# Chunk Size and Retrieval Evaluation

**Goal**\
Determine the chunk size that yields the best retrieval performance for our RAG system\
\
**Data**\
Modules: Kritisches Denken , Agiles Mindset (2699, 2700)\
Chunk sizes: 100 and 150\
20 content-related questions for each course.\
\
**Method/Approach**\
Sentence splitting for chunking\
SBERT for question encoding\
OpenSearch with vector search and retrieval (hnsw params: M=24, ef\_search=100)\
\
**Results**\
Chunk size 100 best for Kritisches Denken. Chunk size 150 best for Agiles Mindset. 150 chosen for standardization\
\
**Evaluation Metrics**\
Retrieval quality: Cumulative Gain (CG at k=1 to k=6)\
\
**Conclusions**\
Standardizing to a chunk size of 150 led to consistent improvement in retrieval quality. However, retrieval performance should be monitored as new courses are added to ensure the chosen parameters remain optimal.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.eduplex.eu/text-analysis/ai-tutor-rag-system/chunk-size-and-retrieval-evaluation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
