# Chunk Size and Retrieval Evaluation

**Goal**\
Determine the chunk size that yields the best retrieval performance for our RAG system\
\
**Data**\
Modules: Kritisches Denken , Agiles Mindset (2699, 2700)\
Chunk sizes: 100 and 150\
20 content-related questions for each course.\
\
**Method/Approach**\
Sentence splitting for chunking\
SBERT for question encoding\
OpenSearch with vector search and retrieval (hnsw params: M=24, ef\_search=100)\
\
**Results**\
Chunk size 100 best for Kritisches Denken. Chunk size 150 best for Agiles Mindset. 150 chosen for standardization\
\
**Evaluation Metrics**\
Retrieval quality: Cumulative Gain (CG at k=1 to k=6)\
\
**Conclusions**\
Standardizing to a chunk size of 150 led to consistent improvement in retrieval quality. However, retrieval performance should be monitored as new courses are added to ensure the chosen parameters remain optimal.
