EduPLEx_API
InfoPrototypeAll docs
Recommendation, reporting & analytics
Recommendation, reporting & analytics
  • Experiments report
    • Key concepts
    • Data sources
    • First demonstrator: ESCO ontologies and semantic matching
    • Software design
      • Endpoints Sbert_eduplex
      • Setup Sbert_eduplex
    • AI Applications
    • Conclusions
    • Recommendation
    • Bibliography
  • Recommendation Engine
  • Reporting and predictive analytics
  • LRS User Journey Visualizer
  • AI Tutor - RAG system
    • LLM-augmented Retrieval and Ranking for Course Recommendations
    • Retrieval of course candidates when searching via title.
    • Answer Generation Evaluation
    • Chunk Size and Retrieval Evaluation
    • Chunking Techniques – Splitters
    • Golden Case CLAPNQ
    • Comparative Retrieval Performance: Modules vs Golden Case
    • LLM-based Evaluator for Context Relevance
    • Retrieval Performance Indexing pdf vs xapi, and Keywords vs Questions
Powered by GitBook
On this page
Edit on GitLab
  1. AI Tutor - RAG system

Chunk Size and Retrieval Evaluation

Goal Determine the chunk size that yields the best retrieval performance for our RAG system Data Modules: Kritisches Denken , Agiles Mindset (2699, 2700) Chunk sizes: 100 and 150 20 content-related questions for each course. Method/Approach Sentence splitting for chunking SBERT for question encoding OpenSearch with vector search and retrieval (hnsw params: M=24, ef_search=100) Results Chunk size 100 best for Kritisches Denken. Chunk size 150 best for Agiles Mindset. 150 chosen for standardization Evaluation Metrics Retrieval quality: Cumulative Gain (CG at k=1 to k=6) Conclusions Standardizing to a chunk size of 150 led to consistent improvement in retrieval quality. However, retrieval performance should be monitored as new courses are added to ensure the chosen parameters remain optimal.

PreviousAnswer Generation EvaluationNextChunking Techniques – Splitters

Last updated 4 months ago