EduPLEx_API
InfoPrototypeAll docs
Recommendation, reporting & analytics
Recommendation, reporting & analytics
  • Experiments report
    • Key concepts
    • Data sources
    • First demonstrator: ESCO ontologies and semantic matching
    • Software design
      • Endpoints Sbert_eduplex
      • Setup Sbert_eduplex
    • AI Applications
    • Conclusions
    • Recommendation
    • Bibliography
  • Recommendation Engine
  • Reporting and predictive analytics
  • LRS User Journey Visualizer
  • AI Tutor - RAG system
    • LLM-augmented Retrieval and Ranking for Course Recommendations
    • Retrieval of course candidates when searching via title.
    • Answer Generation Evaluation
    • Chunk Size and Retrieval Evaluation
    • Chunking Techniques – Splitters
    • Golden Case CLAPNQ
    • Comparative Retrieval Performance: Modules vs Golden Case
    • LLM-based Evaluator for Context Relevance
    • Retrieval Performance Indexing pdf vs xapi, and Keywords vs Questions
Powered by GitBook
On this page
Edit on GitLab
  1. AI Tutor - RAG system

Comparative Retrieval Performance: Modules vs Golden Case

PreviousGolden Case CLAPNQNextLLM-based Evaluator for Context Relevance

Last updated 4 months ago

Goal Compare CG from current modules with golden case Data Modules Kritisches Denken and Agiles Mindset, indexed xapi. 20 test questions for each module. Method/Approach Retrieval results evaluated using binary relevance score 1-0 (manually labeled relevant chunks). Results The best retrieval performance achieved for the module Kritisches Denken was 40% at k3. For Agiles Mindset the best achieved was 50% at k2. Evaluation Metrics Retrieval quality: Cumulative Gain (CG at k=1 to k=6) Conclusions The Cumulative Gain (CG) performance for the current modules was notably lower than the golden case. Further improvements are necessary to reduce the gap in our retrieval quality when compared to the ideal structured dataset of the golden case.


Content retrieval details

After testing with our real courses, we have decided to obtain a golden case of almost 5000 questions and answers and proceed with more experiments

Experiments with this golden case and different LLMs do not show a big difference in the retrieved results

But there is a big difference between the results from the golden case (above 90%) and our test learning content (that stays below 40% of retrieved results success)

We decided to test different chunking strategies to try to improve the retrieval success scores.

After using the sentence splitter strategy we got improved results for our test learning content:

https://eduplex.atlassian.net/browse/EDX-529
https://eduplex.atlassian.net/browse/EDX-530