Last updated
Last updated
Goal Compare CG from current modules with golden case Data Modules Kritisches Denken and Agiles Mindset, indexed xapi. 20 test questions for each module. Method/Approach Retrieval results evaluated using binary relevance score 1-0 (manually labeled relevant chunks). Results The best retrieval performance achieved for the module Kritisches Denken was 40% at k3. For Agiles Mindset the best achieved was 50% at k2. Evaluation Metrics Retrieval quality: Cumulative Gain (CG at k=1 to k=6) Conclusions The Cumulative Gain (CG) performance for the current modules was notably lower than the golden case. Further improvements are necessary to reduce the gap in our retrieval quality when compared to the ideal structured dataset of the golden case.
After testing with our real courses, we have decided to obtain a golden case of almost 5000 questions and answers and proceed with more experiments
Experiments with this golden case and different LLMs do not show a big difference in the retrieved results
But there is a big difference between the results from the golden case (above 90%) and our test learning content (that stays below 40% of retrieved results success)
We decided to test different chunking strategies to try to improve the retrieval success scores.
After using the sentence splitter strategy we got improved results for our test learning content: