大数据 (Jan 2025)

Question Answering Enhancement Method for Large Educational Models Based on Re-ranking and Post-retrieval Reflection

  • SUN Haoran,
  • WANG Zhihao,
  • WU Yifan,
  • XIANG Yang

Abstract

Read online

Computer education is one of the requirements of modern information society education. With the development of large language models, there has been increasing attention on applying these models to the computer education process. However, the hallucination problem associated with large language models poses significant challenges to their application. To mitigate this issue, retrieval-augmented generation (RAG) techniques, by incorporating external knowledge bases, can effectively enhance the quality of responses generated by large language models. However, traditional RAG methods often struggle with filtering irrelevant external knowledge, leading to interference from unrelated information that fails to adequately address the hallucination problem. In this paper, we collect computer-related textbooks and knowledge documents, dividing them into knowledge document blocks to construct an external knowledge database. We introduce an large educational models question-answering enhancement method based on re-ranking and post-retrieval-augmented reflection. We utilize a high-performance multilingual re-ranking model based on a cross-encoder to capture deep semantic information for filtering retrieved information, thereby alleviating the shortcomings of traditional retrieval generation methods. Additionally, we apply retrieval-augmented generation for model reflection to further enhance response quality. This approach significantly improves the accuracy of large language models in computer question-answering tasks. Our method has been tested on several popular current generative models, achieving promising results on CS-Bench, with an approximate 5% increase in accuracy for computer question-answering tasks.

Keywords