Frontiers in Medicine (May 2024)

GastroBot: a Chinese gastrointestinal disease chatbot based on the retrieval-augmented generation

  • Qingqing Zhou,
  • Can Liu,
  • Yuchen Duan,
  • Kaijie Sun,
  • Yu Li,
  • Hongxing Kan,
  • Hongxing Kan,
  • Zongyun Gu,
  • Jianhua Shu,
  • Jili Hu,
  • Jili Hu

DOI
https://doi.org/10.3389/fmed.2024.1392555
Journal volume & issue
Vol. 11

Abstract

Read online

IntroductionLarge Language Models (LLMs) play a crucial role in clinical information processing, showcasing robust generalization across diverse language tasks. However, existing LLMs, despite their significance, lack optimization for clinical applications, presenting challenges in terms of illusions and interpretability. The Retrieval-Augmented Generation (RAG) model addresses these issues by providing sources for answer generation, thereby reducing errors. This study explores the application of RAG technology in clinical gastroenterology to enhance knowledge generation on gastrointestinal diseases.MethodsWe fine-tuned the embedding model using a corpus consisting of 25 guidelines on gastrointestinal diseases. The fine-tuned model exhibited an 18% improvement in hit rate compared to its base model, gte-base-zh. Moreover, it outperformed OpenAI’s Embedding model by 20%. Employing the RAG framework with the llama-index, we developed a Chinese gastroenterology chatbot named “GastroBot,” which significantly improves answer accuracy and contextual relevance, minimizing errors and the risk of disseminating misleading information.ResultsWhen evaluating GastroBot using the RAGAS framework, we observed a context recall rate of 95%. The faithfulness to the source, stands at 93.73%. The relevance of answers exhibits a strong correlation, reaching 92.28%. These findings highlight the effectiveness of GastroBot in providing accurate and contextually relevant information about gastrointestinal diseases. During manual assessment of GastroBot, in comparison with other models, our GastroBot model delivers a substantial amount of valuable knowledge while ensuring the completeness and consistency of the results.DiscussionResearch findings suggest that incorporating the RAG method into clinical gastroenterology can enhance the accuracy and reliability of large language models. Serving as a practical implementation of this method, GastroBot has demonstrated significant enhancements in contextual comprehension and response quality. Continued exploration and refinement of the model are poised to drive forward clinical information processing and decision support in the gastroenterology field.

Keywords