Towards Reliable Healthcare LLM Agents: A Case Study for Pilgrims during Hajj

Hanan M. Alghamdi; Abeer Mostafa

doi:10.3390/info15070371

Information (Jun 2024)

Towards Reliable Healthcare LLM Agents: A Case Study for Pilgrims during Hajj

Hanan M. Alghamdi,
Abeer Mostafa

Affiliations

Hanan M. Alghamdi: Department of Computers, College of Engineering and Computing Al Qunfidhah, Umm Al-Qura University, Makkah 24382, Saudi Arabia
Abeer Mostafa: Department of Computer Science and Engineering, Egypt-Japan University of Science and Technology, Alexandria 21934, Egypt

DOI: https://doi.org/10.3390/info15070371
Journal volume & issue: Vol. 15, no. 7
p. 371

Abstract

Read online

There is a pressing need for healthcare conversational agents with domain-specific expertise to ensure the provision of accurate and reliable information tailored to specific medical contexts. Moreover, there is a notable gap in research ensuring the credibility and trustworthiness of the information provided by these healthcare agents, particularly in critical scenarios such as medical emergencies. Pilgrims come from diverse cultural and linguistic backgrounds, often facing difficulties in accessing medical advice and information. Establishing an AI-powered multilingual chatbot can bridge this gap by providing readily available medical guidance and support, contributing to the well-being and safety of pilgrims. In this paper, we present a comprehensive methodology aimed at enhancing the reliability and efficacy of healthcare conversational agents, with a specific focus on addressing the needs of Hajj pilgrims. Our approach leverages domain-specific fine-tuning techniques on a large language model, alongside synthetic data augmentation strategies, to optimize performance in delivering contextually relevant healthcare information by introducing the HajjHealthQA dataset. Additionally, we employ a retrieval-augmented generation (RAG) module as a crucial component to validate uncertain generated responses, which improves model performance by 5%. Moreover, we train a secondary AI agent on a well-known health fact-checking dataset and use it to validate medical information in the generated responses. Our approach significantly elevates the chatbot’s accuracy, demonstrating its adaptability to a wide range of pilgrim queries. We evaluate the chatbot’s performance using quantitative and qualitative metrics, highlighting its proficiency in generating accurate responses and achieve competitive results compared to state-of-the-art models, in addition to mitigating the risk of misinformation and providing users with trustworthy health information.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords