AI chatbots not yet ready for clinical use

Joshua Au Yeung; Joshua Au Yeung; Zeljko Kraljevic; Akish Luintel; Alfred Balston; Esther Idowu; Richard J. Dobson; Richard J. Dobson; James T. Teo; James T. Teo

doi:10.3389/fdgth.2023.1161098

Frontiers in Digital Health (Apr 2023)

AI chatbots not yet ready for clinical use

Joshua Au Yeung,
Joshua Au Yeung,
Zeljko Kraljevic,
Akish Luintel,
Alfred Balston,
Esther Idowu,
Richard J. Dobson,
Richard J. Dobson,
James T. Teo,
James T. Teo

Affiliations

Joshua Au Yeung: Department of Neuroscience, Kings College Hospital, London, United Kingdom
Joshua Au Yeung: Guys & St Thomas Hospital, London, United Kingdom
Zeljko Kraljevic: Department of Biostatistics, Kings College London, London, United Kingdom
Akish Luintel: Department of Neuroscience, Kings College Hospital, London, United Kingdom
Alfred Balston: Guys & St Thomas Hospital, London, United Kingdom
Esther Idowu: Guys & St Thomas Hospital, London, United Kingdom
Richard J. Dobson: Department of Biostatistics, Kings College London, London, United Kingdom
Richard J. Dobson: NIHR Biomedical Research Centre, South London and Maudsley NHS Foundation Trust and King's College London, London, United Kingdom
James T. Teo: Department of Neuroscience, Kings College Hospital, London, United Kingdom
James T. Teo: Guys & St Thomas Hospital, London, United Kingdom

DOI: https://doi.org/10.3389/fdgth.2023.1161098
Journal volume & issue: Vol. 5

Abstract

Read online

As large language models (LLMs) expand and become more advanced, so do the natural language processing capabilities of conversational AI, or “chatbots”. OpenAI's recent release, ChatGPT, uses a transformer-based model to enable human-like text generation and question-answering on general domain knowledge, while a healthcare-specific Large Language Model (LLM) such as GatorTron has focused on the real-world healthcare domain knowledge. As LLMs advance to achieve near human-level performances on medical question and answering benchmarks, it is probable that Conversational AI will soon be developed for use in healthcare. In this article we discuss the potential and compare the performance of two different approaches to generative pretrained transformers—ChatGPT, the most widely used general conversational LLM, and Foresight, a GPT (generative pretrained transformer) based model focused on modelling patients and disorders. The comparison is conducted on the task of forecasting relevant diagnoses based on clinical vignettes. We also discuss important considerations and limitations of transformer-based chatbots for clinical use.

Published in Frontiers in Digital Health

ISSN: 2673-253X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Public aspects of medicine; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/digital-health#

About the journal

Abstract

Keywords