Evaluating the strengths and limitations of multimodal ChatGPT-4 in detecting glaucoma using fundus images

Saif Aldeen AlRyalat; Saif Aldeen AlRyalat; Ayman Mohammed Musleh; Malik Y. Kahook

doi:10.3389/fopht.2024.1387190

Frontiers in Ophthalmology (Jun 2024)

Evaluating the strengths and limitations of multimodal ChatGPT-4 in detecting glaucoma using fundus images

Saif Aldeen AlRyalat,
Saif Aldeen AlRyalat,
Ayman Mohammed Musleh,
Malik Y. Kahook

Affiliations

Saif Aldeen AlRyalat: Department of Ophthalmology, The University of Jordan, Amman, Jordan
Saif Aldeen AlRyalat: Department of Ophthalmology, Houston Methodist Hospital, Houston, TX, United States
Ayman Mohammed Musleh: Jordan University Hospital, Amman, Jordan
Malik Y. Kahook: Department of Ophthalmology, University of Colorado School of Medicine, Sue Anschutz-Rodgers Eye Center, Aurora, CO, United States

DOI: https://doi.org/10.3389/fopht.2024.1387190
Journal volume & issue: Vol. 4

Abstract

Read online

OverviewThis study evaluates the diagnostic accuracy of a multimodal large language model (LLM), ChatGPT-4, in recognizing glaucoma using color fundus photographs (CFPs) with a benchmark dataset and without prior training or fine tuning.MethodsThe publicly accessible Retinal Fundus Glaucoma Challenge “REFUGE” dataset was utilized for analyses. The input data consisted of the entire 400 image testing set. The task involved classifying fundus images into either ‘Likely Glaucomatous’ or ‘Likely Non-Glaucomatous’. We constructed a confusion matrix to visualize the results of predictions from ChatGPT-4, focusing on accuracy of binary classifications (glaucoma vs non-glaucoma).ResultsChatGPT-4 demonstrated an accuracy of 90% with a 95% confidence interval (CI) of 87.06%-92.94%. The sensitivity was found to be 50% (95% CI: 34.51%-65.49%), while the specificity was 94.44% (95% CI: 92.08%-96.81%). The precision was recorded at 50% (95% CI: 34.51%-65.49%), and the F1 Score was 0.50.ConclusionChatGPT-4 achieved relatively high diagnostic accuracy without prior fine tuning on CFPs. Considering the scarcity of data in specialized medical fields, including ophthalmology, the use of advanced AI techniques, such as LLMs, might require less data for training compared to other forms of AI with potential savings in time and financial resources. It may also pave the way for the development of innovative tools to support specialized medical care, particularly those dependent on multimodal data for diagnosis and follow-up, irrespective of resource constraints.

Published in Frontiers in Ophthalmology

ISSN: 2674-0826 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine
Website: https://www.frontiersin.org/journals/ophthalmology

About the journal

Abstract

Keywords