Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist?

Jimmy S. Chen, MD; Akshay J. Reddy, BS; Eman Al-Sharif, MD; Marissa K. Shoji, MD; Fritz Gerald P. Kalaw, MD; Medi Eslani, MD; Paul Z. Lang, MD; Malvika Arya, MD; Zachary A. Koretz, MD, MPH; Kyle A. Bolo, MD; Justin J. Arnett, MD; Aliya C. Roginiel, MD, MPH; Jiun L. Do, MD, PhD; Shira L. Robbins, MD; Andrew S. Camp, MD; Nathan L. Scott, MD; Jolene C. Rudell, MD, PhD; Robert N. Weinreb, MD; Sally L. Baxter, MD, MSc; David B. Granet, MD, MHCM

Ophthalmology Science (Jan 2025)

Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist?

Jimmy S. Chen, MD,
Akshay J. Reddy, BS,
Eman Al-Sharif, MD,
Marissa K. Shoji, MD,
Fritz Gerald P. Kalaw, MD,
Medi Eslani, MD,
Paul Z. Lang, MD,
Malvika Arya, MD,
Zachary A. Koretz, MD, MPH,
Kyle A. Bolo, MD,
Justin J. Arnett, MD,
Aliya C. Roginiel, MD, MPH,
Jiun L. Do, MD, PhD,
Shira L. Robbins, MD,
Andrew S. Camp, MD,
Nathan L. Scott, MD,
Jolene C. Rudell, MD, PhD,
Robert N. Weinreb, MD,
Sally L. Baxter, MD, MSc,
David B. Granet, MD, MHCM

Affiliations

Jimmy S. Chen, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Akshay J. Reddy, BS: School of Medicine, California University of Science and Medicine, Colton, California
Eman Al-Sharif, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Surgery Department, College of Medicine, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Marissa K. Shoji, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Fritz Gerald P. Kalaw, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Medi Eslani, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Paul Z. Lang, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Malvika Arya, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Zachary A. Koretz, MD, MPH: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Kyle A. Bolo, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Justin J. Arnett, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Aliya C. Roginiel, MD, MPH: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Jiun L. Do, MD, PhD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Shira L. Robbins, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Andrew S. Camp, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Nathan L. Scott, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Jolene C. Rudell, MD, PhD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Robert N. Weinreb, MD: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Sally L. Baxter, MD, MSc: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
David B. Granet, MD, MHCM: Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Correspondence: David B. Granet, MD, MHCM, Anne F. Ratner Chair in Pediatric Ophthalmology, Ratner Eye Center, Shiley Eye Institute, 9415 Campus Point Dr MC0946, La Jolla, CA, 92093.

Journal volume & issue: Vol. 5, no. 1
p. 100600

Abstract

Read online

Objective: Large language models such as ChatGPT have demonstrated significant potential in question-answering within ophthalmology, but there is a paucity of literature evaluating its ability to generate clinical assessments and discussions. The objectives of this study were to (1) assess the accuracy of assessment and plans generated by ChatGPT and (2) evaluate ophthalmologists’ abilities to distinguish between responses generated by clinicians versus ChatGPT. Design: Cross-sectional mixed-methods study. Subjects: Sixteen ophthalmologists from a single academic center, of which 10 were board-eligible and 6 were board-certified, were recruited to participate in this study. Methods: Prompt engineering was used to ensure ChatGPT output discussions in the style of the ophthalmologist author of the Medical College of Wisconsin Ophthalmic Case Studies. Cases where ChatGPT accurately identified the primary diagnoses were included and then paired. Masked human-generated and ChatGPT-generated discussions were sent to participating ophthalmologists to identify the author of the discussions. Response confidence was assessed using a 5-point Likert scale score, and subjective feedback was manually reviewed. Main Outcome Measures: Accuracy of ophthalmologist identification of discussion author, as well as subjective perceptions of human-generated versus ChatGPT-generated discussions. Results: Overall, ChatGPT correctly identified the primary diagnosis in 15 of 17 (88.2%) cases. Two cases were excluded from the paired comparison due to hallucinations or fabrications of nonuser-provided data. Ophthalmologists correctly identified the author in 77.9% ± 26.6% of the 13 included cases, with a mean Likert scale confidence rating of 3.6 ± 1.0. No significant differences in performance or confidence were found between board-certified and board-eligible ophthalmologists. Subjectively, ophthalmologists found that discussions written by ChatGPT tended to have more generic responses, irrelevant information, hallucinated more frequently, and had distinct syntactic patterns (all P < 0.01). Conclusions: Large language models have the potential to synthesize clinical data and generate ophthalmic discussions. While these findings have exciting implications for artificial intelligence-assisted health care delivery, more rigorous real-world evaluation of these models is necessary before clinical deployment. Financial Disclosures: The author(s) have no proprietary or commercial interest in any materials discussed in this article.

Published in Ophthalmology Science

ISSN: 2666-9145 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Ophthalmology
Website: https://www.journals.elsevier.com/ophthalmology-science/

About the journal

Abstract

Keywords