Clinical Ophthalmology (Dec 2023)

The Utility of ChatGPT in Diabetic Retinopathy Risk Assessment: A Comparative Study with Clinical Diagnosis

  • Raghu K,
  • S T,
  • S Devishamani C,
  • M S,
  • Rajalakshmi R,
  • Raman R

Journal volume & issue
Vol. Volume 17
pp. 4021 – 4031

Abstract

Read online

Keerthana Raghu,1,* Tamilselvi S,2,* Chitralekha S Devishamani,1 Suchetha M,2 Ramachandran Rajalakshmi,3 Rajiv Raman1 1Shri Bhagwan Mahavir Vitreoretinal Services, Sankara Nethralaya, Chennai, Tamil Nadu, India; 2Centre for Health Care Advancement, Innovation, and Research Department, Vellore Institute of Technology, Chennai, Tamil Nadu, India; 3Department of Diabetology, Ophthalmology and Epidemiology, Madras Diabetes Research Foundation & Dr. Mohan’s Diabetes Specialities Centre, Chennai, Tamil Nadu, India*These authors contributed equally to this workCorrespondence: Rajiv Raman, Shri Bhagwan Mahavir Vitreoretinal Services, Sankara Nethralaya, Sankara Nethralaya (Main Campus), No. 41 (Old 18), College Road, Chennai, Tamil Nadu, 600006, India, Tel +91-444288708, Fax +91-44-28254180, Email [email protected]: To evaluate the ability of an artificial intelligence (AI) model, ChatGPT, in predicting the diabetic retinopathy (DR) risk.Methods: This retrospective observational study utilized an anonymized dataset of 111 patients with diabetes who underwent a comprehensive eye examination along with clinical and biochemical assessments. Clinical and biochemical data along with and without central subfield thickness (CST) values of the macula from OCT were uploaded to ChatGPT-4, and the response from the ChatGPT was compared to the clinical DR diagnosis made by an ophthalmologist.Results: The study assessed the consistency of responses provided by ChatGPT, yielding an Intraclass Correlation Coefficient (ICC) value of 0.936 (95% CI, 0.913– 0.954, p < 0.001) (with CST) and 0.915 (95% CI, 0.706– 0.846, p < 0.001) (without CST), both situations indicated excellent reliability. The sensitivity and specificity of ChatGPT in predicting the DR cases were evaluated. The results revealed a sensitivity of 67% with CST and 73% without CST. The specificity was 68% with CST and 54% without CST. However, Cohen’s kappa revealed only a fair agreement between ChatGPT predictions and clinical DR status in both situations, with CST (kappa = 0.263, p = 0.005) and without CST (kappa = 0.351, p < 0.001).Conclusion: This study suggests that ChatGPT has the potential of a preliminary DR screening tool with further optimization needed for clinical use.Keywords: ChatGPT, artificial intelligence, diabetic retinopathy, diabetes

Keywords