Scientific Reports (Mar 2024)

Development of a deep learning model to distinguish the cause of optic disc atrophy using retinal fundus photography

  • Dong Kyu Lee,
  • Young Jo Choi,
  • Seung Jae Lee,
  • Hyun Goo Kang,
  • Yu Rang Park

DOI
https://doi.org/10.1038/s41598-024-55054-0
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 9

Abstract

Read online

Abstract The differential diagnosis for optic atrophy can be challenging and requires expensive, time-consuming ancillary testing to determine the cause. While Leber's hereditary optic neuropathy (LHON) and optic neuritis (ON) are both clinically significant causes for optic atrophy, both relatively rare in the general population, contributing to limitations in obtaining large imaging datasets. This study therefore aims to develop a deep learning (DL) model based on small datasets that could distinguish the cause of optic disc atrophy using only fundus photography. We retrospectively reviewed fundus photographs of 120 normal eyes, 30 eyes (15 patients) with genetically-confirmed LHON, and 30 eyes (26 patients) with ON. Images were split into a training dataset and a test dataset and used for model training with ResNet-18. To visualize the critical regions in retinal photographs that are highly associated with disease prediction, Gradient-Weighted Class Activation Map (Grad-CAM) was used to generate image-level attention heat maps and to enhance the interpretability of the DL system. In the 3-class classification of normal, LHON, and ON, the area under the receiver operating characteristic curve (AUROC) was 1.0 for normal, 0.988 for LHON, and 0.990 for ON, clearly differentiating each class from the others with an overall total accuracy of 0.93. Specifically, when distinguishing between normal and disease cases, the precision, recall, and F1 scores were perfect at 1.0. Furthermore, in the differentiation of LHON from other conditions, ON from others, and between LHON and ON, we consistently observed precision, recall, and F1 scores of 0.8. The model performance was maintained until only 10% of the pixel values of the image, identified as important by Grad-CAM, were preserved and the rest were masked, followed by retraining and evaluation.

Keywords