Ophthalmology Science (Jun 2022)
Deep Learning-Based Cataract Detection and Grading from Slit-Lamp and Retro-Illumination Photographs
Abstract
Purpose: To develop and validate an automated deep learning (DL)-based artificial intelligence (AI) platform for diagnosing and grading cataracts using slit-lamp and retroillumination lens photographs based on the Lens Opacities Classification System (LOCS) III. Design: Cross-sectional study in which a convolutional neural network was trained and tested using photographs of slit-lamp and retroillumination lens photographs. Participants: One thousand three hundred thirty-five slit-lamp images and 637 retroillumination lens images from 596 patients. Methods: Slit-lamp and retroillumination lens photographs were graded by 2 trained graders using LOCS III. Image datasets were labeled and divided into training, validation, and test datasets. We trained and validated AI platforms with 4 key strategies in the AI domain: (1) region detection network for redundant information inside data, (2) data augmentation and transfer learning for the small dataset size problem, (3) generalized cross-entropy loss for dataset bias, and (4) class balanced loss for class imbalance problems. The performance of the AI platform was reinforced with an ensemble of 3 AI algorithms: ResNet18, WideResNet50-2, and ResNext50. Main Outcome Measures: Diagnostic and LOCS III-based grading prediction performance of AI platforms. Results: The AI platform showed robust diagnostic performance (area under the receiver operating characteristic curve [AUC], 0.9992 [95% confidence interval (CI), 0.9986–0.9998] and 0.9994 [95% CI, 0.9989–0.9998]; accuracy, 98.82% [95% CI, 97.7%–99.9%] and 98.51% [95% CI, 97.4%–99.6%]) and LOCS III-based grading prediction performance (AUC, 0.9567 [95% CI, 0.9501–0.9633] and 0.9650 [95% CI, 0.9509–0.9792]; accuracy, 91.22% [95% CI, 89.4%–93.0%] and 90.26% [95% CI, 88.6%–91.9%]) for nuclear opalescence (NO) and nuclear color (NC) using slit-lamp photographs, respectively. For cortical opacity (CO) and posterior subcapsular opacity (PSC), the system achieved high diagnostic performance (AUC, 0.9680 [95% CI, 0.9579–0.9781] and 0.9465 [95% CI, 0.9348–0.9582]; accuracy, 96.21% [95% CI, 94.4%–98.0%] and 92.17% [95% CI, 88.6%–95.8%]) and good LOCS III-based grading prediction performance (AUC, 0.9044 [95% CI, 0.8958–0.9129] and 0.9174 [95% CI, 0.9055–0.9295]; accuracy, 91.33% [95% CI, 89.7%–93.0%] and 87.89% [95% CI, 85.6%–90.2%]) using retroillumination images. Conclusions: Our DL-based AI platform successfully yielded accurate and precise detection and grading of NO and NC in 7-level classification and CO and PSC in 6-level classification, overcoming the limitations of medical databases such as few training data or biased label distribution.