PLoS Neglected Tropical Diseases (Dec 2022)
Detection of trachoma using machine learning approaches.
Abstract
BackgroundThough significant progress in disease elimination has been made over the past decades, trachoma is the leading infectious cause of blindness globally. Further efforts in trachoma elimination are paradoxically being limited by the relative rarity of the disease, which makes clinical training for monitoring surveys difficult. In this work, we evaluate the plausibility of an Artificial Intelligence model to augment or replace human image graders in the evaluation/diagnosis of trachomatous inflammation-follicular (TF).MethodsWe utilized a dataset consisting of 2300 images with a 5% positivity rate for TF. We developed classifiers by implementing two state-of-the-art Convolutional Neural Network architectures, ResNet101 and VGG16, and applying a suite of data augmentation/oversampling techniques to the positive images. We then augmented our data set with additional images from independent research groups and evaluated performance.ResultsModels performed well in minimizing the number of false negatives, given the constraint of the low numbers of images in which TF was present. The best performing models achieved a sensitivity of 95% and positive predictive value of 50-70% while reducing the number images requiring skilled grading by 66-75%. Basic oversampling and data augmentation techniques were most successful at improving model performance, while techniques that are grounded in clinical experience, such as highlighting follicles, were less successful.DiscussionThe developed models perform well and significantly reduce the burden on graders by minimizing the number of false negative identifications. Further improvements in model skill will benefit from data sets with more TF as well as a range in image quality and image capture techniques used. While these models approach/meet the community-accepted standard for skilled field graders (i.e., Cohen's Kappa >0.7), they are insufficient to be deployed independently/clinically at this time; rather, they can be utilized to significantly reduce the burden on skilled image graders.