Towards population-independent, multi-disease detection in fundus photographs

Sarah Matta; Mathieu Lamard; Pierre-Henri Conze; Alexandre Le Guilcher; Clément Lecat; Romuald Carette; Fabien Basset; Pascale Massin; Jean-Bernard Rottier; Béatrice Cochener; Gwenolé Quellec

doi:10.1038/s41598-023-38610-y

Scientific Reports (Jul 2023)

Towards population-independent, multi-disease detection in fundus photographs

Sarah Matta,
Mathieu Lamard,
Pierre-Henri Conze,
Alexandre Le Guilcher,
Clément Lecat,
Romuald Carette,
Fabien Basset,
Pascale Massin,
Jean-Bernard Rottier,
Béatrice Cochener,
Gwenolé Quellec

Affiliations

Sarah Matta: Université de Bretagne Occidentale
Mathieu Lamard: Université de Bretagne Occidentale
Pierre-Henri Conze: INSERM, UMR 1101
Alexandre Le Guilcher: Evolucare Technologies
Clément Lecat: Evolucare Technologies
Romuald Carette: Evolucare Technologies
Fabien Basset: Evolucare Technologies
Pascale Massin: Service d’Ophtalmologie, Hôpital Lariboisière, APHP
Jean-Bernard Rottier: Bâtiment de consultation porte 14 Pôle Santé Sud CMCM
Béatrice Cochener: Université de Bretagne Occidentale
Gwenolé Quellec: INSERM, UMR 1101

DOI: https://doi.org/10.1038/s41598-023-38610-y
Journal volume & issue: Vol. 13, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Independent validation studies of automatic diabetic retinopathy screening systems have recently shown a drop of screening performance on external data. Beyond diabetic retinopathy, this study investigates the generalizability of deep learning (DL) algorithms for screening various ocular anomalies in fundus photographs, across heterogeneous populations and imaging protocols. The following datasets are considered: OPHDIAT (France, diabetic population), OphtaMaine (France, general population), RIADD (India, general population) and ODIR (China, general population). Two multi-disease DL algorithms were developed: a Single-Dataset (SD) network, trained on the largest dataset (OPHDIAT), and a Multiple-Dataset (MD) network, trained on multiple datasets simultaneously. To assess their generalizability, both algorithms were evaluated whenever training and test data originate from overlapping datasets or from disjoint datasets. The SD network achieved a mean per-disease area under the receiver operating characteristic curve (mAUC) of 0.9571 on OPHDIAT. However, it generalized poorly to the other three datasets (mAUC < 0.9). When all four datasets were involved in training, the MD network significantly outperformed the SD network (p = 0.0058), indicating improved generality. However, in leave-one-dataset-out experiments, performance of the MD network was significantly lower on populations unseen during training than on populations involved in training (p < 0.0001), indicating imperfect generalizability.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal