JMIR Medical Informatics (Aug 2021)

Gender Prediction for a Multiethnic Population via Deep Learning Across Different Retinal Fundus Photograph Fields: Retrospective Cross-sectional Study

  • Bjorn Kaijun Betzler,
  • Henrik Hee Seung Yang,
  • Sahil Thakur,
  • Marco Yu,
  • Ten Cheer Quek,
  • Zhi Da Soh,
  • Geunyoung Lee,
  • Yih-Chung Tham,
  • Tien Yin Wong,
  • Tyler Hyungtaek Rim,
  • Ching-Yu Cheng

DOI
https://doi.org/10.2196/25165
Journal volume & issue
Vol. 9, no. 8
p. e25165

Abstract

Read online

BackgroundDeep learning algorithms have been built for the detection of systemic and eye diseases based on fundus photographs. The retina possesses features that can be affected by gender differences, and the extent to which these features are captured via photography differs depending on the retinal image field. ObjectiveWe aimed to compare deep learning algorithms’ performance in predicting gender based on different fields of fundus photographs (optic disc–centered, macula-centered, and peripheral fields). MethodsThis retrospective cross-sectional study included 172,170 fundus photographs of 9956 adults aged ≥40 years from the Singapore Epidemiology of Eye Diseases Study. Optic disc–centered, macula-centered, and peripheral field fundus images were included in this study as input data for a deep learning model for gender prediction. Performance was estimated at the individual level and image level. Receiver operating characteristic curves for binary classification were calculated. ResultsThe deep learning algorithms predicted gender with an area under the receiver operating characteristic curve (AUC) of 0.94 at the individual level and an AUC of 0.87 at the image level. Across the three image field types, the best performance was seen when using optic disc–centered field images (younger subgroups: AUC=0.91; older subgroups: AUC=0.86), and algorithms that used peripheral field images had the lowest performance (younger subgroups: AUC=0.85; older subgroups: AUC=0.76). Across the three ethnic subgroups, algorithm performance was lowest in the Indian subgroup (AUC=0.88) compared to that in the Malay (AUC=0.91) and Chinese (AUC=0.91) subgroups when the algorithms were tested on optic disc–centered images. Algorithms’ performance in gender prediction at the image level was better in younger subgroups (aged <65 years; AUC=0.89) than in older subgroups (aged ≥65 years; AUC=0.82). ConclusionsWe confirmed that gender among the Asian population can be predicted with fundus photographs by using deep learning, and our algorithms’ performance in terms of gender prediction differed according to the field of fundus photographs, age subgroups, and ethnic groups. Our work provides a further understanding of using deep learning models for the prediction of gender-related diseases. Further validation of our findings is still needed.