npj Digital Medicine (Jun 2024)

Development and validation of a smartphone-based deep-learning-enabled system to detect middle-ear conditions in otoscopic images

  • Constance Dubois,
  • David Eigen,
  • François Simon,
  • Vincent Couloigner,
  • Michael Gormish,
  • Martin Chalumeau,
  • Laurent Schmoll,
  • Jérémie F. Cohen

DOI
https://doi.org/10.1038/s41746-024-01159-9
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Middle-ear conditions are common causes of primary care visits, hearing impairment, and inappropriate antibiotic use. Deep learning (DL) may assist clinicians in interpreting otoscopic images. This study included patients over 5 years old from an ambulatory ENT practice in Strasbourg, France, between 2013 and 2020. Digital otoscopic images were obtained using a smartphone-attached otoscope (Smart Scope, Karl Storz, Germany) and labeled by a senior ENT specialist across 11 diagnostic classes (reference standard). An Inception-v2 DL model was trained using 41,664 otoscopic images, and its diagnostic accuracy was evaluated by calculating class-specific estimates of sensitivity and specificity. The model was then incorporated into a smartphone app called i-Nside. The DL model was evaluated on a validation set of 3,962 images and a held-out test set comprising 326 images. On the validation set, all class-specific estimates of sensitivity and specificity exceeded 98%. On the test set, the DL model achieved a sensitivity of 99.0% (95% confidence interval: 94.5–100) and a specificity of 95.2% (91.5–97.6) for the binary classification of normal vs. abnormal images; wax plugs were detected with a sensitivity of 100% (94.6–100) and specificity of 97.7% (95.0–99.1); other class-specific estimates of sensitivity and specificity ranged from 33.3% to 92.3% and 96.0% to 100%, respectively. We present an end-to-end DL-enabled system able to achieve expert-level diagnostic accuracy for identifying normal tympanic aspects and wax plugs within digital otoscopic images. However, the system’s performance varied for other middle-ear conditions. Further prospective validation is necessary before wider clinical deployment.