Frontiers in Pediatrics (Jan 2020)

Using Deep Learning Algorithms to Grade Hydronephrosis Severity: Toward a Clinical Adjunct

  • Lauren C. Smail,
  • Lauren C. Smail,
  • Kiret Dhindsa,
  • Kiret Dhindsa,
  • Kiret Dhindsa,
  • Luis H. Braga,
  • Luis H. Braga,
  • Luis H. Braga,
  • Suzanna Becker,
  • Suzanna Becker,
  • Suzanna Becker,
  • Ranil R. Sonnadara,
  • Ranil R. Sonnadara,
  • Ranil R. Sonnadara,
  • Ranil R. Sonnadara,
  • Ranil R. Sonnadara,
  • Ranil R. Sonnadara

DOI
https://doi.org/10.3389/fped.2020.00001
Journal volume & issue
Vol. 8

Abstract

Read online

Grading hydronephrosis severity relies on subjective interpretation of renal ultrasound images. Deep learning is a data-driven algorithmic approach to classifying data, including images, presenting a promising option for grading hydronephrosis. The current study explored the potential of deep convolutional neural networks (CNN), a type of deep learning algorithm, to grade hydronephrosis ultrasound images according to the 5-point Society for Fetal Urology (SFU) classification system, and discusses its potential applications in developing decision and teaching aids for clinical practice. We developed a five-layer CNN to grade 2,420 sagittal hydronephrosis ultrasound images [191 SFU 0 (8%), 407 SFU I (17%), 666 SFU II (28%), 833 SFU III (34%), and 323 SFU IV (13%)], from 673 patients ranging from 0 to 116.29 months old (Mage = 16.53, SD = 17.80). Five-way (all grades) and two-way classification problems [i.e., II vs. III, and low (0–II) vs. high (III–IV)] were explored. The CNN classified 94% (95% CI, 93–95%) of the images correctly or within one grade of the provided label in the five-way classification problem. Fifty-one percent of these images (95% CI, 49–53%) were correctly predicted, with an average weighted F1 score of 0.49 (95% CI, 0.47–0.51). The CNN achieved an average accuracy of 78% (95% CI, 75–82%) with an average weighted F1 of 0.78 (95% CI, 0.74–0.82) when classifying low vs. high grades, and an average accuracy of 71% (95% CI, 68–74%) with an average weighted F1 score of 0.71 (95% CI, 0.68–0.75) when discriminating between grades II vs. III. Our model performs well above chance level, and classifies almost all images either correctly or within one grade of the provided label. We have demonstrated the applicability of a CNN approach to hydronephrosis ultrasound image classification. Further investigation into a deep learning-based clinical adjunct for hydronephrosis is warranted.

Keywords