Ultrasonography (Jan 2024)

Feasibility of a deep learning artificial intelligence model for the diagnosis of pediatric ileocolic intussusception with grayscale ultrasonography

  • Se Woo Kim,
  • Jung-Eun Cheon,
  • Young Hun Choi,
  • Jae-Yeon Hwang,
  • Su-Mi Shin,
  • Yeon Jin Cho,
  • Seunghyun Lee,
  • Seul Bi Lee

DOI
https://doi.org/10.14366/usg.23153
Journal volume & issue
Vol. 43, no. 1
pp. 57 – 67

Abstract

Read online

Purpose This study explored the feasibility of utilizing a deep learning artificial intelligence (AI) model to detect ileocolic intussusception on grayscale ultrasound images. Methods This retrospective observational study incorporated ultrasound images of children who underwent emergency ultrasonography for suspected ileocolic intussusception. After excluding video clips, Doppler images, and annotated images, 40,765 images from two tertiary hospitals were included (positive-to-negative ratio: hospital A, 2,775:35,373; hospital B, 140:2,477). Images from hospital A were split into a training set, a tuning set, and an internal test set (ITS) at a ratio of 7:1.5:1.5. Images from hospital B comprised an external test set (ETS). For each image indicating intussusception, two radiologists provided a bounding box as the ground-truth label. If intussusception was suspected in the input image, the model generated a bounding box with a confidence score (0-1) at the estimated lesion location. Average precision (AP) was used to evaluate overall model performance. The performance of practical thresholds for the model-generated confidence score, as determined from the ITS, was verified using the ETS. Results The AP values for the ITS and ETS were 0.952 and 0.936, respectively. Two confidence thresholds, CTopt and CTprecision, were set at 0.557 and 0.790, respectively. For the ETS, the perimage precision and recall were 95.7% and 80.0% with CTopt, and 98.4% and 44.3% with CTprecision. For per-patient diagnosis, the sensitivity and specificity were 100.0% and 97.1% with CTopt, and 100.0% and 99.0% with CTprecision. The average number of false positives per patient was 0.04 with CTopt and 0.01 for CTprecision. Conclusion The feasibility of using an AI model to diagnose ileocolic intussusception on ultrasonography was demonstrated. However, further study involving bias-free data is warranted for robust clinical validation.

Keywords