Journal of Orthopaedic Surgery and Research (Jun 2023)

Development and clinical validation of deep learning for auto-diagnosis of supraspinatus tears

  • Deming Guo,
  • Xiaoning Liu,
  • Dawei Wang,
  • Xiongfeng Tang,
  • Yanguo Qin

DOI
https://doi.org/10.1186/s13018-023-03909-z
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Accurately diagnosing supraspinatus tears based on magnetic resonance imaging (MRI) is challenging and time-combusting due to the experience level variability of the musculoskeletal radiologists and orthopedic surgeons. We developed a deep learning-based model for automatically diagnosing supraspinatus tears (STs) using shoulder MRI and validated its feasibility in clinical practice. Materials and methods A total of 701 shoulder MRI data (2804 images) were retrospectively collected for model training and internal test. An additional 69 shoulder MRIs (276 images) were collected from patients who underwent shoulder arthroplasty and constituted the surgery test set for clinical validation. Two advanced convolutional neural networks (CNN) based on Xception were trained and optimized to detect STs. The diagnostic performance of the CNN was evaluated according to its sensitivity, specificity, precision, accuracy, and F1 score. Subgroup analyses were performed to verify its robustness, and we also compared the CNN’s performance with that of 4 radiologists and 4 orthopedic surgeons on the surgery and internal test sets. Results Optimal diagnostic performance was achieved on the 2D model, from which F1-scores of 0.824 and 0.75, and areas under the ROC curves of 0.921 (95% confidence interval, 0.841–1.000) and 0.882 (0.817–0.947) were observed on the surgery and internal test sets. For the subgroup analysis, the 2D CNN model demonstrated a sensitivity of 0.33–1.000 and 0.625–1.000 for different degrees of tears on the surgery and internal test sets, and there was no significant performance difference between 1.5 and 3.0 T data. Compared with eight clinicians, the 2D CNN model exhibited better diagnostic performance than the junior clinicians and was equivalent to senior clinicians. Conclusions The proposed 2D CNN model realized the adequate and efficient automatic diagnoses of STs, which achieved a comparable performance of junior musculoskeletal radiologists and orthopedic surgeons. It might be conducive to assisting poor-experienced radiologists, especially in community scenarios lacking consulting experts.

Keywords