Frontiers in Oncology (Nov 2022)

Deep learning for the diagnosis of suspicious thyroid nodules based on multimodal ultrasound images

  • Yi Tao,
  • Yanyan Yu,
  • Tong Wu,
  • Xiangli Xu,
  • Quan Dai,
  • Hanqing Kong,
  • Lei Zhang,
  • Weidong Yu,
  • Xiaoping Leng,
  • Weibao Qiu,
  • Jiawei Tian

DOI
https://doi.org/10.3389/fonc.2022.1012724
Journal volume & issue
Vol. 12

Abstract

Read online

ObjectivesThis study aimed to differentially diagnose thyroid nodules (TNs) of Thyroid Imaging Reporting and Data System (TI-RADS) 3–5 categories using a deep learning (DL) model based on multimodal ultrasound (US) images and explore its auxiliary role for radiologists with varying degrees of experience.MethodsPreoperative multimodal US images of 1,138 TNs of TI-RADS 3–5 categories were randomly divided into a training set (n = 728), a validation set (n = 182), and a test set (n = 228) in a 4:1:1.25 ratio. Grayscale US (GSU), color Doppler flow imaging (CDFI), strain elastography (SE), and region of interest mask (Mask) images were acquired in both transverse and longitudinal sections, all of which were confirmed by pathology. In this study, fivefold cross-validation was used to evaluate the performance of the proposed DL model. The diagnostic performance of the mature DL model and radiologists in the test set was compared, and whether DL could assist radiologists in improving diagnostic performance was verified. Specificity, sensitivity, accuracy, positive predictive value, negative predictive value, and area under the receiver operating characteristics curves (AUC) were obtained.ResultsThe AUCs of DL in the differentiation of TNs were 0.858 based on (GSU + SE), 0.909 based on (GSU + CDFI), 0.906 based on (GSU + CDFI + SE), and 0.881 based (GSU + Mask), which were superior to that of 0.825-based single GSU (p = 0.014, p< 0.001, p< 0.001, and p = 0.002, respectively). The highest AUC of 0.928 was achieved by DL based on (G + C + E + M)US, the highest specificity of 89.5% was achieved by (G + C + E)US, and the highest accuracy of 86.2% and sensitivity of 86.9% were achieved by DL based on (G + C + M)US. With DL assistance, the AUC of junior radiologists increased from 0.720 to 0.796 (p< 0.001), which was slightly higher than that of senior radiologists without DL assistance (0.796 vs. 0.794, p > 0.05). Senior radiologists with DL assistance exhibited higher accuracy and comparable AUC than that of DL based on GSU (83.4% vs. 78.9%, p = 0.041; 0.822 vs. 0.825, p = 0.512). However, the AUC of DL based on multimodal US images was significantly higher than that based on visual diagnosis by radiologists (p< 0.05).ConclusionThe DL models based on multimodal US images showed exceptional performance in the differential diagnosis of suspicious TNs, effectively increased the diagnostic efficacy of TN evaluations by junior radiologists, and provided an objective assessment for the clinical and surgical management phases that follow.

Keywords