Scientific Reports (May 2024)

Differential diagnosis of thyroid nodules using heterogeneity quantification software on ultrasound images: correlation with the Bethesda system and surgical pathology

  • Young Jae Ryu,
  • Jin Woong Kim,
  • Sang Chun Park,
  • Young Hoe Hur,
  • Hyung Joong Kim,
  • Tae-Hoon Kim

DOI
https://doi.org/10.1038/s41598-024-60881-2
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Ultrasonography (US)-guided fine-needle aspiration cytology (FNAC) is the primary modality for evaluating thyroid nodules. However, in cases of atypia of undetermined significance (AUS) or follicular lesion of undetermined significance (FLUS), supplemental tests are necessary for a definitive diagnosis. Accordingly, we aimed to develop a non-invasive quantification software using the heterogeneity scores of thyroid nodules. This cross-sectional study retrospectively enrolled 188 patients who were categorized into four groups according to their diagnostic classification in the Bethesda system and surgical pathology [II-benign (B) (n = 24); III-B (n = 52); III-malignant (M) (n = 54); V/VI-M (n = 58)]. Heterogeneity scores were derived using an image pixel-based heterogeneity index, utilized as a coefficient of variation (CV) value, and analyzed across all US images. Differences in heterogeneity scores were compared using one-way analysis of variance with Tukey’s test. Diagnostic accuracy was determined by calculating the area under the receiver operating characteristic (AUROC) curve. The results of this study indicated significant differences in mean heterogeneity scores between benign and malignant thyroid nodules, except in the comparison between III-M and V/VI-M nodules. Among malignant nodules, the Bethesda classification was not observed to be associated with mean heterogeneity scores. Moreover, there was a positive correlation between heterogeneity scores and the combined diagnostic category, which was based on the Bethesda system and surgical cytology grades (R = 0.639, p < 0.001). AUROC for heterogeneity scores showed the highest diagnostic performance (0.818; cut-off: 30.22% CV value) for differentiating the benign group (normal/II-B/III-B) from the malignant group (III-M/V&VI-M), with a diagnostic accuracy of 72.5% (161/122). Quantitative heterogeneity measurement of US images is a valuable non-invasive diagnostic tool for predicting the likelihood of malignancy in thyroid nodules, including AUS or FLUS.