Health Sciences Review (Jun 2023)

Role of machine learning in differentiating benign from malignant indeterminate thyroid nodules: A literature review

  • Julian M. Conn Busch,
  • Joseph L. Cozzi,
  • Hui Li,
  • Li Lan,
  • Maryellen L. Giger,
  • Xavier M. Keutgen

Journal volume & issue
Vol. 7
p. 100089

Abstract

Read online

Rationale and Objectives: Thyroid nodules are a common occurrence and up to 30% are indeterminate on Fine Needle Aspiration (FNA) biopsy. Use of machine learning (ML) algorithms as computer aids, i.e., computer-aided diagnosis (CADx), may provide accessible, accurate methods of diagnosing thyroid cancer on ultrasound (US) images of thyroid nodules. We conducted a systematic review to synthesize the current knowledge and potential of ML in assisting physicians in diagnosing indeterminate thyroid nodules. Materials and Methods: We searched PubMed for studies without a year constraint. Inclusion criteria were as follows: (1) publication of the study as a full-text manuscript in a peer-reviewed journal, (2) original work (e.g., not a review). Studies were excluded if they (1) did not perform FNA biopsy, (2) excluded all indeterminate lesions, (3) focused on a non-cancer thyroid pathology, (4) described only non-diagnostic uses of their model(s) (e.g. calcification quantification), or (5) did not use grayscale ultrasound. We used the Joanna Briggs Institute Appraisal Checklist for Diagnostic Test Accuracy Studies to synthesize and present quality data and PRISMA guidelines for reporting. Results: Our search yielded only 6 articles published between 2013 and 2022. All studies used either only fine-needle aspiration (FNA) biopsy or a combination of FNA and surgery as the reference standard. Of these, one study included only indeterminate nodules, while 5 studies included other nodules in categories as well. Results varied greatly and studies were inconsistent in which performance metrics they reported. Physician performances yielded areas under the ROC curve (AUC) ranging from 0.68 to 0.83, while ML models gave AUCs from 0.666 to 0.954. One study showed that their model had a greater AUC than physicians, while another study showed no difference in AUCs of their ML model and physicians. Another study demonstrated that their ML-assisted approach had a greater AUC (0.917) than either an American College of Radiology (ACR) Thyroid Imaging Reporting & Data System (TI-RADS)-only approach (AUC 0.68) or a ML only approach (AUC 0.77). One study measured the performance of ML models without physician comparison, with AUCs ranging from 0.85 to 0.90. One study found no difference between their ML model and molecular testing (AUC 0.88 and 0.81, respectively). Conclusion: The growing field of CADx and increasing availability of commercial models suggests an increasing interest toward utilization of ML-assisted diagnosis in the clinical setting. However, there is a significant lack of research on the role of ML the diagnosis of indeterminate thyroid nodules. Funding: The source of funding for this review was institutional funds from the University of Chicago.

Keywords