Frontiers in Oncology (Dec 2022)

A machine learning-based approach to predicting the malignant and metastasis of thyroid cancer

  • Jianhua Gu,
  • Jianhua Gu,
  • Rongli Xie,
  • Yanna Zhao,
  • Zhifeng Zhao,
  • Dan Xu,
  • Min Ding,
  • Tingyu Lin,
  • Wenjuan Xu,
  • Wenjuan Xu,
  • Zihuai Nie,
  • Enjun Miao,
  • Dan Tan,
  • Sibo Zhu,
  • Dongjie Shen,
  • Jian Fei

DOI
https://doi.org/10.3389/fonc.2022.938292
Journal volume & issue
Vol. 12

Abstract

Read online

BackgroundThyroid Cancer (TC) is the most common malignant disease of endocrine system, and its incidence rate is increasing year by year. Early diagnosis, management of malignant nodules and scientific treatment are crucial for TC prognosis. The first aim is the construction of a classification model for TC based on risk factors. The second aim is the construction of a prediction model for metastasis based on risk factors.MethodsWe retrospectively collected approximately 70 preoperative demographic and laboratory test indices from 1735 TC patients. Machine learning pipelines including linear regression model ridge, Logistic Regression (LR) and eXtreme Gradient Boosting (XGBoost) were used to select the best model for predicting deterioration and metastasis of TC. A comprehensive comparative analysis with the prediction model using only thyroid imaging reporting and data system (TI-RADS).ResultsThe XGBoost model achieved the best performance in the final thyroid nodule diagnosis (AUC: 0.84) and metastasis (AUC: 0.72-0.77) predictions. Its AUCs for predicting Grade 4 TC deterioration and metastasis reached 0.84 and 0.97, respectively, while none of the AUCs for Only TI-RADS reached 0.70. Based on multivariate analysis and feature selection, age, obesity, prothrombin time, fibrinogen, and HBeAb were common significant risk factors for tumor progression and metastasis. Monocyte, D-dimer, T3, FT3, and albumin were common protective factors. Tumor size (11.14 ± 7.14 mm) is the most important indicator of metastasis formation. In addition, GGT, glucose, platelet volume distribution width, and neutrophil percentage also contributed to the development of metastases. The abnormal levels of blood lipid and uric acid were closely related to the deterioration of tumor. The dual role of mean erythrocytic hemoglobin concentration in TC needs to be verified in a larger patient cohort. We have established a free online tool (http://www.cancer-thyroid.com/) that is available to all clinicians for the prognosis of patients at high risk of TC.ConclusionIt is feasible to use XGBoost algorithm, combined with preoperative laboratory test indexes and demographic characteristics to predict tumor progression and metastasis in patients with TC, and its performance is better than that of Only using TI-RADS. The web tools we developed can help physicians with less clinical experience to choose the appropriate clinical decision or secondary confirmation of diagnosis results.

Keywords