International Journal of Endocrinology (Jan 2023)
Machine Learning for Predicting Distant Metastasis of Medullary Thyroid Carcinoma Using the SEER Database
Abstract
Objectives. We aimed to establish an effective machine learning (ML) model for predicting the risk of distant metastasis (DM) in medullary thyroid carcinoma (MTC). Methods. Demographic data of MTC patients were extracted from the Surveillance, Epidemiology, and End Results (SEER) database of the National Institutes of Health between 2004 and 2015 to develop six ML algorithm models. Models were evaluated based on accuracy, precision, recall rate, F1-score, and area under the receiver operating characteristic curve (AUC). The association between clinicopathological characteristics and target variables was interpreted. Analyses were performed using traditional logistic regression (LR). Results. In total, 2049 patients were included and 138 developed DM. Multivariable LR showed that age, sex, tumor size, extrathyroidal extension, and lymph node metastasis were predictive features for DM in MTC. Among the six ML models, the random forest (RF) had the best predictability in assessing the risk of DM in MTC, with an accuracy, precision, recall rate, F1-score, and AUC higher than those of the traditional binary LR model. Conclusion. RF was superior to traditional LR in predicting the risk of DM in MTC and can provide a valuable reference for clinicians in decision-making.