International Journal of General Medicine (Nov 2021)

Risk Assessment of Pulmonary Metastasis for Cervical Cancer Patients by Ensemble Learning Models: A Large Population Based Real-World Study

  • Zhu M,
  • Wang B,
  • Wang T,
  • Chen Y,
  • He D

Journal volume & issue
Vol. Volume 14
pp. 8713 – 8723

Abstract

Read online

Menglin Zhu,1,* Bo Wang,2,* Tiejun Wang,3 Yilin Chen,1,4 Du He1,5 1Department of Anesthesiology, Hubei Minzu University Affiliated Enshi Clinical Medical School, The Central Hospital of Enshi Tujia and Miao Autonomous Prefecture, Enshi, Hubei, 445000, People’s Republic of China; 2National Clinical Research Center for Obstetrical and Gynecological Diseases; Key Laboratory of Cancer Invasion and Metastasis, Ministry of Education; Department of Obstetrics and Gynecology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, People’s Republic of China; 3Department of Oncology, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, People’s Republic of China; 4Department of Pulmonary and Critical Care Medicine, Hubei Minzu University Affiliated Enshi Clinical Medical School, The Central Hospital of Enshi Tujia and Miao Autonomous Prefecture, Enshi, Hubei, 445000, People’s Republic of China; 5Department of Oncology, Hubei Minzu University Affiliated Enshi Clinical Medical School, The Central Hospital of Enshi Tujia and Miao Autonomous Prefecture, Enshi, Hubei, 445000, People’s Republic of China*These authors contributed equally to this workCorrespondence: Du He; Yilin Chen Email [email protected]; [email protected]: Pulmonary metastasis (PM) is an independent risk factor affecting the prognosis of cervical patients, but it still lacks a prediction. This study aimed to develop machine learning-based predictive models for PM.Methods: A total of 22,766 patients diagnosed with or without PM from the Surveillance, Epidemiology, and End Results (SEER) database were enrolled in this study. The cohort was randomly split into a train set (70%) and a validation set (30%). In addition, 884 Chinese patients from two tertiary medical centers were included as an external validation set. Duplicated and useless candidate variables were excluded, and sixteen variables were included for the machine learning algorithm. We developed five predictive models, including the generalized linear model (GLM), random forest model (RFM), naive Bayesian model (NBM), artificial neural networks model (ANNM), and decision tree model (DTM). The predictive performance of these models was evaluated by the receiver operating characteristic (ROC) curve and calibration curve. The Cox proportional hazard model (CPHM) and competing risk model (CRM) were also included for survival outcome prediction.Results: Of the patients included in the analysis, 2456 (4.38%) patients were diagnosed with PM. Age, organ-site metastasis (liver, bone, brain), distant lymph metastasis, tumor size, and pathology were the important predictors of PM. The RFM with 9 variables introduced was identified as the best predictive model for PM (AUC = 0.972, 95% CI: 0.958– 0.986). The C-index for the CPHM and CRM was 0.626 (95% CI: 0.604– 0.648) and 0.611 (95% CI: 0.586– 0.636), respectively.Conclusion: The prediction algorithm derived by machine-learning-based methods shows a robust ability to predict PM. This result suggests that machine learning techniques have the potential to improve the development and validation of predictive modeling in cervical patients with PM.Keywords: cervical cancer, pulmonary metastasis, machine learning, predictive model, prognosis, SEER database

Keywords