BMC Bioinformatics (Jun 2021)

Machine learning-based prediction of survival prognosis in cervical cancer

  • Dongyan Ding,
  • Tingyuan Lang,
  • Dongling Zou,
  • Jiawei Tan,
  • Jia Chen,
  • Lei Zhou,
  • Dong Wang,
  • Rong Li,
  • Yunzhe Li,
  • Jingshu Liu,
  • Cui Ma,
  • Qi Zhou

DOI
https://doi.org/10.1186/s12859-021-04261-x
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Accurately forecasting the prognosis could improve cervical cancer management, however, the currently used clinical features are difficult to provide enough information. The aim of this study is to improve forecasting capability by developing a miRNAs-based machine learning survival prediction model. Results The expression characteristics of miRNAs were chosen as features for model development. The cervical cancer miRNA expression data was obtained from The Cancer Genome Atlas database. Preprocessing, including unquantified data removal, missing value imputation, samples normalization, log transformation, and feature scaling, was performed. In total, 42 survival-related miRNAs were identified by Cox Proportional-Hazards analysis. The patients were optimally clustered into four groups with three different 5-years survival outcome (≥ 90%, ≈ 65%, ≤ 40%) by K-means clustering algorithm base on top 10 survival-related miRNAs. According to the K-means clustering result, a prediction model with high performance was established. The pathways analysis indicated that the miRNAs used play roles involved in the regulation of cancer stem cells. Conclusion A miRNAs-based machine learning cervical cancer survival prediction model was developed that robustly stratifies cervical cancer patients into high survival rate (5-years survival rate ≥ 90%), moderate survival rate (5-years survival rate ≈ 65%), and low survival rate (5-years survival rate ≤ 40%).

Keywords