Frontiers in Molecular Biosciences (Dec 2022)

Machine learning models for predicting one-year survival in patients with metastatic gastric cancer who experienced upfront radical gastrectomy

  • Cheng Zhang,
  • Cheng Zhang,
  • Yi Zhang,
  • Ya-Hui Yang,
  • Hui Xu,
  • Hui Xu,
  • Xiao-Peng Zhang,
  • Zhi-Jun Wu,
  • Min-Min Xie,
  • Ying Feng,
  • Chong Feng,
  • Tai Ma

DOI
https://doi.org/10.3389/fmolb.2022.937242
Journal volume & issue
Vol. 9

Abstract

Read online

Tumor metastasis is a common event in patients with gastric cancer (GC) who previously underwent curative gastrectomy. It is meaningful to employ high-volume clinical data for predicting the survival of metastatic GC patients. We aim to establish an improved machine learning (ML) classifier for predicting if a patient with metastatic GC would die within 12 months. Eligible patients were enrolled from a Chinese GC cohort, and the complete detailed information from medical records was extracted to generate a high-dimensional dataset. Appropriate feature engineering and feature filter were conducted before modeling with eight algorithms. A 10-fold cross validation (CV) nested in a holdout CV (8:2) was employed for hyperparameter tuning and model evaluation. Model selection was based on the area under the receiver operating characteristic (AUROC) curve, recall, and precision. The selected model was globally explained using interpretable surrogate models. Of the total 399 cases (median survival of 8.2 months), 242 patients survived less than 12 months. The linear discriminant analysis (LDA), support vector machine (SVM), and random forest (RF) model had the highest AUROC (0.78 ± 0.021), recall (0.93 ± 0.031), and precision (0.80 ± 0.026), respectively. The LDA model created a new function that generally separated the two classes. The predicted probability of the SVM model was interpreted using a linear regression model visualized by a nomogram. The predicted class of the RF model was explained using a decision tree model. In summary, analyzing high-volume medical data by ML is helpful to produce an improved model for predicting the survival in patients with metastatic GC. The algorithm should be carefully selected in different practical scenarios.

Keywords