Cancer Medicine (Mar 2025)

Identification and Validation of Four Serum Biomarkers With Optimal Diagnostic and Prognostic Potential for Gastric Cancer Based on Machine Learning Algorithms

  • Yi Liu,
  • Bingxian Bian,
  • Shiyu Chen,
  • Bingqian Zhou,
  • Peng Zhang,
  • Lisong Shen,
  • Hui Chen

DOI
https://doi.org/10.1002/cam4.70659
Journal volume & issue
Vol. 14, no. 6
pp. n/a – n/a

Abstract

Read online

ABSTRACT Background Gastric cancer (GC) is considered a highly heterogeneous disease, and currently, a comprehensive approach encompassing molecular data from various biological levels is lacking. Methods This study conducted different analyses, including the identification of differentially expressed genes (DEGs), weighted correlation networks (WGCNA), single‐cell RNA sequencing (scRNA‐seq), mRNA expression‐based stemness index (mRNAsi), and multiCox analysis, utilizing data from Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. Subsequently, the machine learning algorithms including least absolute shrinkage and selection operator (LASSO) regression and random forest (RF), combined with multiCox analysis were exploited to identify hub genes. These findings were then validated through the receiver operating characteristic (ROC) curve and Kaplan–Meier analysis, and were experimentally confirmed in GC samples by reverse transcription–polymerase chain reaction (RT‐PCR) and enzyme‐linked immunosorbent assay (ELISA). Results Integrated analysis of TCGA and GEO databases, coupled with LASSO regression and RF algorithms, allowed us to identify 18 hub genes encoding differentially expressed secreted proteins in GC. The results of RT‐PCR and bioinformatics analysis revealed four promising biomarkers with optimal diagnostic and prognostic potential. ROC analysis and Kaplan–Meier curves highlighted CHI3L1, FCGBP, VSIG2, and TFF2 as promising biomarkers for GC, offering superior modeling accuracy. These findings were further confirmed by RT‐PCR and ELISA, affirming the clinical utility of these four biomarkers. Additionally, CIBERSORT analysis indicated a potential correlation between the four biomarkers and the infiltration of B memory cells and Treg cells. Conclusion This study unveiled four promising biomarkers present in the serum of patients with GC, which could serve as powerful indicators of GC and provide valuable insights for further research into GC pathogenesis.

Keywords