European Journal of Medical Research (Oct 2022)

Construction of mRNA prognosis signature associated with differentially expressed genes in early stage of stomach adenocarcinomas based on TCGA and GEO datasets

  • Fuquan Jiang,
  • Haiguan Lin,
  • Hongfeng Yan,
  • Xiaomin Sun,
  • Jianwu Yang,
  • Manku Dong

DOI
https://doi.org/10.1186/s40001-022-00827-4
Journal volume & issue
Vol. 27, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Stomach adenocarcinomas (STAD) are the most common malignancy of the human digestive system and represent the fourth leading cause of cancer-related deaths. As early-stage STAD are generally mild or asymptomatic, patients with advanced STAD have short overall survival. Early diagnosis of STAD has a considerable influence on clinical outcomes. Methods The mRNA expression data and clinical indicators of STAD and normal tissues were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) database. The gene expression differences were analyzed by R packages, and gene function enrichment analysis was performed. Kaplan–Meier method and univariate Cox proportional risk regression analysis were used to screen differential expressed genes (DEGs) related to survival of STAD patients. Multivariate Cox proportional risk regression analysis was used to further screen and determine the prognostic DEGs in STAD patients, and to construct a multigene prognostic prediction signature. The accuracy of predictive signature was tested by receiver operating characteristic (ROC) curve software package, and the nomogram of patients with STAD was drawn. Cox regression was used to investigate the correlation between multigene prognostic signature and clinical factors. The predictive performance of this model was compared with two other models proposed in previous studies using KM survival analysis, ROC curve analysis, Harrell consistency index and decision curve analysis (DCA). qRT-PCR and Western blot were used to verify the expression levels of prognostic genes. The pathways and functions of possible involvement of features were predicted using the GSEA method. Results A total of 569 early-stage specific DEGs were retrieved from TCGA-STAD dataset, including 229 up-regulated genes and 340 down-regulated genes. Enrichment analysis showed that the early-stage specific DEGs were associated with cytokine–cytokine receptor interaction, neuroactive ligand–receptor interaction, and calcium signaling pathway. Multiple Cox regression algorithm was used to identify 10 early-stage specific DEGs associated with overall survival (P < 0.01) of STAD patients, and a multi-mRNA prognosis signature was established. The patients were divided into high-risk group and low-risk group according to the risk score. In the training set, the prognostic signature was positively correlated with tumor size and stage (P < 0.05), survival curve (P < 0.001) and time-dependent ROC (AUC = 0.625). In the training dataset and test dataset, the both signatures had good predictive efficiencies. Cox regression and DCA analysis revealed that the prognostic signature was an independent factor and had a better predict effect than the conventional TNM stage classification method and the earlier published biomarkers on the prognosis of STAD patients. Conclusion In this study, based on the early-stage specifically expressed genes, the prognostic signature constructed through TCGA and GEO datasets may become an indicator for clinical prognosis assessment of STAD and a new strategy for targeted therapy in the future.

Keywords