Scientific Reports (Sep 2024)

Development of a prognostic model for NSCLC based on differential genes in tumour stem cells

  • Yuqi Ma,
  • Jiawei Li,
  • Chunping Xiong,
  • Xiaoluo Sun,
  • Tao Shen

DOI
https://doi.org/10.1038/s41598-024-71317-2
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Non-small cell lung cancer (NSCLC) constitutes a significant portion of lung cancers and cytotoxic drugs (e.g. cisplatin) are currently the first-line treatment. However, NSCLC has developed resistance to this drug, which limits the therapeutic effect and thus affects prognosis. NSCLC sc-RNA-seq data were downloaded from the GEO database and Ku Leuven Laboratory for Functional Epigenetics, and bulk RNA-seq data were obtained from the TCGA database. The “Seurat” package was employed for scRNA-seq data processing, and the uniform manifold approximation and projection (UMAP) were applied for downscaling and cluster identification. Use the FindAllMarkers function to find differential genes (DEGs) for tumor stem cells. Then, we performed univariate regression analyses on the DEGs to identify potential prognostic genes. We created a machine learning framework based on potential prognostic genes, which combines 10 machine learning methods and their 101 combinations to get the optimal prognostic risk model. The model was evaluated in the training set and validation set. A nomogram was developed to provide physicians with a quantitative tool for prognosis prediction. Finally, we evaluated the expression and functionality of SLC2A1. We discovered 22 cell clusters containing 218379 cells by examining single-cell RNA sequencing datasets (GSE148071, KU_lom, GSE131907, GSE136246, GSE127465). Tumour cells were isolated for subpopulation analysis and 162 differential genes from SOX2_cancer were obtained. After univariate Cox analysis, we found 23 genes with prognostic potential prognostic value and utilized them to develop 101‑combination machine learning computational framework. We eventually picked the best performing ‘StepCox[both] + RSF’, which includes 8 genes. The model has a relatively high prediction accuracy in both TCGA and GEO datasets. In in vitro investigations, targeted suppression of the SLC2A1 gene resulted in significant reductions in proliferation, invasion and migration in A549 cells. In addition, a significant reduction in cisplatin resistance was seen in A549/DDP cells. The outcomes demonstrated the precision and credibility of the prognostic model for NSCLC, highlighting its potential significance in the treatment and prognosis of individuals affected by this disease. SLC2A1 may become a promising prognostic marker and a potential therapeutic target, offering valuable insights to inform clinical treatment decisions.

Keywords