BMC Medical Genomics (Feb 2022)

Machine learning and bioinformatics analysis revealed classification and potential treatment strategy in stage 3–4 NSCLC patients

  • Chang Li,
  • Chen Tian,
  • Yulan Zeng,
  • Jinyan Liang,
  • Qifan Yang,
  • Feifei Gu,
  • Yue Hu,
  • Li Liu

DOI
https://doi.org/10.1186/s12920-022-01184-1
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Precision medicine has increased the accuracy of cancer diagnosis and treatment, especially in the era of cancer immunotherapy. Despite recent advances in cancer immunotherapy, the overall survival rate of advanced NSCLC patients remains low. A better classification in advanced NSCLC is important for developing more effective treatments. Method The calculation of abundances of tumor-infiltrating immune cells (TIICs) was conducted using Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT), xCell (xCELL), Tumor IMmune Estimation Resource (TIMER), Estimate the Proportion of Immune and Cancer cells (EPIC), and Microenvironment Cell Populations-counter (MCP-counter). K-means clustering was used to classify patients, and four machine learning methods (SVM, Randomforest, Adaboost, Xgboost) were used to build the classifiers. Multi-omics datasets (including transcriptomics, DNA methylation, copy number alterations, miRNA profile) and ICI immunotherapy treatment cohorts were obtained from various databases. The drug sensitivity data were derived from PRISM and CTRP databases. Results In this study, patients with stage 3–4 NSCLC were divided into three clusters according to the abundance of TIICs, and we established classifiers to distinguish these clusters based on different machine learning algorithms (including SVM, RF, Xgboost, and Adaboost). Patients in cluster-2 were found to have a survival advantage and might have a favorable response to immunotherapy. We then constructed an immune-related Poor Prognosis Signature which could successfully predict the advanced NSCLC patient survival, and through epigenetic analysis, we found 3 key molecules (HSPA8, CREB1, RAP1A) which might serve as potential therapeutic targets in cluster-1. In the end, after screening of drug sensitivity data derived from CTRP and PRISM databases, we identified several compounds which might serve as medication for different clusters. Conclusions Our study has not only depicted the landscape of different clusters of stage 3–4 NSCLC but presented a treatment strategy for patients with advanced NSCLC.

Keywords