Scientific Reports (Nov 2024)
Survival prediction and molecular subtyping of squamous cell lung cancer based on network embedding
Abstract
Abstract Squamous cell lung cancer (SQCLC), which is fatal to humans, is heterogeneous with different genetic and histological features. We used SBMOI, a multi-omics data integration method from previous study, to integrate clinical, gene expression, and somatic mutation data of SQCLC to construct new patient features. Next, random survival forest (RSF) model and SimpleMKL model were constructed to predict the survival of SQCLC patients, and K-means model was constructed to perform molecular subtyping. The results of the RSF model showed that when the dimension of the patient features were 11 × 364 and the hard threshold was 0.2, we obtained the best results, and the AUC value of the 1-year time-dependent ROC curve was 0.706. The SimpleMKL model, constructed using the same patient features, performed exceptionally well, with 1-year, 5-year, and 10-year survival prediction AUC values of 0.944, 0.947 and 0.950, respectively. We used K-means analysis to identify three SQCLC molecular subtypes with significant survival differences. The patient features constructed by SBMOI were used to effectively predict the survival and molecular subtyping of SQCLC patients. In addition, our study further confirmed the effectiveness in multi-omics data integration task and broad applicability of SBMOI.
Keywords