Frontiers in Immunology (Nov 2022)
Identification of two robust subclasses of sepsis with both prognostic and therapeutic values based on machine learning analysis
Abstract
BackgroundSepsis is a heterogeneous syndrome with high morbidity and mortality. Optimal and effective classifications are in urgent need and to be developed.Methods and resultsA total of 1,936 patients (sepsis samples, n=1,692; normal samples, n=244) in 7 discovery datasets were included to conduct weighted gene co-expression network analysis (WGCNA) to filter out candidate genes related to sepsis. Then, two subtypes of sepsis were classified in the training sepsis set (n=1,692), the Adaptive and Inflammatory, using K-means clustering analysis on 90 sepsis-related features. We validated these subtypes using 617 samples in 5 independent datasets and the merged 5 sets. Cibersort method revealed the Adaptive subtype was related to high infiltration levels of T cells and natural killer (NK) cells and a better clinical outcome. Immune features were validated by single-cell RNA sequencing (scRNA-seq) analysis. The Inflammatory subtype was associated with high infiltration of macrophages and a disadvantageous prognosis. Based on functional analysis, upregulation of the Toll-like receptor signaling pathway was obtained in Inflammatory subtype and NK cell-mediated cytotoxicity and T cell receptor signaling pathway were upregulated in Adaptive group. To quantify the cluster findings, a scoring system, called, risk score, was established using four datasets (n=980) in the discovery cohorts based on least absolute shrinkage and selection operator (LASSO) and logistic regression and validated in external sets (n=760). Multivariate logistic regression analysis revealed the risk score was an independent predictor of outcomes of sepsis patients (OR [odds ratio], 2.752, 95% confidence interval [CI], 2.234-3.389, P<0.001), when adjusted by age and gender. In addition, the validation sets confirmed the performance (OR, 1.638, 95% CI, 1.309-2.048, P<0.001). Finally, nomograms demonstrated great discriminatory potential than that of risk score, age and gender (training set: AUC=0.682, 95% CI, 0.643-0.719; validation set: AUC=0.624, 95% CI, 0.576-0.664). Decision curve analysis (DCA) demonstrated that the nomograms were clinically useful and had better discriminative performance to recognize patients at high risk than the age, gender and risk score, respectively.ConclusionsIn-depth analysis of a comprehensive landscape of the transcriptome characteristics of sepsis might contribute to personalized treatments and prediction of clinical outcomes.
Keywords