Frontiers in Immunology (Nov 2024)
Predicting patients with septic shock and sepsis through analyzing whole-blood expression of NK cell-related hub genes using an advanced machine learning framework
Abstract
BackgroundSepsis is a life-threatening condition that causes millions of deaths globally each year. The need for biomarkers to predict the progression of sepsis to septic shock remains critical, with rapid, reliable methods still lacking. Transcriptomics data has recently emerged as a valuable resource for disease phenotyping and endotyping, making it a promising tool for predicting disease stages. Therefore, we aimed to establish an advanced machine learning framework to predict sepsis and septic shock using transcriptomics datasets with rapid turnaround methods.MethodsWe retrieved four NCBI GEO transcriptomics datasets previously generated from peripheral blood samples of healthy individuals and patients with sepsis and septic shock. The datasets were processed for bioinformatic analysis and supplemented with a series of bench experiments, leading to the identification of a hub gene panel relevant to sepsis and septic shock. The hub gene panel was used to establish a novel prediction model to distinguish sepsis from septic shock through a multistage machine learning pipeline, incorporating linear discriminant analysis, risk score analysis, and ensemble method combined with Least Absolute Shrinkage and Selection Operator analysis. Finally, we validated the prediction model with the hub gene dataset generated by RT-qPCR using peripheral blood samples from newly recruited patients.ResultsOur analysis led to identify six hub genes (GZMB, PRF1, KLRD1, SH2D1A, LCK, and CD247) which are related to NK cell cytotoxicity and septic shock, collectively termed 6-HubGss. Using this panel, we created SepxFindeR, a machine learning model that demonstrated high accuracy in predicting sepsis and septic shock and distinguishing septic shock from sepsis in a cross-database context. Remarkably, the SepxFindeR model proved compatible with RT-qPCR datasets based on the 6-HubGss panel, facilitating the identification of newly recruited patients with sepsis and septic shock.ConclusionsOur bioinformatic approach led to the discovery of the 6-HubGss biomarker panel and the development of the SepxFindeR machine learning model, enabling accurate prediction of septic shock and distinction from sepsis with rapid processing capabilities.
Keywords