Heliyon (Jul 2024)

Development and validation of a machine learning-based interpretable model for predicting sepsis by complete blood cell parameters

  • Tiancong Zhang,
  • Shuang Wang,
  • Qiang Meng,
  • Liman Li,
  • Mengxue Yuan,
  • Shuo Guo,
  • Yang Fu

Journal volume & issue
Vol. 10, no. 14
p. e34498

Abstract

Read online

Background: Sepsis, a severe infectious disease, carries a high mortality rate. Early detection and prompt treatment are crucial for reducing mortality and improving prognosis. The aim of this research is to develop a clinical prediction model using machine learning algorithms, leveraging complete blood cell (CBC) parameters, to detect sepsis at an early stage. Methods: The study involved 572 patients admitted to West China Hospital of Sichuan University between July 2020 and September 2021. Among them, 215 were diagnosed with sepsis, while 357 had local infections. Demographic information was collected, and 57 CBC parameters were analyzed to identify potential predictors using techniques such as the Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost). The prediction model was built using Logistic Regression and evaluated for diagnostic specificity, discrimination, and clinical applicability including metrics such as the area under the curve (AUC), calibration curve, clinical impact curve, and clinical decision curve. Additionally, the model's diagnostic performance was assessed on a separate validation cohort. Shapley's additive explanations (SHAP), and breakdown (BD) profiles were used to explain the contribution of each variable in predicting the outcome. Results: Among all the machine learning methods' prediction models, the LASSO-based model (λ = min) demonstrated the highest diagnostic performance in both the discovery cohort (AUC = 0.9446, P < 0.001) and the validation cohort (AUC = 0.9001, P < 0.001). Furthermore, upon local analysis and interpretation of the model, we demonstrated that LY-Z, MO-Z, and PLT-I had the most significant impact on the outcome. Conclusions: The predictive model based on CBC parameters can be utilized as an effective approach for the early detection of sepsis.

Keywords