Computation (Jan 2025)
Advancements in Predictive Analytics: Machine Learning Approaches to Estimating Length of Stay and Mortality in Sepsis
Abstract
Sepsis remains a major global health concern, causing high mortality rates, prolonged hospital stays, and substantial economic burdens. The accurate prediction of clinical outcomes, such as mortality and length of stay (LOS), is critical for optimizing hospital resource allocation and improving patient management. The present study investigates the potential of machine learning (ML) models to predict these outcomes using a dataset of 1492 sepsis patients with clinical, physiological, and demographic features. After rigorous preprocessing to address missing data and ensure consistency, multiple classifiers, including Random Forest, Extra Trees, and Gradient Boosting, were trained and validated. The results demonstrate that Random Forest and Extra Trees achieve high accuracy for LOS prediction, while Gradient Boosting and Bernoulli Naïve Bayes effectively predict mortality. Feature importance analysis identified ICU stay duration (ICU_DAYS_OBS) as the most influential predictor for both outcomes, alongside vital signs, white blood cell counts, and lactic acid levels. These findings highlight the potential of ML-driven clinical decision support systems (CDSSs) to enhance early risk assessment, optimize ICU resource planning, and support timely interventions. Future research should refine predictive features, integrate advanced biomarkers, and validate models across larger and more diverse datasets to improve scalability and clinical impact.
Keywords