Intensive Care Research (May 2024)
Stratifying Mortality Risk in Intensive Care: A Comprehensive Analysis Using Cluster Analysis and Classification and Regression Tree Algorithms
Abstract
Abstract Background Machine learning (ML) can be promising for stratifying patients into homogeneous groups and assessing mortality based on score combination. Using ML, we compared mortality prediction performance for clustered and non-clustered models and tried to develop a simple decision algorithm to predict the patient’s cluster membership with classification and regression trees (CART). Methods Retrospective study involving patients requiring ICU admission (1st January 2011–16th September 2022). Clusters were identified by combining Charlson Comorbidity Index (CCI) plus Simplified Acute Physiology Score II (SAPS II) or Sequential Organ Failure Assessment (SOFA). Intercluster and survival analyses were performed. We analyzed the relationship with mortality with multivariate logistic regressions and receiver operating characteristic curves (ROC) for models with and without clusters. Nested models were compared with Likelihood Ratio Tests (LRT). Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were compared for non-nested models. With the best model, we used CART to build a decision tree for patient’s membership. Results Our sample consisted of 2605 patients (mortality 59.7%). For both score combinations, we identified two clusters (A and B for CCI + SAPS II, α and β for CCI + SOFA). Belonging to cluster B/β was associated with shorter survival times (Peto-Peto p-values < 0.0001) and increased mortality (Odds-ratio 4.65 and 5.44, respectively). According to LRT and ROC analysis, clustered models performed better, and CCI + SOFA showed the lowest AIC and BIC values (AIC = 3021.21, BIC = 3132.65). Using CART (β cluster positive case) the accuracy of the decision tree was 94.8%. Conclusion Clustered models significantly improved mortality prediction. The CCI + SOFA clustered model showed the best balance between complexity and data fit and should be preferred. Developing a user-friendly decision-making algorithm for cluster membership with CART showed high accuracy. Further validation studies are needed to confirm these findings.
Keywords