BMC Nephrology (Dec 2023)
Machine learning models to predict end-stage kidney disease in chronic kidney disease stage 4
Abstract
Abstract Introduction End-stage kidney disease (ESKD) is associated with increased morbidity and mortality. Identifying patients with stage 4 CKD (CKD4) at risk of rapid progression to ESKD remains challenging. Accurate prediction of CKD4 progression can improve patient outcomes by improving advanced care planning and optimizing healthcare resource allocation. Methods We obtained electronic health record data from patients with CKD4 in a large health system between January 1, 2006, and December 31, 2016. We developed and validated four models, including Least Absolute Shrinkage and Selection Operator (LASSO) regression, random forest, eXtreme Gradient Boosting (XGBoost), and artificial neural network (ANN), to predict ESKD at 3 years. We utilized area under the receiver operating characteristic curve (AUROC) to evaluate model performances and utilized Shapley additive explanation (SHAP) values and plots to define feature dependence of the best performance model. Results We included 3,160 patients with CKD4. ESKD was observed in 538 patients (21%). All approaches had similar AUROCs; ANN yielded the highest AUROC (0.77; 95%CI 0.75 to 0.79) and LASSO regression (0.77; 95%CI 0.75 to 0.79), followed by random forest (0.76; 95% CI 0.74 to 0.79), and XGBoost (0.76; 95% CI 0.74 to 0.78). Conclusions We developed and validated several models for near-term prediction of kidney failure in CKD4. ANN, random forest, and XGBoost demonstrated similar predictive performances. Using this suite of models, interventions can be customized based on risk, and population health and resources appropriately allocated.
Keywords