Digital Health (Jul 2024)

A 10-year retrospective cohort of diabetic patients in a large medical institution: Utilizing multiple machine learning models for diabetic kidney disease prediction

  • Guangpu Li,
  • Jia Li,
  • Fei Tian,
  • Jingjing Ren,
  • Zuishuang Guo,
  • Shaokang Pan,
  • Dongwei Liu,
  • Jiayu Duan,
  • Zhangsuo Liu

DOI
https://doi.org/10.1177/20552076241265220
Journal volume & issue
Vol. 10

Abstract

Read online

Objective As the prevalence of diabetes steadily increases, the burden of diabetic kidney disease (DKD) is also intensifying. In response, we have utilized a 10-year diabetes cohort from our medical center to train machine learning-based models for predicting DKD and interpreting relevant factors. Methods Employing a large dataset from 73,101 hospitalized type 2 diabetes patients at The First Affiliated Hospital of Zhengzhou University, we analyzed demographic and medication data. Machine learning models, including XGBoost, CatBoost, LightGBM, Random Forest, AdaBoost, GBDT (gradient boosting decision tree), and SGD (stochastic gradient descent), were trained on these data, focusing on interpretability by SHAP. SHAP explains the output of the models by assigning an importance value to each feature for a particular prediction, enabling a clear understanding of how individual features influence the prediction outcomes. Results The XGBoost model achieved an area under the curve (AUC) of 0.95 and an area under the precision-recall curve (AUPR) of 0.76, while CatBoost recorded an AUC of 0.97 and an AUPR of 0.84. These results underscore the effectiveness of these models in predicting DKD in patients with type 2 diabetes. Conclusions This study provides a comprehensive approach for predicting DKD in patients with type 2 diabetes, employing machine learning techniques. The findings are crucial for the early detection and intervention of DKD, offering a roadmap for future research and healthcare strategies in diabetes management. Additionally, the presence of non-diabetic kidney diseases and diabetes with complications was identified as significant factors in the development of DKD.