Development and validation of machine learning models to predict unplanned hospitalizations of patients with diabetes within the next 12 months

A. E. Andreychenko; A. D. Ermak; D. V. Gavrilov; R. E. Novitskiy; A. V. Gusev

doi:10.14341/DM13065

Сахарный диабет (May 2024)

Development and validation of machine learning models to predict unplanned hospitalizations of patients with diabetes within the next 12 months

A. E. Andreychenko,
A. D. Ermak,
D. V. Gavrilov,
R. E. Novitskiy,
A. V. Gusev

Affiliations

A. E. Andreychenko: K-SkAI LLC
A. D. Ermak: K-SkAI LLC
D. V. Gavrilov: K-SkAI LLC
R. E. Novitskiy: K-SkAI LLC
A. V. Gusev: Federal Research Institute for Health Organization and Informatics; Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies

DOI: https://doi.org/10.14341/DM13065
Journal volume & issue: Vol. 27, no. 2
pp. 142 – 157

Abstract

Read online

BACKGROUND: The incidence of diabetes mellitus (DM) both in the Russian Federation and in the world has been steadily increasing for several decades. Stable population growth and current epidemiological characteristics of DM lead to enormous economic costs and significant social losses throughout the world. The disease often progresses with the development of specific complications, while significantly increasing the likelihood of hospitalization. The creation and inference of a machine learning model for predicting hospitalizations of patients with DM to an inpatient medical facility will make it possible to personalize the provision of medical care and optimize the load on the entire healthcare system.AIM: Development and validation of models for predicting unplanned hospitalizations of patients with diabetes due to the disease itself and its complications using machine learning algorithms and data from real clinical practice.MATERIALS AND METHODS: 170,141 depersonalized electronic health records of 23,742 diabetic patients were included in the study. Anamnestic, constitutional, clinical, instrumental and laboratory data, widely used in routine medical practice, were considered as potential predictors, a total of 33 signs. Logistic regression (LR), gradient boosting methods (LightGBM, XGBoost, CatBoost), decision tree-based methods (RandomForest and ExtraTrees), and a neural network-based algorithm (Multi-layer Perceptron) were compared. External validation was performed on the data of the separate region of Russian Federation.RESULTS: The best results and stability to external validation data were shown by the LightGBM model with an AUC of 0.818 (95% CI 0.802–0.834) in internal testing and 0.802 (95% CI 0.773–0.832) in external validation.CONCLUSION: The metrics of the best model were superior to previously published studies. The results of external validation showed the relative stability of the model to new data from another region, that reflects the possibility of the model’s application in real clinical practice.

Published in Сахарный диабет

ISSN: 2072-0351 (Print); 2072-0378 (Online)
Publisher: Endocrinology Research Centre
Country of publisher: Russian Federation
LCC subjects: Medicine: Internal medicine: Specialties of internal medicine: Nutritional diseases. Deficiency diseases
Website: https://www.dia-endojournals.ru/jour/index

About the journal

Abstract

Keywords