PLoS Medicine (Apr 2024)

A novel electronic health record-based, machine-learning model to predict severe hypoglycemia leading to hospitalizations in older adults with diabetes: A territory-wide cohort and modeling study.

  • Mai Shi,
  • Aimin Yang,
  • Eric S H Lau,
  • Andrea O Y Luk,
  • Ronald C W Ma,
  • Alice P S Kong,
  • Raymond S M Wong,
  • Jones C M Chan,
  • Juliana C N Chan,
  • Elaine Chow

DOI
https://doi.org/10.1371/journal.pmed.1004369
Journal volume & issue
Vol. 21, no. 4
p. e1004369

Abstract

Read online

BackgroundOlder adults with diabetes are at high risk of severe hypoglycemia (SH). Many machine-learning (ML) models predict short-term hypoglycemia are not specific for older adults and show poor precision-recall. We aimed to develop a multidimensional, electronic health record (EHR)-based ML model to predict one-year risk of SH requiring hospitalization in older adults with diabetes.Methods and findingsWe adopted a case-control design for a retrospective territory-wide cohort of 1,456,618 records from 364,863 unique older adults (age ≥65 years) with diabetes and at least 1 Hong Kong Hospital Authority attendance from 2013 to 2018. We used 258 predictors including demographics, admissions, diagnoses, medications, and routine laboratory tests in a one-year period to predict SH events requiring hospitalization in the following 12 months. The cohort was randomly split into training, testing, and internal validation sets in a 7:2:1 ratio. Six ML algorithms were evaluated including logistic-regression, random forest, gradient boost machine, deep neural network (DNN), XGBoost, and Rulefit. We tested our model in a temporal validation cohort in the Hong Kong Diabetes Register with predictors defined in 2018 and outcome events defined in 2019. Predictive performance was assessed using area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC) statistics, and positive predictive value (PPV). We identified 11,128 SH events requiring hospitalization during the observation periods. The XGBoost model yielded the best performance (AUROC = 0.978 [95% CI 0.972 to 0.984]; AUPRC = 0.670 [95% CI 0.652 to 0.688]; PPV = 0.721 [95% CI 0.703 to 0.739]). This was superior to an 11-variable conventional logistic-regression model comprised of age, sex, history of SH, hypertension, blood glucose, kidney function measurements, and use of oral glucose-lowering drugs (GLDs) (AUROC = 0.906; AUPRC = 0.085; PPV = 0.468). Top impactful predictors included non-use of lipid-regulating drugs, in-patient admission, urgent emergency triage, insulin use, and history of SH. External validation in the HKDR cohort yielded AUROC of 0.856 [95% CI 0.838 to 0.873]. Main limitations of this study included limited transportability of the model and lack of geographically independent validation.ConclusionsOur novel-ML model demonstrated good discrimination and high precision in predicting one-year risk of SH requiring hospitalization. This may be integrated into EHR decision support systems for preemptive intervention in older adults at highest risk.