Frontiers in Endocrinology (Nov 2022)

Development and validation of a machine learning-augmented algorithm for diabetes screening in community and primary care settings: A population-based study

  • XiaoHuan Liu,
  • XiaoHuan Liu,
  • Weiyue Zhang,
  • Weiyue Zhang,
  • Qiao Zhang,
  • Long Chen,
  • TianShu Zeng,
  • TianShu Zeng,
  • JiaoYue Zhang,
  • JiaoYue Zhang,
  • Jie Min,
  • Jie Min,
  • ShengHua Tian,
  • ShengHua Tian,
  • Hao Zhang,
  • Hao Zhang,
  • Hantao Huang,
  • Ping Wang,
  • Xiang Hu,
  • Xiang Hu,
  • LuLu Chen,
  • LuLu Chen

DOI
https://doi.org/10.3389/fendo.2022.1043919
Journal volume & issue
Vol. 13

Abstract

Read online

BackgroundOpportunely screening for diabetes is crucial to reduce its related morbidity, mortality, and socioeconomic burden. Machine learning (ML) has excellent capability to maximize predictive accuracy. We aim to develop ML-augmented models for diabetes screening in community and primary care settings.Methods8425 participants were involved from a population-based study in Hubei, China since 2011. The dataset was split into a development set and a testing set. Seven different ML algorithms were compared to generate predictive models. Non-laboratory features were employed in the ML model for community settings, and laboratory test features were further introduced in the ML+lab models for primary care. The area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (auPR), and the average detection costs per participant of these models were compared with their counterparts based on the New China Diabetes Risk Score (NCDRS) currently recommended for diabetes screening.ResultsThe AUC and auPR of the ML model were 0·697and 0·303 in the testing set, seemingly outperforming those of NCDRS by 10·99% and 64·67%, respectively. The average detection cost of the ML model was 12·81% lower than that of NCDRS with the same sensitivity (0·72). Moreover, the average detection cost of the ML+FPG model is the lowest among the ML+lab models and less than that of the ML model and NCDRS+FPG model.ConclusionThe ML model and the ML+FPG model achieved higher predictive accuracy and lower detection costs than their counterpart based on NCDRS. Thus, the ML-augmented algorithm is potential to be employed for diabetes screening in community and primary care settings.

Keywords