Diabetes & Metabolism Journal (Jul 2022)

Development of Various Diabetes Prediction Models Using Machine Learning Techniques

  • Juyoung Shin,
  • Jaewon Kim,
  • Chanjung Lee,
  • Joon Young Yoon,
  • Seyeon Kim,
  • Seungjae Song,
  • Hun-Sung Kim

DOI
https://doi.org/10.4093/dmj.2021.0115
Journal volume & issue
Vol. 46, no. 4
pp. 650 – 657

Abstract

Read online

Background There are many models for predicting diabetes mellitus (DM), but their clinical implication remains vague. Therefore, we aimed to create various DM prediction models using easily accessible health screening test parameters. Methods Two sets of variables were used to develop eight DM prediction models. One set comprised 62 easily accessible examination results of commonly used variables from a tertiary university hospital. The second set comprised 27 of the 62 variables included in the national routine health checkups. Gradient boosting and random forest algorithms were used to develop the models. Internal validation was performed using the stratified 10-fold cross-validation method. Results The area under the receiver operating characteristic curve (ROC-AUC) for the 62-variable DM model making 12-month predictions for subjects without diabetes was the largest (0.928) among those of the eight DM prediction models. The ROC-AUC dropped by more than 0.04 when training with the simplified 27-variable set but still showed fairly good performance with ROC-AUCs between 0.842 and 0.880. The accuracy was up to 11.5% higher (from 0.807 to 0.714) when fasting glucose was included. Conclusion We created easily applicable diabetes prediction models that deliver good performance using parameters commonly assessed during tertiary university hospital and national routine health checkups. We plan to perform prospective external validation, hoping that the developed DM prediction models will be widely used in clinical practice.

Keywords