Diabetes & Metabolism Journal (Jul 2021)

Development and Validation of a Deep Learning Based Diabetes Prediction System Using a Nationwide Population-Based Cohort

  • Sang Youl Rhee,
  • Ji Min Sung,
  • Sunhee Kim,
  • In-Jeong Cho,
  • Sang-Eun Lee,
  • Hyuk-Jae Chang

DOI
https://doi.org/10.4093/dmj.2020.0081
Journal volume & issue
Vol. 45, no. 4
pp. 515 – 525

Abstract

Read online

Background Previously developed prediction models for type 2 diabetes mellitus (T2DM) have limited performance. We developed a deep learning (DL) based model using a cohort representative of the Korean population. Methods This study was conducted on the basis of the National Health Insurance Service-Health Screening (NHIS-HEALS) cohort of Korea. Overall, 335,302 subjects without T2DM at baseline were included. We developed the model based on 80% of the subjects, and verified the power in the remainder. Predictive models for T2DM were constructed using the recurrent neural network long short-term memory (RNN-LSTM) network and the Cox longitudinal summary model. The performance of both models over a 10-year period was compared using a time dependent area under the curve. Results During a mean follow-up of 10.4±1.7 years, the mean frequency of periodic health check-ups was 2.9±1.0 per subject. During the observation period, T2DM was newly observed in 8.7% of the subjects. The annual performance of the model created using the RNN-LSTM network was superior to that of the Cox model, and the risk factors for T2DM, derived using the two models were similar; however, certain results differed. Conclusion The DL-based T2DM prediction model, constructed using a cohort representative of the population, performs better than the conventional model. After pilot tests, this model will be provided to all Korean national health screening recipients in the future.

Keywords