Diabetology & Metabolic Syndrome (Jul 2023)

Machine learning for predicting diabetes risk in western China adults

  • Lin Li,
  • Yinlin Cheng,
  • Weidong Ji,
  • Mimi Liu,
  • Zhensheng Hu,
  • Yining Yang,
  • Yushan Wang,
  • Yi Zhou

DOI
https://doi.org/10.1186/s13098-023-01112-y
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Objective Diabetes mellitus is a global epidemic disease. Long-time exposure of patients to hyperglycemia can lead to various type of chronic tissue damage. Early diagnosis of and screening for diabetes are crucial to population health. Methods We collected the national physical examination data in Xinjiang, China, in 2020 (a total of more than 4 million people). Three types of physical examination indices were analyzed: questionnaire, routine physical examination and laboratory values. Integrated learning, deep learning and logistic regression methods were used to establish a risk model for type-2 diabetes mellitus. In addition, to improve the convenience and flexibility of the model, a diabetes risk score card was established based on logistic regression to assess the risk of the population. Results An XGBoost-based risk prediction model outperformed the other five risk assessment algorithms. The AUC of the model was 0.9122. Based on the feature importance ranking map, we found that hypertension, fasting blood glucose, age, coronary heart disease, ethnicity, parental diabetes mellitus, triglycerides, waist circumference, total cholesterol, and body mass index were the most important features of the risk prediction model for type-2 diabetes. Conclusions This study established a diabetes risk assessment model based on multiple ethnicities, a large sample and many indices, and classified the diabetes risk of the population, thus providing a new forecast tool for the screening of patients and providing information on diabetes prevention for healthy populations.

Keywords