Frontiers in Endocrinology (Mar 2024)

Identifying diagnostic indicators for type 2 diabetes mellitus from physical examination using interpretable machine learning approach

  • Xiang Lv,
  • Jiesi Luo,
  • Wei Huang,
  • Wei Huang,
  • Hui Guo,
  • Xue Bai,
  • Xue Bai,
  • Pijun Yan,
  • Pijun Yan,
  • Zongzhe Jiang,
  • Zongzhe Jiang,
  • Yonglin Zhang,
  • Runyu Jing,
  • Qi Chen,
  • Qi Chen,
  • Qi Chen,
  • Qi Chen,
  • Menglong Li

DOI
https://doi.org/10.3389/fendo.2024.1376220
Journal volume & issue
Vol. 15

Abstract

Read online

BackgroundIdentification of patients at risk for type 2 diabetes mellitus (T2DM) can not only prevent complications and reduce suffering but also ease the health care burden. While routine physical examination can provide useful information for diagnosis, manual exploration of routine physical examination records is not feasible due to the high prevalence of T2DM.ObjectivesWe aim to build interpretable machine learning models for T2DM diagnosis and uncover important diagnostic indicators from physical examination, including age- and sex-related indicators.MethodsIn this study, we present three weighted diversity density (WDD)-based algorithms for T2DM screening that use physical examination indicators, the algorithms are highly transparent and interpretable, two of which are missing value tolerant algorithms.PatientsRegarding the dataset, we collected 43 physical examination indicator data from 11,071 cases of T2DM patients and 126,622 healthy controls at the Affiliated Hospital of Southwest Medical University. After data processing, we used a data matrix containing 16004 EHRs and 43 clinical indicators for modelling.ResultsThe indicators were ranked according to their model weights, and the top 25% of indicators were found to be directly or indirectly related to T2DM. We further investigated the clinical characteristics of different age and sex groups, and found that the algorithms can detect relevant indicators specific to these groups. The algorithms performed well in T2DM screening, with the highest area under the receiver operating characteristic curve (AUC) reaching 0.9185.ConclusionThis work utilized the interpretable WDD-based algorithms to construct T2DM diagnostic models based on physical examination indicators. By modeling data grouped by age and sex, we identified several predictive markers related to age and sex, uncovering characteristic differences among various groups of T2DM patients.

Keywords