Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus

Tianze Ding; Peijie Liu; Jie Jia; Hui Wu; Jie Zhu; Kefeng Yang

doi:10.1530/EC-24-0169

Endocrine Connections (Nov 2024)

Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus

Tianze Ding,
Peijie Liu,
Jie Jia,
Hui Wu,
Jie Zhu,
Kefeng Yang

Affiliations

Tianze Ding: Department of Clinical Nutrition, Xin Hua Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai, China; Department of Clinical Nutrition, College of Heath Science and Technology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Peijie Liu: Department of Clinical Nutrition, Xin Hua Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai, China; Department of Clinical Nutrition, College of Heath Science and Technology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Jie Jia: Department of Clinical Nutrition, Xin Hua Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai, China; Department of Clinical Nutrition, College of Heath Science and Technology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
Hui Wu: Department of Nutrition, Seventh People’s Hospital of Shanghai University of Traditional Chinese Medicine, Shanghai, China
Jie Zhu: Nutrition and Foods Program, School of Family and Consumer Sciences, Texas State University, San Marcos, Texas, USA
Kefeng Yang: Department of Clinical Nutrition, Xin Hua Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai, China; Department of Clinical Nutrition, College of Heath Science and Technology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China

DOI: https://doi.org/10.1530/EC-24-0169
Journal volume & issue: Vol. 13, no. 12
pp. 1 – 8

Abstract

Read online

Introduction: Gestational diabetes mellitus (GDM) significantly affects pregnancy outcomes. Therefore, it is crucial to develop prediction models since they can guide timely interventions to reduce the incidence of GDM and its associated adverse effects. Methods: A total of 554 pregnant women were selected and their sociodemographic characteristics, clinical data and dietary data were collected. Dietary data were investigated by a validated semi-quantitative food frequency questionnaire (FFQ). We applied random forest mean decrease impurity for feature selection and the models are built using logistic regression, XGBoost, and LightGBM algorithms. The prediction performance of different models was compared by accuracy, sensitivity, specificity, area under curve (AUC) and Hosmer–Lemeshow test. Results: Blood glucose, age, pre-pregnancy body mass index (BMI), triglycerides and high-density lipoprotein cholesterol (HDL) were the top five features according to the feature selection. Among the three algorithms, XGBoost performed best with an AUC of 0.788, LightGBM came second (AUC = 0.749), and logistic regression performed the worst (AUC = 0.712). In addition, XGBoost and LightGBM both achieved a fairly good performance when dietary information was included, surpassing their performance on the non-dietary dataset (0.788 vs 0.718 in XGBoost; 0.749 vs 0.726 in LightGBM). Conclusion: XGBoost and LightGBM algorithms outperform logistic regression in predicting GDM among Chinese pregnant women. In addition, dietary data may have a positive effect on improving model performance, which deserves more in-depth investigation with larger sample size.

Published in Endocrine Connections

ISSN: 2049-3614 (Online)
Publisher: Bioscientifica
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Specialties of internal medicine: Diseases of the endocrine glands. Clinical endocrinology
Website: http://www.endocrineconnections.com/

About the journal

Abstract

Keywords