Scientific Reports (Aug 2023)
Prediction of gestational diabetes mellitus in Asian women using machine learning algorithms
Abstract
Abstract This study developed a machine learning algorithm to predict gestational diabetes mellitus (GDM) using retrospective data from 34,387 pregnancies in multi-centers of South Korea. Variables were collected at baseline, E0 (until 10 weeks’ gestation), E1 (11–13 weeks’ gestation) and M1 (14–24 weeks’ gestation). The data set was randomly divided into training and test sets (7:3 ratio) to compare the performances of light gradient boosting machine (LGBM) and extreme gradient boosting (XGBoost) algorithms, with a full set of variables (original). A prediction model with the whole cohort achieved area under the receiver operating characteristics curve (AUC) and area under the precision-recall curve (AUPR) values of 0.711 and 0.246 at baseline, 0.720 and 0.256 at E0, 0.721 and 0.262 at E1, and 0.804 and 0.442 at M1, respectively. Then comparison of three models with different variable sets were performed: [a] variables from clinical guidelines; [b] selected variables from Shapley additive explanations (SHAP) values; and [c] Boruta algorithms. Based on model [c] with the least variables and similar or better performance than the other models, simple questionnaires were developed. The combined use of maternal factors and laboratory data could effectively predict individual risk of GDM using a machine learning model.