Heliyon (Jun 2024)
A stepwise prediction and interpretation of gestational diabetes mellitus: Foster the practical application of machine learning in clinical decision
Abstract
Background: Machine learning has shown to be an effective method for early prediction and intervention of Gestational diabetes mellitus (GDM), which greatly decreases GDM incidence, reduces maternal and infant complications and improves the prognosis. However, there is still much room for improvement in data quality, feature dimension, and accuracy. The contributions and mechanism explanations of clinical data at different pregnancy stages to the prediction accuracy are still lacking. More importantly, current models still face notable obstacles in practical applications due to the complex and diverse input features and difficulties in redeployment. As a result, a simple, practical but accurate enough model is urgently needed. Design and methods: In this study, 2309 samples from two public hospitals in Shenzhen, China were collected for analysis. Different algorithms were systematically compared to build a robust and stepwise prediction system (level A to C) based on advanced machine learning, and models under different levels were interpreted. Results: XGBoost reported the best performance with ACC of 0.922, 0.859 and 0.850, AUC of 0.974, 0.924 and 0.913 for the selected level A to C models in the test set, respectively. Tree-based feature importance and SHAP method successfully identified the commonly recognized risk factors, while indicated new inconsistent impact trends for GDM in different stages of pregnancy. Conclusion: A stepwise prediction system was successfully established. A practical tool that enables a quick prediction of GDM was released at https://github.com/ifyoungnet/MedGDM.This study is expected to provide a more detailed profiling of GDM risk and lay the foundation for the application of the model in practice.