Frontiers in Endocrinology (Jun 2024)
Prediction model of preeclampsia using machine learning based methods: a population based cohort study in China
Abstract
IntroductionPreeclampsia is a disease with an unknown pathogenesis and is one of the leading causes of maternal and perinatal morbidity. At present, early identification of high-risk groups for preeclampsia and timely intervention with aspirin is an effective preventive method against preeclampsia. This study aims to develop a robust and effective preeclampsia prediction model with good performance by machine learning algorithms based on maternal characteristics, biophysical and biochemical markers at 11–13 + 6 weeks’ gestation, providing an effective tool for early screening and prediction of preeclampsia.MethodsThis study included 5116 singleton pregnant women who underwent PE screening and fetal aneuploidy from a prospective cohort longitudinal study in China. Maternal characteristics (such as maternal age, height, pre-pregnancy weight), past medical history, mean arterial pressure, uterine artery pulsatility index, pregnancy-associated plasma protein A, and placental growth factor were collected as the covariates for the preeclampsia prediction model. Five classification algorithms including Logistic Regression, Extra Trees Classifier, Voting Classifier, Gaussian Process Classifier and Stacking Classifier were applied for the prediction model development. Five-fold cross-validation with an 8:2 train-test split was applied for model validation.ResultsWe ultimately included 49 cases of preterm preeclampsia and 161 cases of term preeclampsia from the 4644 pregnant women data in the final analysis. Compared with other prediction algorithms, the AUC and detection rate at 10% FPR of the Voting Classifier algorithm showed better performance in the prediction of preterm preeclampsia (AUC=0.884, DR at 10%FPR=0.625) under all covariates included. However, its performance was similar to that of other model algorithms in all PE and term PE prediction. In the prediction of all preeclampsia, the contribution of PLGF was higher than PAPP-A (11.9% VS 8.7%), while the situation was opposite in the prediction of preterm preeclampsia (7.2% VS 16.5%). The performance for preeclampsia or preterm preeclampsia using machine learning algorithms was similar to that achieved by the fetal medicine foundation competing risk model under the same predictive factors (AUCs of 0.797 and 0.856 for PE and preterm PE, respectively).ConclusionsOur models provide an accessible tool for large-scale population screening and prediction of preeclampsia, which helps reduce the disease burden and improve maternal and fetal outcomes.
Keywords