Life (Apr 2022)
Prediction of Emergency Cesarean Section Using Machine Learning Methods: Development and External Validation of a Nationwide Multicenter Dataset in Republic of Korea
Abstract
This study was a multicenter retrospective cohort study of term nulliparous women who underwent labor, and was conducted to develop an automated machine learning model for prediction of emergent cesarean section (CS) before onset of labor. Nine machine learning methods of logistic regression, random forest, Support Vector Machine (SVM), gradient boosting, extreme gradient boosting (XGBoost), light gradient boosting machine (LGBM), k-nearest neighbors (KNN), Voting, and Stacking were applied and compared for prediction of emergent CS during active labor. External validation was performed using a nationwide multicenter dataset for Korean fetal growth. A total of 6549 term nulliparous women was included in the analysis, and the emergent CS rate was 16.1%. The C-statistics values for KNN, Voting, XGBoost, Stacking, gradient boosting, random forest, LGBM, logistic regression, and SVM were 0.6, 0.69, 0.64, 0.59, 0.66, 0.68, 0.68, 0.7, and 0.69, respectively. The logistic regression model showed the best predictive performance with an accuracy of 0.78. The machine learning model identified nine significant variables of maternal age, height, weight at pre-pregnancy, pregnancy-associated hypertension, gestational age, and fetal sonographic findings. The C-statistic value for the logistic regression machine learning model in the external validation set (1391 term nulliparous women) was 0.69, with an overall accuracy of 0.68, a specificity of 0.83, and a sensitivity of 0.41. Machine learning algorithms with clinical and sonographic parameters at near term could be useful tools to predict individual risk of emergent CS during active labor in nulliparous women.
Keywords