International Journal of General Medicine (Nov 2023)

Establishment and Validation of a Machine Learning-Based Prediction Model for Termination of Pregnancy via Cesarean Section

  • Zhang R,
  • Sheng W,
  • Liu F,
  • Zhang J,
  • Bai W

Journal volume & issue
Vol. Volume 16
pp. 5567 – 5578

Abstract

Read online

Rui Zhang,1 Weixuan Sheng,2 Feiran Liu,1 Jin Zhang,1 Wenpei Bai1 1Department of Obstetrics and Gynaecology, Beijing Shijitan Hospital, Capital Medical University, Beijing, People’s Republic of China; 2Department of Anesthesiology, Beijing Shijitan Hospital, Capital Medical University, Beijing, People’s Republic of ChinaCorrespondence: Wenpei Bai, Email [email protected]: This study aimed to investigate the risk factors of cesarean section and establish a prediction model for cesarean section based on the characteristics of pregnant women.Methods: The clinical characteristics of 2552 singleton pregnant women who delivered a live baby between January 2020 and December 2021 were retrospectively reviewed. They were divided into vaginal delivery group (n = 1850) and cesarean section group (n = 702). These subjects were divided into training set (2020.1– 2021.6) and validation set (2021.7– 2021.12). In the training set, univariate analysis, Lasso regression, and Boruta were used to screen independent risk factors for cesarean section. Four models, including Logistic Regression (LR), K-Nearest Neighbor (KNN), Classification and Regression Tree (CART), and Random forest (RF), were established in the training set using K-fold cross validation, hyperparameter optimization, and random oversampling techniques. The best model was screened, and Sort graph of feature variables, univariate partial dependency profile, and Break Down profile were delineated. In the validation set, the confusion matrix parameters were calculated, and receiver operating characteristic curve (ROC), precision recall curve (PRC), calibration curve, and clinical decision curve analysis (DCA) were delineated.Results: The risk factors of cesarean section included age and height of women, weight at delivery, weight gain, para, assisted reproduction, abnormal blood glucose during pregnancy, pregnancy hypertension, scarred uterus, premature rupture of membrane (PROM), placenta previa, fetal malposition, thrombocytopenia, floating fetal head, and labor analgesia. RF had the best performance among the four models, and the accuracy of confusion matrix parameters was 0.8956357. The Matthews correlation coefficient (MCC) was 0.753012. The area under ROC (AUC-ROC) was 0.9790787, and the area under PRC (AUC-PRC) was 0.957888.Conclusion: RF prediction model for caesarean section has high discrimination performance, accuracy and consistency, and outstanding generalization ability.Keywords: caesarean section, machine learning, confusion matrix, univariate partial dependence profile

Keywords