BMC Pregnancy and Childbirth (Nov 2024)
Building a machine learning-based risk prediction model for second-trimester miscarriage
Abstract
Abstract Background Second-trimester miscarriage is a common adverse pregnancy outcome that imposes substantial economic and psychological pressures on both the physical and mental well-being of patients and their families. Currently, there is a scarcity of research on predictive models for the risk of second-trimester miscarriage. Methods Clinical data were retrospectively collected from patients who were in the second trimester of pregnancy (between 14+0 and 27+6 weeks gestation), whose main diagnosis was “threatened abortion” and who were hospitalized at the Women and Children’s Hospital of Ningbo University from January 2020 to October 2023. Following preliminary data processing, the patient cohort was randomly stratified into a training cohort and a validation cohort at proportions of 70% and 30%, respectively. The Boruta algorithm and multifactor analysis were used to refine feature factors and determine the optimal features linked to second-trimester miscarriages. The imbalanced dataset from the training cohort was rectified by applying the SMOTE oversampling approach. Seven machine-learning models were built and subjected to a comprehensive analysis to validate and evaluate their predictive capabilities. Through this rigorous assessment, the optimal model was selected. Shapley additive explanations (SHAP) were generated to provide insights into the model’s predictions, and a visual representation of the predictive model was built. Results A total of 2006 patients were included in the study; 395 (19.69%) of them had second-trimester miscarriages. XGBoost was shown to be the optimal model after a comparison of seven different models utilizing metrics such as accuracy, precision, recall, the F1 score, precision-recall average precision, the receiver operating characteristic-area under the curve, decision curve analysis, and the calibration curve. The most significant feature was cervical length, and the top ten features of second-trimester miscarriage were found using the SHAP technique based on relevance rankings. Conclusion The risk of a second-trimester miscarriage can be accurately predicted by the visual risk prediction model, which is based on the machine learning mentioned above.
Keywords