International Journal of Women's Health (Oct 2024)
A Random Survival Forest Model for Predicting Residual and Recurrent High-Grade Cervical Intraepithelial Neoplasia in Premenopausal Women
Abstract
Furui Zhai, Shanshan Mu, Yinghui Song, Min Zhang, Cui Zhang, Ze Lv Gynecological Clinic, Cangzhou Central Hospital, Cangzhou, Hebei, People’s Republic of ChinaCorrespondence: Furui Zhai, Gynecological Clinic, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou City, Hebei Province, People’s Republic of China, Tel +86-0317-2075783, Email [email protected]: Loop electrosurgical excision procedure (LEEP) for high-grade cervical intraepithelial neoplasia (CIN) carries significant risks of recurrence and persistence. This study compares the efficacy of a random survival forest (RSF) model with that of a conventional Cox regression model for predicting residual and recurrent high-grade CIN in premenopausal women after LEEP.Methods: Data from 458 premenopausal women treated for CIN2/3 at our hospital between 2016 and 2020 were analyzed. The RSF model incorporated demographic, pathological, and treatment-related variables. Feature selection utilizing LASSO and three other algorithms was performed to enhance the RSF model, which was further compared to a Cox regression model. Model performance was assessed using area under the curve (AUC), out-of-bag (OOB) error rates, and SHAP values to interpret predictor importance.Results: The RSF model showed superior performance compared to the Cox regression model, with AUC values of 0.767– 0.901 and peak predictive performance at 36 months post-LEEP. In contrast, the highest AUC achieved by Cox regression was 0.880. The RSF model also exhibited relatively lower OOB error rates, indicating better generalizability. Moreover, SHAP value analysis identified margin status and CIN severity as the most prominent predictors that directly affected risk predictions. Lastly, an online tool providing real-time predictions in clinical settings was successfully implemented using the RSF model.Conclusion: The RSF model outperformed the traditional Cox regression model in predicting residual and recurrent high-grade CIN risks post-LEEP. This model may be a more accurate clinical tool that facilitates improved personalized care and early interventions in gynecological oncology.Keywords: cervical intraepithelial neoplasia, residual/recurrent, random survival forest, Cox regression, premenopausal women