Infectious Agents and Cancer (Dec 2024)
Assessing the risk of high-grade squamous intraepithelial lesions (HSIL+) in women with LSIL biopsies: a machine learning-based study
Abstract
Abstract Objective This study aims to analyze factors associated with the missed diagnosis of high-grade squamous intraepithelial lesions (HSIL+) in patients initially diagnosed with low-grade squamous intraepithelial lesions (LSIL) through colposcopic biopsy and to develop a predictive model for assessing the risk of missed HSIL+. Methods We conducted a retrospective analysis of 505 patients who underwent loop electrical excision procedure (LEEP) following an LSIL diagnosis by colposcopic biopsy. Logistic regression was used to identify demographic and pathological parameters associated with missed diagnoses of HSIL+. Additionally, several machine learning methods were employed to construct and assess the performance of the risk prediction models. Results The overall rate of missed diagnoses for HSIL+ was 15.2%. Independent risk factors identified were HPV16/18 infection (OR 2.071; 95% CI 1.039–4.127; p = 0.039), TCT ≥ ASC-H (OR 4.147; 95% CI 1.392–12.355; p = 0.011), TZ3 (OR 1.966; 95% CI 1.003–3.853; p = 0.049) and Colposcopic impression G2 (OR 3.627; 95% CI 1.350–9.743; p = 0.011). Among the models tested, the Decision Tree algorithm demonstrated superior performance with an accuracy of 94.7%, sensitivity of 80.0%, specificity of 96.9%, and an area under the curve (AUC) of 0.936 in the validation set. Conclusion Key independent risk factors for the missed diagnosis of HSIL in patients with LSIL include HPV16/18 infection, TCT ≥ ASC-H, TZ3, and colposcopic impression G2. The Decision Tree model offers a cost-effective, reliable, and clinically valuable tool for accurately predicting the risk of missed diagnosis of HSIL+, facilitating early intervention and management.
Keywords