Nature Communications (Apr 2025)
Machine learning center-specific models show improved IVF live birth predictions over US national registry-based model
Abstract
Abstract Expanding in vitro fertilization (IVF) access requires improved patient counseling and affordability via cost-success transparency. Clinicians ask how two types of live birth prediction (LBP) models perform: machine learning, center-specific (MLCS) models and the multicenter, US national registry-based model produced by Society for Assisted Reproductive Technology (SART). In a retrospective model validation study, we tested whether MLCS performs better than SART using 4635 patients’ first-IVF cycle data from 6 centers. MLCS significantly improved minimization of false positives and negatives overall (precision recall area-under-the-curve) and at the 50% LBP threshold (F1 score) compared to SART (p < 0.05). To contextualize, MLCS more appropriately assigned 23% and 11% of all patients to LBP ≥ 50% and LBP ≥ 75% whereas SART gave lower LBPs. Here, we show MLCS improves model metrics relevant for clinical utility – personalizing prognostic counseling and cost-success transparency – and is externally validated. We recommend evaluating MLCS in a larger sample of fertility centers.