Machine learning center-specific models show improved IVF live birth predictions over US national registry-based model

Mylene W. M. Yao; Elizabeth T. Nguyen; Matthew G. Retzloff; L. April Gago; John E. Nichols; John F. Payne; Barry A. Ripps; Michael Opsahl; Jeremy Groll; Ronald Beesley; Gregory Neal; Jaye Adams; Lorie Nowak; Trevor Swanson; Xiaocong Chen

doi:10.1038/s41467-025-58744-z

Nature Communications (Apr 2025)

Machine learning center-specific models show improved IVF live birth predictions over US national registry-based model

Mylene W. M. Yao,
Elizabeth T. Nguyen,
Matthew G. Retzloff,
L. April Gago,
John E. Nichols,
John F. Payne,
Barry A. Ripps,
Michael Opsahl,
Jeremy Groll,
Ronald Beesley,
Gregory Neal,
Jaye Adams,
Lorie Nowak,
Trevor Swanson,
Xiaocong Chen

Affiliations

Mylene W. M. Yao: R&D Department, Univfy
Elizabeth T. Nguyen: R&D Department, Univfy
Matthew G. Retzloff: Fertility Center of San Antonio
L. April Gago: Gago Center for Fertility
John E. Nichols: Piedmont Reproductive Endocrinology Group
John F. Payne: Piedmont Reproductive Endocrinology Group
Barry A. Ripps: NewLIFE Fertility
Michael Opsahl: Poma Fertility
Jeremy Groll: SpringCreek Fertility
Ronald Beesley: Poma Fertility
Gregory Neal: Fertility Center of San Antonio
Jaye Adams: Fertility Center of San Antonio
Lorie Nowak: SpringCreek Fertility
Trevor Swanson: R&D Department, Univfy
Xiaocong Chen: R&D Department, Univfy

DOI: https://doi.org/10.1038/s41467-025-58744-z
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Expanding in vitro fertilization (IVF) access requires improved patient counseling and affordability via cost-success transparency. Clinicians ask how two types of live birth prediction (LBP) models perform: machine learning, center-specific (MLCS) models and the multicenter, US national registry-based model produced by Society for Assisted Reproductive Technology (SART). In a retrospective model validation study, we tested whether MLCS performs better than SART using 4635 patients’ first-IVF cycle data from 6 centers. MLCS significantly improved minimization of false positives and negatives overall (precision recall area-under-the-curve) and at the 50% LBP threshold (F1 score) compared to SART (p < 0.05). To contextualize, MLCS more appropriately assigned 23% and 11% of all patients to LBP ≥ 50% and LBP ≥ 75% whereas SART gave lower LBPs. Here, we show MLCS improves model metrics relevant for clinical utility – personalizing prognostic counseling and cost-success transparency – and is externally validated. We recommend evaluating MLCS in a larger sample of fertility centers.

Published in Nature Communications

ISSN: 2041-1723 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/ncomms/

About the journal