PLoS Computational Biology (Apr 2023)

Personalized prediction of the secondary oocytes number after ovarian stimulation: A machine learning model based on clinical and genetic data

  • Krystian Zieliński,
  • Sebastian Pukszta,
  • Małgorzata Mickiewicz,
  • Marta Kotlarz,
  • Piotr Wygocki,
  • Marcin Zieleń,
  • Dominika Drzewiecka,
  • Damian Drzyzga,
  • Anna Kloska,
  • Joanna Jakóbkiewicz-Banecka

Journal volume & issue
Vol. 19, no. 4

Abstract

Read online

Controlled ovarian stimulation is tailored to the patient based on clinical parameters but estimating the number of retrieved metaphase II (MII) oocytes is a challenge. Here, we have developed a model that takes advantage of the patient’s genetic and clinical characteristics simultaneously for predicting the stimulation outcome. Sequence variants in reproduction-related genes identified by next-generation sequencing were matched to groups of various MII oocyte counts using ranking, correspondence analysis, and self-organizing map methods. The gradient boosting machine technique was used to train models on a clinical dataset of 8,574 or a clinical-genetic dataset of 516 ovarian stimulations. The clinical-genetic model predicted the number of MII oocytes better than that based on clinical data. Anti-Müllerian hormone level and antral follicle count were the two most important predictors while a genetic feature consisting of sequence variants in the GDF9, LHCGR, FSHB, ESR1, and ESR2 genes was the third. The combined contribution of genetic features important for the prediction was over one-third of that revealed for anti-Müllerian hormone. Predictions of our clinical-genetic model accurately matched individuals’ actual outcomes preventing over- or underestimation. The genetic data upgrades the personalized prediction of ovarian stimulation outcomes, thus improving the in vitro fertilization procedure. Author summary Infertility is a condition that leads to the failure of natural conception. It affects more than 186 million people worldwide. Because in vitro fertilization (IVF) is an effective infertility treatment, optimizing the steps in the process is essential to best assist those trying to conceive. The IVF process begins with ovarian stimulation during which the woman takes ovary-stimulating hormones, i.e., gonadotropins, to produce a certain number of viable, fertilization-ready egg cells, but predicting the number of egg cells collected after such stimulation is difficult. These predictions are usually based on the patient’s clinical parameters and depend on the physician’s experience, making them highly subjective. Here, we used machine learning to identify features that physicians could adopt to predict the number of egg cells obtained after ovarian stimulation. We found that clinical parameters (anti-Müllerian hormone level and antral follicle count), as well as genetic characteristics (variants in reproduction-related genes—GDF9, LHCGR, FSHB, ESR1, and ESR2), are important features that increase the accuracy of such predictions. Our predictive model has been designed to help physicians tailor the protocols used for ovarian stimulation on a case-by-case basis, thereby improving the safety and efficiency of the IVF process.