Scientific Reports (May 2024)

Diagnostic utility of clinicodemographic, biochemical and metabolite variables to identify viable pregnancies in a symptomatic cohort during early gestation

  • Christopher J. Hill,
  • Marie M. Phelan,
  • Philip J. Dutton,
  • Paula Busuulwa,
  • Alison Maclean,
  • Andrew S. Davison,
  • Josephine A. Drury,
  • Nicola Tempest,
  • Andrew W. Horne,
  • Eva Caamaño Gutiérrez,
  • Dharani K. Hapangama

DOI
https://doi.org/10.1038/s41598-024-61690-3
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract A significant number of pregnancies are lost in the first trimester and 1–2% are ectopic pregnancies (EPs). Early pregnancy loss in general can cause significant morbidity with bleeding or infection, while EPs are the leading cause of maternal mortality in the first trimester. Symptoms of pregnancy loss and EP are very similar (including pain and bleeding); however, these symptoms are also common in live normally sited pregnancies (LNSP). To date, no biomarkers have been identified to differentiate LNSP from pregnancies that will not progress beyond early gestation (non-viable or EPs), defined together as combined adverse outcomes (CAO). In this study, we present a novel machine learning pipeline to create prediction models that identify a composite biomarker to differentiate LNSP from CAO in symptomatic women. This prospective cohort study included 370 participants. A single blood sample was prospectively collected from participants on first emergency presentation prior to final clinical diagnosis of pregnancy outcome: LNSP, miscarriage, pregnancy of unknown location (PUL) or tubal EP (tEP). Miscarriage, PUL and tEP were grouped together into a CAO group. Human chorionic gonadotrophin β (β-hCG) and progesterone concentrations were measured in plasma. Serum samples were subjected to untargeted metabolomic profiling. The cohort was randomly split into train and validation data sets, with the train data set subjected to variable selection. Nine metabolite signals were identified as key discriminators of LNSP versus CAO. Random forest models were constructed using stable metabolite signals alone, or in combination with plasma hormone concentrations and demographic data. When comparing LNSP with CAO, a model with stable metabolite signals only demonstrated a modest predictive accuracy (0.68), which was comparable to a model of β-hCG and progesterone (0.71). The best model for LNSP prediction comprised stable metabolite signals and hormone concentrations (accuracy = 0.79). In conclusion, serum metabolite levels and biochemical markers from a single blood sample possess modest predictive utility in differentiating LNSP from CAO pregnancies upon first presentation, which is improved by variable selection and combination using machine learning. A diagnostic test to confirm LNSP and thus exclude pregnancies affecting maternal morbidity and potentially life-threatening outcomes would be invaluable in emergency situations.

Keywords