Journal of Medical Internet Research (Aug 2024)

Five-Feature Models to Predict Preeclampsia Onset Time From Electronic Health Record Data: Development and Validation Study

  • Hailey K Ballard,
  • Xiaotong Yang,
  • Aditya D Mahadevan,
  • Dominick J Lemas,
  • Lana X Garmire

DOI
https://doi.org/10.2196/48997
Journal volume & issue
Vol. 26
p. e48997

Abstract

Read online

Background Preeclampsia is a potentially fatal complication during pregnancy, characterized by high blood pressure and the presence of excessive proteins in the urine. Due to its complexity, the prediction of preeclampsia onset is often difficult and inaccurate. Objective This study aimed to create quantitative models to predict the onset gestational age of preeclampsia using electronic health records. Methods We retrospectively collected 1178 preeclamptic pregnancy records from the University of Michigan Health System as the discovery cohort, and 881 records from the University of Florida Health System as the validation cohort. We constructed 2 Cox-proportional hazards models: 1 baseline model using maternal and pregnancy characteristics, and the other full model with additional laboratory findings, vitals, and medications. We built the models using 80% of the discovery data, tested the remaining 20% of the discovery data, and validated with the University of Florida data. We further stratified the patients into high- and low-risk groups for preeclampsia onset risk assessment. Results The baseline model reached Concordance indices of 0.64 and 0.61 in the 20% testing data and the validation data, respectively, while the full model increased these Concordance indices to 0.69 and 0.61, respectively. For preeclampsia diagnosed at 34 weeks, the baseline and full models had area under the curve (AUC) values of 0.65 and 0.70, and AUC values of 0.69 and 0.70 for preeclampsia diagnosed at 37 weeks, respectively. Both models contain 5 selective features, among which the number of fetuses in the pregnancy, hypertension, and parity are shared between the 2 models with similar hazard ratios and significant P values. In the full model, maximum diastolic blood pressure in early pregnancy was the predominant feature. Conclusions Electronic health records data provide useful information to predict the gestational age of preeclampsia onset. Stratification of the cohorts using 5-predictor Cox-proportional hazards models provides clinicians with convenient tools to assess the onset time of preeclampsia in patients.