PLoS ONE (Jan 2023)

Identification of pregnancies and their outcomes in healthcare claims data, 2008–2019: An algorithm

  • Elizabeth C. Ailes,
  • Weiming Zhu,
  • Elizabeth A. Clark,
  • Ya-lin A. Huang,
  • Margaret A. Lampe,
  • Athena P. Kourtis,
  • Jennita Reefhuis,
  • Karen W. Hoover

Journal volume & issue
Vol. 18, no. 4

Abstract

Read online

Pregnancy is a condition of broad interest across many medical and health services research domains, but one not easily identified in healthcare claims data. Our objective was to establish an algorithm to identify pregnant women and their pregnancies in claims data. We identified pregnancy-related diagnosis, procedure, and diagnosis-related group codes, accounting for the transition to International Statistical Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) diagnosis and procedure codes, in health encounter reporting on 10/1/2015. We selected women in Merative MarketScan commercial databases aged 15–49 years with pregnancy-related claims, and their infants, during 2008–2019. Pregnancies, pregnancy outcomes, and gestational ages were assigned using the constellation of service dates, code types, pregnancy outcomes, and linkage to infant records. We describe pregnancy outcomes and gestational ages, as well as maternal age, census region, and health plan type. In a sensitivity analysis, we compared our algorithm-assigned date of last menstrual period (LMP) to fertility procedure-based LMP (date of procedure + 14 days) among women with embryo transfer or insemination procedures. Among 5,812,699 identified pregnancies, most (77.9%) were livebirths, followed by spontaneous abortions (16.2%); 3,274,353 (72.2%) livebirths could be linked to infants. Most pregnancies were among women 25–34 years (59.1%), living in the South (39.1%) and Midwest (22.4%), with large employer-sponsored insurance (52.0%). Outcome distributions were similar across ICD-9 and ICD-10 eras, with some variation in gestational age distribution observed. Sensitivity analyses supported our algorithm’s framework; algorithm- and fertility procedure-derived LMP estimates were within a week of each other (mean difference: -4 days [IQR: -13 to 6 days]; n = 107,870). We have developed an algorithm to identify pregnancies, their gestational age, and outcomes, across ICD-9 and ICD-10 eras using administrative data. This algorithm may be useful to reproductive health researchers investigating a broad range of pregnancy and infant outcomes.