Clinical Epigenetics (Jul 2017)

Maternal blood contamination of collected cord blood can be identified using DNA methylation at three CpGs

  • Alexander M. Morin,
  • Evan Gatev,
  • Lisa M. McEwen,
  • Julia L. MacIsaac,
  • David T. S. Lin,
  • Nastassja Koen,
  • Darina Czamara,
  • Katri Räikkönen,
  • Heather J. Zar,
  • Karestan Koenen,
  • Dan J. Stein,
  • Michael S. Kobor,
  • Meaghan J. Jones

DOI
https://doi.org/10.1186/s13148-017-0370-2
Journal volume & issue
Vol. 9, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Cord blood is a commonly used tissue in environmental, genetic, and epigenetic population studies due to its ready availability and potential to inform on a sensitive period of human development. However, the introduction of maternal blood during labor or cross-contamination during sample collection may complicate downstream analyses. After discovering maternal contamination of cord blood in a cohort study of 150 neonates using Illumina 450K DNA methylation (DNAm) data, we used a combination of linear regression and random forest machine learning to create a DNAm-based screening method. We identified a panel of DNAm sites that could discriminate between contaminated and non-contaminated samples, then designed pyrosequencing assays to pre-screen DNA prior to being assayed on an array. Results Maternal contamination of cord blood was initially identified by unusual X chromosome DNA methylation patterns in 17 males. We utilized our DNAm panel to detect contaminated male samples and a proportional amount of female samples in the same cohort. We validated our DNAm screening method on an additional 189 sample cohort using both pyrosequencing and DNAm arrays, as well as 9 publically available cord blood 450K data sets. The rate of contamination varied from 0 to 10% within these studies, likely related to collection specific methods. Conclusions Maternal blood can contaminate cord blood during sample collection at appreciable levels across multiple studies. We have identified a panel of markers that can be used to identify this contamination, either post hoc after DNAm arrays have been completed, or in advance using a targeted technique like pyrosequencing.

Keywords