PLoS ONE (Jan 2016)

BEclear: Batch Effect Detection and Adjustment in DNA Methylation Data.

  • Ruslan Akulenko,
  • Markus Merl,
  • Volkhard Helms

DOI
https://doi.org/10.1371/journal.pone.0159921
Journal volume & issue
Vol. 11, no. 8
p. e0159921

Abstract

Read online

Batch effects describe non-natural variations of, for example, large-scale genomic data sets. If not corrected by suitable numerical algorithms, batch effects may seriously affect the analysis of these datasets. The novel array platform independent software tool BEclear enables researchers to identify those portions of the data that deviate statistically significant from the remaining data and to replace these portions by typical values reconstructed from neighboring data entries based on latent factor models. In contrast to other comparable methods that often use some sort of global normalization of the data, BEclear avoids changing the apparently unaffected parts of the data. We tested the performance of this approach on DNA methylation data for various tumor data sets taken from The Cancer Genome Atlas and compared the results to those obtained with the existing algorithms ComBat, Surrogate Variable Analysis, RUVm and Functional normalization. BEclear constantly performed at par with or better than these methods. BEclear is available as an R package at the Bioconductor project http://bioconductor.org/packages/release/bioc/html/BEclear.html.