PLoS ONE (Jan 2014)

In silico pooling of ChIP-seq control experiments.

  • Guannan Sun,
  • Rajini Srinivasan,
  • Camila Lopez-Anido,
  • Holly A Hung,
  • John Svaren,
  • Sündüz Keleş

DOI
https://doi.org/10.1371/journal.pone.0109691
Journal volume & issue
Vol. 9, no. 11
p. e109691

Abstract

Read online

As next generation sequencing technologies are becoming more economical, large-scale ChIP-seq studies are enabling the investigation of the roles of transcription factor binding and epigenome on phenotypic variation. Studying such variation requires individual level ChIP-seq experiments. Standard designs for ChIP-seq experiments employ a paired control per ChIP-seq sample. Genomic coverage for control experiments is often sacrificed to increase the resources for ChIP samples. However, the quality of ChIP-enriched regions identifiable from a ChIP-seq experiment depends on the quality and the coverage of the control experiments. Insufficient coverage leads to loss of power in detecting enrichment. We investigate the effect of in silico pooling of control samples within multiple biological replicates, multiple treatment conditions, and multiple cell lines and tissues across multiple datasets with varying levels of genomic coverage. Our computational studies suggest guidelines for performing in silico pooling of control experiments. Using vast amounts of ENCODE data, we show that pairwise correlations between control samples originating from multiple biological replicates, treatments, and cell lines/tissues can be grouped into two classes representing whether or not in silico pooling leads to power gain in detecting enrichment between the ChIP and the control samples. Our findings have important implications for multiplexing samples.