BMC Bioinformatics (Jan 2019)

InTAD: chromosome conformation guided analysis of enhancer target genes

  • Konstantin Okonechnikov,
  • Serap Erkek,
  • Jan O. Korbel,
  • Stefan M. Pfister,
  • Lukas Chavez

DOI
https://doi.org/10.1186/s12859-019-2655-2
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Background High-throughput technologies for analyzing chromosome conformation at a genome scale have revealed that chromatin is organized in topologically associated domains (TADs). While TADs are relatively stable across cell types, intra-TAD activities are cell type specific. Epigenetic profiling of different tissues and cell-types has identified a large number of non-coding epigenetic regulatory elements (‘enhancers’) that can be located far away from coding genes. Linear proximity is a commonly chosen criterion for associating enhancers with their potential target genes. While enhancers frequently regulate the closest gene, unambiguous identification of enhancer regulated genes remains to be a challenge in the absence of sample matched chromosome conformation data. Results To associate enhancers with their target genes, we have previously developed and applied a method that tests for significant correlations between enhancer and gene expressions across a cohort of samples. To limit the number of tests, we constrain this analysis to gene-enhancer pairs embedded in the same TAD, where information on TAD boundaries is borrowed from publicly available chromosome conformation capturing (‘Hi-C’) data. We have now implemented this method as an R Bioconductor package ‘InTAD’ and verified the software package by reanalyzing available enhancer and gene expression data derived from ependymoma brain tumors. Conclusion The open-source package InTAD is an easy-to-use software tool for identifying proximal and distal enhancer target genes by leveraging information on correlated expression of enhancers and genes that are located in the same TAD. InTAD can be applied to any heterogeneous cohort of samples analyzed by a combination of gene expression and epigenetic profiling techniques and integrates either public or custom information of TAD boundaries.

Keywords