Epigenetics & Chromatin (Feb 2019)

Combined analysis of dissimilar promoter accessibility and gene expression profiles identifies tissue-specific genes and actively repressed networks

  • Rebekah R. Starks,
  • Anilisa Biswas,
  • Ashish Jain,
  • Geetu Tuteja

DOI
https://doi.org/10.1186/s13072-019-0260-2
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Background The assay for transposase-accessible chromatin (ATAC-seq) is a powerful method to examine chromatin accessibility. While many studies have reported a positive correlation between gene expression and promoter accessibility, few have investigated the genes that deviate from this trend. In this study, we aimed to understand the relationship between gene expression and promoter accessibility in multiple cell types while also identifying gene regulatory networks in the placenta, an understudied organ that is critical for a successful pregnancy. Results We started by assaying the open chromatin landscape in the mid-gestation placenta, when the fetal vasculature has started developing. After incorporating transcriptomic data generated in the placenta at the same time point, we grouped genes based on their expression levels and ATAC-seq promoter coverage. We found that the genes with the strongest correlation (high expression and high coverage) are likely involved in housekeeping functions, whereas tissue-specific genes were highly expressed and had only medium–low coverage. We also predicted that genes with medium–low expression and high promoter coverage were actively repressed. Within this group, we extracted a protein–protein interaction network enriched for neuronal functions, likely preventing the cells from adopting a neuronal fate. We further confirmed that a repressive histone mark is bound to the promoters of genes in this network. Finally, we ran our pipeline using ATAC-seq and RNA-seq data generated in ten additional cell types. We again found that genes with the strongest correlation are enriched for housekeeping functions and that genes with medium–low promoter coverage and high expression are more likely to be tissue-specific. These results demonstrate that only two data types, both of which require relatively low starting material to generate and are becoming more commonly available, can be integrated to understand multiple aspects of gene regulation. Conclusions Within the placenta, we identified an active placenta-specific gene network as well as a repressed neuronal network. Beyond the placenta, we demonstrate that ATAC-seq data and RNA-seq data can be integrated to identify tissue-specific genes and actively repressed gene networks in multiple cell types.

Keywords