PLoS ONE (Jan 2020)

Combining signal and sequence to detect RNA polymerase initiation in ATAC-seq data.

  • Ignacio J Tripodi,
  • Murad Chowdhury,
  • Margaret Gruca,
  • Robin D Dowell

DOI
https://doi.org/10.1371/journal.pone.0232332
Journal volume & issue
Vol. 15, no. 4
p. e0232332

Abstract

Read online

The assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) is an inexpensive protocol for measuring open chromatin regions. ATAC-seq is also relatively simple and requires fewer cells than many other high-throughput sequencing protocols. Therefore, it is tractable in numerous settings where other high throughput assays are challenging to impossible. Hence it is important to understand the limits of what can be inferred from ATAC-seq data. In this work, we leverage ATAC-seq to predict the presence of nascent transcription. Nascent transcription assays are the current gold standard for identifying regions of active transcription, including markers for functional transcription factor (TF) binding. We combine mapped short reads from ATAC-seq with the underlying peak sequence, to determine regions of active transcription genome-wide. We show that a hybrid signal/sequence representation classified using recurrent neural networks (RNNs) can identify these regions across different cell types.