Genome Biology (Aug 2023)

Predicting the impact of sequence motifs on gene regulation using single-cell data

  • Jacob Hepkema,
  • Nicholas Keone Lee,
  • Benjamin J. Stewart,
  • Siwat Ruangroengkulrith,
  • Varodom Charoensawan,
  • Menna R. Clatworthy,
  • Martin Hemberg

DOI
https://doi.org/10.1186/s13059-023-03021-9
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 22

Abstract

Read online

Abstract The binding of transcription factors at proximal promoters and distal enhancers is central to gene regulation. Identifying regulatory motifs and quantifying their impact on expression remains challenging. Using a convolutional neural network trained on single-cell data, we infer putative regulatory motifs and cell type-specific importance. Our model, scover, explains 29% of the variance in gene expression in multiple mouse tissues. Applying scover to distal enhancers identified using scATAC-seq from the developing human brain, we identify cell type-specific motif activities in distal enhancers. Scover can identify regulatory motifs and their importance from single-cell data where all parameters and outputs are easily interpretable.