Nature Communications (Jan 2024)

Quantifying negative selection in human 3ʹ UTRs uncovers constrained targets of RNA-binding proteins

  • Scott D. Findlay,
  • Lindsay Romo,
  • Christopher B. Burge

DOI
https://doi.org/10.1038/s41467-023-44456-9
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Many non-coding variants associated with phenotypes occur in 3ʹ untranslated regions (3ʹ UTRs), and may affect interactions with RNA-binding proteins (RBPs) to regulate gene expression post-transcriptionally. However, identifying functional 3ʹ UTR variants has proven difficult. We use allele frequencies from the Genome Aggregation Database (gnomAD) to identify classes of 3ʹ UTR variants under strong negative selection in humans. We develop intergenic mutability-adjusted proportion singleton (iMAPS), a generalized measure related to MAPS, to quantify negative selection in non-coding regions. This approach, in conjunction with in vitro and in vivo binding data, identifies precise RBP binding sites, miRNA target sites, and polyadenylation signals (PASs) under strong selection. For each class of sites, we identify thousands of gnomAD variants under selection comparable to missense coding variants, and find that sites in core 3ʹ UTR regions upstream of the most-used PAS are under strongest selection. Together, this work improves our understanding of selection on human genes and validates approaches for interpreting genetic variants in human 3ʹ UTRs.