Nature Communications (May 2024)
Quantifying 3′UTR length from scRNA-seq data reveals changes independent of gene expression
Abstract
Abstract Although more than half of all genes generate transcripts that differ in 3′UTR length, current analysis pipelines only quantify the amount but not the length of mRNA transcripts. 3′UTR length is determined by 3′ end cleavage sites (CS). We map CS in more than 200 primary human and mouse cell types and increase CS annotations relative to the GENCODE database by 40%. Approximately half of all CS are used in few cell types, revealing that most genes only have one or two major 3′ ends. We incorporate the CS annotations into a computational pipeline, called scUTRquant, for rapid, accurate, and simultaneous quantification of gene and 3′UTR isoform expression from single-cell RNA sequencing (scRNA-seq) data. When applying scUTRquant to data from 474 cell types and 2134 perturbations, we discover extensive 3′UTR length changes across cell types that are as widespread and coordinately regulated as gene expression changes but affect mostly different genes. Our data indicate that mRNA abundance and mRNA length are two largely independent axes of gene regulation that together determine the amount and spatial organization of protein synthesis.