PLoS ONE (Jan 2017)
Identification of human short introns.
Abstract
Canonical pre-mRNA splicing requires snRNPs and associated splicing factors to excise conserved intronic sequences, with a minimum intron length required for efficient splicing. Non-canonical splicing-intron excision without the spliceosome-has been documented; most notably, some tRNAs and the XBP1 mRNA contain short introns that are not removed by the spliceosome. There have been some efforts to identify additional short introns, but little is known about how many short introns are processed from mRNAs. Here, we report an approach to identify RNA short introns from RNA-Seq data, discriminating against small genomic deletions. We identify hundreds of short introns conserved among multiple human cell lines. These short introns are often alternatively spliced and are found in a variety of RNAs-both mRNAs and lncRNAs. Short intron splicing efficiency is increased by secondary structure, and we detect both canonical and non-canonical short introns. In many cases, splicing of these short introns from mRNAs is predicted to alter the reading frame and change protein output. Our findings imply that standard gene prediction models which often assume a lower limit for intron size fail to predict short introns effectively. We conclude that short introns are abundant in the human transcriptome, and short intron splicing represents an added layer to mRNA regulation.