Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States; Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States; Department of Genetics and Biochemistry, Center for Human Genetics, Clemson University, Greenwood, United States
Justin M Waldern
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States
Abhishek Dey
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States
Verna and Marrs McClean Department of Biochemistry and Molecular Biology, Therapeutic Innovation Center (THINC), and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, United States
Kevin M Weeks
Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, United States
David H Mathews
Department of Biochemistry & Biophysics and Center for RNA Biology, School of Medicine and Dentistry, University of Rochester, Rochester, United States
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States; Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, United States
Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and downstream function are particularly challenging. Here, we use a novel chemical probing strategy to visualize endogenous precursor and mature MAPT mRNA structures in cells. We used these data to estimate Boltzmann suboptimal structural ensembles, which were then analyzed to predict consequences of mutations on pre-mRNA structure. Further analysis of recent cryo-EM structures of the spliceosome at different stages of the splicing cycle revealed that the footprint of the Bact complex with pre-mRNA best predicted alternative splicing outcomes for exon 10 inclusion of the alternatively spliced MAPT gene, achieving 74% accuracy. We further developed a β-regression weighting framework that incorporates splice site strength, RNA structure, and exonic/intronic splicing regulatory elements capable of predicting, with 90% accuracy, the effects of 47 known and 6 newly discovered mutations on inclusion of exon 10 of MAPT. This combined experimental and computational framework represents a path forward for accurate prediction of splicing-related disease-causing variants.