Frontiers in Molecular Biosciences (Apr 2019)
Evaluating and Correcting Inherent Bias of microRNA Expression in Illumina Sequencing Analysis
Abstract
microRNA (miRNA) expression profiles based on the highly powerful Illumina sequencing technology rely on the construction of cDNA libraries in which adaptor ligation is known to deeply favor some miRNAs over others. This introduces erroneous measurements of the miRNA abundances and relative miRNA quantities in biological samples. Here, by using the commercial miRXplore Universal Reference that contains an equimolar mixture of 963 animal miRNAs and TruSeq or bulged adaptors, we describe a method for correcting ligation biases in expression profiles obtained with standard protocols of cDNA library construction and provide data for quantifying the true miRNA abundances in biological samples. Ligation biases were evaluated at three ratios of miRNA to 3′-adaptor and four numbers of polymerase chain reaction amplification cycles by calculating efficiency captures/correcting factors for each miRNA. We show that ligation biases lead to over- or under-expression covering a 105 amplitude range. We also show that, at each miRNA:3′-adaptor ratio, coefficients of variation (CVs) of efficiency captures calculated over the four number of amplification cycles using sliding windows of 10 values ranged from 0.1 for the miRNAs of high expression to 0.6 for the miRNAs of low expression. Efficiency captures of miRNAs of high and low expression in profiles are therefore differently impacted by the number of amplification cycles. Importantly, we observed that at a given number of amplification cycles, CVs of efficiency captures calculated over the three miRNA:3′-adaptor ratios displayed a steady value of 0.3 +/− 0.05 STD for miRNAs of high and low expression. This allows, at a given number of amplification cycles, accurate comparison of miRNA expression between biological samples over a substantial expression range. Finally we provide tables of correcting factors that allow to measure the abundances of 963 miRNAs in biological samples from TruSeq-based expression profiles and, an example of their use by characterizing miRNAs of the let-7, miR-26, miR-29, and miR-30 families as the more abundant miRNAs of the rat adult cerebellum.
Keywords