BMC Bioinformatics (Jan 2022)

how_are_we_stranded_here: quick determination of RNA-Seq strandedness

  • Brandon Signal,
  • Tim Kahlke

DOI
https://doi.org/10.1186/s12859-022-04572-7
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses. Results To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination. Conclusions how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at https://github.com/betsig/how_are_we_stranded_here .

Keywords