PLoS ONE (Jan 2020)

RefShannon: A genome-guided transcriptome assembler using sparse flow decomposition.

  • Shunfu Mao,
  • Lior Pachter,
  • David Tse,
  • Sreeram Kannan

DOI
https://doi.org/10.1371/journal.pone.0232946
Journal volume & issue
Vol. 15, no. 6
p. e0232946

Abstract

Read online

High throughput sequencing of RNA (RNA-Seq) has become a staple in modern molecular biology, with applications not only in quantifying gene expression but also in isoform-level analysis of the RNA transcripts. To enable such an isoform-level analysis, a transcriptome assembly algorithm is utilized to stitch together the observed short reads into the corresponding transcripts. This task is complicated due to the complexity of alternative splicing - a mechanism by which the same gene may generate multiple distinct RNA transcripts. We develop a novel genome-guided transcriptome assembler, RefShannon, that exploits the varying abundances of the different transcripts, in enabling an accurate reconstruction of the transcripts. Our evaluation shows RefShannon is able to improve sensitivity effectively (up to 22%) at a given specificity in comparison with other state-of-the-art assemblers. RefShannon is written in Python and is available from Github (https://github.com/shunfumao/RefShannon).