PeerJ Computer Science (Feb 2016)

PATACSDB—the database of polyA translational attenuators in coding sequences

  • Malgorzata Habich,
  • Sergej Djuranovic,
  • Pawel Szczesny

DOI
https://doi.org/10.7717/peerj-cs.45
Journal volume & issue
Vol. 2
p. e45

Abstract

Read online Read online

Recent additions to the repertoire of gene expression regulatory mechanisms are polyadenylate (polyA) tracks encoding for poly-lysine runs in protein sequences. Such tracks stall the translation apparatus and induce frameshifting independently of the effects of charged nascent poly-lysine sequence on the ribosome exit channel. As such, they substantially influence the stability of mRNA and the amount of protein produced from a given transcript. Single base changes in these regions are enough to exert a measurable response on both protein and mRNA abundance; this makes each of these sequences a potentially interesting case study for the effects of synonymous mutation, gene dosage balance and natural frameshifting. Here we present PATACSDB, a resource that contain a comprehensive list of polyA tracks from over 250 eukaryotic genomes. Our data is based on the Ensembl genomic database of coding sequences and filtered with algorithm of 12A-1 which selects sequences of polyA tracks with a minimal length of 12 A’s allowing for one mismatched base. The PATACSDB database is accessible at: http://sysbio.ibb.waw.pl/patacsdb. The source code is available at http://github.com/habich/PATACSDB, and it includes the scripts with which the database can be recreated.

Keywords