BMC Genomics (Mar 2008)

Bioinformatics prediction of overlapping frameshifted translation products in mammalian transcripts

  • Seuwen Klaus,
  • Baumgarten Birgit,
  • Brüngger Adrian,
  • Ribrioux Sebastien,
  • John Markus R

DOI
https://doi.org/10.1186/1471-2164-9-122
Journal volume & issue
Vol. 9, no. 1
p. 122

Abstract

Read online

Abstract Background Exceptionally, a single nucleotide sequence can be translated in vivo in two different frames to yield distinct proteins. In the case of the G-protein alpha subunit XL-alpha-s transcript, a frameshifted open reading frame (ORF) in exon 1 is translated to yield a structurally distinct protein called Alex, which plays a role in platelet aggregation and neurological processes. We carried out a novel bioinformatics screen for other possible dual-frame translated sequences, based on comparative genomics. Results Our method searched human, mouse and rat transcripts in frames +1 and -1 for ORFs which are unusually well conserved at the amino acid level. We name these conserved frameshifted overlapping ORFs 'matreshkas' to reflect their nested character. Select findings of our analysis revealed that the G-protein coupled receptor GPR27 is entirely contained within a frame -1 matreshka, thrombopoietin contains a matreshka which spans ~70% of its length, platelet glycoprotein IIIa (ITGB3) contains a matreshka with the predicted characteristics of a secreted peptide hormone, while the potassium channel KCNK12 contains a matreshka spanning >400 amino acids. Conclusion Although the in vivo existence of translated matreshkas has not been experimentally verified, this genome-wide analysis provides strong evidence that substantial overlapping coding sequences exist in a number of human and rodent transcripts.