Biology Direct (Sep 2010)

Relating underrepresented genomic DNA patterns and tiRNAs: the rule behind the observation and beyond

  • Varnai Peter,
  • Turu Gabor,
  • Cserzo Miklos,
  • Hunyady Laszlo

DOI
https://doi.org/10.1186/1745-6150-5-56
Journal volume & issue
Vol. 5, no. 1
p. 56

Abstract

Read online

Abstract Background One of the central problems of post-genomic biology is the understanding of regulatory network of genes. Traditionally the problem is approached from the protein-DNA interaction perspective. In recent years various types of noncoding RNAs appeared on the scene as new potent players of the game. The exact role of these molecules in gene expression control is mostly unknown at present, while their importance is generally recognized. Results The Human and Mouse genomes have been screened with a statistical model for sequence patterns underrepresented in these genomes, and a subset of motifs, named spanions, has been identified. The common portion of the motif lists of the two species is 75% indicating evolutionary conservation of this feature. These motifs are arranged in clusters at close proximity of distinct genetic landmarks: 5' ends of genes, exon side of the exon/intron junctions and 5' ends of 3' UTRs. The length of the clusters is typically in the 20 to 25 bases range. The findings are in agreement with the known C/G bias of promoter regions while access much more sequential information than the simple composition based model. In the Human genome the recently reported transcription initiation RNAs (tiRNAs) are typically transcribed from these spanion clusters according to the presented results. The spanion clusters account for 70% of the published tiRNAs. Apparently, the model access the common statistical feature of this new and mostly uncharacterized non-coding RNA class and, in this way, supports the experimental observations with theoretical background. Conclusions The presented results seem to support the emerging model of the RNA-driven eukaryotic gene expression control. Beyond that, the model detects spanion clusters at genetic positions where no tiRNA counterpart was considered and reported. The GO-term analysis of genes with high concentration of spanion clusters in their promoter proximal region indicates involvement in gene regulatory processes. The results of the analysis suggest that the gene regulatory potential of the small non-coding RNAs is grossly underestimated at present. Reviewers This article was reviewed by Frank Eisenhaber, Sandor Pongor and Rotem Sorek (nominated by Doron Lancet).