BMC Bioinformatics (Jan 2009)
Searching for bidirectional promoters in <it>Arabidopsis thaliana</it>
Abstract
Abstract Background A "bidirectional gene pair" is defined as two adjacent genes which are located on opposite strands of DNA with transcription start sites (TSSs) not more than 1000 base pairs apart and the intergenic region between two TSSs is commonly designated as a putative "bidirectional promoter". Individual examples of bidirectional gene pairs have been reported for years, as well as a few genome-wide analyses have been studied in mammalian and human genomes. However, no genome-wide analysis of bidirectional genes for plants has been done. Furthermore, the exact mechanism of this gene organization is still less understood. Results We conducted comprehensive analysis of bidirectional gene pairs through the whole Arabidopsis thaliana genome and identified 2471 bidirectional gene pairs. The analysis shows that bidirectional genes are often coexpressed and tend to be involved in the same biological function. Furthermore, bidirectional gene pairs associated with similar functions seem to have stronger expression correlation. We pay more attention to the regulatory analysis on the intergenic regions between bidirectional genes. Using a hierarchical stochastic language model (HSL) (which is developed by ourselves), we can identify intergenic regions enriched of regulatory elements which are essential for the initiation of transcription. Finally, we picked 27 functionally associated bidirectional gene pairs with their intergenic regions enriched of regulatory elements and hypothesized them to be regulated by bidirectional promoters, some of which have the same orthologs in ancient organisms. More than half of these bidirectional gene pairs are further supported by sharing similar functional categories as these of handful experimental verified bidirectional genes. Conclusion Bidirectional gene pairs are concluded also prevalent in plant genome. Promoter analyses of the intergenic regions between bidirectional genes could be a new way to study the bidirectional gene structure, which may provide a important clue for further analysis. Such a method could be applied to other genomes.