BMC Genomics (May 2008)
Uncovering rate variation of lateral gene transfer during bacterial genome evolution
Abstract
Abstract Background Large scale genome arrangement, such as whole gene insertion/deletion, plays an important role in bacterial genome evolution. Various methods have been employed to study the dynamic process of gene insertions and deletions, such as parsimony methods and maximum likelihood methods. Previous maximum likelihood studies have assumed that the rate of gene insertions/deletions is constant over different genes. This assumption is unrealistic. For instance, it has been shown that informational genes are less likely to be laterally transferred than non-informational genes. However, how much of the variation in gene transfer rates is due to the difference between informational genes and non-informational genes is unclear. In this study, a Γ-distribution was incorporated in the likelihood estimation by considering rate variation for gene insertions/deletions between genes. This makes it possible to address whether a difference between informational genes and non-informational genes is the main contributor to rate variation of lateral gene transfers. Results The results show that models incorporating rate variation fit the data better than do constant rate models in many phylogenetic groups. Even though informational genes are less likely to be laterally transferred than non-informational genes, the degree of rate variation for insertions/deletions did not change dramatically and remained high even when informational genes were excluded from the study. This suggests that the variation in rate of insertions/deletions is not due mainly to the simple difference between informational genes and non-informational genes. Among genes that are not classified as informational and among the informational genes themselves, there are still large differences in the rates that these genes are inserted and deleted. Conclusion While the difference in informational gene rates contributes to rate variation, it is only a small fraction of the variation present; instead, a substantial amount of rate variation for insertions/deletions remains among both informational genes and among non-informational genes.