BMC Evolutionary Biology (Feb 2007)

Long-term trends in evolution of indels in protein sequences

  • Shoemaker Benjamin,
  • Babenko Vladimir,
  • Madej Thomas,
  • Wolf Yuri,
  • Panchenko Anna R

DOI
https://doi.org/10.1186/1471-2148-7-19
Journal volume & issue
Vol. 7, no. 1
p. 19

Abstract

Read online

Abstract Background In this paper we describe an analysis of the size evolution of both protein domains and their indels, as inferred by changing sizes of whole domains or individual unaligned regions or "spacers". We studied relatively early evolutionary events and focused on protein domains which are conserved among various taxonomy groups. Results We found that more than one third of all domains have a statistically significant tendency to increase/decrease in size in evolution as judged from the overall domain size distribution as well as from the size distribution of individual spacers. Moreover, the fraction of domains and individual spacers increasing in size is almost twofold larger than the fraction decreasing in size. Conclusion We showed that the tolerance to insertion and deletion events depends on the domain's taxonomy span. Eukaryotic domains are depleted in insertions compared to the overall test set, namely, the number of spacers increasing in size is about the same as the number of spacers decreasing in size. On the other hand, ancient domain families show some bias towards insertions or spacers which grow in size in evolution. Domains from several Gene Ontology categories also demonstrate certain tendencies for insertion or deletion events as inferred from the analysis of spacer sizes.