PLoS Neglected Tropical Diseases (Aug 2010)
Trichomonas transmembrane cyclases result from massive gene duplication and concomitant development of pseudogenes.
Abstract
Trichomonas vaginalis has an unusually large genome (approximately 160 Mb) encoding approximately 60,000 proteins. With the goal of beginning to understand why some Trichomonas genes are present in so many copies, we characterized here a family of approximately 123 Trichomonas genes that encode transmembrane adenylyl cyclases (TMACs).The large family of TMACs genes is the result of recent duplications of a small set of ancestral genes that appear to be unique to trichomonads. Duplicated TMAC genes are not closely associated with repetitive elements, and duplications of flanking sequences are rare. However, there is evidence for TMAC gene replacements by homologous recombination. A high percentage of TMAC genes (approximately 46%) are pseudogenes, as they contain stop codons and/or frame shifts, or the genes are truncated. Numerous stop codons present in the genome project G3 strain are not present in orthologous genes of two other Trichomonas strains (S1 and B7RC2). Each TMAC is composed of a series of N-terminal transmembrane helices and a single C-terminal cyclase domain that has adenylyl cyclase activity. Multiple TMAC genes are transcribed by Trichomonas cloned by limiting dilution.We conclude that one reason for the unusually large genome of Trichomonas is the presence of unstable families of genes such as those encoding TMACs that are undergoing massive gene duplication and concomitant development of pseudogenes.