BMC Evolutionary Biology (Sep 2010)
Unusual conservation among genes encoding small secreted salivary gland proteins from a gall midge
Abstract
Abstract Background In most protein-coding genes, greater sequence variation is observed in noncoding regions (introns and untranslated regions) than in coding regions due to selective constraints. During characterization of genes and transcripts encoding small secreted salivary gland proteins (SSSGPs) from the Hessian fly, we found exactly the opposite pattern of conservation in several families of genes: the non-coding regions were highly conserved, but the coding regions were highly variable. Results Seven genes from the SSSGP-1 family are clustered as one inverted and six tandem repeats within a 15 kb region of the genome. Except for SSSGP-1A2, a gene that encodes a protein identical to that encoded by SSSGP-1A1, the other six genes consist of a highly diversified, mature protein-coding region as well as highly conserved regions including the promoter, 5'- and 3'-UTRs, a signal peptide coding region, and an intron. This unusual pattern of highly diversified coding regions coupled with highly conserved regions in the rest of the gene was also observed in several other groups of SSSGP-encoding genes or cDNAs. The unusual conservation pattern was also found in some of the SSSGP cDNAs from the Asian rice gall midge, but not from the orange wheat blossom midge. Strong positive selection was one of the forces driving for diversification whereas concerted homogenization was likely a mechanism for sequence conservation. Conclusion Rapid diversification in mature SSSGPs suggests that the genes are under selection pressure for functional adaptation. The conservation in the noncoding regions of these genes including introns also suggested potential mechanisms for sequence homogenization that are not yet fully understood. This report should be useful for future studies on genetic mechanisms involved in evolution and functional adaptation of parasite genes.