PLoS ONE (Jan 2013)

CLCAs - a family of metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites.

  • Anna Lenart,
  • Małgorzata Dudkiewicz,
  • Marcin Grynberg,
  • Krzysztof Pawłowski

DOI
https://doi.org/10.1371/journal.pone.0062272
Journal volume & issue
Vol. 8, no. 5
p. e62272

Abstract

Read online

The zinc-dependent metalloproteases with His-Glu-x-x-His (HExxH) active site motif, zincins, are a broad group of proteins involved in many metabolic and regulatory functions, and found in all forms of life. Human genome contains more than 100 genes encoding proteins with known zincin-like domains. A survey of all proteins containing the HExxH motif shows that approximately 52% of HExxH occurrences fall within known protein structural domains (as defined in the Pfam database). Domain families with majority of members possessing a conserved HExxH motif include, not surprisingly, many known and putative metalloproteases. Furthermore, several HExxH-containing protein domains thus identified can be confidently predicted to be putative peptidases of zincin fold. Thus, we predict zincin-like fold for eight uncharacterised Pfam families. Besides the domains with the HExxH motif strictly conserved, and those with sporadic occurrences, intermediate families are identified that contain some members with a conserved HExxH motif, but also many homologues with substitutions at the conserved positions. Such substitutions can be evolutionarily conserved and non-random, yet functional roles of these inactive zincins are not known. The CLCAs are a novel zincin-like protease family with many cases of substituted active sites. We show that this allegedly metazoan family has a number of bacterial and archaeal members. An extremely patchy phylogenetic distribution of CLCAs in prokaryotes and their conserved protein domain composition strongly suggests an evolutionary scenario of horizontal gene transfer (HGT) from multicellular eukaryotes to bacteria, providing an example of eukaryote-derived xenologues in bacterial genomes. Additionally, in a protein family identified here as closely homologous to CLCA, the CLCA_X (CLCA-like) family, a number of proteins is found in phages and plasmids, supporting the HGT scenario.