PLoS ONE (Jan 2021)
The GH19 Engineering Database: Sequence diversity, substrate scope, and evolution in glycoside hydrolase family 19.
Abstract
The glycoside hydrolase 19 (GH19) is a bifunctional family of chitinases and endolysins, which have been studied for the control of plant fungal pests, the recycle of chitin biomass, and the treatment of multi-drug resistant bacteria. The GH19 domain-containing sequences (22,461) were divided into a chitinase and an endolysin subfamily by analyzing sequence networks, guided by taxonomy and the substrate specificity of characterized enzymes. The chitinase subfamily was split into seventeen groups, thus extending the previous classification. The endolysin subfamily is more diverse and consists of thirty-four groups. Despite their sequence diversity, twenty-six residues are conserved in chitinases and endolysins, which can be distinguished by two specific sequence patterns at six and four positions, respectively. Their location outside the catalytic cleft suggests a possible mechanism for substrate specificity that goes beyond the direct interaction with the substrate. The evolution of the GH19 catalytic domain was investigated by large-scale phylogeny. The inferred evolutionary history and putative horizontal gene transfer events differ from previous works. While no clear patterns were detected in endolysins, chitinases varied in sequence length by up to four loop insertions, causing at least eight distinct presence/absence loop combinations. The annotated GH19 sequences and structures are accessible via the GH19 Engineering Database (GH19ED, https://gh19ed.biocatnet.de). The GH19ED has been developed to support the prediction of substrate specificity and the search for novel GH19 enzymes from neglected taxonomic groups or in regions of the sequence space where few sequences have been described yet.