PLoS ONE (Jan 2020)
Conserved sequence motifs in the abiotic stress response protein late embryogenesis abundant 3.
Abstract
LEA3 proteins, a family of abiotic stress proteins, are defined by the presence of a tryptophan-containing motif, which we name the W-motif. We use Pfam LEA3 sequences to search the Phytozome database to create a W-motif definition and a LEA3 sequence dataset. A comprehensive analysis of these sequences revealed four N-terminal motifs, as well as two previously undiscovered C-terminal motifs that contain conserved acidic and hydrophobic residues. The general architecture of the LEA3 sequences consisted of an N-terminal motif with a potential mitochondrial transport signal and the twin-arginine motif cut-site, followed by a W-motif and often a C-terminal motif. Analysis of species distribution of the motifs showed that one architecture was found exclusively in Commelinids, while two were distributed fairly evenly over all species. The physiochemical properties of the different architectures showed clustering in a relatively narrow range compared to the previously studied dehydrins. The evolutionary analysis revealed that the different sequences grouped into clades based on architecture, and that there appear to be at least two distinct groups of LEA3 proteins based on their architectures and physiochemical properties. The presence of LEA3 proteins in non-vascular plants but their absence in algae suggests that LEA3 may have arisen in the evolution of land plants.