BMC Bioinformatics (Oct 2020)

ConCysFind: a pipeline tool to predict conserved amino acids of protein sequences across the plant kingdom

  • Marten Moore,
  • Corinna Wesemann,
  • Nikolaj Gossmann,
  • Arne Sahm,
  • Jan Krüger,
  • Alexander Sczyrba,
  • Karl-Josef Dietz

DOI
https://doi.org/10.1186/s12859-020-03749-2
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Post-translational modifications (PTM) of amino acid (AA) side chains in peptides control protein structure and functionality. PTMs depend on the specific AA characteristics. The reactivity of cysteine thiol-based PTMs are unique among all proteinaceous AA. This pipeline aims to ease the identification of conserved AA of polypeptides or protein families based on the phylogenetic occurrence in the plant kingdom. The tool is customizable to include any species. The degree of AA conservation is taken as indicator for structural and functional significance, especially for PTM-based regulation. Further, this pipeline tool gives insight into the evolution of these potentially regulatory important peptides. Results The web-based or stand-alone pipeline tool Conserved Cysteine Finder (ConCysFind) was developed to identify conserved AA such as cysteine, tryptophan, serine, threonine, tyrosin and methionine. ConCysFind evaluates multiple alignments considering the proteome of 21 plant species. This exemplar study focused on Cys as evolutionarily conserved target for multiple redox PTM. Phylogenetic trees and tables with the compressed results of the scoring algorithm are generated for each Cys in the query polypeptide. Analysis of 33 translation elongation and release factors alongside of known redox proteins from Arabidopsis thaliana for conserved Cys residues confirmed the suitability of the tool for identifying conserved and functional PTM sites. Exemplarily, the redox sensitivity of cysteines in the eukaryotic release factor 1-1 (eRF1-1) was experimentally validated. Conclusion ConCysFind is a valuable tool for prediction of new potential protein PTM targets in a broad spectrum of species, based on conserved AA throughout the plant kingdom. The identified targets were successfully verified through protein biochemical assays. The pipeline is universally applicable to other phylogenetic branches by customization of the database.

Keywords