BMC Bioinformatics (May 2011)

Extraction of consensus protein patterns in regions containing non-proline <it>cis </it>peptide bonds and their functional assessment

  • Rigas Georgios,
  • Exarchos Themis P,
  • Exarchos Konstantinos P,
  • Papaloukas Costas,
  • Fotiadis Dimitrios I

DOI
https://doi.org/10.1186/1471-2105-12-142
Journal volume & issue
Vol. 12, no. 1
p. 142

Abstract

Read online

Abstract Background In peptides and proteins, only a small percentile of peptide bonds adopts the cis configuration. Especially in the case of amide peptide bonds, the amount of cis conformations is quite limited thus hampering systematic studies, until recently. However, lately the emerging population of databases with more 3D structures of proteins has produced a considerable number of sequences containing non-proline cis formations (cis-nonPro). Results In our work, we extract regular expression-type patterns that are descriptive of regions surrounding the cis-nonPro formations. For this purpose, three types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, and iii) pattern discovery using a structural equivalency set. Afterwards, using each pattern as predicate, we search the Eukaryotic Linear Motif (ELM) resource to identify potential functional implications of regions with cis-nonPro peptide bonds. The patterns extracted from each type of pattern discovery are further employed, in order to formulate a pattern-based classifier, which is used to discriminate between cis-nonPro and trans-nonPro formations. Conclusions In terms of functional implications, we observe a significant association of cis-nonPro peptide bonds towards ligand/binding functionalities. As for the pattern-based classification scheme, the highest results were obtained using the structural equivalency set, which yielded 70% accuracy, 77% sensitivity and 63% specificity.