Open Chemistry (Jun 2019)

Applying Discriminant and Cluster Analyses to Separate Allergenic from Non-allergenic Proteins

  • Naneva L.,
  • Nedyalkova M.,
  • Madurga S.,
  • Mas F.,
  • Simeonov V.

DOI
https://doi.org/10.1515/chem-2019-0045
Journal volume & issue
Vol. 17, no. 1
pp. 401 – 407

Abstract

Read online

As a result of increased healthcare requirements and the introduction of genetically modified foods, the problem of allergies is becoming a growing health problem. The concept of allergies has prompted the use of new methods such as genomics and proteomics to uncover the nature of allergies. In the present study, a selection of 1400 food proteins was analysed by PLS-DA (Partial Least Square-based Discriminant Analysis) after suitable transformation of structural parameters into uniform vectors. Then, the resulting strings of different length were converted into vectors with equal length by Auto and Cross-Covariance (ACC) analysis. Hierarchical and non-hierarchical (K-means) Cluster Analysis (CA) was also performed in order to reach a certain level of separation within a small training set of plant proteins (16 allergenic and 16 non-allergenic) using a new three-dimensional descriptor based on surface protein properties in combination with amino acid hydrophobicity scales. The novelty of the approach in protein differentiation into allergenic and non-allergenic classes is described in the article.

Keywords