Journal of Cheminformatics (Oct 2024)

Bitter peptide prediction using graph neural networks

  • Prashant Srivastava,
  • Alexandra Steuer,
  • Francesco Ferri,
  • Alessandro Nicoli,
  • Kristian Schultz,
  • Saptarshi Bej,
  • Antonella Di Pizio,
  • Olaf Wolkenhauer

DOI
https://doi.org/10.1186/s13321-024-00909-x
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Bitter taste is an unpleasant taste modality that affects food consumption. Bitter peptides are generated during enzymatic processes that produce functional, bioactive protein hydrolysates or during the aging process of fermented products such as cheese, soybean protein, and wine. Understanding the underlying peptide sequences responsible for bitter taste can pave the way for more efficient identification of these peptides. This paper presents BitterPep-GCN, a feature-agnostic graph convolution network for bitter peptide prediction. The graph-based model learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. BitterPep-GCN was benchmarked using BTP640, a publicly available bitter peptide dataset. The latent peptide embeddings generated by the trained model were used to analyze the activity of sequence motifs responsible for the bitter taste of the peptides. Particularly, we calculated the activity for individual amino acids and dipeptide, tripeptide, and tetrapeptide sequence motifs present in the peptides. Our analyses pinpoint specific amino acids, such as F, G, P, and R, as well as sequence motifs, notably tripeptide and tetrapeptide motifs containing FF, as key bitter signatures in peptides. This work not only provides a new predictor of bitter taste for a more efficient identification of bitter peptides in various food products but also gives a hint into the molecular basis of bitterness. Scientific Contribution Our work provides the first application of Graph Neural Networks for the prediction of peptide bitter taste. The best-developed model, BitterPep-GCN, learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. The embeddings were used to analyze the sequence motifs responsible for the bitter taste.

Keywords