International Journal of Molecular Sciences (Oct 2023)

ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information

  • Mingwei Sun,
  • Haoyuan Hu,
  • Wei Pang,
  • You Zhou

DOI
https://doi.org/10.3390/ijms242015447
Journal volume & issue
Vol. 24, no. 20
p. 15447

Abstract

Read online

Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.

Keywords