BMC Bioinformatics (Nov 2020)

DeepFrag-k: a fragment-based deep learning approach for protein fold recognition

  • Wessam Elhefnawy,
  • Min Li,
  • Jianxin Wang,
  • Yaohang Li

DOI
https://doi.org/10.1186/s12859-020-3504-z
Journal volume & issue
Vol. 21, no. S6
pp. 1 – 12

Abstract

Read online

Abstract Background One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold. Results Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition. Conclusions There is a set of fragments that can serve as structural “keywords” distinguishing between major protein folds. The deep learning architecture in DeepFrag-k is able to accurately identify these fragments as structure features to improve protein fold recognition.

Keywords