Frontiers in Cell and Developmental Biology (Oct 2024)

DeepO-GlcNAc: a web server for prediction of protein O-GlcNAcylation sites using deep learning combined with attention mechanism

  • Liyuan Zhang,
  • Tingzhi Deng,
  • Tingzhi Deng,
  • Shuijing Pan,
  • Minghui Zhang,
  • Yusen Zhang,
  • Chunhua Yang,
  • Xiaoyong Yang,
  • Geng Tian,
  • Jia Mi

DOI
https://doi.org/10.3389/fcell.2024.1456728
Journal volume & issue
Vol. 12

Abstract

Read online

IntroductionProtein O-GlcNAcylation is a dynamic post-translational modification involved in major cellular processes and associated with many human diseases. Bioinformatic prediction of O-GlcNAc sites before experimental validation is a challenge task in O-GlcNAc research. Recent advancements in deep learning algorithms and the availability of O-GlcNAc proteomics data present an opportunity to improve O-GlcNAc site prediction.ObjectivesThis study aims to develop a deep learning-based tool to improve O-GlcNAcylation site prediction.MethodsWe construct an annotated unbalanced O-GlcNAcylation data set and propose a new deep learning framework, DeepO-GlcNAc, using Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) combined with attention mechanism.ResultsThe ablation study confirms that the additional model components in DeepO-GlcNAc, such as attention mechanisms and LSTM, contribute positively to improving prediction performance. Our model demonstrates strong robustness across five cross-species datasets, excluding humans. We also compare our model with three external predictors using an independent dataset. Our results demonstrated that DeepO-GlcNAc outperforms the external predictors, achieving an accuracy of 92%, an average precision of 72%, a MCC of 0.60, and an AUC of 92% in ROC analysis. Moreover, we have implemented DeepO-GlcNAc as a web server to facilitate further investigation and usage by the scientific community.ConclusionOur work demonstrates the feasibility of utilizing deep learning for O-GlcNAc site prediction and provides a novel tool for O-GlcNAc investigation.

Keywords