Engineering, Technology & Applied Science Research (Apr 2024)

Transformer Encoder with Protein Language Model for Protein Secondary Structure Prediction

  • Ammar Kazm,
  • Aida Ali,
  • Haslina Hashim

DOI
https://doi.org/10.48084/etasr.6855
Journal volume & issue
Vol. 14, no. 2

Abstract

Read online

In bioinformatics, protein secondary structure prediction plays a significant role in understanding protein function and interactions. This study presents the TE_SS approach, which uses a transformer encoder-based model and the Ankh protein language model to predict protein secondary structures. The research focuses on the prediction of nine classes of structures, according to the Dictionary of Secondary Structure of Proteins (DSSP) version 4. The model's performance was rigorously evaluated using various datasets. Additionally, this study compares the model with the state-of-the-art methods in the prediction of eight structure classes. The findings reveal that TE_SS excels in nine- and three-class structure predictions while also showing remarkable proficiency in the eight-class category. This is underscored by its performance in Qs and SOV evaluation metrics, demonstrating its capability to discern complex protein sequence patterns. This advancement provides a significant tool for protein structure analysis, thereby enriching the field of bioinformatics.

Keywords