International Journal of Molecular Sciences (Jun 2024)

TGC-ARG: Anticipating Antibiotic Resistance via Transformer-Based Modeling and Contrastive Learning

  • Yihan Dong,
  • Hanming Quan,
  • Chenxi Ma,
  • Linchao Shan,
  • Lei Deng

DOI
https://doi.org/10.3390/ijms25137228
Journal volume & issue
Vol. 25, no. 13
p. 7228

Abstract

Read online

In various domains, including everyday activities, agricultural practices, and medical treatments, the escalating challenge of antibiotic resistance poses a significant concern. Traditional approaches to studying antibiotic resistance genes (ARGs) often require substantial time and effort and are limited in accuracy. Moreover, the decentralized nature of existing data repositories complicates comprehensive analysis of antibiotic resistance gene sequences. In this study, we introduce a novel computational framework named TGC-ARG designed to predict potential ARGs. This framework takes protein sequences as input, utilizes SCRATCH-1D for protein secondary structure prediction, and employs feature extraction techniques to derive distinctive features from both sequence and structural data. Subsequently, a Siamese network is employed to foster a contrastive learning environment, enhancing the model’s ability to effectively represent the data. Finally, a multi-layer perceptron (MLP) integrates and processes sequence embeddings alongside predicted secondary structure embeddings to forecast ARG presence. To evaluate our approach, we curated a pioneering open dataset termed ARSS (Antibiotic Resistance Sequence Statistics). Comprehensive comparative experiments demonstrate that our method surpasses current state-of-the-art methodologies. Additionally, through detailed case studies, we illustrate the efficacy of our approach in predicting potential ARGs.

Keywords