IEEE Access (Jan 2024)

AbGraftBERT: Enhancing Antibody Design Through Self-Grafting and Amino Acid-Based Perturbation of Pre- Trained Language Models

  • Ziheng Wang,
  • Xinhe Li,
  • Ying Xie,
  • Zeyu Wen,
  • Ruinian Jin,
  • Hiroaki Takada,
  • Ryoichi Nagatomi

DOI
https://doi.org/10.1109/ACCESS.2024.3416461
Journal volume & issue
Vol. 12
pp. 87438 – 87450

Abstract

Read online

Recent advancements in sequence-structure co-design have demonstrated the efficacy of integrating pre-trained language models (PLMs) with Graph Neural Network (GNN)-based modules for antigen-specific antibody design. A significant advantage of PLMs is their ability to directly predict the Complementarity-Determining Region (CDR), or reduce reliance on structure data by providing a good initialization for the downstream GNN modules. However, while the performance scaling law suggests that training larger models can lead to improved performance, scaling up PLMs for downstream modules faces limitations due to constraints on the size of datasets used for training and fine-tuning. To address the limitations of larger PLMs trained on constrained datasets, we introduce AbGraftBERT. Initially, inspired by the depth up-scaling approach, we provide a Self-Grafting strategy by dividing a PLM into two components: the root layers (first 10 layers, with parameters frozen) and the crown layers (last 10 layers, with parameters frozen), then interconnected these layers through a full-connection layer inspired by the model reprogramming strategy. Furthermore, we introduce Cross-Embedding Attentional Perturbation (CEAP) which perturbs the embeddings passed to the downstream GNN, enabling the model to enhance its focus on critical amino acids and improve robustness. Our experiments demonstrate that our model surpasses current state-of-the-art models in antibody-related tasks, including sequence and structure prediction, antigen-binding antibody design and optimization of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) neutralization. The source code of the AbGraftBERT model is available at https://github.com/azusakou/AbGraftBERT

Keywords