Communications Biology (Oct 2023)

SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders

  • Garyk Brixi,
  • Tianzheng Ye,
  • Lauren Hong,
  • Tian Wang,
  • Connor Monticello,
  • Natalia Lopez-Barbosa,
  • Sophia Vincoff,
  • Vivian Yudistyra,
  • Lin Zhao,
  • Elena Haarer,
  • Tianlai Chen,
  • Sarah Pertsemlidis,
  • Kalyan Palepu,
  • Suhaas Bhat,
  • Jayani Christopher,
  • Xinning Li,
  • Tong Liu,
  • Sue Zhang,
  • Lillian Petersen,
  • Matthew P. DeLisa,
  • Pranam Chatterjee

DOI
https://doi.org/10.1038/s42003-023-05464-z
Journal volume & issue
Vol. 6, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Protein-protein interactions (PPIs) are critical for biological processes and predicting the sites of these interactions is useful for both computational and experimental applications. We present a Structure-agnostic Language Transformer and Peptide Prioritization (SaLT&PepPr) pipeline to predict interaction interfaces from a protein sequence alone for the subsequent generation of peptidic binding motifs. Our model fine-tunes the ESM-2 protein language model (pLM) with a per-position prediction task to identify PPI sites using data from the PDB, and prioritizes motifs which are most likely to be involved within inter-chain binding. By only using amino acid sequence as input, our model is competitive with structural homology-based methods, but exhibits reduced performance compared with deep learning models that input both structural and sequence features. Inspired by our previous results using co-crystals to engineer target-binding “guide” peptides, we curate PPI databases to identify partners for subsequent peptide derivation. Fusing guide peptides to an E3 ubiquitin ligase domain, we demonstrate degradation of endogenous β-catenin, 4E-BP2, and TRIM8, and highlight the nanomolar binding affinity, low off-targeting propensity, and function-altering capability of our best-performing degraders in cancer cells. In total, our study suggests that prioritizing binders from natural interactions via pLMs can enable programmable protein targeting and modulation.