Nature Communications (Jul 2022)

ProtGPT2 is a deep unsupervised language model for protein design

  • Noelia Ferruz,
  • Steffen Schmidt,
  • Birte Höcker

DOI
https://doi.org/10.1038/s41467-022-32007-7
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 10

Abstract

Read online

Protein design aims to build novel proteins customized for specific purposes, thereby holding the potential to tackle many environmental and biomedical problems. Here the authors apply some of the latest advances in natural language processing, generative Transformers, to train ProtGPT2, a language model that explores unseen regions of the protein space while designing proteins with nature-like properties.