Nature Communications (Jul 2022)
ProtGPT2 is a deep unsupervised language model for protein design
Abstract
Protein design aims to build novel proteins customized for specific purposes, thereby holding the potential to tackle many environmental and biomedical problems. Here the authors apply some of the latest advances in natural language processing, generative Transformers, to train ProtGPT2, a language model that explores unseen regions of the protein space while designing proteins with nature-like properties.