Nature Communications (Oct 2022)

Protein language models trained on multiple sequence alignments learn phylogenetic relationships

  • Umberto Lupo,
  • Damiano Sgarbossa,
  • Anne-Florence Bitbol

DOI
https://doi.org/10.1038/s41467-022-34032-y
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 11

Abstract

Read online

Protein language models taking multiple sequence alignments as inputs capture protein structure and mutational effects. Here, the authors show that these models also encode phylogenetic relationships, and can disentangle correlations due to structural constraints from those due to phylogeny.