PLoS Computational Biology (Jul 2021)

The interplay of SARS-CoV-2 evolution and constraints imposed by the structure and functionality of its proteins.

  • Lukasz Jaroszewski,
  • Mallika Iyer,
  • Arghavan Alisoltani,
  • Mayya Sedova,
  • Adam Godzik

DOI
https://doi.org/10.1371/journal.pcbi.1009147
Journal volume & issue
Vol. 17, no. 7
p. e1009147

Abstract

Read online

The unprecedented pace of the sequencing of the SARS-CoV-2 virus genomes provides us with unique information about the genetic changes in a single pathogen during ongoing pandemic. By the analysis of close to 200,000 genomes we show that the patterns of the SARS-CoV-2 virus mutations along its genome are closely correlated with the structural and functional features of the encoded proteins. Requirements of foldability of proteins' 3D structures and the conservation of their key functional regions, such as protein-protein interaction interfaces, are the dominant factors driving evolutionary selection in protein-coding genes. At the same time, avoidance of the host immunity leads to the abundance of mutations in other regions, resulting in high variability of the missense mutation rate along the genome. "Unexplained" peaks and valleys in the mutation rate provide hints on function for yet uncharacterized genomic regions and specific protein structural and functional features they code for. Some of these observations have immediate practical implications for the selection of target regions for PCR-based COVID-19 tests and for evaluating the risk of mutations in epitopes targeted by specific antibodies and vaccine design strategies.