Scientific Reports (Jan 2023)

Expanding repertoire of SARS-CoV-2 deletion mutations contributes to evolution of highly transmissible variants

  • A. J. Venkatakrishnan,
  • Praveen Anand,
  • Patrick J. Lenehan,
  • Pritha Ghosh,
  • Rohit Suratekar,
  • Eli Silvert,
  • Colin Pawlowski,
  • Abhishek Siroha,
  • Dibyendu Roy Chowdhury,
  • John C. O’Horo,
  • Joseph D. Yao,
  • Bobbi S. Pritt,
  • Andrew P. Norgan,
  • Ryan T. Hurt,
  • Andrew D. Badley,
  • John Halamka,
  • Venky Soundararajan

DOI
https://doi.org/10.1038/s41598-022-26646-5
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 11

Abstract

Read online

Abstract The emergence of highly transmissible SARS-CoV-2 variants and vaccine breakthrough infections globally mandated the characterization of the immuno-evasive features of SARS-CoV-2. Here, we systematically analyzed 2.13 million SARS-CoV-2 genomes from 188 countries/territories (up to June 2021) and performed whole-genome viral sequencing from 102 COVID-19 patients, including 43 vaccine breakthrough infections. We identified 92 Spike protein mutations that increased in prevalence during at least one surge in SARS-CoV-2 test positivity in any country over a 3-month window. Deletions in the Spike protein N-terminal domain were highly enriched for these ‘surge-associated mutations’ (Odds Ratio = 14.19, 95% CI 6.15–32.75, p value = 3.41 × 10–10). Based on a longitudinal analysis of mutational prevalence globally, we found an expanding repertoire of Spike protein deletions proximal to an antigenic supersite in the N-terminal domain that may be one of the key contributors to the evolution of highly transmissible variants. Finally, we generated clinically annotated SARS-CoV-2 whole genome sequences from 102 patients and identified 107 unique mutations, including 78 substitutions and 29 deletions. In five patients, we identified distinct deletions between residues 85–90, which reside within a linear B cell epitope. Deletions in this region arose contemporaneously on a diverse background of variants across the globe since December 2020. Overall, our findings based on genomic-epidemiology and clinical surveillance suggest that the genomic deletion of dispensable antigenic regions in SARS-CoV-2 may contribute to the evasion of immune responses and the evolution of highly transmissible variants.