Microbiology Spectrum (Apr 2022)

Conserved Pattern and Potential Role of Recurrent Deletions in SARS-CoV-2 Evolution

  • Shenghui Weng,
  • Hangyu Zhou,
  • Chengyang Ji,
  • Liang Li,
  • Na Han,
  • Rong Yang,
  • Jingzhe Shang,
  • Aiping Wu

DOI
https://doi.org/10.1128/spectrum.02191-21
Journal volume & issue
Vol. 10, no. 2

Abstract

Read online

ABSTRACT SARS-CoV-2 continues adapting to human hosts during the current worldwide pandemic since 2019. This virus evolves through multiple means, such as single nucleotide mutations and structural variations, which has brought great difficulty to disease prevention and control of COVID-19. Structural variation, including multiple nucleotide changes like insertions and deletions, has a greater impact relative to single nucleotide mutation on both genome structures and protein functions. In this study, we found that deletion occurred frequently in not only SARS-CoV-2 but also in other SARS-related coronaviruses. These deletions showed obvious location bias and formed 45 recurrent deletion regions in the viral genome. Some of these deletions showed proliferation advantages, including four high-frequency deletions (nsp6 Δ106-109, S Δ69-70, S Δ144, and Δ28271) that were detected in around 50% of SARS-CoV-2 genomes and other 19 median-frequency deletions. In addition, the association between deletions and the WHO reported variants of concern (VOC) and variants of interest (VOI) of SARS-CoV-2 indicated that these variants had a unique combination of deletion patterns. In the spike (S) protein, the deletions in SARS-CoV-2 were mainly in the N-terminal domain. Some deletions, such as S Δ144/145 and S Δ243-244, have been confirmed to block the binding sites of neutralizing antibodies. Overall, this study revealed a conservative regional pattern and the potential effect of some deletions in SARS-CoV-2 over the whole genome, providing important evidence for potential epidemic control and vaccine development. IMPORTANCE Mutations in SARS-CoV-2 were studied extensively, while only the structure variations on the spike protein were discussed well in previous studies. To study the role of structural variations in virus evolution, we described the distribution of structure variations on the whole genome. Conserved patterns were found of deletions among SARS-CoV-2, SARS-CoV-2-like, and SARS-CoV-like viruses. There were 45 recurrent deletion regions (RDRs) in SARS-CoV-2 generated through the integration of deleted positions. In these regions, four high-frequency deletions parallelly appeared in multiple strains. Furthermore, in the spike protein, the deletions in SARS-CoV-2 were mainly in the N-terminal domain, blocking the binding sites of some neutralizing antibodies, while the structural variations in SARS-related coronavirus were mainly in the N-terminal domain and receptor binding domain. The receptor binding domain is highly related to hosting recognition. The deletions in the receptor binding domain may play a role in host adaption.

Keywords