Microbiology Spectrum (Jan 2024)

Targeted amplification and genetic sequencing of the severe acute respiratory syndrome coronavirus 2 surface glycoprotein

  • Matthew W. Keller,
  • Lisa M. Keong,
  • Benjamin L. Rambo-Martin,
  • Norman Hassell,
  • Kristine A. Lacek,
  • Malania M. Wilson,
  • Marie K. Kirby,
  • Jimma Liddell,
  • D. Collins Owuor,
  • Mili Sheth,
  • Joseph Madden,
  • Justin S. Lee,
  • Rebecca J. Kondor,
  • David E. Wentworth,
  • John R. Barnes

DOI
https://doi.org/10.1128/spectrum.02982-23
Journal volume & issue
Vol. 12, no. 1

Abstract

Read online

ABSTRACT The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein is a highly immunogenic and mutable protein that is the target of vaccine prevention and antibody therapeutics. This makes the encoding S-gene an important sequencing target. The SARS-CoV-2 sequencing community overwhelmingly adopted tiling amplicon-based strategies for sequencing the entire genome. As the virus evolved, primer mismatches inevitably led to amplicon dropout. Given the exposure of the spike protein to host antibodies, mutation occurred here most rapidly, leading to amplicon failure over the most insightful region of the genome. To mitigate this, we developed a targeted method to amplify and sequence the S-gene. We evaluated 20 distinct primer designs through iterative in silico and in vitro testing to select the optimal primer pairs and run conditions. Once selected, periodic in silico analysis monitors primer conservation as SARS-CoV-2 evolves. Despite being designed during the beta wave, the selected primers remain >99% conserved through Omicron as of 19 October 2023. To validate the final design, we compared targeted S-gene data to National SARS-CoV-2 Strain Surveillance whole-genome data for 321 matching samples. Consensus sequences for the two methods were highly identical (99.998%) across the S-gene. This method can serve as a complement to whole-genome surveillance or can be leveraged where only S-gene sequencing is of interest. IMPORTANCE The COVID-19 pandemic was accompanied by an unprecedented surveillance effort. The resulting data were and will continue to be critical for surveillance and control of SARS-CoV-2. However, some genomic surveillance methods experienced challenges as the virus evolved, resulting in incomplete and poor quality data. Complete and quality coverage, especially of the S-gene, is important for supporting the selection of vaccine candidates. As such, we developed a robust method to target the S-gene for amplification and sequencing. By focusing on the S-gene and imposing strict coverage and quality metrics, we hope to increase the quality of surveillance data for this continually evolving gene. Our technique is currently being deployed globally to partner laboratories, and public health representatives from 79 countries have received hands-on training and support. Expanding access to quality surveillance methods will undoubtedly lead to earlier detection of novel variants and better inform vaccine strain selection.

Keywords