PLoS ONE (Jan 2023)
Genome characterization based on the Spike-614 and NS8-84 loci of SARS-CoV-2 reveals two major possible onsets of the COVID-19 pandemic.
Abstract
The global COVID-19 pandemic has lasted for 3 years since its outbreak, however its origin is still unknown. Here, we analyzed the genotypes of 3.14 million SARS-CoV-2 genomes based on the amino acid 614 of the Spike (S) and the amino acid 84 of NS8 (nonstructural protein 8), and identified 16 linkage haplotypes. The GL haplotype (S_614G and NS8_84L) was the major haplotype driving the global pandemic and accounted for 99.2% of the sequenced genomes, while the DL haplotype (S_614D and NS8_84L) caused the pandemic in China in the spring of 2020 and accounted for approximately 60% of the genomes in China and 0.45% of the global genomes. The GS (S_614G and NS8_84S), DS (S_614D and NS8_84S), and NS (S_614N and NS8_84S) haplotypes accounted for 0.26%, 0.06%, and 0.0067% of the genomes, respectively. The main evolutionary trajectory of SARS-CoV-2 is DS→DL→GL, whereas the other haplotypes are minor byproducts in the evolution. Surprisingly, the newest haplotype GL had the oldest time of most recent common ancestor (tMRCA), which was May 1 2019 by mean, while the oldest haplotype DS had the newest tMRCA with a mean of October 17, indicating that the ancestral strains that gave birth to GL had been extinct and replaced by the more adapted newcomer at the place of its origin, just like the sequential rise and fall of the delta and omicron variants. However, the haplotype DL arrived and evolved into toxic strains and ignited a pandemic in China where the GL strains had not arrived in by the end of 2019. The GL strains had spread all over the world before they were discovered, and ignited the global pandemic, which had not been noticed until the virus was declared in China. However, the GL haplotype had little influence in China during the early phase of the pandemic due to its late arrival as well as the strict transmission controls in China. Therefore, we propose two major onsets of the COVID-19 pandemic, one was mainly driven by the haplotype DL in China, the other was driven by the haplotype GL globally.