BMC Genomic Data (Nov 2023)

Probable human origin of the SARS-CoV-2 polybasic furin cleavage motif

  • Antonio R. Romeu

DOI
https://doi.org/10.1186/s12863-023-01169-8
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background The key evolutionary step leading to the pandemic virus was the acquisition of the PRRA furin cleavage motif at the spike glycoprotein S1/S2 junction by a progenitor of SARS-CoV-2. Two of its features draw attention: (i) it is absent in other known lineage B beta-coronaviruses, including the newly discovered coronaviruses in bats from Laos and Vietnam, which are the closest known relatives of the covid virus; and, (ii) it introduced the pair of arginine codons (CGG-CGG), whose usage is extremely rare in coronaviruses. With an occurrence rate of only 3%, the arginine CGG codon is considered a minority in SARS CoV-2. On the other hand, Laos and Vietnam bat coronaviruses contain receptor-binding domains that are almost identical to that of SARS-CoV-2 and can therefore infect human cells despite the absence of the furin cleavage motif. Results Based on these data, the aim of this work is to provide a detailed sequence analysis between the SARS-CoV-2 S gene insert encoding PRRA and the human mRNA transcripts. The result showed a 100% match to several mRNA transcripts. The set of human genes whose mRNAs match this S gene insert are ubiquitous and highly expressed, e.g., the ATPase F1 (ATP5F1) and the ubiquitin specific peptidase 21 (USP21) genes; or specific genes of target organs or tissues of the SARS-CoV-2 infection (e.g., MEMO1, SALL3, TRIM17, CWC15, CCDC187, FAM71E2, GAB4, PRDM13). Results suggest that a recombination between the genome of a SARS-CoV-2 progenitor and human mRNA transcripts could be the origin of the S gene 12-nucleotide insert encoding the S protein PRRA motif. Conclusions The hypothesis of probable human origin of the SARS-CoV-2 polybasic furin cleavage motif is supported by: (i) the nature of human genes whose mRNA sequence 100% match the S gene insert; (ii) the synonymous base substitution in the arginine codons (CGG-CGG); and (iii) further spike glycoprotein PRRA-like insertions suggesting that the acquisition of PRRA may not have been a single recombination event.

Keywords