Computational and Structural Biotechnology Journal (Jan 2020)
A recent origin of Orf3a from M protein across the coronavirus lineage arising by sharp divergence
Abstract
The genome of SARS-CoV-2, the coronavirus responsible for the Covid-19 pandemic, encodes a number of accessory genes. The longest accessory gene, Orf3a, plays important roles in the virus lifecycle indicated by experimental findings, known polymorphisms, its evolutionary trajectory and a distinct three-dimensional fold. Here we show that supervised, sensitive database searches with Orf3a detect weak, yet significant and highly specific similarities to the M proteins of coronaviruses. The similarity profiles can be used to derive low-resolution three-dimensional models for M proteins based on Orf3a as a structural template. The models also explain the emergence of Orf3a from M proteins and suggest a recent origin across the coronavirus lineage, enunciated by its restricted phylogenetic distribution. This study provides evidence for the common origin of M and Orf3a families and proposes for the first time a working model for the structure of the universally distributed M proteins in coronaviruses, consistent with the properties of both protein families.