mBio (Apr 2023)
Exaptation of Inactivated Host Enzymes for Structural Roles in Orthopoxviruses and Novel Folds of Virus Proteins Revealed by Protein Structure Modeling
Abstract
ABSTRACT Viruses with large, double-stranded DNA genomes captured the majority of their genes from their hosts at different stages of evolution. The origins of many virus genes are readily detected through significant sequence similarity with cellular homologs. In particular, this is the case for virus enzymes, such as DNA and RNA polymerases or nucleotide kinases, that retain their catalytic activity after capture by an ancestral virus. However, a large fraction of virus genes have no readily detectable cellular homologs, meaning that their origins remain enigmatic. We explored the potential origins of such proteins that are encoded in the genomes of orthopoxviruses, a thoroughly studied virus genus that includes major human pathogens. To this end, we used AlphaFold2 to predict the structures of all 214 proteins that are encoded by orthopoxviruses. Among the proteins of unknown provenance, structure prediction yielded clear indications of origin for 14 of them and validated several inferences that were previously made via sequence analysis. A notable emerging trend is the exaptation of enzymes from cellular organisms for nonenzymatic, structural roles in virus reproduction that is accompanied by the disruption of catalytic sites and by an overall drastic divergence that precludes homology detection at the sequence level. Among the 16 orthopoxvirus proteins that were found to be inactivated enzyme derivatives are the poxvirus replication processivity factor A20, which is an inactivated NAD-dependent DNA ligase; the major core protein A3, which is an inactivated deubiquitinase; F11, which is an inactivated prolyl hydroxylase; and more similar cases. For nearly one-third of the orthopoxvirus virion proteins, no significantly similar structures were identified, suggesting exaptation with subsequent major structural rearrangement that yielded unique protein folds. IMPORTANCE Protein structures are more strongly conserved in evolution than are amino acid sequences. Comparative structural analysis is particularly important for inferring the origins of viral proteins that typically evolve at high rates. We used a powerful protein structure modeling method, namely, AlphaFold2, to model the structures of all orthopoxvirus proteins and compared them to all available protein structures. Multiple cases of recruitment of host enzymes for structural roles in viruses, accompanied by the disruption of catalytic sites, were discovered. However, many viral proteins appear to have evolved unique structural folds.
Keywords