Microbiology Spectrum (Jan 2024)
Diversity and conservation of the genome architecture of phages infecting the Alphaproteobacteria
Abstract
ABSTRACT Bacteriophages are viruses that are only capable of replicating inside a suitable bacterial host cell. Here, we performed a comprehensive meta-analysis of all 103 publicly available phage genomes that are known to infect members of the Alphaproteobacteria. We combined the sequence data and associated metadata with results from various comparative genomic and phylogenetic methods to quantify gene presence, families of orthologous gene products, and assign the phages into clusters and subclusters. Comparative genomic analyses, such as this one, allow differentiation of phages based on their evolutionary relatedness. The results of our analyses justify the classification of these phages into 16 clusters and 6 subclusters to ideally balance sequence diversity, protein orthologs, and core genes with host range. This study is the first to group phages infecting the Alphaproteobacteria. We expect that these results will inform future studies and that our classification schema will improve the classification of phages that infect bacterial taxa in the Alphaproteobacteria class. IMPORTANCE This study reports the results of the largest analysis of genome sequences from phages that infect the Alphaproteobacteria class of bacterial hosts. We analyzed over 100 whole genome sequences of phages to construct dotplots, categorize them into genetically distinct clusters, generate a bootstrapped phylogenetic tree, compute protein orthologs, and predict packaging strategies. We determined that the phage sequences primarily cluster by the bacterial host family, phage morphotype, and genome size. We expect that the findings reported in this seminal study will facilitate future analyses that will improve our knowledge of the phages that infect these hosts.
Keywords