mSystems (Jul 2024)
zDB: bacterial comparative genomics made easy
Abstract
ABSTRACT The analysis and comparison of genomes rely on different tools for tasks such as annotation, orthology prediction, and phylogenetic inference. Most tools are specialized for a single task, and additional efforts are necessary to integrate and visualize the results. To fill this gap, we developed zDB, an application integrating a Nextflow analysis pipeline and a Python visualization platform built on the Django framework. The application is available on GitHub (https://github.com/metagenlab/zDB) and from the bioconda channel. Starting from annotated Genbank files, zDB identifies orthologs and infers a phylogeny for each orthogroup. A species phylogeny is also constructed from shared single-copy orthologs. The results can be enriched with Pfam protein domain prediction, Cluster of Orthologs Genes and Kyoto Encyclopedia of Genes and Genomes annotations, and Swissprot homologs. The web application allows searching for specific genes or annotations, running Blast queries, and comparing genomic regions and whole genomes. The metabolic capacities of organisms can be compared at either the module or pathway levels. Finally, users can run queries to examine the conservation of specific genes or annotations across a chosen subset of genomes and display the results as a list of genes, Venn diagram, or heatmaps. Those features make zDB useful for both bioinformaticians and researchers more accustomed to laboratory research.IMPORTANCEGenome comparison and analysis rely on many independent tools, leaving to scientists the burden to integrate and visualize their results for interpretation. To alleviate this burden, we have built zDB, a comparative genomics tool that includes both an analysis pipeline and a visualization platform. The analysis pipeline automates gene annotation, orthology prediction, and phylogenetic inference, while the visualization platform allows scientists to easily explore the results in a web browser. Among other features, the interface allows users to visually compare whole genomes and targeted regions, assess the conservation of genes or metabolic pathways, perform Blast searches, or look for specific annotations. Altogether, this tool will be useful for a broad range of applications in comparative studies between two and hundred genomes. Furthermore, it is designed to allow sharing of data sets easily at a local or international scale, thereby supporting exploratory analyses for non-bioinformaticians on the genome of their favorite organisms.
Keywords