mSphere (Feb 2022)
The COMBAT-TB Workbench: Making Powerful Mycobacterium tuberculosis Bioinformatics Accessible
Abstract
ABSTRACT Whole-genome sequencing (WGS) is a powerful method for detecting drug resistance, genetic diversity, and transmission dynamics of Mycobacterium tuberculosis. Implementation of WGS in public health microbiology laboratories is impeded by a lack of user-friendly, automated, and semiautomated pipelines. We present the COMBAT-TB Workbench, a modular, easy-to-install application that provides a web-based environment for Mycobacterium tuberculosis bioinformatics. The COMBAT-TB Workbench is built using two main software components: the IRIDA platform for its web-based user interface and data management capabilities and the Galaxy bioinformatics workflow platform for workflow execution. These components are combined into a single easy-to-install application using Docker container technology. We implemented two workflows, for M. tuberculosis sample analysis and phylogeny, in Galaxy. Building our workflows involved updating some Galaxy tools (Trimmomatic, snippy, and snp-sites) and writing new Galaxy tools (snp-dists, TB-Profiler, tb_variant_filter, and TB Variant Report). The irida-wf-ga2xml tool was updated to be able to work with recent versions of Galaxy and was further developed into IRIDA plugins for both workflows. In the case of the M. tuberculosis sample analysis, an interface was added to update the metadata stored for each sequence sample with results gleaned from the Galaxy workflow output. Data can be loaded into the COMBAT-TB Workbench via the web interface or via the command line IRIDA uploader tool. The COMBAT-TB Workbench application deploys IRIDA, the COMBAT-TB IRIDA plugins, the MariaDB database, and Galaxy using Docker containers (https://github.com/COMBAT-TB/irida-galaxy-deploy). IMPORTANCE While the reduction in the cost of WGS is making sequencing more affordable in lower- and middle-income countries (LMICs), public health laboratories in these countries seldom have access to bioinformaticians and system support engineers adept at using the Linux command line and complex bioinformatics software. The COMBAT-TB Workbench provides an open-source, modular, easy-to-deploy and -use environment for managing and analyzing M. tuberculosis WGS data and thereby makes WGS usable in practice in the LMIC context.
Keywords