BMC Bioinformatics (Aug 2024)
refMLST: reference-based multilocus sequence typing enables universal bacterial typing
Abstract
Abstract Background Commonly used approaches for genomic investigation of bacterial outbreaks, including SNP and gene-by-gene approaches, are limited by the requirement for background genomes and curated allele schemes, respectively. As a result, they only work on a select subset of known organisms, and fail on novel or less studied pathogens. We introduce refMLST, a gene-by-gene approach using the reference genome of a bacterium to form a scalable, reproducible and robust method to perform outbreak investigation. Results When applied to multiple outbreak causing bacteria including 1263 Salmonella enterica, 331 Yersinia enterocolitica and 6526 Campylobacter jejuni genomes, refMLST enabled consistent clustering, improved resolution, and faster processing in comparison to commonly used tools like chewieSnake. Conclusions refMLST is a novel multilocus sequence typing approach that is applicable to any bacterial species with a public reference genome, does not require a curated scheme, and automatically accounts for genetic recombination. Availability and implementation: refMLST is freely available for academic use at https://bugseq.com/academic .
Keywords