Frontiers in Cellular and Infection Microbiology (May 2012)
A framework for assessing the concordance of molecular typing methods and the true strain phylogeny of Campylobacter jejuni and C. coli using draft genome sequence data
Abstract
Tracking of sources of sporadic cases of campylobacteriosis remains challenging, as commonly used molecular typing methods have limited ability to unambiguously link genetically related strains. Genomics has become increasingly prominent in the public health response to enteric pathogens as methods enable characterization of pathogens at an unprecedented level of resolution. However, the cost of sequencing and expertise required for bioinformatic analyses remains prohibitive, and these comprehensive analyses are limited to a few priority strains. Although several molecular typing methods are currently widely used for epidemiological analysis of campylobacters, it is not clear how accurately these methods reflect true strain relationships. To address this, we analyzed 104 publically available whole genome sequences (WGS) of C. jejuni and C. coli. In addition to in silico determination of multi-locus sequence (MLST), fla and porA type, as well as comparative genomic fingerprint (CGF), we inferred a reference phylogeny based on conserved core genome elements. Molecular typing data were compared to the reference phylogeny for concordance using the Adjusted Wallace Coefficient (AWC) with confidence intervals. Although MLST targets the sequence variability in core genes and CGF targets insertions/deletions of accessory genes, both methods are based on multilocus analysis and provided better estimates of true phylogeny than methods based on single loci (porA, fla). A more comprehensive WGS dataset including additional genetically related strains, both epidemiologically linked and unlinked, will be necessary to assess performance of methods for outbreak investigations and surveillance activities. Analyses of the strengths and weaknesses of widely used typing methodologies in inferring true strain relationships will provide guidance in the interpretation of this data for epidemiological purposes.
Keywords