Scientific Reports (Jul 2017)

Failure of phylogeny inferred from multilocus sequence typing to represent bacterial phylogeny

  • Alan K. L. Tsang,
  • Hwei Huih Lee,
  • Siu-Ming Yiu,
  • Susanna K. P. Lau,
  • Patrick C. Y. Woo

DOI
https://doi.org/10.1038/s41598-017-04707-4
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Although multilocus sequence typing (MLST) is highly discriminatory and useful for outbreak investigations and epidemiological surveillance, it has always been controversial whether clustering and phylogeny inferred from the MLST gene loci can represent the real phylogeny of bacterial strains. In this study, we compare the phylogenetic trees constructed using three approaches, (1) concatenated blocks of homologous sequence shared between the bacterial genomes, (2) genome single-nucleotide polymorphisms (SNP) profile and (3) concatenated nucleotide sequences of gene loci in the corresponding MLST schemes, for 10 bacterial species with >30 complete genome sequences available. Major differences in strain clustering at more than one position were observed between the phylogeny inferred using genome/SNP data and MLST for all 10 bacterial species. Shimodaira-Hasegawa test revealed significant difference between the topologies of the genome and MLST trees for nine of the 10 bacterial species, and significant difference between the topologies of the SNP and MLST trees were present for all 10 bacterial species. Matching Clusters and R-F Clusters metrics showed that the distances between the genome/SNP and MLST trees were larger than those between the SNP and genome trees. Phylogeny inferred from MLST failed to represent genome phylogeny with the same bacterial species.