Evolutionary Bioinformatics (Aug 2017)
Popmarker: Identifying Phylogenetic Markers at the Population Level
Abstract
As phylogenomic approach becomes a common practice for constructing true bacterial phylogenies, it has become apparent that single molecular markers such as 16S ribosomal DNA often lead to misclassification of species. In this study, we present a program called Popmarker that uses the true species phylogeny and identifies a minimum set of molecular markers reflecting the bacterial evolution history and phylogenetic relationship at the resolution of populations. Popmarker ranks the proteome according to the correlation of whole species tree or subtree branch length against orthologous sequence distances. We demonstrate that 5 proteins of 2 top ranks achieve the same resolution as concatenation of 2203 single-copy orthologous genes and the right species classification as well as correct split of the 2 groups of Vibrio campbellii . The top-ranking genes selected by Popmarker are candidates that lead to speciation and are useful in distinguishing close related species in microbiome study.