PeerJ (Jul 2020)

Very few sites can reshape the inferred phylogenetic tree

  • Warren R. Francis,
  • Donald E. Canfield

DOI
https://doi.org/10.7717/peerj.8865
Journal volume & issue
Vol. 8
p. e8865

Abstract

Read online Read online

The history of animal evolution, and the relative placement of extant animal phyla in this history is, in principle, testable from phylogenies derived from molecular sequence data. Though datasets have increased in size and quality in the past years, the contribution of individual genes (and ultimately amino acid sites) to the final phylogeny is unequal across genes. Here we demonstrate that removing a small fraction of sites strongly favoring one topology can produce a highly-supported tree of an alternate topology. We explore this approach using a dataset for animal phylogeny, and create a highly-supported tree with a monophyletic group of sponges and ctenophores, a topology not usually recovered. Because of the high sensitivity of such an analysis to gene selection, and because most gene sets are neither standardized nor representative of the entire genome, researchers should be diligent about making intermediate analyses available with their phylogenetic studies. Effort is needed to ensure these datasets are maximally informative, by ensuring all genes are systematically sampled across relevant species. From there, it could be determined whether any gene or gene sets introduce bias, and then deal with those biases appropriately.

Keywords