BMC Bioinformatics (Jun 2017)

Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species

  • Jennifer Rogers,
  • Andrew Fishberg,
  • Nora Youngs,
  • Yi-Chieh Wu

DOI
https://doi.org/10.1186/s12859-017-1701-1
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background In phylogenetics, we often seek to reconcile gene trees with species trees within the framework of an evolutionary model. While the most popular models for eukaryotic species allow for only gene duplication and gene loss or only multispecies coalescence, recent work has combined these phenomena through a reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes the duplication-loss and coalescent history of a gene family. However, the LCT makes the simplifying assumption that only one individual is sampled per species whereas, with advances in gene sequencing, we now have access to multiple samples per species. Results We demonstrate that with these additional samples, there exist gene tree topologies that are impossible to reconcile with any species tree. In particular, the multiple samples enforce new constraints on the placement of duplications within a valid reconciliation. To model these constraints, we extend the LCT to a new structure, the partially labeled coalescent tree (PLCT) and demonstrate how to use the PLCT to evaluate the feasibility of a gene tree topology. We apply our algorithm to two clades of apes and flies to characterize possible sources of infeasibility. Conclusion Going forward, we believe that this model represents a first step towards understanding reconciliations in duplication-loss-coalescence models with multiple samples per species.

Keywords