PLoS Computational Biology (Feb 2019)

Verbalizing phylogenomic conflict: Representation of node congruence across competing reconstructions of the neoavian explosion.

  • Nico M Franz,
  • Lukas J Musher,
  • Joseph W Brown,
  • Shizhuo Yu,
  • Bertram Ludäscher

DOI
https://doi.org/10.1371/journal.pcbi.1006493
Journal volume & issue
Vol. 15, no. 2
p. e1006493

Abstract

Read online

Phylogenomic research is accelerating the publication of landmark studies that aim to resolve deep divergences of major organismal groups. Meanwhile, systems for identifying and integrating the products of phylogenomic inference-such as newly supported clade concepts-have not kept pace. However, the ability to verbalize node concept congruence and conflict across multiple, in effect simultaneously endorsed phylogenomic hypotheses, is a prerequisite for building synthetic data environments for biological systematics and other domains impacted by these conflicting inferences. Here we develop a novel solution to the conflict verbalization challenge, based on a logic representation and reasoning approach that utilizes the language of Region Connection Calculus (RCC-5) to produce consistent alignments of node concepts endorsed by incongruent phylogenomic studies. The approach employs clade concept labels to individuate concepts used by each source, even if these carry identical names. Indirect RCC-5 modeling of intensional (property-based) node concept definitions, facilitated by the local relaxation of coverage constraints, allows parent concepts to attain congruence in spite of their differentially sampled children. To demonstrate the feasibility of this approach, we align two recent phylogenomic reconstructions of higher-level avian groups that entail strong conflict in the "neoavian explosion" region. According to our representations, this conflict is constituted by 26 instances of input "whole concept" overlap. These instances are further resolvable in the output labeling schemes and visualizations as "split concepts", which provide the labels and relations needed to build truly synthetic phylogenomic data environments. Because the RCC-5 alignments fundamentally reflect the trained, logic-enabled judgments of systematic experts, future designs for such environments need to promote a culture where experts routinely assess the intensionalities of node concepts published by our peers-even and especially when we are not in agreement with each other.