BMC Bioinformatics (Dec 2012)

A flexible ancestral genome reconstruction method based on gapped adjacencies

  • Gagnon Yves,
  • Blanchette Mathieu,
  • El-Mabrouk Nadia

DOI
https://doi.org/10.1186/1471-2105-13-S19-S4
Journal volume & issue
Vol. 13, no. Suppl 19
p. S4

Abstract

Read online

Abstract Background The "small phylogeny" problem consists in inferring ancestral genomes associated with each internal node of a phylogenetic tree of a set of extant species. Existing methods can be grouped into two main categories: the distance-based methods aiming at minimizing a total branch length, and the synteny-based (or mapping) methods that first predict a collection of relations between ancestral markers in term of "synteny", and then assemble this collection into a set of Contiguous Ancestral Regions (CARs). The predicted CARs are likely to be more reliable as they are more directly deduced from observed conservations in extant species. However the challenge is to end up with a completely assembled genome. Results We develop a new synteny-based method that is flexible enough to handle a model of evolution involving whole genome duplication events, in addition to rearrangements, gene insertions, and losses. Ancestral relationships between markers are defined in term of Gapped Adjacencies, i.e. pairs of markers separated by up to a given number of markers. It improves on a previous restricted to direct adjacencies, which revealed a high accuracy for adjacency prediction, but with the drawback of being overly conservative, i.e. of generating a large number of CARs. Applying our algorithm on various simulated data sets reveals good performance as we usually end up with a completely assembled genome, while keeping a low error rate. Availability All source code is available at http://www.iro.umontreal.ca/~mabrouk.