BMC Bioinformatics (Oct 2011)

Genomic distance under gene substitutions

  • Ribeiro Leonardo C,
  • Machado Raphael,
  • Braga Marília D V,
  • Stoye Jens

DOI
https://doi.org/10.1186/1471-2105-12-S9-S8
Journal volume & issue
Vol. 12, no. Suppl 9
p. S8

Abstract

Read online

Abstract Background The distance between two genomes is often computed by comparing only the common markers between them. Some approaches are also able to deal with non-common markers, allowing the insertion or the deletion of such markers. In these models, a deletion and a subsequent insertion that occur at the same position of the genome count for two sorting steps. Results Here we propose a new model that sorts non-common markers with substitutions, which are more powerful operations that comprehend insertions and deletions. A deletion and an insertion that occur at the same position of the genome can be modeled as a substitution, counting for a single sorting step. Conclusions Comparing genomes with unequal content, but without duplicated markers, we give a linear time algorithm to compute the genomic distance considering substitutions and double-cut-and-join (DCJ) operations. This model provides a parsimonious genomic distance to handle genomes free of duplicated markers, that is in practice a lower bound to the real genomic distances. The method could also be used to refine orthology assignments, since in some cases a substitution could actually correspond to an unannotated orthology.