PLoS ONE (Jan 2014)

Orthology detection combining clustering and synteny for very large datasets.

  • Marcus Lechner,
  • Maribel Hernandez-Rosales,
  • Daniel Doerr,
  • Nicolas Wieseke,
  • Annelyse Thévenin,
  • Jens Stoye,
  • Roland K Hartmann,
  • Sonja J Prohaska,
  • Peter F Stadler

DOI
https://doi.org/10.1371/journal.pone.0105015
Journal volume & issue
Vol. 9, no. 8
p. e105015

Abstract

Read online

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.