Genome Biology (Jul 2019)

Benchmarking of alignment-free sequence comparison methods

  • Andrzej Zielezinski,
  • Hani Z. Girgis,
  • Guillaume Bernard,
  • Chris-Andre Leimeister,
  • Kujin Tang,
  • Thomas Dencker,
  • Anna Katharina Lau,
  • Sophie Röhling,
  • Jae Jin Choi,
  • Michael S. Waterman,
  • Matteo Comin,
  • Sung-Hou Kim,
  • Susana Vinga,
  • Jonas S. Almeida,
  • Cheong Xin Chan,
  • Benjamin T. James,
  • Fengzhu Sun,
  • Burkhard Morgenstern,
  • Wojciech M. Karlowski

DOI
https://doi.org/10.1186/s13059-019-1755-7
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment. Results Here, we present a community resource (http://afproject.org) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events. Conclusion The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.

Keywords