Frontiers in Microbiology (Mar 2023)
Ksak: A high-throughput tool for alignment-free phylogenetics
Abstract
Phylogenetic tools are fundamental to the studies of evolutionary relationships. In this paper, we present Ksak, a novel high-throughput tool for alignment-free phylogenetic analysis. Ksak computes the pairwise distance matrix between molecular sequences, using seven widely accepted k-mer based distance measures. Based on the distance matrix, Ksak constructs the phylogenetic tree with standard algorithms. When benchmarked with a golden standard 16S rRNA dataset, Ksak was found to be the most accurate tool among all five tools compared and was 19% more accurate than ClustalW2, a high-accuracy multiple sequence aligner. Above all, Ksak was tens to hundreds of times faster than ClustalW2, which helps eliminate the computation limit currently encountered in large-scale multiple sequence alignment. Ksak is freely available at https://github.com/labxscut/ksak.
Keywords