Medical Review (Dec 2021)
TahcoRoll: fast genomic signature profiling via thinned automaton and rolling hash
Abstract
Genomic signatures like k-mers have become one of the most prominent approaches to describe genomic data. As a result, myriad real-world applications, such as the construction of de Bruijn graphs in genome assembly, have been benefited by recognizing genomic signatures. In other words, an efficient approach of genomic signature profiling is an essential need for tackling high-throughput sequencing reads. However, most of the existing approaches only recognize fixed-size k-mers while many research studies have shown the importance of considering variable-length k-mers.
Keywords