Journal of Computer Science and Technology (Oct 2020)
An analysis of k-mer frequency features with SVM and CNN for viral subtyping classification
Abstract
Viral subtyping classification is very relevant for the appropriate diagnosis and treatment of illnesses. The most used tools are based on alignment-based methods, nevertheless, they are becoming too slow with the increase of genomic data. For that reason, alignment-free methods have emerged as an alternative. In this work, we analyzed four alignment-free algorithms: two methods use k-mer frequencies (Kameris and Castor-KRFE); the third method used a frequency chaos game representation of a DNA with CNNs; finally the last one, process DNA sequences as a digital signal (ML-DSP). From the comparison, Kameris and Castor-KRFE outperformed the rest, followed by the method based on CNNs.
Keywords