IEEE Access (Jan 2019)

Viral Genome Deep Classifier

  • Anna Fabijanska,
  • Szymon Grabowski

DOI
https://doi.org/10.1109/ACCESS.2019.2923687
Journal volume & issue
Vol. 7
pp. 81297 – 81307

Abstract

Read online

The task of virus classification into subtypes is an important concern in many categorization studies, e.g., in virology or epidemiology. Therefore, the problem of virus subtyping has been a subject of considerable interest in the last decade. Although there exist several virus subtyping tools, they are often dedicated to a specific family of viruses. Even specialized methods, however, often fail to correctly subtype viruses, such as HIV or influenza. To address these shortcomings, we present a viral genome deep classifier (VGDC)-a tool for an automatic virus subtyping, which employs a deep convolutional neural network (CNN). The method is universal and can be applied for subtyping any virus, as confirmed by experiments on dengue, hepatitis B and C, HIV-1, and influenza A datasets. For all considered virus types, the obtained classification rates are very high with the corresponding values of the F1-score ranging from about 0.85 to 1.00 depending on the virus type and the considered number of subtypes. For HIV-1 and influenza A, the VGDC significantly outperforms the leading competitors, including CASTOR and COMET. The VGDC source code is freely available to facilitate easy usage and comparison with future approaches.

Keywords