Virology Journal (Mar 2012)

Human polyomaviruses identification by logic mining techniques

  • Weitschek Emanuel,
  • Lo Presti Alessandra,
  • Drovandi Guido,
  • Felici Giovanni,
  • Ciccozzi Massimo,
  • Ciotti Marco,
  • Bertolazzi Paola

DOI
https://doi.org/10.1186/1743-422X-9-58
Journal volume & issue
Vol. 9, no. 1
p. 58

Abstract

Read online

Abstract Background Differences in genomic sequences are crucial for the classification of viruses into different species. In this work, viral DNA sequences belonging to the human polyomaviruses BKPyV, JCPyV, KIPyV, WUPyV, and MCPyV are analyzed using a logic data mining method in order to identify the nucleotides which are able to distinguish the five different human polyomaviruses. Results The approach presented in this work is successful as it discovers several logic rules that effectively characterize the different five studied polyomaviruses. The individuated logic rules are able to separate precisely one viral type from the other and to assign an unknown DNA sequence to one of the five analyzed polyomaviruses. Conclusions The data mining analysis is performed by considering the complete sequences of the viruses and the sequences of the different gene regions separately, obtaining in both cases extremely high correct recognition rates.