Uludağ University Journal of The Faculty of Engineering (Dec 2016)

Detection of P53 Consensus Sequence: A Novel String Matching With Classes Algorithm

  • Gıyasettin ÖZCAN

DOI
https://doi.org/10.17482/uujfe.21385
Journal volume & issue
Vol. 21, no. 2
pp. 269 – 282

Abstract

Read online

We present a novel fast string matching technique for special DNA pattern forms and compare performance of recent CPU architectures on the matching problem. In particular, we consider consensus P53 DNA-binding consensus sequence, which has an important contribution for cancer treatment. Based on biological findings, consensus P53 pattern may emerge in various sequence forms and its length is not deterministic. Therefore, classic string matching algorithms are not able to solve the problem. For efficient solution, we consider bitwise string matching algorithms with classes and present a novel search technique which is based on 64-bit packed variables. In order to prevent obstacles based on variable length of the pattern, we search specific indexes of P53 on databases. For experimental analysis, we make use of mus musculus DNA sequences with approximately 2.3 billion nucleotides. We compare algorithm performance and three architectures with various level CPU parallelism. Test results show that our technique presents search efficiency during P53 pattern search in each architecture platform. Due to its structure, the algorithm also introduces an efficient solution to similar string matching with class problems.

Keywords