HCudaBLAST: an implementation of BLAST on Hadoop and Cuda

Nilay Khare; Alind Khare; Farhan Khan

doi:10.1186/s40537-017-0102-7

Journal of Big Data (Nov 2017)

HCudaBLAST: an implementation of BLAST on Hadoop and Cuda

Nilay Khare,
Alind Khare,
Farhan Khan

Affiliations

Nilay Khare: Maulana Azad National Institute of Technology
Alind Khare: IIIT
Farhan Khan: Maulana Azad National Institute of Technology

DOI: https://doi.org/10.1186/s40537-017-0102-7
Journal volume & issue: Vol. 4, no. 1
pp. 1 – 8

Abstract

Read online

Abstract The world of DNA sequencing has not only been a difficult field since it was first worked upon, but it is also growing at an exponential rate. The amount of data involved in DNA searching is huge, thereby normal tools or algorithms are not suitable to handle this degree of data processing. BLAST is a tool given by National Center for Biotechnology Information (NCBI) to compare nucleotide or protein sequences to sequence databases and calculate the statistical significance of matches. Many variants of BLAST such as blastn, blastp, blastx, etc. are used to search for nucleotides, proteins, nucleotides-to-proteins sequences respectively. GPU-BLAST and HBLAST have already been proposed to handle the vast amount of data involved in searching DNA sequencing and they also speedup the searching process. In this article, we propose a new model for searching DNA sequences—HCudaBLAST. It involves CUDA processing and Hadoop combined for efficient searching. The results recorded after implementing HCudaBLAST are shown. This solution combines the multi-core parallelism of GPGPUs and the scalability feature provided by the Hadoop framework.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords