Efficient Alignment of Very Long Sequences

Advances in Science, Technology and Engineering Systems. 2018;3(2):329-345 DOI 10.25046/aj030236

 

Journal Homepage

Journal Title: Advances in Science, Technology and Engineering Systems

ISSN: 2415-6698 (Online)

Publisher: Advances in Science, Technology and Engineering Systems Journal (ASTESJ)

LCC Subject Category: Technology | Science

Country of publisher: United States

Language of fulltext: English

Full-text formats available: PDF

 

AUTHORS

Chunchun Zhao
Sartaj Sahni

EDITORIAL INFORMATION

Double blind peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 8 weeks

 

Abstract | Full Text

We consider the problem of aligning two very long biological sequences. The score for the best alignment may be found using the Smith-Waterman scoring algorithm while the best alignment itself may be determined using Myers and Miller’s alignment algorithm. Neither of these algorithms takes advantage of computer caches to obtain high efficiency. We propose cache-efficient algorithms to determine the score of the best alignment as well as the best alignment itself. All algorithms were implemented using C and OpenMP, and benchmarked using real data sets from the National Center for Biotechnology Information (NCBI) database. The test computational platforms were Xeon E5 2603, I7-x980 and Xeon E5 2695. Our best single-core cacheefficient scoring algorithm reduces the running time by as much as 19.7% relative to the Smith-Waterman scoring algorithm and our best cache-efficient alignment algorithm reduces the running time by as much as 17.1% relative to the Myers and Miller alignment algorithm. Multicore versions of our cache-efficient algorithms scale quite well up to the 24 cores we tested; achieving a speedup of 22 with 24 cores. Our multi-core scoring and alignment algorithms reduce the running time by as much as 61.4% and 47.3% relative to multi-core versions of the Smith-Waterman scoring algorithm and Myers and Miller’s alignment algorithm, respectively.