UHD Journal of Science and Technology (Aug 2017)
An Improved Parallel Multiple Sequence Alignment Algorithm on Multi-core System
Abstract
In this paper, we introduce an improved parallel algorithm for computing the number of exact matches nid (S,T) in the local alignment of two biological sequences S and T. This number is used in the first stage of progressive alignment to compute the distance between two sequences. The distance computations are usually its most computationally intensive part. Therefore, this work concentrates on improving an algorithm for this stage using vectorizing technique and running on multi-core. Our program is able to compute nid (S,T) between very long sequences, up to 34 k residues by C++ with OpenMP library on an Intel Core-i7-3770 quad-core processor of 3.40 GHz and main memory of 8 GB. It outperforms ClustalW-MPI 0.13 with 2.9-fold speedup, and the efficiency reached 0.35. Furthermore, a higher speedup with improved efficiency can be accomplished. Its performance figures vary from a low of 0.438 GCUPS to a high of 3.66 GCUPS as the lengths of the query sequences decrease from 34,500 to 9200.
Keywords