International Journal of Information and Communication Technology Research (Sep 2022)
The Eficient Alignment of Long DNA Sequences Using Divide and Conquer Approach
Abstract
discovering mutations in DNA sequences is the most common approach to diagnosing many genome-related diseases. The optimal alignment of DNA sequences is a reliable approach to discover mutations in one sequence in comparison to the reference sequence. Needleman-Wunsch is the most applicable software for optimal alignment of the sequences and Smith-Waterman is the most applicable one for local optimal alignment of sequences. Their performances are excellent with short sequences, but as the sequences become long their performance degeneration grows exponentially to the point that it is practically impossible to align the sequences such as compete human DNAs. Therefore, many researches are done or being conducted to find ways of performing the alignment with tolerable time and memory consumptions. One such effort is breaking the sequences into same number of parts and align corresponding parts together to produce the overall alignment. With this, there are three achievements simultaneously: run time reduction, main memory utilization reduction, and the possibility to better utilize multiprocessors, multicores and General-Purpose Graphic Processing Units (GPGPUs). In this research, the method for breaking long sequences into smaller parts is based on the divide and conquer approach. The breaking points are selected along the longest common subsequence of the current sequences. The method is evaluated to be very efficient with respect to both time and main memory utilization which are the two confining factors.