Foundations of Computing and Decision Sciences (Jun 2016)

G-MAPSEQ – a new method for mapping reads to a reference genome

  • Wojciechowski Pawel,
  • Frohmberg Wojciech,
  • Kierzynka Michal,
  • Zurkowski Piotr,
  • Blazewicz Jacek

DOI
https://doi.org/10.1515/fcds-2016-0007
Journal volume & issue
Vol. 41, no. 2
pp. 123 – 142

Abstract

Read online

The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.

Keywords