Sistemas de Informação (Dec 2020)

A Survey on Solutions for Planted Motif Search Challenging Instances

  • AGUENA, D. S.,
  • MONGELLI, H.,
  • ALMEIDA, N. F.

Journal volume & issue
Vol. 1, no. 26
pp. 10 – 30

Abstract

Read online

In the gene expression process, a transcription factor molecule will bind to a short substring in the promoter region of a gene in order to start the transcription process. This short substring, called motif, appear imperfectly conserved over several genes promoter regions. The discovery of motifs over a set of sequences representing the promoter regions is an important problem in Bioinformatics. Pevzner and Sze, in 2000, have introduced the planted (l; d)-motif search (PMS) problem to find motifs in a set of sequences where l is the motif length and d is the maximum difference between the motif found and its occurrences in the set. Burlher and Tompa, in 2001, studied this problem and, based on their studies, it was possible to classify certain instances of the problem, considered more diffult, as challenging instances. Since then, many approaches have been proposed to solve PMS challenging instances, but there are still limitations on the maximum size of instances supported by these approaches. In this work we present a review of solutions for PMS challenging instances.

Keywords