BMC Bioinformatics (Oct 2017)

An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm

  • Biswanath Chowdhury,
  • Arnav Garai,
  • Gautam Garai

DOI
https://doi.org/10.1186/s12859-017-1874-7
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Detection of important functional and/or structural elements and identification of their positions in a large eukaryotic genomic sequence are an active research area. Gene is an important functional and structural unit of DNA. The computation of gene prediction is, therefore, very essential for detailed genome annotation. Results In this paper, we propose a new gene prediction technique based on Genetic Algorithm (GA) to determine the optimal positions of exons of a gene in a chromosome or genome. The correct identification of the coding and non-coding regions is difficult and computationally demanding. The proposed genetic-based method, named Gene Prediction with Genetic Algorithm (GPGA), reduces this problem by searching only one exon at a time instead of all exons along with its introns. This representation carries a significant advantage in that it breaks the entire gene-finding problem into a number of smaller sub-problems, thereby reducing the computational complexity. We tested the performance of the GPGA with existing benchmark datasets and compared the results with well-known and relevant techniques. The comparison shows the better or comparable performance of the proposed method. We also used GPGA for annotating the human chromosome 21 (HS21) using cross-species comparisons with the mouse orthologs. Conclusion It was noted that the GPGA predicted true genes with better accuracy than other well-known approaches.

Keywords