PeerJ (Jan 2018)

Genomic signal processing for DNA sequence clustering

  • Gerardo Mendizabal-Ruiz,
  • Israel Román-Godínez,
  • Sulema Torres-Ramos,
  • Ricardo A. Salido-Ruiz,
  • Hugo Vélez-Pérez,
  • J. Alejandro Morales

DOI
https://doi.org/10.7717/peerj.4264
Journal volume & issue
Vol. 6
p. e4264

Abstract

Read online Read online

Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

Keywords