Information (Jan 2017)

The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences

  • Ivan V. Stepanyan,
  • Sergey V. Petoukhov

DOI
https://doi.org/10.3390/info8010012
Journal volume & issue
Vol. 8, no. 1
p. 12

Abstract

Read online

The article is devoted to a matrix method of comparative analysis of long nucleotide sequences by means of presenting each sequence in the form of three digital binary sequences. This method uses a set of symmetries of biochemical attributes of nucleotides. It also uses the possibility of presentation of every whole set of N-mers as one of the members of a Kronecker family of genetic matrices. With this method, a long nucleotide sequence can be visually represented as an individual fractal-like mosaic or another regular mosaic of binary type. In contrast to natural nucleotide sequences, artificial random sequences give non-regular patterns. Examples of binary mosaics of long nucleotide sequences are shown, including cases of human chromosomes and penicillins. The obtained results are then discussed.

Keywords