Известия Саратовского университета. Новая серия. Серия Математика. Механика. Информатика (Aug 2019)
Classification and Recognition of Structures of Genetic Sequences
Abstract
For solving problems of determining the relationships between the properties of organisms and the properties of the corresponding genetic sequences, we proposed a classification of genetic sequences based on numerical indicators of recurrent and Z-recurrent shapes, which define the structure of functional relationships of elements in sequences. For numerical indicators of recurrent and Z-recurrent shapes, we introduce a method of classification of genetic sequences. We compared a numerical characteristic that generalizes numerical values with a numerical characteristic of recurrent or Z-recurrent shapes which determine the structure of a sequence for each sequence of a biological rank considered in the recognition problem, which has a meaningful in-terpretation in the application area. The problem of recognition is considered from two points of view: when we determine belonging of a sequence to a specific rank of sequences, and when we determine which group of sequences contains the experimental sequence. Basic mathematical difficulties in solving these recognition problems are associated with the search difference in numerical representation of recurrent and Z-recurrent shapes of experimental sequences. To overcome these difficulties we created a spectrum of numerical indicators of recurrent and Z-recurrent shapes. Classification and recognition of sequences are illustrated by an example with three ranks of genetic codes of organisms, each of them represented by 5 sequences. Z-recurrent shape is introduced to define and extend the classification of sequences and increase the efficiency of recognition methods.
Keywords