Revista Cubana de Ciencias Informáticas (Jan 2015)
Análisis del uso de las bibliotecas gsl y lapack en la construcción de árboles filogenéticos
Abstract
A phylogenetic tree or evolutionary tree is a branching diagram or "tree” showing the evolutionary relationships among various biological species based on similarities and differences in their physical or genetic characteristics. In the study of the evolutionary process, several methods are used to construct phylogenetic trees, including those based on the genetic distance between pairs of DNA sequences or proteins, which require a multiple alignment as input. In such methods, the data used are presented in a distance matrix obtained from the alignment of the sequences according to a biological model. The criterion of minimum evolution is one of these methods, where the best tree is one that minimizes the length of the internal branches. To find this tree can be used an adjustment method based on least squares (LS). This adjustment is used to estimate the lengths of the branches between all possible topologies, selecting the one that minimizes as much as possible, the difference between the given and the predicted distance. Their solution is analytic and can be obtained from a system of linear equations. To solve these systems there are several math libraries. Between them, the GSL library provides a more comfortable interface, while the LAPACK library has a superior performance. In this paper, are compared the results obtained by using both libraries, in the implementation of the LS method for constructing phylogenetic trees.