Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki (Jan 2016)
ROBUST MODIFICATION OF THE LASSO METHOD FOR GENOME-WIDE ASSOCIATION STUDY IN VIEW OF TARGET PHENOTYPE VALUES
Abstract
A modification of the Lasso method used for genome-wide association study by examples of double haploid lines of barley is proposed for taking into account the additional information about target values of the phenotype which is defined by some feature of plants. From a statistical point of view, a linear regression problem is studied. It is proposed to formalize the additional information about features of plants as intersection of two sets of weights assigned to the training set elements. The first set of weights is produced by means of the interval contamination model. The second set is formed by the pair-wise comparisons of phenotype values. The obtained intersection is convex and is totally defined by its extreme points. This feature allows reducing the Lasso method with sets of weights to a finite set of standard Lasso methods. Results of numerical experiments have showed that the modification provides the better accuracy measures in comparison with the standard Lasso when the training set is small.
Keywords