IEEE Access (Jan 2021)
Sparse Nonnegative Interaction Models
Abstract
Non-negative least square regression (NLS) is a constrained least squares problem where the coefficients are restricted to be non-negative. It is useful for modeling non-negative responses such as time measurements, count data, histograms and so on. Existing NLS solvers are designed for cases where the predictor variables and response variables have linear relationships, and do not consider interactions among predictor variables. In this paper, we solve NLS in the complete space of power sets of variables. Such an extension is particularly useful in biology, for modeling genetic associations. Our new algorithms solve NLS problems exactly while decreasing computational burden by using an active set method. The algorithm proceeds in an iterative fashion, such that an optimal interaction term is searched by a branch-and-bound subroutine, and added to the solution set one another. The resulting large search space is efficiently restricted by novel pruning conditions and two kinds of sparsity promoting regularization; $l_{1}$ norm and non-negativity constraints. In computational experiments using HIV-1 datasets, 99% of the search space was safely pruned without losing the optimal variables. In mutagenicity datasets, the proposed method could identify long and accurate patterns compared to the original NLS. Codes are available from https://github.com/afiveithree/inlars.
Keywords