Mathematics (May 2022)
Heterogeneous Overdispersed Count Data Regressions via Double-Penalized Estimations
Abstract
Recently, the high-dimensional negative binomial regression (NBR) for count data has been widely used in many scientific fields. However, most studies assumed the dispersion parameter as a constant, which may not be satisfied in practice. This paper studies the variable selection and dispersion estimation for the heterogeneous NBR models, which model the dispersion parameter as a function. Specifically, we proposed a double regression and applied a double ℓ1-penalty to both regressions. Under the restricted eigenvalue conditions, we prove the oracle inequalities for the lasso estimators of two partial regression coefficients for the first time, using concentration inequalities of empirical processes. Furthermore, derived from the oracle inequalities, the consistency and convergence rate for the estimators are the theoretical guarantees for further statistical inference. Finally, both simulations and a real data analysis demonstrate that the new methods are effective.
Keywords