IEEE Access (Jan 2020)

Large-Scale Expectile Regression With Covariates Missing at Random

  • Yingli Pan,
  • Zhan Liu,
  • Wen Cai

DOI
https://doi.org/10.1109/ACCESS.2020.2970741
Journal volume & issue
Vol. 8
pp. 36502 – 36513

Abstract

Read online

Analysis of large volumes of data is very complex due to not only a high level of skewness and heteroscedasticity of variance but also the phenomenon of missing data. Expectile regression is a popular alternative method of analyzing heterogeneous data. In this paper, we consider fitting a linear expectile regression model for estimating conditional expectiles based on a large quantity of data with covariates missing at random. We construct a communication-efficient surrogate loss (CSL) function to estimate model parameters. The asymptotic normality of the proposed estimator is established. A proximal alternating direction method of multipliers (ADMM) algorithm is developed for distributed statistical optimization on a large quantity of data. Simulation studies are performed to assess the finite-sample performance of the proposed method. Survey data from the Behavioral Risk Factor Surveillance System (BRFSS) is used to demonstrate the utility of the proposed method in practice.

Keywords