Mathematics (Nov 2023)

Efficient Estimation and Validation of Shrinkage Estimators in Big Data Analytics

  • Salomi du Plessis,
  • Mohammad Arashi,
  • Gaonyalelwe Maribe,
  • Salomon M. Millard

DOI
https://doi.org/10.3390/math11224632
Journal volume & issue
Vol. 11, no. 22
p. 4632

Abstract

Read online

Shrinkage estimators are often used to mitigate the consequences of multicollinearity in linear regression models. Despite the ease with which these techniques can be applied to small- or moderate-size datasets, they encounter significant challenges in the big data domain. Some of these challenges are that the volume of data often exceeds the storage capacity of a single computer and that the time required to obtain results becomes infeasible due to the computational burden of a high volume of data. We propose an algorithm for the efficient model estimation and validation of various well-known shrinkage estimators to be used in scenarios where the volume of the data is large. Our proposed algorithm utilises sufficient statistics that can be computed and updated at the row level, thus minimizing access to the entire dataset. A simulation study, as well as an application on a real-world dataset, illustrates the efficiency of the proposed approach.

Keywords