Axioms (Jun 2024)

Analysis of Fat Big Data Using Factor Models and Penalization Techniques: A Monte Carlo Simulation and Application

  • Faridoon Khan,
  • Olayan Albalawi

DOI
https://doi.org/10.3390/axioms13070418
Journal volume & issue
Vol. 13, no. 7
p. 418

Abstract

Read online

This article assesses the predictive accuracy of factor models utilizing Partial·Least·Squares (PLS) and Principal·Component·Analysis (PCA) in comparison to autometrics and penalization techniques. The simulation exercise examines three types of scenarios by introducing the issues of multicollinearity, heteroscedasticity, and autocorrelation. The number of predictors and sample size are adjusted to observe the effects. The accuracy of the models is evaluated by calculating the Root·Mean·Square·Error (RMSE) and the Mean·Absolute·Error (MAE). In the presence of severe multicollinearity, the factor approach utilizing (PLS demonstrates exceptional performance in comparison. Autometrics achieves the lowest RMSE and MAE values across all levels of heteroscedasticity. Autometrics provides better forecasts with low and moderate autocorrelation. However, Elastic·Smoothly·Clipped·Absolute·Deviation (E-SCAD) forecasts well with severe autocorrelation. In addition to the simulation, we employ a popular Pakistani macroeconomic dataset for empirical research. The dataset contains 79 monthly variables from January 2013 to December 2020. The competing approaches perform differently compared to the simulation datasets, although “The PLS factor approach outperforms its competing approaches in forecasting, with lower RMSE and MAE”. It is more probable that the actual dataset exhibits a high degree of multicollinearity.

Keywords