Problemi Ekonomiki (Mar 2019)
Ordinary Least Squares: the Adequacy of Linear Regression Solutions under Multicollinearity and without it
Abstract
The article deals with the problem of economic adequacy of solving a linear regression problem by the OLS method. The study uses the following definition of adequacy: a linear regression solution is considered adequate if it not only has correct signs but also correctly reflects the relationship between coefficients of regression in the population. If in this case the coefficient of determination is greater than 0.8, the solution is considered economically adequate. As an indicator of adequacy of a linear regression problem solution it is proposed to use a 10 % level of the coefficient of variability (CV) of the regression coefficients. It is shown that OLS solutions may be not adequate to the solution in the population, although they may be physically correct (with correct signs) and statistically significant. The mentioned result is obtained by using the artificial data population (ADP) algorithm. The ADP allows generating data of any size with known regression coefficients in the whole population, which can be calculated with the aid of the OLS solution for a very large sample. The ADP algorithm makes it possible to change the regular component of the influence of the regressors on the response. Besides, the random changes of regressors in the ADP are divided into two parts. The first part is coherent to the response changes, but the second part is completely random (incoherent). This one allows changing the near-collinearity level of the data by changing the variance of the incoherent noise in regressors. Studies using ADP have shown that with a high probability the OLS solutions are physically incorrect if the sample sizes (n) are less than 23; physically correct but not adequate for 23 400. Furthermore, it is noted that if the elimination of strongly correlated regressors is not economically justified but is rather a measure of lowering the value of the VIF-factor, the results may be far from the reality. In this regard, it is stated that the use of the MOLS eliminates the need to exclude strongly correlated regressors at all, since the accuracy of the MOLS solution increases with an increase in the VIF.
Keywords