Axioms (Oct 2022)
MEM and MEM4PP: New Tools Supporting the Parallel Generation of Critical Metrics in the Evaluation of Statistical Models
Abstract
This paper describes MEM and MEM4PP as new Stata tools and commands. They support the automatic reporting and selection of the best regression and classification models by adding supplemental performance metrics based on statistical post-estimation and custom computation. In particular, MEM provides helpful metrics, such as the maximum acceptable variance inflation factor (maxAcceptVIF) together with the maximum computed variance inflation factor (maxComputVIF) for ordinary least squares (OLS) regression, the maximum absolute value of the correlation coefficient in the predictors’ correlation matrix (maxAbsVPMCC), the area under the curve of receiving operator characteristics (AUC-ROC), p and chi-squared of the goodness-of-fit (GOF) test for logit and probit, and also the maximum probability thresholds (maxProbNlogPenultThrsh and maxProbNlogLastThrsh) from Zlotnik and Abraira risk-prediction nomograms (nomolog) for logistic regressions. This new tool also performs the automatic identification of the list of variables if run after most regression commands. After simple successive invocations of MEM (in a .do file acting as a batch file), the collectible results are produced in the console or exported to specially designated files (one .csv for all models in a batch). MEM4PP is MEM’s version for parallel processing. It starts from the same batch (the same .do file with its path provided as a parameter) and triggers different instances of Stata to parallelly generate the same results (one .csv for each model in a batch). The paper also includes some examples using real-world data from the World Values Survey (the evidence between 1981 and 2020, version number 1.6). They help us understand how MEM and MEM4PP support the testing of predictor independence, reverse causality checks, the best model selection starting from such metrics, and, ultimately, the replication of all these steps.
Keywords