International Journal of Psychological Research (Jun 2010)
Robust analysis of the central tendency, simple and multiple regression and ANOVA: a step by step tutorial.
Abstract
After much exertion and care to run an experiment in social science, the analysis of data should not be ruined by an improper analysis. Often, classical methods, like the mean, the usual simple and multiple linear regressions, and the ANOVA require normality and absence of outliers, which rarely occurs in data coming from experiments. To palliate to this problem, researchers often use some ad-hoc methods like the detection and deletion of outliers. In this tutorial, we will show the shortcomings of such an approach. In particular, we will show that outliers can sometimes be very difficult to detect and that the full inferential procedure is somewhat distorted by such a procedure. A more appropriate and modern approach is to use a robust procedure that provides estimation, inference and testing that are not influenced by outlying observations but describes correctly the structure for the bulk of the data. It can also give diagnostic of the distance of any point or subject relative to the central tendency. Robust procedures can also be viewed as methods to check the appropriateness of the classical methods. To provide a step-by-step tutorial, we present descriptive analyses that allow researchers to make an initial check on the conditions of application of the data. Next, we compare classical and robust alternatives to ANOVA and regression and discuss their advantages and disadvantages. Finally, we present indices and plots that are based on the residuals of the analysis and can be used to determine if the conditions of applications of the analyses are respected. Examples on data from psychological research illustrate each of these points and for each analysis and plot, R code is provided to allow the readers to apply the techniques presented throughout the article.
Keywords