Cross-validation and permutations in MVPA: Validity of permutation strategies and power of cross-validation schemes
Giancarlo Valente,
Agustin Lage Castellanos,
Lars Hausfeld,
Federico De Martino,
Elia Formisano
Affiliations
Giancarlo Valente
Maastricht University, Department of Cognitive Neuroscience, The Netherlands; Maastricht Brain Imaging Center, M-Bic, The Netherlands; Corresponding author at: Faculty of Psychology and Neuroscience, Department of Cognitive Neurosciences, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands.
Agustin Lage Castellanos
Maastricht University, Department of Cognitive Neuroscience, The Netherlands; Maastricht Brain Imaging Center, M-Bic, The Netherlands; Department of NeuroInformatics, Cuban Center for Neuroscience, Cuba
Lars Hausfeld
Maastricht University, Department of Cognitive Neuroscience, The Netherlands; Maastricht Brain Imaging Center, M-Bic, The Netherlands
Federico De Martino
Maastricht University, Department of Cognitive Neuroscience, The Netherlands; Maastricht Brain Imaging Center, M-Bic, The Netherlands; Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN, United States
Elia Formisano
Maastricht University, Department of Cognitive Neuroscience, The Netherlands; Maastricht Brain Imaging Center, M-Bic, The Netherlands; Maastricht Center for Systems Biology (MaCSBio), The Netherlands
Multi-Voxel Pattern Analysis (MVPA) is a well established tool to disclose weak, distributed effects in brain activity patterns. The generalization ability is assessed by testing the learning model on new, unseen data. However, when limited data is available, the decoding success is estimated using cross-validation. There is general consensus on assessing statistical significance of cross-validated accuracy with non-parametric permutation tests. In this work we focus on the false positive control of different permutation strategies and on the statistical power of different cross-validation schemes.With simulations, we show that estimating the entire cross-validation error on each permuted dataset is the only statistically valid permutation strategy. Furthermore, using both simulations and real data from the HCP WU-Minn 3T fMRI dataset, we show that, among the different cross-validation schemes, a repeated split-half cross-validation is the most powerful, despite achieving slightly lower classification accuracy, when compared to other schemes. Our findings provide additional insights into the optimization of the experimental design for MVPA, highlighting the benefits of having many short runs.