Technological and data sharing advances have led to a proliferation of high-resolution structural and functional maps of the brain. Modern neuroimaging research increasingly depends on identifying correspondences between the topographies of these maps; however, most standard methods for statistical inference fail to account for their spatial properties. Recently, multiple methods have been developed to generate null distributions that preserve the spatial autocorrelation of brain maps and yield more accurate statistical estimates. Here, we comprehensively assess the performance of ten published null frameworks in statistical analyses of neuroimaging data. To test the efficacy of these frameworks in situations with a known ground truth, we first apply them to a series of controlled simulations and examine the impact of data resolution and spatial autocorrelation on their family-wise error rates. Next, we use each framework with two empirical neuroimaging datasets, investigating their performance when testing (1) the correspondence between brain maps (e.g., correlating two activation maps) and (2) the spatial distribution of a feature within a partition (e.g., quantifying the specificity of an activation map within an intrinsic functional network). Finally, we investigate how differences in the implementation of these null models may impact their performance. In agreement with previous reports, we find that naive null models that do not preserve spatial autocorrelation consistently yield elevated false positive rates and unrealistically liberal statistical estimates. While spatially-constrained null models yielded more realistic, conservative estimates, even these frameworks suffer from inflated false positive rates and variable performance across analyses. Throughout our results, we observe minimal impact of parcellation and resolution on null model performance. Altogether, our findings highlight the need for continued development of statistically-rigorous methods for comparing brain maps. The present report provides a harmonised framework for benchmarking and comparing future advancements.