NeuroImage (May 2020)
Test-retest reliability and sample size estimates after MRI scanner relocation
Abstract
Objective: Many factors can contribute to the reliability and robustness of MRI-derived metrics. In this study, we assessed the reliability and reproducibility of three MRI modalities after an MRI scanner was relocated to a new hospital facility. Methods: Twenty healthy volunteers (12 females, mean age (standard deviation) = 41 (11) years, age range [25–66]) completed three MRI sessions. The first session (S1) was one week prior to the 3T GE HDxt scanner relocation. The second (S2) occurred nine weeks after S1 and at the new location; a third session (S3) was acquired 4 weeks after S2. At each session, we acquired structural T1-weighted, pseudo-continuous arterial spin labelled, and diffusion tensor imaging sequences. We used longitudinal processing streams to create 12 summary MRI metrics, including total gray matter (GM), cortical GM, subcortical GM, white matter (WM), and lateral ventricle volume; mean cortical thickness; total surface area; average gray matter perfusion, and average diffusion tensor metrics along principal white matter pathways. We compared mean MRI values and variance at the old scanner location to multiple sessions at the new location using Bayesian multi-level regression models. K-fold cross validation allowed identification of important predictors. Whole-brain analyses were used to investigate any regional differences. Furthermore, we calculated within-subject coefficient of variation (wsCV), intraclass correlation coefficient (ICC), and dice similarity index (SI) of cortical segmentations across scanner relocation and within-site. Additionally, we estimated sample sizes required to robustly detect a 4% difference between two groups across MRI metrics. Results: All global MRI metrics exhibited little mean difference and small variability (bar cortical gray matter perfusion) both across scanner relocation and within-site repeat. T1- and DTI-derived tissue metrics showed 0.80 and within-subject coefficient of variation (wsCV) < 1.4%. Mean cortical gray matter perfusion had the highest between-session variability (6.7% [0.3, 16.7], estimate [95% uncertainty interval]), and hence the smallest ICC (0.71 [0.44,0.92]) and largest wsCV (13.4% [5.4, 18.1]). No global metric exhibited evidence of a meaningful mean difference between scanner locations. However, surface area showed evidence of a mean difference within-site repeat (between S2 and S3). Whole-brain analyses revealed no significant areas of difference between scanner relocation or within-site. For all metrics, we found no support for a systematic difference in variance across relocation sites compared to within-site test-retest reliability. Necessary sample sizes to detect a 4% difference between two independent groups varied from a maximum of n = 362 per group (cortical gray matter perfusion), to total gray matter volume (n = 114), average fractional anisotropy (n = 23), total gray matter volume normalized by intracranial volume (n = 19), and axial diffusivity (n = 3 per group). Conclusion: Cortical gray matter perfusion was the most variable metric investigated (necessitating large sample sizes to identify group differences), with other metrics showing substantially less variability. Scanner relocation appeared to have a negligible effect on variability of the global MRI metrics tested. This manuscript reports within-site test-retest variability to act as a tool for calculating sample size in future investigations. Our results suggest that when all other parameters are held constant (e.g., sequence parameters and MRI processing), the effect of scanner relocation is indistinguishable from within-site variability, but may need to be considered depending on the question being investigated.