NeuroImage (Aug 2023)
Generalizing prediction of task-evoked brain activity across datasets and populations
Abstract
Predictions of task-based functional magnetic resonance imaging (fMRI) from task-free resting-state (rs) fMRI have gained popularity over the past decade. This method holds a great promise for studying individual variability in brain function without the need to perform highly demanding tasks. However, in order to be broadly used, prediction models must prove to generalize beyond the dataset they were trained on. In this work, we test the generalizability of prediction of task-fMRI from rs-fMRI across sites, MRI vendors and age-groups. Moreover, we investigate the data requirements for successful prediction. We use the Human Connectome Project (HCP) dataset to explore how different combinations of training sample sizes and number of fMRI datapoints affect prediction success in various cognitive tasks. We then apply models trained on HCP data to predict brain activations in data from a different site, a different MRI vendor (Phillips vs. Siemens scanners) and a different age group (children from the HCP-development project). We demonstrate that, depending on the task, a training set of approximately 20 participants with 100 fMRI timepoints each yields the largest gain in model performance. Nevertheless, further increasing sample size and number of timepoints results in significantly improved predictions, until reaching approximately 450–600 training participants and 800–1000 timepoints. Overall, the number of fMRI timepoints influences prediction success more than the sample size. We further show that models trained on adequate amounts of data successfully generalize across sites, vendors and age groups and provide predictions that are both accurate and individual-specific. These findings suggest that large-scale publicly available datasets may be utilized to study brain function in smaller, unique samples.