Scientific Data (Aug 2023)

A natural language fMRI dataset for voxelwise encoding models

  • Amanda LeBel,
  • Lauren Wagner,
  • Shailee Jain,
  • Aneesh Adhikari-Desai,
  • Bhavin Gupta,
  • Allyson Morgenthal,
  • Jerry Tang,
  • Lixiang Xu,
  • Alexander G. Huth

DOI
https://doi.org/10.1038/s41597-023-02437-z
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Speech comprehension is a complex process that draws on humans’ abilities to extract lexical information, parse syntax, and form semantic understanding. These sub-processes have traditionally been studied using separate neuroimaging experiments that attempt to isolate specific effects of interest. More recently it has become possible to study all stages of language comprehension in a single neuroimaging experiment using narrative natural language stimuli. The resulting data are richly varied at every level, enabling analyses that can probe everything from spectral representations to high-level representations of semantic meaning. We provide a dataset containing BOLD fMRI responses recorded while 8 participants each listened to 27 complete, natural, narrative stories (~6 hours). This dataset includes pre-processed and raw MRIs, as well as hand-constructed 3D cortical surfaces for each participant. To address the challenges of analyzing naturalistic data, this dataset is accompanied by a python library containing basic code for creating voxelwise encoding models. Altogether, this dataset provides a large and novel resource for understanding speech and language processing in the human brain.