Crossmodal hierarchical predictive coding for audiovisual sequences in the human brain

Yiyuan Teresa Huang; Chien-Te Wu; Yi-Xin Miranda Fang; Chin-Kun Fu; Shinsuke Koike; Zenas C. Chao

doi:10.1038/s42003-024-06677-6

Communications Biology (Aug 2024)

Crossmodal hierarchical predictive coding for audiovisual sequences in the human brain

Yiyuan Teresa Huang,
Chien-Te Wu,
Yi-Xin Miranda Fang,
Chin-Kun Fu,
Shinsuke Koike,
Zenas C. Chao

Affiliations

Yiyuan Teresa Huang: International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo
Chien-Te Wu: International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo
Yi-Xin Miranda Fang: School of Occupational Therapy, College of Medicine, National Taiwan University
Chin-Kun Fu: School of Occupational Therapy, College of Medicine, National Taiwan University
Shinsuke Koike: International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo
Zenas C. Chao: International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo

DOI: https://doi.org/10.1038/s42003-024-06677-6
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Predictive coding theory suggests the brain anticipates sensory information using prior knowledge. While this theory has been extensively researched within individual sensory modalities, evidence for predictive processing across sensory modalities is limited. Here, we examine how crossmodal knowledge is represented and learned in the brain, by identifying the hierarchical networks underlying crossmodal predictions when information of one sensory modality leads to a prediction in another modality. We record electroencephalogram (EEG) during a crossmodal audiovisual local-global oddball paradigm, in which the predictability of transitions between tones and images are manipulated at both the stimulus and sequence levels. To dissect the complex predictive signals in our EEG data, we employed a model-fitting approach to untangle neural interactions across modalities and hierarchies. The model-fitting result demonstrates that audiovisual integration occurs at both the levels of individual stimulus interactions and multi-stimulus sequences. Furthermore, we identify the spatio-spectro-temporal signatures of prediction-error signals across hierarchies and modalities, and reveal that auditory and visual prediction errors are rapidly redirected to the central-parietal electrodes during learning through alpha-band interactions. Our study suggests a crossmodal predictive coding mechanism where unimodal predictions are processed by distributed brain networks to form crossmodal knowledge.

Published in Communications Biology

ISSN: 2399-3642 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General)
Website: https://www.nature.com/commsbio/

About the journal