Neural computations underlying inverse reinforcement learning in the human brain

Sven Collette; Wolfgang M Pauli; Peter Bossaerts; John O'Doherty

doi:10.7554/eLife.29718

eLife (Oct 2017)

Neural computations underlying inverse reinforcement learning in the human brain

Sven Collette,
Wolfgang M Pauli,
Peter Bossaerts,
John O'Doherty

Affiliations

Sven Collette: ORCiD; Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, United States; Computation and Neural Systems Program, California Institute of Technology, Pasadena, United States
Wolfgang M Pauli: Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, United States; Computation and Neural Systems Program, California Institute of Technology, Pasadena, United States
Peter Bossaerts: Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, Australia; California Institute of Technology, Pasadena, United States
John O'Doherty: Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, United States; Computation and Neural Systems Program, California Institute of Technology, Pasadena, United States

DOI: https://doi.org/10.7554/eLife.29718
Journal volume & issue: Vol. 6

Abstract

Read online

In inverse reinforcement learning an observer infers the reward distribution available for actions in the environment solely through observing the actions implemented by another agent. To address whether this computational process is implemented in the human brain, participants underwent fMRI while learning about slot machines yielding hidden preferred and non-preferred food outcomes with varying probabilities, through observing the repeated slot choices of agents with similar and dissimilar food preferences. Using formal model comparison, we found that participants implemented inverse RL as opposed to a simple imitation strategy, in which the actions of the other agent are copied instead of inferring the underlying reward structure of the decision problem. Our computational fMRI analysis revealed that anterior dorsomedial prefrontal cortex encoded inferences about action-values within the value space of the agent as opposed to that of the observer, demonstrating that inverse RL is an abstract cognitive process divorceable from the values and concerns of the observer him/herself.

Published in eLife

ISSN: 2050-084X (Online)
Publisher: eLife Sciences Publications Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine; Science: Biology (General)
Website: https://elifesciences.org

About the journal

Abstract

Keywords