Measuring and characterizing generalization in deep reinforcement learning

Sam Witty; Jun K. Lee; Emma Tosch; Akanksha Atrey; Kaleigh Clary; Michael L. Littman; David Jensen

doi:10.1002/ail2.45

Applied AI Letters (Dec 2021)

Measuring and characterizing generalization in deep reinforcement learning

Sam Witty,
Jun K. Lee,
Emma Tosch,
Akanksha Atrey,
Kaleigh Clary,
Michael L. Littman,
David Jensen

Affiliations

Sam Witty: College of Information and Computer Sciences University of Massachusetts Amherst Amherst Massachusetts USA
Jun K. Lee: Department of Computer Science Brown University Providence Rhode Island USA
Emma Tosch: College of Engineering and Mathematical Sciences University of Vermont Burlington Vermont USA
Akanksha Atrey: College of Information and Computer Sciences University of Massachusetts Amherst Amherst Massachusetts USA
Kaleigh Clary: College of Information and Computer Sciences University of Massachusetts Amherst Amherst Massachusetts USA
Michael L. Littman: Department of Computer Science Brown University Providence Rhode Island USA
David Jensen: College of Information and Computer Sciences University of Massachusetts Amherst Amherst Massachusetts USA

DOI: https://doi.org/10.1002/ail2.45
Journal volume & issue: Vol. 2, no. 4
pp. n/a – n/a

Abstract

Read online

Abstract Deep reinforcement learning (RL) methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re‐examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on‐policy, off‐policy, and unreachable states. We propose a set of practical methods for evaluating agents with these definitions of generalization. We demonstrate these techniques on a common benchmark task for deep RL, and we show that the learned networks make poor decisions for states that differ only slightly from on‐policy states, even though those states are not selected adversarially. We focus our analyses on the deep Q‐networks (DQNs) that kicked off the modern era of deep RL. Taken together, these results call into question the extent to which DQNs learn generalized representations, and suggest that more experimentation and analysis is necessary before claims of representation learning can be supported.

Published in Applied AI Letters

ISSN: 2689-5595 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/26895595

About the journal

Abstract

Keywords