HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents

Daniel Horvath; Jesus Bujalance Martin; Ferenc Gabor Erdos; Zoltan Istenes; Fabien Moutarde

doi:10.1109/ACCESS.2024.3427012

IEEE Access (Jan 2024)

HiER: Highlight Experience Replay for Boosting Off-Policy Reinforcement Learning Agents

Daniel Horvath,
Jesus Bujalance Martin,
Ferenc Gabor Erdos,
Zoltan Istenes,
Fabien Moutarde

Affiliations

Daniel Horvath: ORCiD; Center for Robotics, Mines Paris, PSL University, Paris, France
Jesus Bujalance Martin: Center for Robotics, Mines Paris, PSL University, Paris, France
Ferenc Gabor Erdos: ORCiD; Centre of Excellence in Production Informatics and Control, Institute for Computer Science and Control, Hungarian Research Network, Budapest, Hungary
Zoltan Istenes: ORCiD; CoLocation Center for Academic and Industrial Cooperation, Eötvös Loránd University, Budapest, Hungary
Fabien Moutarde: Center for Robotics, Mines Paris, PSL University, Paris, France

DOI: https://doi.org/10.1109/ACCESS.2024.3427012
Journal volume & issue: Vol. 12
pp. 100102 – 100119

Abstract

Read online

Even though reinforcement-learning-based algorithms achieved superhuman performance in many domains, the field of robotics poses significant challenges as the state and action spaces are continuous, and the reward function is predominantly sparse. Furthermore, on many occasions, the agent is devoid of access to any form of demonstration. Inspired by human learning, in this work, we propose a method named highlight experience replay (HiER) that creates a secondary highlight replay buffer for the most relevant experiences. For the weights update, the transitions are sampled from both the standard and the highlight experience replay buffer. It can be applied with or without the techniques of hindsight experience replay (HER) and prioritized experience replay (PER). Our method significantly improves the performance of the state-of-the-art, validated on 8 tasks of three robotic benchmarks. Furthermore, to exploit the full potential of HiER, we propose HiER+ in which HiER is enhanced with an arbitrary data collection curriculum learning method. Our implementation, the qualitative results, and a video presentation are available on the project site: http://www.danielhorvath.eu/hier/.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords