Self-competitive Hindsight Experience Replay with Penalty Measures

WANG Zihao, QIAN Xuezhong, SONG Wei

doi:10.3778/j.issn.1673-9418.2303031

Jisuanji kexue yu tansuo (May 2024)

Self-competitive Hindsight Experience Replay with Penalty Measures

WANG Zihao, QIAN Xuezhong, SONG Wei

Affiliations

WANG Zihao, QIAN Xuezhong, SONG Wei: School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2303031
Journal volume & issue: Vol. 18, no. 5
pp. 1223 – 1231

Abstract

Read online

Self-competitive hindsight experience replay (SCHER) is an improved strategy proposed based on the hindsight experience replay (HER) algorithm. The HER algorithm generates virtual labeled data by replaying experiences to optimize the model in the face of sparse environmental rewards. However, the HER algorithm has two problems: firstly, it cannot handle the large amount of repetitive data generated due to sparse rewards, which contaminates the experience pool; secondly, virtual goals may randomly select intermediate states that are not helpful in completing the task, leading to learning bias. To address these issues, the SCHER algorithm proposes two improvement strategies: firstly, increase the adaptive reward signal to penalize meaningless actions made by agents and quickly avoid such operations; secondly, use self-competition strategy to generate two sets of data for the same task, analyze and compare them, and find the key steps that enable the agent to succeed in different environments, thereby improving the accuracy of generated virtual goals. Experimental results show that the SCHER algorithm can better utilize the experience replay technique, increasing the average task success rate by 5.7 percentage points, and has higher accuracy and generalization ability.

deep reinforcement learning; sparse reward; experience replay; adaptive reward signal

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords