Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning

Feng Liu; Shuling Dai; Yongjia Zhao

doi:10.1109/ACCESS.2020.3045835

IEEE Access (Jan 2020)

Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning

Feng Liu,
Shuling Dai,
Yongjia Zhao

Affiliations

Feng Liu: ORCiD; State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China
Shuling Dai: ORCiD; State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China
Yongjia Zhao: ORCiD; State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2020.3045835
Journal volume & issue: Vol. 8
pp. 228099 – 228107

Abstract

Read online

Using the same algorithm and hyperparameter configurations, deep reinforcement learning (DRL) will derive drastically different results from multiple experimental trials, and most of these results are unsatisfactory. Because of the instability of the results, researchers have to perform many trials to confirm an algorithm or a set of hyperparameters in DRL. In this article, we present the policy return method, which is a new design for reducing the number of trials when training a DRL model. This method allows the learned policy to return to a previous state when it becomes divergent or stagnant at any stage of training. When returning, a certain percentage of stochastic data is added to the weights of the neural networks to prevent a repeated decline. Extensive experiments on challenging tasks and various target scores demonstrate that the policy return method can bring about a 10% to 40% reduction in the required number of trials compared with that of the corresponding original algorithm, and a 10% to 30% reduction compared with the state-of-the-art algorithms.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords