Stochastic Double Deep Q-Network

Pingli Lv; Xuesong Wang; Yuhu Cheng; Ziming Duan

doi:10.1109/ACCESS.2019.2922706

IEEE Access (Jan 2019)

Stochastic Double Deep Q-Network

Pingli Lv,
Xuesong Wang,
Yuhu Cheng,
Ziming Duan

Affiliations

Pingli Lv: School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
Xuesong Wang: ORCiD; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
Yuhu Cheng: ORCiD; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
Ziming Duan: School of Mathematics, China University of Mining and Technology, Xuzhou, China

DOI: https://doi.org/10.1109/ACCESS.2019.2922706
Journal volume & issue: Vol. 7
pp. 79446 – 79454

Abstract

Read online

Estimation bias seriously affects the performance of reinforcement learning algorithms. The maximum operation may result in overestimation, while the double estimator operation often leads to underestimation. To eliminate the estimation bias, these two operations are combined together in our proposed algorithm named stochastic double deep Q-learning network (SDDQN), which is based on the idea of random selection. A tabular version of SDDQN is also given, named stochastic double Q-learning (SDQ). Both the SDDQN and SDQ are based on the double estimator framework. At each step, we choose to use either the maximum operation or the double estimator operation with a certain probability, which is determined by a random selection parameter. The theoretical analysis shows that there indeed exists a proper random selection parameter that makes SDDQN and SDQ unbiased. The experiments on Grid World and Atari 2600 games illustrate that our proposed algorithms can balance the estimation bias effectively and improve performance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords