A Functional Clipping Approach for Policy Optimization Algorithms

Wangshu Zhu; Andre Rosendo

doi:10.1109/ACCESS.2021.3094566

IEEE Access (Jan 2021)

A Functional Clipping Approach for Policy Optimization Algorithms

Wangshu Zhu,
Andre Rosendo

Affiliations

Wangshu Zhu: ORCiD; Department of Computer Science, School of Information Science and Technology, Living Machines Laboratories, ShanghaiTech University, Shanghai, China
Andre Rosendo: ORCiD; Department of Computer Science, School of Information Science and Technology, Living Machines Laboratories, ShanghaiTech University, Shanghai, China

DOI: https://doi.org/10.1109/ACCESS.2021.3094566
Journal volume & issue: Vol. 9
pp. 96056 – 96063

Abstract

Read online

Proximal policy optimization (PPO) has yielded state-of-the-art results in policy search, a subfield of reinforcement learning, with one of its key points being the use of a surrogate objective function to restrict the step size at each policy update. Although such restriction is helpful, the algorithm still suffers from performance instability and optimization inefficiency from the sudden flattening of the curve. To address this issue we present a novel functional clipping policy optimization algorithm, named Proximal Policy Optimization Smoothed Algorithm (PPOS), and its critical improvement is the use of a functional clipping method instead of a flat clipping method. We compare our approach with PPO and PPORB, which adopts a rollback clipping method, and prove that our approach can conduct more accurate updates than other PPO methods. We show that it outperforms the latest PPO variants on both performance and stability in challenging continuous control tasks. Moreover, we provide an instructive guideline for tuning the main hyperparameter in our algorithm.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords