ICT Express (Sep 2020)

Implementing action mask in proximal policy optimization (PPO) algorithm

  • Cheng-Yen Tang,
  • Chien-Hung Liu,
  • Woei-Kae Chen,
  • Shingchern D. You

Journal volume & issue
Vol. 6, no. 3
pp. 200 – 203

Abstract

Read online

The proximal policy optimization (PPO) algorithm is a promising algorithm in reinforcement learning. In this paper, we propose to add an action mask in the PPO algorithm. The mask indicates whether an action is valid or invalid for each state. Simulation results show that, when compared with the original version, the proposed algorithm yields much higher return with a moderate number of training steps. Therefore, it is useful and valuable to incorporate such a mask if applicable.

Keywords