SICE Journal of Control, Measurement, and System Integration (Jul 2018)
Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem
Abstract
This paper introduces a reinforcement learning technique with an internal reward for a multi-agent cooperation task. The proposed methods is an extension of Q-learning which changes the ordinary (external) reward to the internal reward for agent-cooperation. Specifically, we propose here two Q-learning methods, both of which employ the internal reward for the less or no communication. To guarantee the effectiveness of the proposed methods, we theoretically derived the mechanisms that solve the following questions: (1) how the internal rewards should be set to guarantee the cooperation among the agents under the condition of less and no communication; and (2) how the values of the cooperative behaviors types (i.e., the varieties of the cooperative behaviors of the agents) should be updated under the condition of no communication. The intensive simulations on the maze problem for the agent-cooperation task have been revealed that our two proposed methods successfully enable the agents to acquire their cooperative behaviors even in less or no communication, while the conventional method (Q-learning) always fails to acquire such behaviors.
Keywords