IEEE Access (Jan 2018)
A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes
Abstract
In recent years, reinforcement learning (RL) has achieved remarkable success due to the growing adoption of deep learning techniques and the rapid growth of computing power. Nevertheless, it is well-known that flat reinforcement learning algorithms are often have trouble learning and are even data-efficient with respect to tasks having hierarchical structures, e.g., those consisting of multiple subtasks. Hierarchical reinforcement learning is a principled approach that can tackle such challenging tasks. On the other hand, many real-world tasks usually have only partial observability in which state measurements are often imperfect and partially observable. The problems of RL in such settings can be formulated as a partially observable Markov decision process (POMDP). In this paper, we study hierarchical RL in a POMDP in which the tasks have only partial observability and possess hierarchical properties. We propose a hierarchical deep reinforcement learning approach for learning in hierarchical POMDP. The deep hierarchical RL algorithm is proposed for domains to both MDP and POMDP learning. We evaluate the proposed algorithm using various challenging hierarchical POMDPs.
Keywords