IEEE Access (Jan 2019)
Meta-Learning via Weighted Gradient Update
Abstract
Despite deep reinforcement learning has attained performance beyond human beings in many domains, including games, dialogue systems and robotics, sample inefficient is still a limitation in the application of deep reinforcement learning. This paper develops a novel and simple gradient-based meta learning method suitable for improving learning efficiency of deep reinforcement learning methods. Rather than designing complex network or adding excessive fine tuning parameters, the proposed method is simple and does not introduce any learned parameters for meta learning. Specifically, according to the characteristic of different trajectories, this paper proposes to weight every trajectory in model-agnostic meta-learning for meta updating gradient effectively. The key idea underlying this method is to take advantage of the relationship between different trajectories and the direction of parameter update. Additionally, an end-to-end training approach is also introduced so that the proposed model can attain good results with a small amount of training data on new tasks. The statistical results of the experiments indicate that the proposed algorithm delivers state-of-the-art performance on both discrete and continuous control tasks.
Keywords