IEEE Access (Jan 2020)
Reinforcement Learning Interpretation Methods: A Survey
Abstract
Reinforcement Learning (RL) systems achieved outstanding performance in different domains such as Atari games, finance, healthcare, and self-driving cars. However, their black-box nature complicates their use, especially in critical applications such as healthcare. To solve this problem, researchers have proposed different approaches to interpret RL models. Some of these methods were adopted from machine learning, while others were designed specifically for RL. The main objective of this paper is to show and explain RL interpretation methods, the metrics used to classify them, and how these metrics were applied to understand the internal details of RL models. We reviewed papers that propose new RL interpretation methods, improve the old ones, or discuss the pros and cons of the existing methods.
Keywords