IEEE Access (Jan 2020)

Reinforcement Learning Interpretation Methods: A Survey

  • Alnour Alharin,
  • Thanh-Nam Doan,
  • Mina Sartipi

DOI
https://doi.org/10.1109/ACCESS.2020.3023394
Journal volume & issue
Vol. 8
pp. 171058 – 171077

Abstract

Read online

Reinforcement Learning (RL) systems achieved outstanding performance in different domains such as Atari games, finance, healthcare, and self-driving cars. However, their black-box nature complicates their use, especially in critical applications such as healthcare. To solve this problem, researchers have proposed different approaches to interpret RL models. Some of these methods were adopted from machine learning, while others were designed specifically for RL. The main objective of this paper is to show and explain RL interpretation methods, the metrics used to classify them, and how these metrics were applied to understand the internal details of RL models. We reviewed papers that propose new RL interpretation methods, improve the old ones, or discuss the pros and cons of the existing methods.

Keywords