Zhejiang Daxue xuebao. Lixue ban (Jul 2024)

Research progress of causal inference in reinforcement learning framework(强化学习框架中因果推断研究进展)

  • 刘华玲(LIU Hualing),
  • 朱建亮(ZHU Jianliang),
  • 任青青(REN Qingqing)

DOI
https://doi.org/10.3785/j.issn.1008-9497.2024.04.001
Journal volume & issue
Vol. 51, no. 4
pp. 391 – 406

Abstract

Read online

Causal reasoning has been extensively studied in all fields of science. In recent decades, there have been a number of innovations in the development and implementation of methods aimed at determining causality. Meanwhile reinforcement learning forms a field of machine learning that focuses on the concept of how agents act in an environment to maximize cumulative rewards. The idea of embedding causal inference into the framework of reinforcement learning is an important academic progress in the field of causal inference and reinforcement learning methodology in recent years. Based on this background, this paper summarizes the background and development of cutting-edge deep reinforcement learning algorithms, and introduces three types of reinforcement learning frameworks based on value function, policy gradient and model respectively. Then, from the perspective of technology application, the research results of applying reinforcement learning to causal inference and causal recognition are reviewed in five combined scenarios. On this basis, this paper emphasizes the interpretability of causal reinforcement learning and the necessity of application research, and highlights the future research directions.(因果推断在科学领域备受关注。近年来,因果关系的确定方法有所创新。强化学习作为一种机器学习方法,主要关注智能体如何在环境中采取行动,以最大化累积奖励。将因果推断方法嵌套在强化学习框架中的思想是因果推断领域以及强化学习方法论中重要的学术进展。基于此,首先,梳理了深度强化学习算法的背景和发展,介绍了基于值函数、基于策略梯度和基于模型的3类强化学习算法框架,以及与因果推断相结合的方向;其次,从5个技术应用角度,对强化学习思想在因果推断和因果识别中的应用研究进行了综述;最后,强调了强化学习框架中因果推断的数据驱动效率、稳定性及应用研究的必要性,并对未来的研究方向进行了展望。)

Keywords