Entropy (Jan 2022)

Learn Quasi-Stationary Distributions of Finite State Markov Chain

  • Zhiqiang Cai,
  • Ling Lin,
  • Xiang Zhou

DOI
https://doi.org/10.3390/e24010133
Journal volume & issue
Vol. 24, no. 1
p. 133

Abstract

Read online

We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method.

Keywords