Mathematics (May 2024)

Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method

  • Longyan Hao,
  • Chaoli Wang,
  • Yibo Shi

DOI
https://doi.org/10.3390/math12101533
Journal volume & issue
Vol. 12, no. 10
p. 1533

Abstract

Read online

This article investigates the optimal tracking control problem for data-based stochastic discrete-time linear systems. An average off-policy Q-learning algorithm is proposed to solve the optimal control problem with random disturbances. Compared with the existing off-policy reinforcement learning (RL) algorithm, the proposed average off-policy Q-learning algorithm avoids the assumption of an initial stability control. First, a pole placement strategy is used to design an initial stable control for systems with unknown dynamics. Second, the initial stable control is used to design a data-based average off-policy Q-learning algorithm. Then, this algorithm is used to solve the stochastic linear quadratic tracking (LQT) problem, and a convergence proof of the algorithm is provided. Finally, numerical examples show that this algorithm outperforms other algorithms in a simulation.

Keywords