SICE Journal of Control, Measurement, and System Integration (Dec 2024)
Bias-free policy evaluation in the discrete-time adaptive linear quadratic optimal control in the presence of stochastic disturbances
Abstract
The study proposes an adaptive Linear Quadratic (LQ) optimal regulator for discrete-time linear systems in the presence of stochastic disturbances through policy iteration with Actor/Critic structure. The existing deterministic policy iteration method realizes an adaptive LQ optimal regulator. However, in case of the presence of stochastic disturbances, it suffers from remaining bias error of the Critic parameter estimation that causes performance degradation. Therefore, for achieving bias-free policy evaluation, the study introduces a disturbance-influenced term into the Critic parameter estimation and employs a Recursive Instrumental Variable (RIV) method with a properly selected instrumental variable. In addition, the study provides a further modification of the Critic parameter estimation that simultaneously estimates the disturbance-influenced term by introducing an extended parameter representation. Finally, the study shows the effectiveness of the proposed method compared with the existing methods through numerical examples.
Keywords