Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints

Pengda Liu; Huaguang Zhang; Chong Liu; Hanguang Su

doi:10.1109/ACCESS.2020.3029171

IEEE Access (Jan 2020)

Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints

Pengda Liu,
Huaguang Zhang,
Chong Liu,
Hanguang Su

Affiliations

Pengda Liu: ORCiD; College of Information Science and Engineering, Northeastern University, Shenyang, China
Huaguang Zhang: ORCiD; College of Information Science and Engineering, Northeastern University, Shenyang, China
Chong Liu: ORCiD; College of Information Science and Engineering, Northeastern University, Shenyang, China
Hanguang Su: ORCiD; College of Information Science and Engineering, Northeastern University, Shenyang, China

DOI: https://doi.org/10.1109/ACCESS.2020.3029171
Journal volume & issue: Vol. 8
pp. 182295 – 182306

Abstract

Read online

In this article, a novel online method for multi-player non-zero-sum (NZS) differential games of nonlinear partially unknown continuous time (CT) systems with control constraints is developed based on neural networks (NN). The issue of multi-player NZS games with saturated actuator is elaborately analyzed and the unknown dynamics model is learned by applying identifier NN. Different from using the standard identifier-actor-critic framework of adaptive dynamic programming (ADP), the proposed method uses only identifier networks and critic networks for all the players to solve the coupled Hamilton-Jacobi (HJ) equations for multi-player NZS games, which could effectively simplify the algorithm and save computing resources. Moreover, a tuning law which utilizes the gradient descent method is designed for each critic network. Meanwhile, to remove the requirement for the initial stabilizing control, a novel stability term is designed to ensure the system stability during the training phase of the critic NN. By the means of Lyapunov approach, it is proven that the system states, the critic network weight estimation errors and the obtained control are all uniformly ultimately bounded (UUB). Finally, two numerical examples are simulated to illustrate the validity of the developed method for multi-player NZS games with control constraints.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords