Zhihui kongzhi yu fangzhen (Dec 2023)

Research on multi-UAV cooperative pursuit and confrontation strategy based on P3C-MADDPG algorithm

  • GAO Jiabo, XIAO Wei, HE Zhijie

DOI
https://doi.org/10.3969/j.issn.1673-3819.2023.06.002
Journal volume & issue
Vol. 45, no. 6
pp. 7 – 18

Abstract

Read online

Aiming at the cooperative pursuit and confrontation task of multiple UAVs in the unknown escape UAV environment, a multi-UAVs cooperative pursuit and confrontation strategy based on P3C-MADDPG algorithm is proposed. First, in order to solve the problem of slow training speed and over estimation of Q value of Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, In MADDPG algorithm, Prioritized Experience Replay (PER) based on tree structure storage and a parallel Critic network model with 3 threads are prioritized respectively, and the P3C-MADDPG algorithm is proposed. Then, based on the kinematics model of UAV, training elements such as state space, reward function combining sparse reward and guided reward, pursuit action space with different accelerations are designed. Finally, based on the above training elements, the P3C-MADDPG algorithm is used to generate the cooperative pursuit and confrontation strategy of multiple UAVs in the unknown escape UAV environment. Simulation experiments show that the P3C-MADDPG algorithm increases the training speed by 11.7% on average, and decreases the Q value by 6.06% on average. The generated multi-UAV cooperative pursuit and confrontation strategy can effectively avoid obstacles, and more intelligently realize the pursuit of unmanned aerial vehicles with unknown strategies.

Keywords