Drones (Aug 2024)

UAV Confrontation and Evolutionary Upgrade Based on Multi-Agent Reinforcement Learning

  • Xin Deng,
  • Zhaoqi Dong,
  • Jishiyu Ding

DOI
https://doi.org/10.3390/drones8080368
Journal volume & issue
Vol. 8, no. 8
p. 368

Abstract

Read online

Unmanned aerial vehicle (UAV) confrontation scenarios play a crucial role in the study of agent behavior selection and decision planning. Multi-agent reinforcement learning (MARL) algorithms serve as a universally effective method guiding agents toward appropriate action strategies. They determine subsequent actions based on the state of the agents and the environmental information that the agents receive. However, traditional MARL settings often result in one party agent consistently outperforming the other party due to superior strategies, or both agents reaching a strategic stalemate with no further improvement. To solve this issue, we propose a semi-static deep deterministic policy gradient algorithm based on MARL. This algorithm employs a centralized training and decentralized execution approach, dynamically adjusting the training intensity based on the comparative strengths and weaknesses of both agents’ strategies. Experimental results show that during the training process, the strategy of the winning team drives the losing team’s strategy to upgrade continuously, and the relationship between the winning team and the losing team keeps changing, thus achieving mutual improvement of the strategies of both teams. The semi-static reinforcement learning algorithm improves the win-loss relationship conversion by 8% and reduces the training time by 40% compared with the traditional reinforcement learning algorithm.

Keywords