Hangkong gongcheng jinzhan (Apr 2023)
Research on air combat decision algorithm based on proximal policy optimization
Abstract
Facing the future combat scenario with manned and unmanned aerial vehicle cooperation, real-time and accurate air combat decision-making is the basis of winning. The complex air environment, transient situation data, and multiple cumbersome combat tasks make coordinated combat with unmanned aerial vehicles a trend in future air combat, replacing single machine combat. However, multi-agent modeling and training processes face difficulties in reward allocation and network convergence. Air combat scenarios for 5v5 manned and unmanned aerial vehicle cooperation, the characteristic model of single agent is abstracted in this paper, and an algorithm based on proximal policy optimization is proposed to obtain the air combat decision sequence by using reward and punishment incentive in the real-time interaction with the environment. The simulation results show that the algorithm proposed in this paper can adapt to the complex battlefield situation and get a stable and reasonable decision-making strategy in continuous action space after training and learning.
Keywords