Drones (Sep 2024)

The Optimal Strategies of Maneuver Decision in Air Combat of UCAV Based on the Improved TD3 Algorithm

  • Xianzhong Gao,
  • Yue Zhang,
  • Baolai Wang,
  • Zhihui Leng,
  • Zhongxi Hou

DOI
https://doi.org/10.3390/drones8090501
Journal volume & issue
Vol. 8, no. 9
p. 501

Abstract

Read online

Nowadays, unmanned aerial vehicles (UAVs) pose a significant challenge to air defense systems. Unmanned combat aerial vehicles (UCAVs) have been proven to be an effective method to counter the threat of UAVs in application. Therefore, maneuver decision-making has become the crucial technology to achieve autonomous air combat for UCAVs. In order to solve the problem of maneuver decision-making, an autonomous model of UCAVs based on the deep reinforcement learning method was proposed in this paper. Firstly, the six-degree-of-freedom (DoF) dynamic model was built in three-dimensional space, and the continuous actions of tangential overload, normal overload, and roll angle were selected as the maneuver inputs. Secondly, to improve the convergence speed for the deep reinforcement learning method, the idea of “scenario-transfer training” was introduced into the twin delayed deep deterministic (TD3) policy gradient algorithm, the results showing that the improved algorithm could cut off about 60% of the training time. Thirdly, for the “nose-to-nose turns”, which is one of the classical maneuvers for experienced pilots, the optimal maneuver generated by the proposed method was analyzed. The results showed that the maneuver strategy obtained by the proposed method was highly consistent with that made by experienced fighter pilots. This is also the first time in a public article that compared the maneuver decisions made by the deep reinforcement learning method with experienced fighter pilots. This research can provide some meaningful references to generate autonomous decision-making strategies for UCAVs.

Keywords