Research on Maneuvering Decision Algorithm Based on Improved Deep Deterministic Policy Gradient

Jing Xianyong; Manyi Hou; Gaolong Wu; Zongcheng Ma; Zhongxiang Tao

doi:10.1109/ACCESS.2022.3202918

IEEE Access (Jan 2022)

Research on Maneuvering Decision Algorithm Based on Improved Deep Deterministic Policy Gradient

Jing Xianyong,
Manyi Hou,
Gaolong Wu,
Zongcheng Ma,
Zhongxiang Tao

Affiliations

Jing Xianyong: ORCiD; Aviation Combat Service Academy, Air Force Aviation University, Changchun, China
Manyi Hou: ORCiD; Aviation Combat Service Academy, Air Force Aviation University, Changchun, China
Gaolong Wu: Aviation Combat Service Academy, Air Force Aviation University, Changchun, China
Zongcheng Ma: ORCiD; Aviation Combat Service Academy, Air Force Aviation University, Changchun, China
Zhongxiang Tao: Aviation Combat Service Academy, Air Force Aviation University, Changchun, China

DOI: https://doi.org/10.1109/ACCESS.2022.3202918
Journal volume & issue: Vol. 10
pp. 92426 – 92445

Abstract

Read online

Autonomous maneuvering decisions of unmanned aerial vehicle (UAV) in short-range air combat remain a challenging research topic, and a decision method based on an improved deep deterministic policy gradient (DDPG) is proposed. First, the problem model is improved from the perspective of energy–air combat, and a decision model with engine thrust, angle of attack, and roll angle as control variables is established. The normal and tangential overloads are determined by these control variables, and the decision is constrained by the flight stability and threshold range. Subsequently, the decision learning algorithm of the maneuver command is designed based on the DDPG framework. According to the energy air combat, speed is introduced into the return function in some states to make the return value more in line with reality. In view of the slow learning speed of the DDPG algorithm, the winning rate is introduced into the $\varepsilon $ -greedy strategy to adjust the exploration and application probabilities in real time. In view of the decrease in computational efficiency caused by the large amount of empirical data, a similar empirical exclusion was carried out based on the vector distance. The simulation results show that the DDPG-based algorithm realizes autonomous decisions of engine thrust, roll angle, and attack angle under constraints, and the comparative simulation shows that the improvement measures are effective.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords