Tactical intent-driven autonomous air combat behavior generation method

Xingyu Wang; Zhen Yang; Shiyuan Chai; Jichuan Huang; Yupeng He; Deyun Zhou

doi:10.1007/s40747-024-01685-9

Complex & Intelligent Systems (Dec 2024)

Tactical intent-driven autonomous air combat behavior generation method

Xingyu Wang,
Zhen Yang,
Shiyuan Chai,
Jichuan Huang,
Yupeng He,
Deyun Zhou

Affiliations

Xingyu Wang: School of Electronics and Information, Northwestern Polytechnical University
Zhen Yang: School of Electronics and Information, Northwestern Polytechnical University
Shiyuan Chai: School of Electronics and Information, Northwestern Polytechnical University
Jichuan Huang: School of Electronics and Information, Northwestern Polytechnical University
Yupeng He: School of Electronics and Information, Northwestern Polytechnical University
Deyun Zhou: School of Electronics and Information, Northwestern Polytechnical University

DOI: https://doi.org/10.1007/s40747-024-01685-9
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 22

Abstract

Read online

Abstract With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords