Pursuing Benefits or Avoiding Threats: Realizing Regional Multi-Target Electronic Reconnaissance With Deep Reinforcement Learning

Yongle Xu; Ming Zhang; Boyin Jin

doi:10.1109/ACCESS.2023.3289077

IEEE Access (Jan 2023)

Pursuing Benefits or Avoiding Threats: Realizing Regional Multi-Target Electronic Reconnaissance With Deep Reinforcement Learning

Yongle Xu,
Ming Zhang,
Boyin Jin

Affiliations

Yongle Xu: School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China
Ming Zhang: School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China
Boyin Jin: ORCiD; School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China

DOI: https://doi.org/10.1109/ACCESS.2023.3289077
Journal volume & issue: Vol. 11
pp. 63972 – 63984

Abstract

Read online

Unmanned combat aerial vehicles (UCAVs) are preferred for regional electronic reconnaissance due to their versatility and stealth. This paper proposes a deep reinforcement learning (DRL) method to enable UCAVs to complete regional multi-target electronic reconnaissance (MER) tasks with continuous autonomous maneuvers. Distinguishing from traditional heuristic search algorithms, we first derive the objective function of MER and elucidate sufficient conditions to improve the success rate of reconnaissance recognition. Then, using the original cognitive electronic warfare framework, a three-dimensional MER simulator named Scouer-N is created to satisfy the requirements of dynamic environment training for DRL-based agents. To enable the processing of sequential situation awareness, a generative network is constructed by introducing a partially observable Markov decision process (POMDP) model, which assists the UCAV in filtering the observations from the sensor and predicting the actual states. Finally, we propose a priority-driven state reward shaping method that provides normalized state representation and dense rewards to the agent during training to improve the agent’s behavioral knowledge for MER. The experimental results demonstrate a considerable improvement in the task success rate of the trained UCAV relative to the benchmark, proving the efficacy of our approach in helping agents learn the optimal reconnaissance strategy from the potential state space.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords