IEEE Access (Jan 2023)
Pursuing Benefits or Avoiding Threats: Realizing Regional Multi-Target Electronic Reconnaissance With Deep Reinforcement Learning
Abstract
Unmanned combat aerial vehicles (UCAVs) are preferred for regional electronic reconnaissance due to their versatility and stealth. This paper proposes a deep reinforcement learning (DRL) method to enable UCAVs to complete regional multi-target electronic reconnaissance (MER) tasks with continuous autonomous maneuvers. Distinguishing from traditional heuristic search algorithms, we first derive the objective function of MER and elucidate sufficient conditions to improve the success rate of reconnaissance recognition. Then, using the original cognitive electronic warfare framework, a three-dimensional MER simulator named Scouer-N is created to satisfy the requirements of dynamic environment training for DRL-based agents. To enable the processing of sequential situation awareness, a generative network is constructed by introducing a partially observable Markov decision process (POMDP) model, which assists the UCAV in filtering the observations from the sensor and predicting the actual states. Finally, we propose a priority-driven state reward shaping method that provides normalized state representation and dense rewards to the agent during training to improve the agent’s behavioral knowledge for MER. The experimental results demonstrate a considerable improvement in the task success rate of the trained UCAV relative to the benchmark, proving the efficacy of our approach in helping agents learn the optimal reconnaissance strategy from the potential state space.
Keywords