IEEE Access (Jan 2024)
Hybrid-Pursuit Strategies in Multiple Pursuer-Evader Games Using Reinforcement Learning
Abstract
This paper presents a comprehensive learning strategy for the collaborative pursuit of evaders by multiple pursuers in environments with dynamic obstacles. Utilizing a variational autoencoder framework for effective obstacle detection, we integrate the multiagent twin delayed deep deterministic policy gradient algorithm for training pursuers and the proximal policy optimization algorithm for evaders, forming a complete pursuit-evasion strategy. In addition to collaborative pursuit strategies, our approach incorporates scheme for individual pursuers to directly capture nearby evaders, enhancing the flexibility and robustness of the overall system. The reward mechanism of these hybrid-pursuit strategies is designed to balance cooperative and individual rewards, informed by the states of both agents and obstacles, to optimize overall performance. Simulation results demonstrate the efficacy of the proposed algorithm, achieving successful collaborative and individual pursuits as well as dynamic obstacle avoidance.
Keywords