IEEE Access (Jan 2023)
Confrontation and Obstacle-Avoidance of Unmanned Vehicles Based on Progressive Reinforcement Learning
Abstract
The core technique of unmanned vehicle systems is the autonomous maneuvering decision, which not only determines the applications of unmanned vehicles but also is the critical technique many countries are competing to develop. Reinforcement Learning (RL) is the potential design method for autonomous maneuvering decision-making systems. Nevertheless, in the face of complex decision-making tasks, it is still challenging to master the optimal policy due to the low learning efficiency caused by the complex environment, high dimensional state, and sparse reward. Inspired by the human learning process from simple to complex, we propose a novel progressive deep RL algorithm for policy optimization in unmanned autonomous decision-making systems in this paper. The proposed algorithm divides the training of the autonomous maneuvering decision into a sequence of curricula with learning tasks from simple to complex. Finally, through the self-play stage, the iterative optimization of the policy is realized. Furthermore, the confrontation environment with two unmanned vehicles with obstacles is analyzed and modeled. Finally, the simulation leads to the one-to-one adversarial tasks demonstrate the effectiveness and applicability of the proposed design algorithm.
Keywords