Applied Sciences (May 2024)
Hybrid Centralized Training and Decentralized Execution Reinforcement Learning in Multi-Agent Path-Finding Simulations
Abstract
In this paper, we propose a hybrid centralized training and decentralized execution neural network architecture with deep reinforcement learning (DRL) to complete the multi-agent path-finding simulation. In the training of physical robots, collisions and other unintended accidents are very likely to occur in multi-agent cases, so it is required to train the networks within a deep deterministic policy gradient for the virtual environment of the simulator. The simple particle multi-agent simulator designed by OpenAI (Sacramento, CA, USA) for training platforms can easily obtain the state information of the environment. The overall system of the training cycle is designed with a self-designed reward function and is completed through a progressive learning approach from a simple to a complex environment. Finally, we carried out and presented the experiments of multi-agent path-finding simulations. The proposed methodology is better than the multi-agent model-based policy optimization (MAMBPO) and model-free multi-agent soft actor–critic models.
Keywords