Hybrid Centralized Training and Decentralized Execution Reinforcement Learning in Multi-Agent Path-Finding Simulations

Hua-Ching Chen; Shih-An Li; Tsung-Han Chang; Hsuan-Ming Feng; Yun-Chien Chen

doi:10.3390/app14103960

Applied Sciences (May 2024)

Hybrid Centralized Training and Decentralized Execution Reinforcement Learning in Multi-Agent Path-Finding Simulations

Hua-Ching Chen,
Shih-An Li,
Tsung-Han Chang,
Hsuan-Ming Feng,
Yun-Chien Chen

Affiliations

Hua-Ching Chen: School of Information Engineering, Xiamen Ocean Vocational College, Xiamen 361100, China
Shih-An Li: Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 10650, Taiwan
Tsung-Han Chang: Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 10650, Taiwan
Hsuan-Ming Feng: Department of Computer Science and Information Engineering, National Quemoy University, Kinmen County 892, Taiwan
Yun-Chien Chen: Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 10650, Taiwan

DOI: https://doi.org/10.3390/app14103960
Journal volume & issue: Vol. 14, no. 10
p. 3960

Abstract

Read online

In this paper, we propose a hybrid centralized training and decentralized execution neural network architecture with deep reinforcement learning (DRL) to complete the multi-agent path-finding simulation. In the training of physical robots, collisions and other unintended accidents are very likely to occur in multi-agent cases, so it is required to train the networks within a deep deterministic policy gradient for the virtual environment of the simulator. The simple particle multi-agent simulator designed by OpenAI (Sacramento, CA, USA) for training platforms can easily obtain the state information of the environment. The overall system of the training cycle is designed with a self-designed reward function and is completed through a progressive learning approach from a simple to a complex environment. Finally, we carried out and presented the experiments of multi-agent path-finding simulations. The proposed methodology is better than the multi-agent model-based policy optimization (MAMBPO) and model-free multi-agent soft actor–critic models.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords