IEEE Access (Jan 2021)
A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation
Abstract
In intelligent unmanned warehouse goods-to-man systems, the allocation of tasks has an important influence on the efficiency because of the dynamic performance of AGV robots and orders. The paper presents a hierarchical Soft Actor-Critic algorithm to solve the dynamic scheduling problem of orders picking. The method proposed is based on the classic Soft Actor-Critic and hierarchical reinforcement learning algorithm. In this paper, the model is trained at different time scales by introducing sub-goals, with the top-level learning a policy and the bottom level learning a policy to achieve the sub-goals. The actor of the controller aims to maximize expected intrinsic reward while also maximizing entropy. That is, to succeed at the sub-goals while moving as randomly as possible. Finally, experimental results for simulation experiments in different scenes show that the method can make multi-logistics AGV robots work together and improves the reward in sparse environments about 2.61 times compared to the SAC algorithm.
Keywords