IEEE Access (Jan 2024)
Farthest Agent Selection With Episode-Wise Observations for Real-Time Multi-Agent Reinforcement Learning Applications
Abstract
Multi-agent reinforcement learning (MARL) algorithms have been widely used for many applications requiring sequential decision-making to maximize the expected rewards through multi-agent cooperation. However, MARL faces significant challenges, particularly in resource-limited real-time computing environments. To tackle this problem, this paper considers the selection of agents for training which can be beneficial in terms of computation-overhead reduction. For the selection, a farthest agent selection (FAS) is proposed, inspired by the farthest point sampling for representative sample selection in 3D point cloud processing. The proposed FAS method is able to choose agents based on their episode-specific observations in real-time. Additionally, the number of selected agents can be determined based on the real-time variances in the observations of each agent. The proposed FAS method is rigorously evaluated using the StarCraft Multi-Agent Challenge (SMAC) and Predator-Prey (PP) tasks, demonstrating superior performance compared to existing MARL algorithms. This research is scalable, and thus, it can contribute to the development of more efficient MARL training methodologies for various applications such as real-time strategy games and human-robot cooperation scenarios requiring multi-agent cooperation under partial observability.
Keywords