A Low-Cost Q-Learning-Based Approach to Handle Continuous Space Problems for Decentralized Multi-Agent Robot Navigation in Cluttered Environments

Vahid Babaei Ajabshir; Mehmet Serdar Guzel; Erkan Bostanci

doi:10.1109/ACCESS.2022.3163393

IEEE Access (Jan 2022)

A Low-Cost Q-Learning-Based Approach to Handle Continuous Space Problems for Decentralized Multi-Agent Robot Navigation in Cluttered Environments

Vahid Babaei Ajabshir,
Mehmet Serdar Guzel,
Erkan Bostanci

Affiliations

Vahid Babaei Ajabshir: Computer Engineering Department, Ankara University, Ankara, Turkey
Mehmet Serdar Guzel: ORCiD; Computer Engineering Department, Ankara University, Ankara, Turkey
Erkan Bostanci: ORCiD; Computer Engineering Department, Ankara University, Ankara, Turkey

DOI: https://doi.org/10.1109/ACCESS.2022.3163393
Journal volume & issue: Vol. 10
pp. 35287 – 35301

Abstract

Read online

This paper addresses the problem of navigating decentralized multi-agent systems in partially cluttered environments and proposes a new machine-learning-based approach to solve it. On the basis of this approach, a new robust and flexible Q-learning-based model is proposed to handle a continuous space problem. As in reinforcement learning (RL) algorithms, Q-learning does not require a model of the environment. Additionally, Q-Learning (QL) has the advantages of being fast and easy to design. However, one disadvantage of QL is that it needs a massive amount of memory, and it grows exponentially with each extra feature introduced to the state space. In this research, we introduce an agent-level decentralized collision avoidance low-cost model for solving a continuous space problem in partially cluttered environments, followed by introducing a method to merge non-overlapping QL features in order to reduce its size significantly by about 70% and make it possible to solve more complicated scenarios with the same memory size. Additionally, another method is proposed for minimizing the sensory data that is used by the controller. A combination of these methods is able to handle swarm navigation low memory cost with at least18 number of robots. These methods can also be adapted for deep q-learning architectures so as to increase their approximation performance and also decrease their learning time process. Experiments reveal that the proposed method also achieves a high degree of accuracy for multi-agent systems in complex scenarios.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords