Multi-Agent Reinforcement Learning with Optimal Equivalent Action of Neighborhood

Haixing Wang; Yi Yang; Zhiwei Lin; Tian Wang

doi:10.3390/act11040099

Actuators (Mar 2022)

Multi-Agent Reinforcement Learning with Optimal Equivalent Action of Neighborhood

Haixing Wang,
Yi Yang,
Zhiwei Lin,
Tian Wang

Affiliations

Haixing Wang: Henan Key Laboratory of Intelligent Detection and Control of Coal Mine Equipment, School of Electrical Engineering and Automation, Henan Polytechnic University, Shiji Road, Jiaozuo 454003, China
Yi Yang: Henan Key Laboratory of Intelligent Detection and Control of Coal Mine Equipment, School of Electrical Engineering and Automation, Henan Polytechnic University, Shiji Road, Jiaozuo 454003, China
Zhiwei Lin: School of Mathematics and Physics, Queen’s University Belfast, University Road, 10587, Belfast BT7 1NN, UK
Tian Wang: Institute of Artificial Intelligence, Beihang University, Xueyuan Road, Beijing 100083, China

DOI: https://doi.org/10.3390/act11040099
Journal volume & issue: Vol. 11, no. 4
p. 99

Abstract

Read online

In a multi-agent system, the complex interaction among agents is one of the difficulties in making the optimal decision. This paper proposes a new action value function and a learning mechanism based on the optimal equivalent action of the neighborhood (OEAN) of a multi-agent system, in order to obtain the optimal decision from the agents. In the new Q-value function, the OEAN is used to depict the equivalent interaction between the current agent and the others. To deal with the non-stationary environment when agents act, the OEAN of the current agent is inferred simultaneously by the maximum a posteriori based on the hidden Markov random field model. The convergence property of the proposed methodology proved that the Q-value function can approach the global Nash equilibrium value using the iteration mechanism. The effectiveness of the method is verified by the case study of the top-coal caving. The experiment results show that the OEAN can reduce the complexity of the agents’ interaction description, meanwhile, the top-coal caving performance can be improved significantly.

Published in Actuators

ISSN: 2076-0825 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Materials of engineering and construction. Mechanics of materials; Technology: Electrical engineering. Electronics. Nuclear engineering: Production of electric energy or power. Powerplants. Central stations
Website: http://www.mdpi.com/journal/actuators

About the journal

Abstract

Keywords