A semi-independent policies training method with shared representation for heterogeneous multi-agents reinforcement learning

Biao Zhao; Weiqiang Jin; Zhang Chen; Yucheng Guo; Yucheng Guo

doi:10.3389/fnins.2023.1201370

Frontiers in Neuroscience (Jun 2023)

A semi-independent policies training method with shared representation for heterogeneous multi-agents reinforcement learning

Biao Zhao,
Weiqiang Jin,
Zhang Chen,
Yucheng Guo,
Yucheng Guo

Affiliations

Biao Zhao: School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an, China
Weiqiang Jin: School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an, China
Zhang Chen: School of Software Engineering, Xi'an Jiaotong University, Xi'an, China
Yucheng Guo: Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China
Yucheng Guo: Department of Orthodontics, Stomatological Hospital of Xi'an Jiaotong University, Xi'an, China

DOI: https://doi.org/10.3389/fnins.2023.1201370
Journal volume & issue: Vol. 17

Abstract

Read online

Humans do not learn everything from the scratch but can connect and associate the upcoming information with the exchanged experience and known knowledge. Such an idea can be extended to cooperated multi-reinforcement learning and has achieved its success on homogeneous agents by means of parameter sharing. However, it is difficult to straightforwardly apply parameter sharing when dealing with heterogeneous agents thanks to their individual forms of input/output and their diverse functions and targets. Neuroscience has provided evidence that our brain creates several levels of experience and knowledge-sharing mechanisms that not only exchange similar experiences but also allow for sharing of abstract concepts to handle unfamiliar situations that others have already encountered. Inspired by such a brain's functions, we propose a semi-independent training policy method that can well tackle the conflict between parameter sharing and specialized training for heterogeneous agents. It employs a shared common representation for both observation and action, enabling the integration of various input and output sources. Additionally, a shared latent space is utilized to maintain a balanced relationship between the upstream policy and downstream functions, benefiting each individual agent's target. From the experiments, it can approve that our proposed method outperforms the current mainstream algorithms, especially when handling heterogeneous agents. Empirically, our proposed method can also be improved as a more general and fundamental heterogeneous agents' reinforcement learning structure for curriculum learning and representation transfer. All our code is open and released on https://gitlab.com/reinforcement/ntype.

Published in Frontiers in Neuroscience

ISSN: 1662-4548 (Print); 1662-453X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry
Website: http://www.frontiersin.org/neuroscience

About the journal

Abstract

Keywords