A Survey on Population-Based Deep Reinforcement Learning

Weifan Long; Taixian Hou; Xiaoyi Wei; Shichao Yan; Peng Zhai; Lihua Zhang

doi:10.3390/math11102234

Mathematics (May 2023)

A Survey on Population-Based Deep Reinforcement Learning

Weifan Long,
Taixian Hou,
Xiaoyi Wei,
Shichao Yan,
Peng Zhai,
Lihua Zhang

Affiliations

Weifan Long: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
Taixian Hou: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
Xiaoyi Wei: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
Shichao Yan: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
Peng Zhai: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
Lihua Zhang: Academy for Engineering and Technology, Fudan University, Shanghai 200433, China

DOI: https://doi.org/10.3390/math11102234
Journal volume & issue: Vol. 11, no. 10
p. 2234

Abstract

Read online

Many real-world applications can be described as large-scale games of imperfect information, which require extensive prior domain knowledge, especially in competitive or human–AI cooperation settings. Population-based training methods have become a popular solution to learn robust policies without any prior knowledge, which can generalize to policies of other players or humans. In this survey, we shed light on population-based deep reinforcement learning (PB-DRL) algorithms, their applications, and general frameworks. We introduce several independent subject areas, including naive self-play, fictitious self-play, population-play, evolution-based training methods, and the policy-space response oracle family. These methods provide a variety of approaches to solving multi-agent problems and are useful in designing robust multi-agent reinforcement learning algorithms that can handle complex real-life situations. Finally, we discuss challenges and hot topics in PB-DRL algorithms. We hope that this brief survey can provide guidance and insights for researchers interested in PB-DRL algorithms.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords