IEEE Access (Jan 2022)

Controllable Swarm Animation Using Deep Reinforcement Learning With a Rule-Based Action Generator

  • Zong-Sheng Wang,
  • Chang Geun Song,
  • Jung Lee,
  • Jong-Hyun Kim,
  • Sun-Jeong Kim

DOI
https://doi.org/10.1109/ACCESS.2022.3172492
Journal volume & issue
Vol. 10
pp. 48472 – 48485

Abstract

Read online

The swarm behavior in nature is a fascinating and complex phenomenon that has been studied extensively for decades. Visually natural swarm animation can be produced by the state-of-the-art rule-based method; however, it still suffers from the drawbacks of low control accuracy and instability in swarm behavior quality when controlled by the user. This study proposes a deep reinforcement learning (DRL) based approach to generate swarm animation that reacts to real-time user control with high quality. A rule-based action generator (RAG) adapted to the actor-critic DRL method is presented to enhance DRL’s action exploration strategy. Various practical dynamic reward functions are also designed for DRL to train agents by rewarding swarm behaviors and penalizing misbehavior. The user controls the swarm by interacting with the swarm’s leader agent, for example by directly changing its speed or orientation, or by specifying a path consisting of waypoints. The second aim of this study is to improve the scalability of the trained policy. This study introduces a new state observation quantity of DRL called the embedded features of swarm (EFS) for allowing the trained policy scaling to a more extensive system than it has been trained on. In the experiments, four different scenarios have been designed to evaluate the control accuracy and quality of the generated swarm behavior by metrics and visualization. Additionally, the experiment has compared the performance of the proposed dynamic reward functions with fixed reward functions. Experimental results show that the proposed approach outperforms state-of-the-art methods in terms of swarm behavior quality and control accuracy. Moreover, the proposed dynamic reward functions are more effective than the existing reward functions.

Keywords