UAV Control Method Combining Reptile Meta-Reinforcement Learning and Generative Adversarial Imitation Learning

Shui Jiang; Yanning Ge; Xu Yang; Wencheng Yang; Hui Cui

doi:10.3390/fi16030105

Future Internet (Mar 2024)

UAV Control Method Combining Reptile Meta-Reinforcement Learning and Generative Adversarial Imitation Learning

Shui Jiang,
Yanning Ge,
Xu Yang,
Wencheng Yang,
Hui Cui

Affiliations

Shui Jiang: College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350007, China
Yanning Ge: College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350007, China
Xu Yang: College of Computer and Control Engineering, Minjiang University, Fuzhou 350108, China
Wencheng Yang: School of Mathematics, Physics and Computing, University of Southern Queensland, Darling Heights, QLD 4350, Australia
Hui Cui: Department of Software Systems & Cybersecurity, Monash University, Melbourne, VIC 3800, Australia

DOI: https://doi.org/10.3390/fi16030105
Journal volume & issue: Vol. 16, no. 3
p. 105

Abstract

Read online

Reinforcement learning (RL) is pivotal in empowering Unmanned Aerial Vehicles (UAVs) to navigate and make decisions efficiently and intelligently within complex and dynamic surroundings. Despite its significance, RL is hampered by inherent limitations such as low sample efficiency, restricted generalization capabilities, and a heavy reliance on the intricacies of reward function design. These challenges often render single-method RL approaches inadequate, particularly in the context of UAV operations where high costs and safety risks in real-world applications cannot be overlooked. To address these issues, this paper introduces a novel RL framework that synergistically integrates meta-learning and imitation learning. By leveraging the Reptile algorithm from meta-learning and Generative Adversarial Imitation Learning (GAIL), coupled with state normalization techniques for processing state data, this framework significantly enhances the model’s adaptability. It achieves this by identifying and leveraging commonalities across various tasks, allowing for swift adaptation to new challenges without the need for complex reward function designs. To ascertain the efficacy of this integrated approach, we conducted simulation experiments within both two-dimensional environments. The empirical results clearly indicate that our GAIL-enhanced Reptile method surpasses conventional single-method RL algorithms in terms of training efficiency. This evidence underscores the potential of combining meta-learning and imitation learning to surmount the traditional barriers faced by reinforcement learning in UAV trajectory planning and decision-making processes.

Published in Future Internet

ISSN: 1999-5903 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/futureinternet/

About the journal

Abstract

Keywords