Safe Reinforcement Learning for Transition Control of Ducted-Fan UAVs

Yanbo Fu; Wenjie Zhao; Liu Liu

doi:10.3390/drones7050332

Drones (May 2023)

Safe Reinforcement Learning for Transition Control of Ducted-Fan UAVs

Yanbo Fu,
Wenjie Zhao,
Liu Liu

Affiliations

Yanbo Fu: School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China
Wenjie Zhao: School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China
Liu Liu: School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China

DOI: https://doi.org/10.3390/drones7050332
Journal volume & issue: Vol. 7, no. 5
p. 332

Abstract

Read online

Ducted-fan tail-sitter unmanned aerial vehicles (UAVs) provide versatility and unique benefits, attracting significant attention in various applications. This study focuses on developing a safe reinforcement learning method for back-transition control between level flight mode and hover mode for ducted-fan tail-sitter UAVs. Our method enables transition control with a minimal altitude change and transition time while adhering to the velocity constraint. We employ the Trust Region Policy Optimization, Proximal Policy Optimization with Lagrangian, and Constrained Policy Optimization (CPO) algorithms for controller training, showcasing the superiority of the CPO algorithm and the necessity of the velocity constraint. The transition trajectory achieved using the CPO algorithm closely resembles the optimal trajectory obtained via the well-known GPOPS-II software with the SNOPT solver. Meanwhile, the CPO algorithm also exhibits strong robustness under unknown perturbations of UAV model parameters and wind disturbance.

Published in Drones

ISSN: 2504-446X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Motor vehicles. Aeronautics. Astronautics
Website: http://www.mdpi.com/journal/drones

About the journal

Abstract

Keywords