Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings

Vidura Sumanasena; Heshan Fernando; Daswin De Silva; Beniel Thileepan; Amila Pasan; Jayathu Samarawickrama; Evgeny Osipov; Damminda Alahakoon

doi:10.3390/s24010185

Sensors (Dec 2023)

Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings

Vidura Sumanasena,
Heshan Fernando,
Daswin De Silva,
Beniel Thileepan,
Amila Pasan,
Jayathu Samarawickrama,
Evgeny Osipov,
Damminda Alahakoon

Affiliations

Vidura Sumanasena: Research Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, Australia
Heshan Fernando: Department of Computer Engineering, Rensselaer Polytechnic Institute, New York, NY 12180, USA
Daswin De Silva: Research Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, Australia
Beniel Thileepan: Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK
Amila Pasan: Centre for Wireless Communications, University of Oulu, 90570 Oulu, Finland
Jayathu Samarawickrama: Department of Electronic and Telecom Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka
Evgeny Osipov: Department of Computer Science, Electrical and Space Engineering, Luleå University of Technology, 97187 Luleå, Sweden
Damminda Alahakoon: Research Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3083, Australia

DOI: https://doi.org/10.3390/s24010185
Journal volume & issue: Vol. 24, no. 1
p. 185

Abstract

Read online

Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords