Expert-Trajectory-Based Features for Apprenticeship Learning via Inverse Reinforcement Learning for Robotic Manipulation

Francisco J. Naranjo-Campos; Juan G. Victores; Carlos Balaguer

doi:10.3390/app142311131

Applied Sciences (Nov 2024)

Expert-Trajectory-Based Features for Apprenticeship Learning via Inverse Reinforcement Learning for Robotic Manipulation

Francisco J. Naranjo-Campos,
Juan G. Victores,
Carlos Balaguer

Affiliations

Francisco J. Naranjo-Campos: RoboticsLab, Systems and Automation Engineering Department, University Carlos III of Madrid, 28911 Leganés, Spain
Juan G. Victores: RoboticsLab, Systems and Automation Engineering Department, University Carlos III of Madrid, 28911 Leganés, Spain
Carlos Balaguer: RoboticsLab, Systems and Automation Engineering Department, University Carlos III of Madrid, 28911 Leganés, Spain

DOI: https://doi.org/10.3390/app142311131
Journal volume & issue: Vol. 14, no. 23
p. 11131

Abstract

Read online

This paper explores the application of Inverse Reinforcement Learning (IRL) in robotics, focusing on inferring reward functions from expert demonstrations of robot arm manipulation tasks. By leveraging IRL, we aim to develop efficient and adaptable techniques for learning robust solutions to complex tasks in continuous state spaces. Our approach combines Apprenticeship Learning via IRL with Proximal Policy Optimization (PPO), expert-trajectory-based features, and the application of a reverse discount. The feature space is constructed by sampling expert trajectories to capture essential task characteristics, enhancing learning efficiency and generalizability by concentrating on critical states. To prevent the vanishing of feature expectations in goal states, we introduce a reverse discounting application to prioritize feature expectations in final states. We validate our methodology through experiments in a simple GridWorld environment, demonstrating that reverse discounting enhances the alignment of the agent’s features with those of the expert. Additionally, we explore how the parameters of the proposed feature definition influence performance. Further experiments on robotic manipulation tasks using the TIAGo robot compare our approach with state-of-the-art methods, confirming its effectiveness and adaptability in complex continuous state spaces across diverse manipulation tasks.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords