Imitation Learning from a Single Demonstration Leveraging Vector Quantization for Robotic Harvesting

Antonios Porichis; Myrto Inglezou; Nikolaos Kegkeroglou; Vishwanathan Mohan; Panagiotis Chatzakos

doi:10.3390/robotics13070098

Robotics (Jun 2024)

Imitation Learning from a Single Demonstration Leveraging Vector Quantization for Robotic Harvesting

Antonios Porichis,
Myrto Inglezou,
Nikolaos Kegkeroglou,
Vishwanathan Mohan,
Panagiotis Chatzakos

Affiliations

Antonios Porichis: AI Innovation Centre, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Myrto Inglezou: AI Innovation Centre, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Nikolaos Kegkeroglou: TWI-Hellas, 280 Kifisias Ave., 152 32 Halandri, Greece
Vishwanathan Mohan: AI Innovation Centre, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Panagiotis Chatzakos: AI Innovation Centre, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK

DOI: https://doi.org/10.3390/robotics13070098
Journal volume & issue: Vol. 13, no. 7
p. 98

Abstract

Read online

The ability of robots to tackle complex non-repetitive tasks will be key in bringing a new level of automation in agricultural applications still involving labor-intensive, menial, and physically demanding activities due to high cognitive requirements. Harvesting is one such example as it requires a combination of motions which can generally be broken down into a visual servoing and a manipulation phase, with the latter often being straightforward to pre-program. In this work, we focus on the task of fresh mushroom harvesting which is still conducted manually by human pickers due to its high complexity. A key challenge is to enable harvesting with low-cost hardware and mechanical systems, such as soft grippers which present additional challenges compared to their rigid counterparts. We devise an Imitation Learning model pipeline utilizing Vector Quantization to learn quantized embeddings directly from visual inputs. We test this approach in a realistic environment designed based on recordings of human experts harvesting real mushrooms. Our models can control a cartesian robot with a soft, pneumatically actuated gripper to successfully replicate the mushroom outrooting sequence. We achieve 100% success in picking mushrooms among distractors with less than 20 min of data collection comprising a single expert demonstration and auxiliary, non-expert, trajectories. The entire model pipeline requires less than 40 min of training on a single A4000 GPU and approx. 20 ms for inference on a standard laptop GPU.

Published in Robotics

ISSN: 2218-6581 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Mechanical engineering and machinery
Website: http://www.mdpi.com/journal/robotics

About the journal

Abstract

Keywords