IEEE Access (Jan 2024)

GraspLDM: Generative 6-DoF Grasp Synthesis Using Latent Diffusion Models

  • Kuldeep R. Barad,
  • Andrej Orsula,
  • Antoine Richard,
  • Jan Dentler,
  • Miguel A. Olivares-Mendez,
  • Carol Martinez

DOI
https://doi.org/10.1109/ACCESS.2024.3492118
Journal volume & issue
Vol. 12
pp. 164621 – 164633

Abstract

Read online

Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric $SE(3)$ grasp poses conditioned on point clouds. GraspLDM’s architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups.

Keywords