Alexandria Engineering Journal (Sep 2024)
Enhancing robotic grasping with attention mechanism and advanced UNet architectures in generative grasping convolutional neural networks
Abstract
This research presents a novel approach to robotic grasping by integrating an attention mechanism and advanced U-Net architectures, specifically UNet and UNet++, into the Generative Grasping Convolutional Neural Network (GG-CNN). The proposed method, Attention UNet++ GG-CNN, aims to improve the performance and accuracy of robotic grasping predictions significantly. The attention mechanism allows the model to focus on the most relevant features in the depth image, enhancing the quality of grasp predictions. The UNet and UNet++ structures, renowned for their efficiency in semantic segmentation tasks, are adapted to generate a pixel-wise grasp quality map, providing a one-to-one mapping from depth images. The UNet++ structure, with its nested and dense skip pathways, further improves the model's ability to capture and propagate low-level details to higher-level features. This approach overcomes the limitations of traditional deep learning grasping techniques by avoiding discrete sampling of grasp candidates and reducing computation times. Preliminary results indicate that the Attention UNet++ GG-CNN significantly improves the accuracy and performance of robotic grasping,with the Attention UNet++ GG-CNN achieving Intersection over Union (IoU) scores up to 95.86%, and average performance metrics across 25%, 50%, and 75% thresholds reaching as high as 88.29%, confirming its potential for more effective and reliable robotic manipulation in various settings, paving the way for more efficient and reliable robotic manipulation in dynamic environments.