Instrument Detection and Descriptive Gesture Segmentation on a Robotic Surgical Maneuvers Dataset

Irene Rivas-Blanco; Carmen López-Casado; Juan M. Herrera-López; José Cabrera-Villa; Carlos J. Pérez-del-Pulgar

doi:10.3390/app14093701

Applied Sciences (Apr 2024)

Instrument Detection and Descriptive Gesture Segmentation on a Robotic Surgical Maneuvers Dataset

Irene Rivas-Blanco,
Carmen López-Casado,
Juan M. Herrera-López,
José Cabrera-Villa,
Carlos J. Pérez-del-Pulgar

Affiliations

Irene Rivas-Blanco: Institute for Mechatronics Engineering and Cyber-Physical Systems (IMECH.UMA), University of Málaga, Andalucía Tech, 29070 Málaga, Spain
Carmen López-Casado: Institute for Mechatronics Engineering and Cyber-Physical Systems (IMECH.UMA), University of Málaga, Andalucía Tech, 29070 Málaga, Spain
Juan M. Herrera-López: Institute for Mechatronics Engineering and Cyber-Physical Systems (IMECH.UMA), University of Málaga, Andalucía Tech, 29070 Málaga, Spain
José Cabrera-Villa: Institute for Mechatronics Engineering and Cyber-Physical Systems (IMECH.UMA), University of Málaga, Andalucía Tech, 29070 Málaga, Spain
Carlos J. Pérez-del-Pulgar: Institute for Mechatronics Engineering and Cyber-Physical Systems (IMECH.UMA), University of Málaga, Andalucía Tech, 29070 Málaga, Spain

DOI: https://doi.org/10.3390/app14093701
Journal volume & issue: Vol. 14, no. 9
p. 3701

Abstract

Read online

Large datasets play a crucial role in the progression of surgical robotics, facilitating advancements in the fields of surgical task recognition and automation. Moreover, public datasets enable the comparative analysis of various algorithms and methodologies, thereby assessing their effectiveness and performance. The ROSMA (Robotics Surgical Maneuvers) dataset provides 206 trials of common surgical training tasks performed with the da Vinci Research Kit (dVRK). In this work, we extend the ROSMA dataset with two annotated subsets: ROSMAT24, which contains bounding box annotations for instrument detection, and ROSMAG40, which contains high and low-level gesture annotations. We propose an annotation method that provides independent labels for the right-handed tools and the left-handed tools. For instrument identification, we validate our proposal with a YOLOv4 model in two experimental scenarios. We demonstrate the generalization capabilities of the network to detect instruments in unseen scenarios. On the other hand, for gesture segmentation, we propose two label categories: high-level annotations that describe gestures at a maneuvers level, and low-level annotations that describe gestures at a fine-grain level. To validate this proposal, we have designed a recurrent neural network based on a bidirectional long-short term memory layer. We present results for four cross-validation experimental setups, reaching up to a 77.35% mAP.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords