IEEE Access (Jan 2024)

A Novel Hybrid Deep Learning Architecture for Dynamic Hand Gesture Recognition

  • David Richard Tom Hax,
  • Pascal Penava,
  • Samira Krodel,
  • Liliya Razova,
  • Ricardo Buettner

DOI
https://doi.org/10.1109/ACCESS.2024.3365274
Journal volume & issue
Vol. 12
pp. 28761 – 28774

Abstract

Read online

Hand gestures are a form of natural communication used in human-computer interaction, however, when gestures are video-based, extraction of features for classification is complex. Current machine learning models struggle to achieve high accuracies when using videos recorded in realistic environments. In this work, we propose a hybrid architecture consisting of a recurrent neural network (RNN), including a long short-term memory layer, on top of a convolutional neural network, to recognize dynamic hand gestures recorded in realistic environments. We used a dataset of 6 dynamic hand gestures: scroll-left, scroll-right, scroll-up, scroll-down, zoom-in, and zoom-out. Our implemented inception-v3 model extracted features and provided the wrapped frame-feature map as input for the RNN, which performs the final classification. The proposed model classifies gestures with an average accuracy of 83.66%. By doing so, we intend to narrow the disparity between realistic environments and high accuracy. Finally, we compare the accuracy of our proposed dynamic hand gesture recognition model with that of the benchmark.

Keywords