IEEE Access (Jan 2024)

EnGesto: An Ensemble Learning Approach for Classification of Hand Gestures

  • Amrutha Raj V,
  • Malu G

DOI
https://doi.org/10.1109/ACCESS.2024.3411155
Journal volume & issue
Vol. 12
pp. 85709 – 85723

Abstract

Read online

Recognizing intricate hand gestures is challenging in various applications like human-computer interaction and e-learning. Accurate classification is crucial for preserving cultural practices and enabling intuitive interaction with machines. However, existing models often struggle with challenges like dynamic lighting, complex backgrounds, varying camera angles, intricate hand poses, noise, and diverse hand attributes. To address the aforementioned challenges, we introduce EnGesto, an ensemble learning model for categorizing intricate hand gestures. EnGesto comprises three major modules: Data Augmentation (DAug), Extract Visual Geometry Group (EVGG), and Multistage Hand Gesture Classification (MuGest). DAug simulates real-world imaging conditions, enhancing accuracy in detecting hand movements and improving resilience to unexpected events, strengthening reliability and capacity. EVGG extracts feature maps using a customized VGG16 model. Within the MuGest module, advanced components such as Fully Convolutional Network (FCN), Region Proposal Network (RPN), Convolutional Neural Network (CNN), Global Max Pooling, Attention layer, and a Fully Connected (FC) layer are employed to carefully select relevant features from EVGG to achieve precise, robust hand gesture classification. Research showcased exemplary performance of the proposed model, surpassing its counterparts in classification accuracy of 97.85%, outperforms VGGNet, ResNet, EfficientNet, and CNN even under demanding image conditions, as in Indian classical dance, Bharatanatyam, with its core mudras—precise gestures conveying a range of emotions and ideas. EnGesto excels in accurate gesture classification, enhancing precision, and efficiency in preserving and facilitating e-learning of treasured art forms while promoting cultural significance, enabling natural, intuitive interaction with machines, and opening avenues for further research and development in this domain.

Keywords