Applied Sciences (Nov 2023)
Fusion of Attention-Based Convolution Neural Network and HOG Features for Static Sign Language Recognition
Abstract
The deaf and hearing-impaired community expresses their emotions, communicates with society, and enhances the interaction between humans and computers using sign language gestures. This work presents a strategy for efficient feature extraction that uses a combination of two different methods that are the convolutional block attention module (CBAM)-based convolutional neural network (CNN) and standard handcrafted histogram of oriented gradients (HOG) feature descriptor. The proposed framework aims to enhance accuracy by extracting meaningful features and resolving issues like rotation, similar hand orientation, etc. The HOG feature extraction technique provides a compact feature representation that signifies meaningful information about sign gestures. The CBAM attention module is incorporated into the structure of CNN to enhance feature learning using spatial and channel attention mechanisms. Then, the final feature vector is formed by concatenating these features. This feature vector is provided to the classification layers to predict static sign gestures. The proposed approach is validated on two publicly available static Massey American Sign Language (ASL) and Indian Sign Language (ISL) databases. The model’s performance is evaluated using precision, recall, F1-score, and accuracy. Our proposed methodology achieved 99.22% and 99.79% accuracy for the ASL and ISL datasets. The acquired results signify the efficiency of the feature fusion and attention mechanism. Our network performed better in accuracy compared to the earlier studies.
Keywords