IEEE Access (Jan 2024)
Motion Images With Positioning Information and Deep Learning for Continuous Arabic Sign Language Recognition in Signer Dependent and Independent Modes
Abstract
While recognition of sign language alphabets and isolated words has matured in recent years, recognition of sign language sentences, or continuous signing, is still a research topic of interest in computer vision, especially in signer independent mode of recognition. Existing state-of-the-art solutions in the continuous Arabic Sign Language Recognition (ArSLR) are promising; however, when implemented in signer independent mode, the accuracies drop noticeably. In this paper, we propose a solution for recognizing continuous Arabic signing in signer dependent and independent modes through the use of motion images with positioning information. Initially, sign videos are converted into several motion images, each emphasizing a different segment of the sentence. This is achieved by a weighted sum of residual images after applying optical flow and motion compensation. Each motion image is composed of the whole sentence video, hence putting the emphasized segment into context in terms of previous and successive sign words. Thereafter, hand-crafted features are calculated from each resultant image, including numerical summaries of the horizontal, vertical and diagonal profiles. With such features, the architecture used for model generation and testing is simplified, where it consists of a single Bi-LSTM layer followed by dropout, softmax, and classification layers. This paper makes use of a recent Arabic Sign Language dataset known as ArabSign. The dataset is composed of 6 signers and 93 words arranged into 50 sentences with 30 repetitions each. Experimental results revealed that the proposed solution is suitable for signer dependent and signer independent modes of continuous sign language recognition. Using a Leave-One-Signer-Out policy, the proposed solution achieved word-recognition rates of 99.8% and 75.3%, respectively. These results noticeably surpass relevant state-of-the-art solutions.
Keywords