IEEE Access (Jan 2021)
Arabic Sign Language Recognition System Using 2D Hands and Body Skeleton Data
Abstract
This paper presents a novel Arabic Sign Language (ArSL) recognition system, using selected 2D hands and body key points from successive video frames. The system recognizes the recorded video signs, for both signer dependent and signer independent modes, using the concatenation of a 3D CNN skeleton network and a 2D point convolution network. To accomplish this, we built a new ArSL video-based sign database. We will present the detailed methodology of recording the new dataset, which comprises 80 static and dynamic signs that were repeated five times by 40 signers. The signs include Arabic alphabet, numbers, and some daily use signs. To facilitate building an online sign recognition system, we introduce the inverse efficiency score to find a sufficient optimal number of successive frames for the recognition decision, in order to cope with a near real-time automatic ArSL system, where tradeoff between accuracy and speed is crucial to avoid delayed sign classification. For the dependent mode, best results were obtained for dynamic signs with an accuracy of 98.39%, and 88.89% for the static signs, and for the independent mode, we obtained for the dynamic signs an accuracy of 96.69%, and 86.34% for the static signs. When both the static and dynamic signs were mixed and the system trained with all the signs, accuracies of 89.62% and 88.09% were obtained in the signer dependent and signer independent modes respectively.
Keywords