An improved hand gesture recognition system using keypoints and hand bounding boxes
Tuan Linh Dang,
Sy Dat Tran,
Thuy Hang Nguyen,
Suntae Kim,
Nicolas Monet
Affiliations
Tuan Linh Dang
School of Information and Communications Technology, Hanoi University of Science and Technology, 1 Dai Co Viet road, Hai Ba Trung district, Hanoi, Vietnam; Corresponding author.
Sy Dat Tran
School of Information and Communications Technology, Hanoi University of Science and Technology, 1 Dai Co Viet road, Hai Ba Trung district, Hanoi, Vietnam
Thuy Hang Nguyen
School of Information and Communications Technology, Hanoi University of Science and Technology, 1 Dai Co Viet road, Hai Ba Trung district, Hanoi, Vietnam
Suntae Kim
Avatar, NAVER CLOVA, Buljeong-ro, Bundang-gu, Seongnam-si, South Korea
Nicolas Monet
Avatar, NAVER CLOVA, Buljeong-ro, Bundang-gu, Seongnam-si, South Korea
Hand gesture recognition is a significant problem for human–computer interaction. One form of hand gesture recognition is static hand gestures. This study developed a static hand gesture recognition system, consisting of three modules: Feature extraction Module, Processing Module, and Classification Module. The feature extraction module uses human pose estimation with a top-down method to extract not only the keypoints but also body and hand bounding boxes. After being normalized and processed in the processing module, its output will be used as the input for the classification module in which we proposed an architecture called Two-pipeline architecture. In this module, we also employ different methods to find the most suitable one for this task. Experiments were conducted on three datasets called HANDS, OUHANDS, and SHAPE. Results showed that the proposed Two-pipeline architecture with 2.5 million parameters obtained accuracy of 94%, 98%, and 94% on three datasets. In addition, the lightweight version with 0.22 million parameters also achieved accuracy of 91%, 94%, and 96%.