Efficient Multimodal Fusion for Hand Pose Estimation With Hourglass Network

Dinh-Cuong Hoang; Phan Xuan Tan; Duc-Long Pham; Hai-Nam Pham; Son-Anh Bui; Chi-Minh Nguyen; An-Binh Phi; Khanh-Duong Tran; Viet-Anh Trinh; van-Duc Tran; Duc-Thanh Tran; van-Hiep Duong; Khanh-Toan Phan; van-Thiep Nguyen; van-Duc Vu; Thu-Uyen Nguyen

doi:10.1109/ACCESS.2024.3444322

IEEE Access (Jan 2024)

Efficient Multimodal Fusion for Hand Pose Estimation With Hourglass Network

Dinh-Cuong Hoang,
Phan Xuan Tan,
Duc-Long Pham,
Hai-Nam Pham,
Son-Anh Bui,
Chi-Minh Nguyen,
An-Binh Phi,
Khanh-Duong Tran,
Viet-Anh Trinh,
van-Duc Tran,
Duc-Thanh Tran,
van-Hiep Duong,
Khanh-Toan Phan,
van-Thiep Nguyen,
van-Duc Vu,
Thu-Uyen Nguyen

Affiliations

Dinh-Cuong Hoang: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Phan Xuan Tan: ORCiD; College of Engineering, Shibaura Institute of Technology, Tokyo, Japan
Duc-Long Pham: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
Hai-Nam Pham: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Son-Anh Bui: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Chi-Minh Nguyen: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
An-Binh Phi: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Khanh-Duong Tran: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Viet-Anh Trinh: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
van-Duc Tran: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Duc-Thanh Tran: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
van-Hiep Duong: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
Khanh-Toan Phan: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
van-Thiep Nguyen: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
van-Duc Vu: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam
Thu-Uyen Nguyen: ORCiD; Department of Information Technology, FPT University, Hanoi, Vietnam

DOI: https://doi.org/10.1109/ACCESS.2024.3444322
Journal volume & issue: Vol. 12
pp. 113810 – 113825

Abstract

Read online

Hand pose estimation is vital for various applications, including virtual reality (VR), augmented reality (AR), gesture recognition, human-computer interaction (HCI), and robotics. Achieving accurate and real-time hand pose estimation is challenging due to factors such as the high degree of articulation in the human hand and the variability in hand shapes and sizes. While multimodal data offers advantages, developing a fast and resource-efficient hand pose estimation system remains challenging. Current state-of-the-art methods often require powerful graphics processing units (GPUs) for high performance, limiting deployment on edge platforms with limited computational resources. There is a critical need for higher efficiency without compromising accuracy, especially in real-world applications like mobile devices and embedded systems. Additionally, real-time performance is essential for practical applications, where systems must respond immediately to user interactions. Unfortunately, most current methods struggle to achieve real-time speeds, even on powerful GPUs, let alone on resource-constrained devices. To address these challenges, we propose an efficient hand pose estimation system that leverages both red-green-blue (RGB) and depth (RGBD) data through a unified fusion strategy. Our method combines appearance and geometric data early in the processing pipeline, significantly reducing computational complexity while maintaining real-time performance on resource-constrained devices. Experimental results show that the proposed model runs at over 110 fps on GPU, and 30 fps on the edge platform of NVidia Jetson NX Xavier, which is 4 to 5 times faster than existing methods, while achieving competitive accuracy.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords