IEEE Access (Jan 2024)

Multiple-Hand 2D Pose Estimation From a Monocular RGB Image

  • Purnendu Mishra,
  • Kishor Sarawadekar

DOI
https://doi.org/10.1109/ACCESS.2024.3376426
Journal volume & issue
Vol. 12
pp. 40722 – 40735

Abstract

Read online

Deep learning models and algorithms facilitate relatively easier ways of hand pose estimation from monocular RGB images compared to traditional approaches. Despite this, a majority of available algorithms use multiple-stage models to perform hand pose estimation. Moreover, the single-stage methods are mainly limited to a single hand and it is difficult for them to scale to multiple hands. To this end, we propose an approach that takes the features of the saliency map extracted for hand region of interest (ROI) localization. An integrated network uses these features for pose estimation. This arrangement of layers forms an end-to-end pipeline that allows simultaneous pose estimation for multiple hands. The model is designed to run on multiple cores of CPU/GPU to independently perform inference for each detected hands’pose making possible faster inference and hence suitable for real-time applications. In addition, a new approach using grid-based design to estimate hand-keypoints position with high precision is also proposed. Both the proposed designs are validated on multiple datasets to prove their feasibility and effectiveness. The probability of the correct keypoint (PCK) value at threshold value of 0.2 is above 95% on the test sets from Interhand dataset and Rendered HandPose Dataset (RHD).

Keywords