Al-Khawarizmi Engineering Journal (Sep 2023)

Tracked Robot Control with Hand Gesture Based on MediaPipe

  • Marthed Wameed,
  • Ahmed M. ALKAMACHI,
  • Ergun Erçelebi

DOI
https://doi.org/10.22153/kej.2023.04.004
Journal volume & issue
Vol. 19, no. 3

Abstract

Read online

Hand gestures are currently considered one of the most accurate ways to communicate in many applications, such as sign language, controlling robots, the virtual world, smart homes, and the field of video games. Several techniques are used to detect and classify hand gestures, for instance using gloves that contain several sensors or depending on computer vision. In this work, computer vision is utilized instead of using gloves to control the robot's movement. That is because gloves need complicated electrical connections that limit user mobility, sensors may be costly to replace, and gloves can spread skin illnesses between users. Based on computer vision, the MediaPipe (MP) method is used. This method is a modern method that is discovered by Google. This method is described by detecting and classifying hand gestures by identifying 21 three-dimensional points on the hand, and by comparing the dimensions of those points. This is how the hand gestures are classified. After detecting and classifying the hand gestures, the system controls the tracked robot through hand gestures in real time, as each hand gesture has a specific movement that the tracked robot performs. In this work, some important paragraphs concluded that the MP method is more accurate and faster in response than the Deep Learning (DL) method, specifically the Convolution Neural Network (CNN). The experimental results shows the accuracy of this method in real time through the effect of environmental elements decreases in some cases when environmental factors change. Environmental elements are such light intensity, distance, and tilt angle (between the hand gesture and camera).The reason for this is that in some cases, the fingers are closed together, and some fingers are not fully closed or opened and the accuracy of the camera used is not good with the changing environmental factors. This leads to the inability of the algorithm used to classify hand gestures correctly (the classification accuracy decrease), and thus response time of the tracked robot's movement increases. That does not present possibility for the system to determine whether the finger is closed or opened.