FM‐YOLOv8:Lightweight gesture recognition algorithm

Fanghai Li; Xitai Na; Jinshuo Shi; Qingbin Sun

doi:10.1049/ipr2.13229

IET Image Processing (Nov 2024)

FM‐YOLOv8:Lightweight gesture recognition algorithm

Fanghai Li,
Xitai Na,
Jinshuo Shi,
Qingbin Sun

Affiliations

Fanghai Li: Electronic Information Engineering Inner Mongolia University Hohhot China
Xitai Na: Electronic Information Engineering Inner Mongolia University Hohhot China
Jinshuo Shi: Electronic Information Engineering Inner Mongolia University Hohhot China
Qingbin Sun: Electronic Information Engineering Inner Mongolia University Hohhot China

DOI: https://doi.org/10.1049/ipr2.13229
Journal volume & issue: Vol. 18, no. 13
pp. 4023 – 4031

Abstract

Read online

Abstract In practical production applications, the efficiency and success rate of gesture recognition directly affect the user experience and work efficiency. However, the existing gesture recognition models have the problem of a large number of model parameters and high computational complexity, which makes them unable to meet the needs of end‐to‐end industrial deployment. To solve these problems, this article proposes a gesture recognition model based on YOLOv8. First, FasterNet is adopted as the backbone of YOLOv8, which significantlydecrease the number of parameters and made the model more lightweight. By reducing the number of parameters, the computational complexity of the model can be reduced while maintaining the performance of the model, and the operation efficiency of the model can be improved. Second, recombination convolution ScConv is introduced to replace common convolution operations to further improve the model's efficiency. Recombination convolution can reduce the computation and make up for the loss of precision to some extent. Finally, the MDPIoU loss function is used to optimize target location and prediction, to improve the accuracy of the model. The MDPIoU loss function can better deal with the problem of target boundary frame positioning and prediction so that the model can locate and predict gestures more accurately in gesture recognition tasks. Experiments on a data set containing 10 types of gestures show that the number of parameters and floating point calculations of the improved network model are reduced by 45% and 42.7%, respectively, while the accuracy is unchanged. The improved model can be deployed on edge terminals, providing efficient and accurate gesture recognition.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords