Jisuanji kexue yu tansuo (Nov 2023)

HSKDLR: Lightweight Lip Reading Method Based on Homogeneous Self-Knowledge Distillation

  • MA Jinlin, LIU Yuhao, MA Ziping, GONG Yuanwen, ZHU Yanbin

DOI
https://doi.org/10.3778/j.issn.1673-9418.2208032
Journal volume & issue
Vol. 17, no. 11
pp. 2689 – 2702

Abstract

Read online

In order to solve the problems of low recognition rate and large amount of calculation in lip reading, this paper proposes a lightweight model for lip reading named HSKDLR (homogeneous self-knowledge distillation for lip reading). Firstly, the S-SE (spatial SE)attention module is designed to pay attention to the spatial features of the lip image, which can construct the i-Ghost Bottleneck (improved Ghost Bottleneck) module to extract the channel features and spatial features of the lip image, thereby improving the accuracy of the lip language recognition model. Secondly, a lip reading model is built based on i-Ghost Bottleneck, which reduces the model computation by optimizing the combination of bottleneck structures to a certain extent. Then, in order to improve the accuracy of the model and reduce time consumption, a model optimization method of the homogeneous self-knowledge distillation (HSKD) is proposed. Finally, this paper employs the HSKD to train the lip reading model and verify its recognition performance. And the experimental results show that HSKDLR has higher recognition accuracy and lower computational complexity than the compared methods. The accuracy of the proposed method on LRW dataset is 87.3%, the floating-point number computation is as low as 2.564 GFLOPs, and the parameter quantity is as low as 3.8723×107. Moreover, HSKD can be applied to most lip reading models to improve recognition accuracy effectively and reduce training time.

Keywords