Gong-kuang zidonghua (Oct 2022)

Unsafe action recognition in underground coal mine based on cross-attention mechanism

  • RAO Tianrong,
  • PAN Tao,
  • XU Huijun

DOI
https://doi.org/10.13272/j.issn.1671-251x.17949
Journal volume & issue
Vol. 48, no. 10
pp. 48 – 54

Abstract

Read online

The real-time video monitoring and alarming of unsafe actions of coal mine personnel is an important means to improve the level of safety in production. The coal mine underground environment is complex, and the monitoring video quality is poor. The conventional action recognition method based on image features or human body key point features is limited in application in the underground coal mine. An action recognition model of multi-feature fusion based on cross-attention mechanism is proposed to recognize unsafe actions of coal mine personnel. For segment video images, the 3D ResNet101 model is adopted to extract image features. The openpose algorithm and ST-GCN (space-time graph convolutional network) are adopted to extract human body key point features. The cross-attention mechanism is used to fuse the image features and human key point features. The fused features are spliced respectively with the image features or human key point features processed by the self-attention mechanism to obtain the final action recognition features. The recognition features is processed by the fully connected layer and the normalized exponential function softmax to obtain action recognition result. Based on the public data sets HMDB51 and UCF101, and the self-built coal mine video dataset, the action recognition experiment is carried out. The results show that the cross-attention mechanism can make the action recognition model more effective in fusing image features and human key point features, and greatly improve the recognition accuracy. At present, SlowFast is the most widely used action recognition model. Compared with the SlowFast, the recognition accuracy of the action recognition model of multi-feature fusion based on cross-attention mechanism has been improved by 1.8% and 0.9% for HMDB51 and UCF101 datasets respectively. The recognition accuracy on the self-built dataset has increased by 6.7%. It is verified that the action recognition model of multi-feature fusion based on cross-attention mechanism is more suitable for the recognition of unsafe actions in the complex coal mine environment.

Keywords