IEEE Access (Jan 2020)

Temporal Segment Connection Network for Action Recognition

  • Qian Li,
  • Wenzhu Yang,
  • Xiangyang Chen,
  • Tongtong Yuan,
  • Yuxia Wang

DOI
https://doi.org/10.1109/ACCESS.2020.3027386
Journal volume & issue
Vol. 8
pp. 179118 – 179127

Abstract

Read online

Two-stream Convolutional Neural Networks have shown excellent performance in video action recognition. Most existing works train each sampling group independently, or just fuse at the last level, which obviously ignore the continuity of action in temporal and the complementary information between action fragments. In this paper, a temporal segment connection network is proposed to overcome these limitations. On the one hand, the forget gate module of the long short-term memory (LSTM) network is used to establish feature-level connections between each sampling group. This not only strengthens the information transmission between the sampling groups to enhance the temporal connectivity, but also extracts the complementary information between the sampling groups to enhance the overall representation of the action. On the other hand, a bi-directional long short-term memory (Bi-LSTM) network is used to automatically evaluate the importance weights of each sampling group based on the deep feature sequence. The experimental results on UCF101 and HMDB51 datasets show that the proposed model can effectively improve the utilization rate of temporal information and the ability of overall action representation, thus significantly improves the accuracy of human action recognition.

Keywords