Complex & Intelligent Systems (Feb 2024)

TAN: a temporal-aware attention network with context-rich representation for boosting proposal generation

  • Yanyan Jiao,
  • Wenzhu Yang,
  • Wenjie Xing,
  • Shuang Zeng,
  • Lei Geng

DOI
https://doi.org/10.1007/s40747-024-01343-0
Journal volume & issue
Vol. 10, no. 3
pp. 3691 – 3708

Abstract

Read online

Abstract Temporal action proposal generation in an untrimmed video is very challenging, and comprehensive context exploration is critically important to generate accurate candidates of action instances. This paper proposes a Temporal-aware Attention Network (TAN) that localizes context-rich proposals by enhancing the temporal representations of boundaries and proposals. Firstly, we pinpoint that obtaining precise location information of action instances needs to consider long-distance temporal contexts. To this end, we propose a Global-Aware Attention (GAA) module for boundary-level interaction. Specifically, we introduce two novel gating mechanisms into the top-down interaction structure to incorporate multi-level semantics into video features effectively. Secondly, we design an efficient task-specific Adaptive Temporal Interaction (ATI) module to learn proposal associations. TAN enhances proposal-level contextual representations in a wide range by utilizing multi-scale interaction modules. Extensive experiments on the ActivityNet-1.3 and THUMOS-14 demonstrate the effectiveness of our proposed method, e.g., TAN achieves 73.43% in AR@1000 on THUMOS-14 and 69.01% in AUC on ActivityNet-1.3. Moreover, TAN significantly improves temporal action detection performance when equipped with existing action classification frameworks.

Keywords