IET Computer Vision (Oct 2023)

Improving multispectral pedestrian detection with scale‐aware permutation attention and adjacent feature aggregation

  • Xin Zuo,
  • Zhi Wang,
  • Jifeng Shen,
  • Wankou Yang

DOI
https://doi.org/10.1049/cvi2.12159
Journal volume & issue
Vol. 17, no. 7
pp. 726 – 738

Abstract

Read online

Abstract High quality feature fusion module is one of the key components for multispectral pedestrian detection system in challenging situations, such as large‐scale variance and occlusion. Although attention mechanism is one of the most effective ways for feature refining, the correlation between attention and scales in feature pyramid still remains unknown. Therefore, a scale‐aware permutated attention module is proposed to enhance features of objects with different scales adaptively in the feature pyramid. Specifically, four different local and global attention sub‐modules are investigated to refine feature maps with different permutations in the Feature Pyramid Networks, improving the quality of the feature fusion. Besides, to address the high miss‐rate issue for small‐sized pedestrians, an adjacent‐branch feature aggregation module is proposed to aggregate features across different scales, taking both semantic context and spatial resolution into consideration. Both modules can benefit from each other with significant performance improvement in terms of efficiency and accuracy, when equipped with the dual‐branch CenterNet detection framework. Experiments on the KAIST and FLIR datasets demonstrate its superior performance compared with other state‐of‐the‐arts.

Keywords