IET Computer Vision (Oct 2020)

Dual attention module and multi‐label based fully convolutional network for crowd counting

  • Suyu Wang,
  • Bin Yang,
  • Bo Liu,
  • Guanghui Zheng

DOI
https://doi.org/10.1049/iet-cvi.2019.0674
Journal volume & issue
Vol. 14, no. 7
pp. 443 – 451

Abstract

Read online

High‐density crowd counting in natural scenes is an extremely difficult and challenging research subject in computer vision. Although the algorithm based on the convolutional neural network has achieved significantly better results than the traditional algorithm, most of them tend to focus on the local features of images, and difficult to obtain the rich global contextual dependencies. To solve this problem, a dual attention module and a multi‐label based fully convolutional network are proposed in this study. Moreover, the authors improve the algorithm by the following multiple perspectives. Firstly, introducing the dual attention module, the global‐context and long‐range dependency are adaptively integrated into both spatial and channel dimensions, which improve the network expression ability. Then, the prediction error is effectively reduced by designing a multi‐label mechanism, so the crowd‐counting task is transformed into foreground and background segmentation tasks to assist in the regression task of the density map. Furthermore, on the basis of the traditional Euclidean distance loss and cross‐entropy loss, the structural similarity index is introduced to further improve the training effect of the model. The test results of the UCF_CC_50, ShanghaiTech, and UCF‐QNRF datasets indicate that the proposed method is superior to the current mainstream algorithm.

Keywords