Jisuanji kexue yu tansuo (Aug 2021)

Crowd Density Detection Technology Based on Deep Semantic Segmentation

  • MA Yu, DU Huimin, MAO Zhili, ZHANG Xia

DOI
https://doi.org/10.3778/j.issn.1673-9418.2005066
Journal volume & issue
Vol. 15, no. 8
pp. 1469 – 1475

Abstract

Read online

With the development of society, people are going out more and more, which leads to more and more crowded scenes. The detection of crowd density is particularly important. Aiming at the multi-scale problem of different human scales caused by camera angles in the crowd, a crowd density detection method based on deep semantic segmentation is proposed. The front-end of the network uses an improved VGG network to extract the crowd characteristics, so that the output feature map is 1/8 of the original image to improve the accuracy of the predicted density map. The back-end designs two atrous convolution modules with different array dilation rates to capture the multi-scale features of the crowd. The multi-scale features of the network enable the network to capture more scale details and edge information. The network finally uses 1×1 convolution to cascade the output to obtain a high-quality prediction density map. At the same time, in order to solve the grid effect caused by atrous convolution, a zigzag network structure is designed, so that every pixel in the convolution operation after zero filling is calculated to ensure the continuity of the information, thereby improving the accuracy of the network. The network performance is tested on the ShanghaiTech and UCF_CC_50 datasets. The test results are better than the current mainstream crowd density detection methods. The MAE value obtained from the test is 42.4% and 38.1% higher than that of the MCNN network. Compared with SANet, network performance is increased by 5.3% and 9.6%.

Keywords