Redesigned Skip-Network for Crowd Counting with Dilated Convolution and Backward Connection

Sorn Sooksatra; Toshiaki Kondo; Pished Bunnun; Atsuo Yoshitaka

doi:10.3390/jimaging6050028

Journal of Imaging (May 2020)

Redesigned Skip-Network for Crowd Counting with Dilated Convolution and Backward Connection

Sorn Sooksatra,
Toshiaki Kondo,
Pished Bunnun,
Atsuo Yoshitaka

Affiliations

Sorn Sooksatra: School of Information and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand
Toshiaki Kondo: School of Information and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand
Pished Bunnun: National Electronic and Computer Technology Center, National Science and Technology Development Agency, Pathum Thani 12120, Thailand
Atsuo Yoshitaka: School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1211, Japan

DOI: https://doi.org/10.3390/jimaging6050028
Journal volume & issue: Vol. 6, no. 5
p. 28

Abstract

Read online

Crowd counting is a challenging task dealing with the variation of an object scale and a crowd density. Existing works have emphasized on skip connections by integrating shallower layers with deeper layers, where each layer extracts features in a different object scale and crowd density. However, only high-level features are emphasized while ignoring low-level features. This paper proposes an estimation network by passing high-level features to shallow layers and emphasizing its low-level feature. Since an estimation network is a hierarchical network, a high-level feature is also emphasized by an improved low-level feature. Our estimation network consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is employed without resizing the feature map. Our method was tested in three datasets for counting humans and vehicles in a crowd image. The counting performance is evaluated by mean absolute error and root mean squared error indicating the accuracy and robustness of an estimation network, respectively. The experimental result shows that our network outperforms other related works in a high crowd density and is effective for reducing over-counting error in the overall case.

Published in Journal of Imaging

ISSN: 2313-433X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Photography; Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.mdpi.com/journal/jimaging

About the journal

Abstract

Keywords