IEEE Access (Jan 2020)
Scale-Aware Hierarchical Detection Network for Pedestrian Detection
Abstract
Several or even dozens of times spatial scale variation is one of the major bottleneck for pedestrian detection. Although the Region-based Convolutional Neural Network (R-CNN) families have shown promising results for object detection, they are still limited to detect pedestrians with large scale variations due to the fixed receptive field sizes on a single convolutional output layer. In contrast to previous methods that simply combined pedestrian predictions on feature maps with different resolution, we propose a scale-aware hierarchical detection network for pedestrian detection under large scale variations. First, we introduce a cross-scale features aggregation module to accomplish feature augmentation for pedestrian representation through merging the lateral connection, the top-down path and bottom-up path. Specifically, the cross-scale features aggregation module can adaptively fuse hierarchical features to enhance feature pyramid representation for robust semantic and accurate localization. Further, we design a scale-aware hierarchical detection network to effectively integrate multiscale pedestrian detection into a unified framework by adaptively perceiving the augmented feature level for special-scale pedestrian detection. Experimentally, the proposed scale-aware hierarchical detection network forms a more robust and discriminative model for pedestrian instances with different scales on widely-used ETH and Caltech benchmarks. In particular, compared with the state-of-the-art method FasterRCNN+ATT, the log-average miss rate of pedestrian detection is reduced by 11.98% for medium scale pedestrians (between 30-80 pixels in height), and 14.12% for whole scale pedestrians (above 20 pixels in height) on Caltech benchmark.
Keywords