G-CNN: Object Detection via Grid Convolutional Neural Network

Qishuo Lu; Chonghua Liu; Zhuqing Jiang; Aidong Men; Bo Yang

doi:10.1109/ACCESS.2017.2770178

IEEE Access (Jan 2017)

G-CNN: Object Detection via Grid Convolutional Neural Network

Qishuo Lu,
Chonghua Liu,
Zhuqing Jiang,
Aidong Men,
Bo Yang

Affiliations

Qishuo Lu: ORCiD; Multimedia Technology Center, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Chonghua Liu: China Academy of Space Technology, Beijing, China
Zhuqing Jiang: Multimedia Technology Center, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Aidong Men: Multimedia Technology Center, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Bo Yang: Multimedia Technology Center, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2017.2770178
Journal volume & issue: Vol. 5
pp. 24023 – 24031

Abstract

Read online

We propose an object detection system that depends on position-sensitive grid feature maps. State-of-the-art object detection networks rely on convolutional neural networks pre-trained on a large auxiliary data set (e.g., ILSVRC 2012) designed for an image-level classification task. The image-level classification task favors translation invariance, while the object detection task needs localization representations that are translation variant to an extent. To address this dilemma, we construct position-sensitive convolutional layers, called grid convolutional layers that activate the object’s specific locations in the feature maps in the form of grids. With end-to-end training, the region of interesting grid pooling layer shepherds the last set of convolutional layers to learn specialized grid feature maps. Experiments on the PASCAL VOC 2007 data set show that our method outperforms the strong baselines faster region-based convolutional neural network counterpart and region-based fully convolutional networks by a large margin. Our method applied to ResNet-50 improves the mean average precision from 74.8%/74.2% to 79.4% without any other tricks. In addition, our approach achieves similar results on different networks (ResNet-101) and data sets (PASCAL VOC 2012 and MS COCO).

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords