IET Computer Vision (Mar 2020)
Scale specified single shot multibox detector
Abstract
Detecting objects at vastly different scales is a fundamental challenge in computer vision. To solve this, some approaches (e.g. TridentNet) investigate the effect of receptive fields, whereas other approaches (e.g. SNIP, SNIPER) are based on the image pyramid strategy. In this study, a novel single‐shot based detector, called scale specified single‐shot multibox detector (4SD) is proposed. It aims to predict objects of a specific scale range separately by using feature maps of different sizes. First, a parallel multi‐branch architecture with feature maps of different sizes is generated by scale specific inference module. Then, the authors propose a scale specific training scheme to specialise each branch by sampling object instances of proper scales for training. Results are shown on both PASCAL VOC and COCO detection. The proposed method can achieve a mean average precision of 83.1% on PASCAL VOC 2007, and 36.9% on MS‐COCO at a speed of 28 frames per second, which is superior to most single‐stage detectors.
Keywords