IEEE Access (Jan 2019)
Auto-Selecting Receptive Field Network for Visual Tracking
Abstract
Recently, Convolutional Neural Networks (CNNs) have shown tremendous potential in the visual tracking community. It is well-known that the receptive field is a critical factor for CNN affecting performance. However, standard CNNs based tracking methods design the receptive fields of artificial neurons in each layer that have the same size. We identify the main bottleneck of affecting the tracking accuracy as regular receptive fields. To settle the problem, we propose an Auto-Selecting Receptive Field Network (ASRF) to select receptive field information and effective clues dynamically. In particular, a Selective Receptive Field Block (SRFB) is designed to adaptively adjust receptive field size for each neuron according to multiple scales of input information. Additionally, we develop a Multi-Scale Receptive Field module (MSRF) that marks a further step in selecting effective clues from different scale receptive fields. The proposed ASRF method performs favorably against state-of-the-art trackers on five benchmarks, including OTB-2013, OTB-2015, UAV-123, VOT-2015, and VOT-2017 while running beyond real-time tracking speed.
Keywords