IEEE Access (Jan 2020)
Deep Ensemble Object Tracking Based on Temporal and Spatial Networks
Abstract
In recent years, correlation filtering and deep learning have achieved good performance in object tracking. Correlation filtering is an efficient and real-time method because its formula provides a fast solution in the Fourier domain, but it does not benefit from end-to-end training. Although deep learning is an effective method for learning object representations, training deep networks online with one or a few examples is challenging. To address these problems, we propose a deep ensemble object tracking algorithm that fuses temporal and spatial information to improve algorithm precision and robustness. The framework of our algorithm includes four aspects: feature extraction, a baseline network, a branch network and adaptive ensemble learning. Feature extraction extracts the general object representation. The baseline network integrates feature extraction and a correlation filtering algorithm into a convolutional neural network for end-to-end training. The branch network is composed of a temporal network and a spatial network. The temporal and spatial networks capture the object temporal and spatial information and further refine the object position. Our algorithm only needs an initial frame to train all networks. Adaptive ensemble learning compensates for the object information deficiency and improves tracking accuracy. Many experiments on tracking benchmark datasets demonstrate that our algorithm performs favourably compared with state-of-the-art tracking algorithms.
Keywords