Robust visual tracking using very deep generative model

Eman R. AlBasiouny; Abdel-Fattah Attia; Hossam E. Abdelmunim; Hazem M. Abbas

doi:10.1186/s40537-022-00682-4

Journal of Big Data (Jan 2023)

Robust visual tracking using very deep generative model

Eman R. AlBasiouny,
Abdel-Fattah Attia,
Hossam E. Abdelmunim,
Hazem M. Abbas

Affiliations

Eman R. AlBasiouny: Computer Science and Engineering Department Faculty of Engineering, Kafrelsheikh University
Abdel-Fattah Attia: Computer Science and Engineering Department Faculty of Engineering, Kafrelsheikh University
Hossam E. Abdelmunim: Computer and Systems Engineering Department Faculty of Engineering, Ain Shams University
Hazem M. Abbas: Computer and Systems Engineering Department Faculty of Engineering, Ain Shams University

DOI: https://doi.org/10.1186/s40537-022-00682-4
Journal volume & issue: Vol. 10, no. 1
pp. 1 – 26

Abstract

Read online

Abstract Deep learning algorithms provide visual tracking robustness at an unprecedented level, but realizing an acceptable performance is still challenging because of the natural continuous changes in the features of foreground and background objects over videos. One of the factors that most affects the robustness of tracking algorithms is the choice of network architecture parameters, especially the depth. A robust visual tracking model using a very deep generator (RTDG) was proposed in this study. We constructed our model on an ordinary convolutional neural network (CNN), which consists of feature extraction and binary classifier networks. We integrated a generative adversarial network (GAN) into the CNN to enhance the tracking results through an adversarial learning process performed during the training phase. We used the discriminator as a classifier and the generator as a store that produces unlabeled feature-level data with different appearances by applying masks to the extracted features. In this study, we investigated the role of increasing the number of fully connected (FC) layers in adversarial generative networks and their impact on robustness. We used a very deep FC network with 22 layers as a high-performance generator for the first time. This generator is used via adversarial learning to augment the positive samples to reduce the gap between the hungry deep learning algorithm and the available training data to achieve robust visual tracking. The experiments showed that the proposed framework performed well against state-of-the-art trackers on OTB-100, VOT2019, LaSOT and UAVDT benchmark datasets.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords