Multi-Channel Generative Framework and Supervised Learning for Anomaly Detection in Surveillance Videos

Tuan-Hung Vu; Jacques Boonaert; Sebastien Ambellouis; Abdelmalik Taleb-Ahmed

doi:10.3390/s21093179

Sensors (May 2021)

Multi-Channel Generative Framework and Supervised Learning for Anomaly Detection in Surveillance Videos

Tuan-Hung Vu,
Jacques Boonaert,
Sebastien Ambellouis,
Abdelmalik Taleb-Ahmed

Affiliations

Tuan-Hung Vu: CERI SN, IMT Lille Douai, 941 Rue Charles Bourseul, 59500 Douai, France
Jacques Boonaert: CERI SN, IMT Lille Douai, 941 Rue Charles Bourseul, 59500 Douai, France
Sebastien Ambellouis: COSYS Department, LEOST, Gustave Eiffel University, 59666 Villeneuve d’Ascq, France
Abdelmalik Taleb-Ahmed: Opto-Acoustic-Electronics Department, IEMN, CNRS, UMR 8520, Université Polytechnique Hauts de France, 59313 Valenciennes, France

DOI: https://doi.org/10.3390/s21093179
Journal volume & issue: Vol. 21, no. 9
p. 3179

Abstract

Read online

Recently, most state-of-the-art anomaly detection methods are based on apparent motion and appearance reconstruction networks and use error estimation between generated and real information as detection features. These approaches achieve promising results by only using normal samples for training steps. In this paper, our contributions are two-fold. On the one hand, we propose a flexible multi-channel framework to generate multi-type frame-level features. On the other hand, we study how it is possible to improve the detection performance by supervised learning. The multi-channel framework is based on four Conditional GANs (CGANs) taking various type of appearance and motion information as input and producing prediction information as output. These CGANs provide a better feature space to represent the distinction between normal and abnormal events. Then, the difference between those generative and ground-truth information is encoded by Peak Signal-to-Noise Ratio (PSNR). We propose to classify those features in a classical supervised scenario by building a small training set with some abnormal samples of the original test set of the dataset. The binary Support Vector Machine (SVM) is applied for frame-level anomaly detection. Finally, we use Mask R-CNN as detector to perform object-centric anomaly localization. Our solution is largely evaluated on Avenue, Ped1, Ped2, and ShanghaiTech datasets. Our experiment results demonstrate that PSNR features combined with supervised SVM are better than error maps computed by previous methods. We achieve state-of-the-art performance for frame-level AUC on Ped1 and ShanghaiTech. Especially, for the most challenging Shanghaitech dataset, a supervised training model outperforms up to 9% the state-of-the-art an unsupervised strategy.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords