IEEE Access (Jan 2024)
Two-Stream Spatial-Temporal Auto-Encoder With Adversarial Training for Video Anomaly Detection
Abstract
Auto-encoder has been widely used in video anomaly detection which aims to detect abnormal segments in video surveillance. However, the previous auto-encoder methods preferred to reconstruct a model of the normal event that only trains on normal samples due to the lack of abnormal samples, which may lead to self-reconstruction and cannot guarantee a larger reconstruction error for an abnormal event. In this paper, a novel Two- Stream spatial-temporal auto-encoder (Two-Stream STAE) network with adversarial training is designed for video anomaly detection. The Two-Stream STAE network is composed of a spatial auto-encoder stream and a temporal auto-encoder stream. The spatial stream reconstructed the appearance model of the normal event by extracting features of detected objects, and the temporal stream encoded the stacked optical flow maps to reconstruct the temporal model of the normal event. In order to enlarge the reconstruction errors of the abnormal events in the network, an adversarial training branch that takes abnormal samples as input is proposed to train the model. A pseudo-abnormal dataset is also built to solve the problem of the lack of abnormal samples. Experimental results on three public benchmark datasets demonstrate the strong competitiveness of our proposed method when comparing with the state-of-the-art methods.
Keywords