Spatially Aware Fusion in 3D Convolutional Autoencoders for Video Anomaly Detection

Asim Niaz; Sareer Ul Amin; Shafiullah Soomro; Hamza Zia; Kwang Nam Choi

doi:10.1109/ACCESS.2024.3435144

IEEE Access (Jan 2024)

Spatially Aware Fusion in 3D Convolutional Autoencoders for Video Anomaly Detection

Asim Niaz,
Sareer Ul Amin,
Shafiullah Soomro,
Hamza Zia,
Kwang Nam Choi

Affiliations

Asim Niaz: ORCiD; Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Sareer Ul Amin: Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Shafiullah Soomro: ORCiD; Department of Computer Science and Media Technology, Linnaeus University, Växjö, Sweden
Hamza Zia: Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea
Kwang Nam Choi: ORCiD; Department of Computer Science and Engineering, Chung-Ang University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3435144
Journal volume & issue: Vol. 12
pp. 104770 – 104784

Abstract

Read online

Surveillance videos are crucial for crime prevention and public safety, yet the challenge of defining abnormal events hinders their effectiveness, limiting the applicability of supervised methods. This paper introduces an unsupervised end-to-end architecture for video anomaly detection that applies spatial and temporal features to identify anomalies in surveillance footage. The model employs a three-dimensional (3D) convolutional autoencoder, with an encoder-decoder structure that learns spatiotemporal representations and reconstructs the input through the latent space. Skip connections linking the encoder and decoder blocks facilitate the transfer of information across various scales of feature representations, enhancing the reconstruction process and improving the overall performance. The architecture incorporates spatial attention modules that highlight informative regions in the input, enabling improved anomaly detection. Spatial and contextual dependencies are further acquired using 3D convolutional filters. The performance of the proposed model is assessed on four benchmark datasets: UCSD Pedestrian 1, UCSD Pedestrian 2, CUHK Avenue, and ShanghaiTech. Notably, the proposed model achieves frame-based Area Under the Curve (AUC) scores of 94.6% on UCSD Ped 1, 96.7% on UCSD Ped 2, 84.7% on CUHK Avenue, and 74.8% on ShanghaiTech. These results demonstrate the state-of-the-art performance of the proposed approach, highlighting its efficacy in real-world anomaly detection scenarios.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords