Shanghai Jiaotong Daxue xuebao (May 2021)

Video Abnormal Detection Combining FCN with LSTM

  • WU Guangli, GUO Zhenzhou, LI Leiting, WANG Chengxiang

DOI
https://doi.org/10.16183/j.cnki.jsjtu.2020.120
Journal volume & issue
Vol. 55, no. 5
pp. 607 – 614

Abstract

Read online

In view of the shortcomings of the traditional video anomaly detection model, a network structure combining the fully convolutional neural (FCN) network and the long short-term memory (LSTM)network is proposed. The network can perform pixel-level prediction and can accurately locate abnormal areas. The network first uses the convolutional neural network to extract image features of different depths in video frames. Then, different image features are input to memory network to analyze semantic information on time series. Image features and semantic information are fused through residual structure. At the same time, the skip structure is used to integrate the fusion features in multi-mode and upsampling is conducted to obtain a prediction image with the same size as the original video frame. The proposed model is tested on the ped 2 subset of University of California, San Diego (UCSD) anomaly detection dataset and University of Minnesota System(UMN)crowd activity dataset. And both two datasets achieve good results. On the UCSD dataset, the equal error rate is as low as 6.6%, the area under curve reaches 98.2%, and the F1 score reaches 94.96%. On the UMN dataset, the equal error rate is as low as 7.1%, the area under curve reaches 93.7%, and the F1 score reaches 94.46%.

Keywords