IET Image Processing (Dec 2021)

Crowd activity recognition in live video streaming via 3D‐ResNet and region graph convolution network

  • Junpeng Kang,
  • Jing Zhang,
  • Wensheng Li,
  • Li Zhuo

DOI
https://doi.org/10.1049/ipr2.12239
Journal volume & issue
Vol. 15, no. 14
pp. 3476 – 3486

Abstract

Read online

Abstract Since the era of we‐media, live video industry has shown an explosive growth trend. For large‐scale live video streaming, especially those containing crowd events that may cause great social impact, how to identify and supervise the crowd activity in live video streaming effectively is of great value to push the healthy development of live video industry. The existing crowd activity recognition mainly uses visual information, rarely fully exploiting and utilizing the correlation or external knowledge between crowd content. Therefore, a crowd activity recognition method in live video streaming is proposed by 3D‐ResNet and regional graph convolution network (ReGCN). (1) After extracting deep spatiotemporal features from live video streaming with 3D‐ResNet, the region proposals are generated by region proposal network. (2) A weakly supervised ReGCN is constructed by making region proposals as graph nodes and their correlations as edges. (3) Crowd activity in live video streaming is recognised by combining the output of ReGCN, the deep spatiotemporal features and the crowd motion intensity as external knowledge. Four experiments are conducted on the public collective activity extended dataset and a real‐world dataset BJUT‐CAD. The competitive results demonstrate that our method can effectively recognise crowd activity in live video streaming.

Keywords