Journal of Big Data (Mar 2020)
A comparison on visual prediction models for MAMO (multi activity-multi object) recognition using deep learning
Abstract
Abstract Multi activity-multi object recognition (MAMO) is a challenging task in visual systems for monitoring, recognizing and alerting in various public places, such as universities, hospitals and airports. While both academic and commercial researchers are aiming towards automatic tracking of human activities in intelligent video surveillance using deep learning frameworks. This is required for many real time applications to detect unusual/suspicious activities like tracking of suspicious behaviour in crime events etc. The primary purpose of this paper is to render a multi class activity prediction in individuals as well as groups from video sequences by using the state-of-the-art object detector You Look only Once (YOLOv3). By optimum utilization of the geographical information of cameras and YOLO object detection framework, a Deep Landmark model recognize a simple to complex human actions on gray scale to RGB image frames of video sequences. This model is tested and compared with various benchmark datasets and found to be the most precise model for detecting human activities in video streams. Upon analysing the experimental results, it has been observed that the proposed method shows superior performance as well as high accuracy.
Keywords