IEEE Access (Jan 2021)
Vision-Based Human Detection Techniques: A Descriptive Review
Abstract
Cameras are being used everywhere for the safety and security of citizens in different countries. Using a machine to detect humans in a photo or a video frame is a very complicated and challenging task. Various techniques have been developed for this purpose, which mainly rely on Artificial Intelligence. This article aims to provide a comprehensive review and analysis of the literatures from a descriptive perspective, which is its main differentiator from the existing survey papers in this area. Firstly, the vision-based human detection techniques and classifiers are elucidated in conjunction with the variants of feature extraction techniques. Secondly, various pros and cons of such techniques are discussed. Then, an investigation has been conducted and reported based on the state-of-the-art human detection descriptors (e.g. Log-Average Miss Rate and accuracy). Although techniques such as Viola-Jones and Speeded-Up Robust Features can detect objects in real-time and overcome Scale-Invariant Feature Transform (SIFT) limitations, they are still sensitive to illuminated conditions. Other techniques such as SIFT, Bag of Words, Orthogonal Moments, and Histogram of oriented Gradients provide other interesting benefits which include insensitivity to occlusion and clutters, simplicity, low-order element construction and invariance to illuminated conditions; nevertheless, they are computationally expensive and sensitive to image rotation. A meticulous review along similar lines revealed that the Deformable Part-based Model performs relatively better due to its ability to deal with particular pose variations and multiple views, occlusion handling (partial) and is application-free while its counterparts focus on only a single aspect. This article highlights and provides a brief description of each available data-sets for human detection research. Various use-cases of human detection systems are also elaborated. Finally, various conclusions are derived based on the conducted review followed by recommendations for future directions and possibilities to further improve the speed and accuracy of human detection systems.
Keywords