IEEE Access (Jan 2024)
Design of an Iterative Method for CCTV Video Analysis Integrating Enhanced Person Detection and Dynamic Mask Graph Networks
Abstract
In the fast-changing video surveillance area, there exists a critical need for new ways in which the huge and complex data availed from CCTV systems can be correctly analyzed. Most already-existing methods of person detection and video analysis are not accurate, particularly in high dynamic scenes like crowded city centers, hospitals, and industries. These methods generally have a host of problems: poor adaptability to occlusions and lighting changes, and low precision for continuous action prediction. This paper provides an integrated model system for CCTV video analysis in a way that basically improves person detection, action prediction, and video synthesis. Concretely, this integrates some of the state-of-the-art methods: Cascade R-CNN for person detection, Dynamic Mask Graph Network(DMGN) for semantic segmentation, Hierarchical Deep Dyna-Q Network(HDDQN) for action prediction, Multimodal Attention Fusion Network(MAFN) with fuzzy logic for attention refinement, and Generative Transformer-based Video Synthesis Model(GT-VSM) for video generation. Attention mechanisms enabled in Cascade R-CNN can avoid the common pitfalls among occlusions and scale variations effectively for performing better in mAP scores. Bringing in temporal consistency in DMGN raises the accuracy in segmentation. In addition, HDDQN will predict complex actions accurately. MAFN fuses the multimodal data for enhancing context awareness. GT-VSM synthesizes video sequences of high fidelity. The system, though proposed, has several key improvements over existing models in terms of better detection accuracy, action prediction, and quality of video synthesis, which are all empirically validated in real-world surveillance scenarios. This work allows chances to improve safety, security, and behavior analysis across very different domains by enabling a real-time video surveillance tool that is based on scalability, efficiency, and accuracy levels.
Keywords