Data in Brief (Dec 2024)
VID: A comprehensive dataset for violence detection in various contexts
Abstract
Security professionals and law enforcement, who presently have to manually go through hours of film to find minor incidents, are becoming more interested in automated crime and violence detection. Developments in computer vision and machine learning are enabling automated systems to quickly scan and mark relevant incidents, reducing the workload of human investigators. Most existing datasets tend to focus on too specific situations (e.g., vehicle violence). Typically, these datasets consist of a limited number of video clips gathered from various sources, such as YouTube. Furthermore, the lack of distinct action-based groups or precise diversity in these datasets may limit their broader usefulness. To overcome the constraints of existing datasets, we have developed a balanced dataset that includes 3020 video clips. This dataset is evenly divided between 1510 non-violent clips and 1510 violent clips, capturing a diverse range of real-world situations. The duration of the clips ranged from 3 to 12 seconds, and they were recorded by non-professional actors. The aim of this comprehensive and balanced approach is to offer an extended resource for the training and evaluation of automated video analysis systems.