Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 20:54:110448.
doi: 10.1016/j.dib.2024.110448. eCollection 2024 Jun.

Weapon Violence Dataset 2.0: A synthetic dataset for violence detection

Affiliations

Weapon Violence Dataset 2.0: A synthetic dataset for violence detection

Muhammad Shahroz Nadeem et al. Data Brief. .

Abstract

In the current era, satisfying the appetite of data hungry models is becoming an increasingly challenging task. This challenge is particularly magnified in research areas characterised by sensitivity, where the quest for genuine data proves to be elusive. The study of violence serves as a poignant example, entailing ethical considerations and compounded by the scarcity of authentic, real-world data that is predominantly accessible only to law enforcement agencies. Existing datasets in this field often resort to using content from movies or open-source video platforms like YouTube, further emphasising the scarcity of authentic data. To address this, our dataset aims to pioneer a new approach by creating the first synthetic virtual dataset for violence detection, named the Weapon Violence Dataset (WVD). The dataset is generated by creating virtual violence scenarios inside the photo-realistic video game namely: Grand Theft Auto-V (GTA-V). This dataset includes carefully selected video clips of person-to-person fights captured from a frontal view, featuring various weapons-both hot and cold across different times of the day. Specifically, WVD contains three categories: Hot violence and Cold violence (representing the violence category) as well as No violence (constituting the control class). The dataset is designed and created in a way that will enable the research community to train deep models on such synthetic data with the ability to increase the data corpus if the needs arise. The dataset is publicly available on Kaggle and comprises normal RGB and optic flow videos.

Keywords: GTA-V; Hot and Cold weapons; Synthetic virtual violence; Violence detection; WVD.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1:
Fig. 1
WVD with three classes: Hot, Cold, and No Violence, allowing two of these to form the conventional two-label system for violence classification: Violence vs. No Violence.
Fig. 2:
Fig. 2
Folder structure of WVD on Kaggle.
Fig. 3:
Fig. 3
Frame sequences capturing fights with hot weapons.
Fig. 4:
Fig. 4
Frame sequences capturing fights with cold weapons.
Fig. 5:
Fig. 5
Frame sequences capturing No Violence scenes.
Fig. 6:
Fig. 6
Folder structure of GTA-V, highlighting folders in green. The remaining files are necessary for successful mod installation.
Fig. 7:
Fig. 7
The three-stage process to develop the synthetic images for the proposed WVD dataset.

References

    1. Liang W., Tadesse G.A., Ho D., Fei-Fei L., Zaharia M., Zhang C., Zou J. Advances, challenges and opportunities in creating data for trustworthy AI. Nat. Mach. Intell. 2022;4(8):669–677.
    1. Nikolenko S.I. Vol. 174. Springer Nature; 2021. (Synthetic Data for Deep Learning).
    1. Wong M.Z., Kunii K., Baylis M., Ong W.H., Kroupa P., Koller S. Synthetic dataset generation for object-to-model deep learning in industrial applications. PeerJ Comput. Sci. 2019;5:e222. - PMC - PubMed
    1. Richter S.R., Vineet V., Roth S., Koltun V. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer International Publishing; 2016. Playing for data: ground truth from computer games; pp. 102–118.
    1. Farnebäck G. Image Analysis: 13th Scandinavian Conference, SCIA 2003 Halmstad, Sweden, June 29–July 2, 2003 Proceedings 13. Springer Berlin Heidelberg; 2003. Two-frame motion estimation based on polynomial expansion; pp. 363–370.

LinkOut - more resources