Addressing religious hate online: from taxonomy creation to automated detection

Alan Ramponi¹, Benedetta Testa², Sara Tonelli¹, Elisabetta Jezek²

Affiliations

PMID: 37346317
PMCID: PMC10280248
DOI: 10.7717/peerj-cs.1128

Addressing religious hate online: from taxonomy creation to automated detection

Alan Ramponi et al. PeerJ Comput Sci. 2022.

. 2022 Dec 15:8:e1128.

doi: 10.7717/peerj-cs.1128. eCollection 2022.

Authors

Alan Ramponi¹, Benedetta Testa², Sara Tonelli¹, Elisabetta Jezek²

Affiliations

¹ Fondazione Bruno Kessler, Trento, Italy.
² Dipartimento di Studi Umanistici, Università di Pavia, Pavia, Italy.

PMID: 37346317
PMCID: PMC10280248
DOI: 10.7717/peerj-cs.1128

Abstract

Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on misogyny and racism. A current underexplored area in this context is religious hate, for which efforts in data and methods to date have been rather scattered. This is exacerbated by different annotation schemes that available datasets use, which inevitably lead to poor repurposing of data in wider contexts. Furthermore, religious hate is very much dependent on country-specific factors, including the presence and visibility of religious minorities, societal issues, historical background, and current political decisions. Motivated by the lack of annotated data specifically tailoring religion and the poor interoperability of current datasets, in this article we propose a fine-grained labeling scheme for religious hate speech detection. Such scheme lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam. Moreover, we introduce a Twitter dataset in two languages-English and Italian-that has been annotated following the proposed annotation scheme. We experiment with several classification algorithms on the annotated dataset, from traditional machine learning classifiers to recent transformer-based language models, assessing the difficulty of two tasks: abusive language detection and religious hate speech detection. Finally, we investigate the cross-lingual transferability of multilingual models on the tasks, shedding light on the viability of repurposing our dataset for religious hate speech detection on low-resource languages. We release the annotated data and publicly distribute the code for our classification experiments at https://github.com/dhfbk/religious-hate-speech.

Keywords: Abusive language detection; Natural language processing; Religious hate speech detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

**Figure 1. Abusive language annotation taxonomy with a focus on religious hate.**

See this image and copyright information in PMC

Cited by

Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages.
Ziyaden A, Yelenov A, Hajiyev F, Rustamov S, Pak A. Ziyaden A, et al. PeerJ Comput Sci. 2024 Mar 29;10:e1974. doi: 10.7717/peerj-cs.1974. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 38660166 Free PMC article.
Special issue on analysis and mining of social media data.
Zubiaga A, Rosso P. Zubiaga A, et al. PeerJ Comput Sci. 2024 Feb 29;10:e1909. doi: 10.7717/peerj-cs.1909. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 38435569 Free PMC article.

References

1. Akiwowo S, Vidgen B, Prabhakaran V, Waseem Z, editors. Proceedings of the Fourth Workshop on Online Abuse and Harms; 2020.
1. Albadi N, Kurdi M, Mishra S. Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM); Piscataway. 2018. pp. 69–76.
1. Anzovino ME, Fersini E, Rosso P. Automatic identification and classification of misogynistic language on Twitter. International conference on applications of natural language to data bases.2018.
1. Aroyo L, Welty C. Truth is a lie: crowd truth and the seven myths of human annotation. AI Magazine. 2015;36(1):15–24. doi: 10.1609/aimag.v36i1.2564. - DOI
1. Awan I. Cyber-extremism: ISIS and the power of social media. Society. 2017;54:1–12. doi: 10.1007/s12115-016-0108-3. - DOI

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Addressing religious hate online: from taxonomy creation to automated detection

Affiliations

Addressing religious hate online: from taxonomy creation to automated detection

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources