. 2020 Dec 8;22(12):e22609.

doi: 10.2196/22609.

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach

Raghad Alshalan^#¹, Hend Al-Khalifa^#¹, Duaa Alsaeed^#¹, Heyam Al-Baity^#¹, Shahad Alshalan^#¹

Affiliations

PMID: 33207310
PMCID: PMC7725497
DOI: 10.2196/22609

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach

Raghad Alshalan et al. J Med Internet Res. 2020.

. 2020 Dec 8;22(12):e22609.

doi: 10.2196/22609.

Authors

Raghad Alshalan^#¹, Hend Al-Khalifa^#¹, Duaa Alsaeed^#¹, Heyam Al-Baity^#¹, Shahad Alshalan^#¹

Affiliation

¹ King Saud University, Riyadh, Saudi Arabia.

^# Contributed equally.

PMID: 33207310
PMCID: PMC7725497
DOI: 10.2196/22609

Abstract

Background: The massive scale of social media platforms requires an automatic solution for detecting hate speech. These automatic solutions will help reduce the need for manual analysis of content. Most previous literature has cast the hate speech detection problem as a supervised text classification task using classical machine learning methods or, more recently, deep learning methods. However, work investigating this problem in Arabic cyberspace is still limited compared to the published work on English text.

Objective: This study aims to identify hate speech related to the COVID-19 pandemic posted by Twitter users in the Arab region and to discover the main issues discussed in tweets containing hate speech.

Methods: We used the ArCOV-19 dataset, an ongoing collection of Arabic tweets related to COVID-19, starting from January 27, 2020. Tweets were analyzed for hate speech using a pretrained convolutional neural network (CNN) model; each tweet was given a score between 0 and 1, with 1 being the most hateful text. We also used nonnegative matrix factorization to discover the main issues and topics discussed in hate tweets.

Results: The analysis of hate speech in Twitter data in the Arab region identified that the number of non-hate tweets greatly exceeded the number of hate tweets, where the percentage of hate tweets among COVID-19 related tweets was 3.2% (11,743/547,554). The analysis also revealed that the majority of hate tweets (8385/11,743, 71.4%) contained a low level of hate based on the score provided by the CNN. This study identified Saudi Arabia as the Arab country from which the most COVID-19 hate tweets originated during the pandemic. Furthermore, we showed that the largest number of hate tweets appeared during the time period of March 1-30, 2020, representing 51.9% of all hate tweets (6095/11,743). Contrary to what was anticipated, in the Arab region, it was found that the spread of COVID-19-related hate speech on Twitter was weakly related with the dissemination of the pandemic based on the Pearson correlation coefficient (r=0.1982, P=.50). The study also identified the commonly discussed topics in hate tweets during the pandemic. Analysis of the 7 extracted topics showed that 6 of the 7 identified topics were related to hate speech against China and Iran. Arab users also discussed topics related to political conflicts in the Arab region during the COVID-19 pandemic.

Conclusions: The COVID-19 pandemic poses serious public health challenges to nations worldwide. During the COVID-19 pandemic, frequent use of social media can contribute to the spread of hate speech. Hate speech on the web can have a negative impact on society, and hate speech may have a direct correlation with real hate crimes, which increases the threat associated with being targeted by hate speech and abusive language. This study is the first to analyze hate speech in the context of Arabic COVID-19-related tweets in the Arab region.

Keywords: CNN; COVID-19; NMF; Twitter; convolutional neural network; coronavirus; deep learning; hate speech; non-negative matrix factorization; pandemic; public health; social media; social network analysis.

©Raghad Alshalan, Hend Al-Khalifa, Duaa Alsaeed, Heyam Al-Baity, Shahad Alshalan. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 08.12.2020.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
Methodology workflow. CNN: convolutional neural network; NMF: utilized nonnegative matrix factorization.

**Figure 2**
Number of hate tweets (red) and total tweets (black) in each Arab country, where a darker color depicts a higher number of hate tweets in that country (MA: Morocco, MR: Mauritania, DZ: Algeria, TN: Tunisia, LY: Libya, EG: Egypt, JO: Jordan, LB: Lebanon, SY: Syria, IQ: Iraq, SA: Saudi Arabia ,YE: Yemen, KW: Kuwait, QA: Qatar, AE: United Aran Emirates, OM: Oman).

**Figure 3**
Number of COVID-19–related hate tweets per country with the average hate level scores in brackets (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00). UAE: United Arab Emirates.

**Figure 4**
Numbers of hate tweets and numbers of COVID-19 cases and deaths per time period with the average hate level scores in brackets (low: 0.50-0.67; average: 0.68-0.85; high: 0.86-1.00).

See this image and copyright information in PMC

Cited by

Toxicity on Social Media During the 2022 Mpox Public Health Emergency: Quantitative Study of Topical and Network Dynamics.
Fan L, Li L, Hemphill L. Fan L, et al. J Med Internet Res. 2024 Dec 12;26:e52997. doi: 10.2196/52997. J Med Internet Res. 2024. PMID: 39666969 Free PMC article.
Asian hate speech detection on Twitter during COVID-19.
Toliyat A, Levitan SI, Peng Z, Etemadpour R. Toliyat A, et al. Front Artif Intell. 2022 Aug 15;5:932381. doi: 10.3389/frai.2022.932381. eCollection 2022. Front Artif Intell. 2022. PMID: 36046150 Free PMC article.
A high-resolution temporal and geospatial content analysis of Twitter posts related to the COVID-19 pandemic.
Ntompras C, Drosatos G, Kaldoudi E. Ntompras C, et al. J Comput Soc Sci. 2022;5(1):687-729. doi: 10.1007/s42001-021-00150-8. Epub 2021 Oct 20. J Comput Soc Sci. 2022. PMID: 34697602 Free PMC article.
Making sense of COVID-19 over time in New Zealand: Assessing the public conversation using Twitter.
Jafarzadeh H, Pauleen DJ, Abedin E, Weerasinghe K, Taskin N, Coskun M. Jafarzadeh H, et al. PLoS One. 2021 Dec 15;16(12):e0259882. doi: 10.1371/journal.pone.0259882. eCollection 2021. PLoS One. 2021. PMID: 34910732 Free PMC article.
A Systematic Review of the Outcomes of Utilization of Artificial Intelligence Within the Healthcare Systems of the Middle East: A Thematic Analysis of Findings.
Khosravi M, Mojtabaeian SM, Demiray EKD, Sayar B. Khosravi M, et al. Health Sci Rep. 2024 Dec 24;7(12):e70300. doi: 10.1002/hsr2.70300. eCollection 2024 Dec. Health Sci Rep. 2024. PMID: 39720235 Free PMC article. Review.

See all "Cited by" articles

References

1. Schmidt A, Wiegand M. A Survey on Hate Speech Detection using Natural Language Processing. Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media; Fifth International Workshop on Natural Language Processing for Social Media; April 2017; Valencia, Spain. Association for Computational Linguistics; 2017. pp. 1–10. - DOI
1. Silva L, Mondal M, Correa D, Benevenuto F, Weber I. Analyzing the Targets of Hate in Online Social Media. ArXiv. Preprint posted online on March 03, 2016. http://arxiv.org/abs/1603.07709
1. Fortuna P, Nunes S. A Survey on Automatic Detection of Hate Speech in Text. ACM Comput Surv. 2018 Sep 06;51(4):1–30. doi: 10.1145/3232676. - DOI
1. Tang L, Bie B, Park S, Zhi D. Social media and outbreaks of emerging infectious diseases: A systematic review of literature. Am J Infect Control. 2018 Sep;46(9):962–972. doi: 10.1016/j.ajic.2018.02.010. - DOI - PMC - PubMed
1. Al-garadi MA, Khan MS, Varathan KD, Mujtaba G, Al-Kabsi AM. Using online social networks to track a pandemic: A systematic review. J Biomed Inform. 2016 Aug;62:1–11. doi: 10.1016/j.jbi.2016.05.005. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach

Affiliation

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous