Hate speech and abusive language detection in Indonesian social media: Progress and challenges
- PMID: 37636475
- PMCID: PMC10447929
- DOI: 10.1016/j.heliyon.2023.e18647
Hate speech and abusive language detection in Indonesian social media: Progress and challenges
Abstract
Nowadays Hate Speech and Abusive Language (HSAL) have spread extensively over social media. The easy use of social media allows people to abuse the media to spread HSAL. Hate speech and abusive language in social media must be detected because they can trigger conflict among citizens. Not only in social media, but HSAL also often trigger conflict in real life. In recent years, many scholars have researched HSAL detection in various languages and media. However, there are still many tasks on HSAL detection that need to be done to develop a better HSAL detection system. This paper discusses a summary of Indonesian HSAL detection research, conducted by utilizing the Kitchenham systematic literature review method. Based on our summary, we found that most Indonesian HSAL research still uses the classic machine-learning approach with classic text representation features that experimented on the Twitter text dataset. We also found several challenges and tasks that need to be addressed to build a better HSAL detection system in Indonesian social media that can detect the hate speech target, category, and levels; and the hate speech buzzer, thread starter, and fake account spreader.
Keywords: Abusive language; Hate speech; Indonesian social media.
© 2023 Published by Elsevier Ltd.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures
Similar articles
-
Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609. J Med Internet Res. 2020. PMID: 33207310 Free PMC article.
-
Addressing religious hate online: from taxonomy creation to automated detection.PeerJ Comput Sci. 2022 Dec 15;8:e1128. doi: 10.7717/peerj-cs.1128. eCollection 2022. PeerJ Comput Sci. 2022. PMID: 37346317 Free PMC article.
-
Hate speech detection and racial bias mitigation in social media based on BERT model.PLoS One. 2020 Aug 27;15(8):e0237861. doi: 10.1371/journal.pone.0237861. eCollection 2020. PLoS One. 2020. PMID: 32853205 Free PMC article.
-
Directions in abusive language training data, a systematic review: Garbage in, garbage out.PLoS One. 2020 Dec 28;15(12):e0243300. doi: 10.1371/journal.pone.0243300. eCollection 2020. PLoS One. 2020. PMID: 33370298 Free PMC article.
-
Hate speech detection with ADHAR: a multi-dialectal hate speech corpus in Arabic.Front Artif Intell. 2024 May 30;7:1391472. doi: 10.3389/frai.2024.1391472. eCollection 2024. Front Artif Intell. 2024. PMID: 38873176 Free PMC article. Review.
References
-
- A dataset and preliminaries study for abusive language detection in Indonesian social media. Proc. Comput. Sci.; The 3rd International Conference on Computer Science and Computational Intelligence (ICCSCI 2018): Empowering Smart Technology in Digital Era for a Better Life; 2018. pp. 222–229. - DOI
-
- Adriani M., Asian J., Nazief B., Tahaghoghi S.M., Williams H.E. Stemming Indonesian: a confix-stripping approach. ACM Trans. Asian Lang. Inf. Process. 2007;6:1–33.
-
- Ahmad Niam I.M., Irawan B., Setianingsih C., Putra B.P. 2018 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC) 2018. Hate speech detection using latent semantic analysis (lsa) method based on image; pp. 166–171.
-
- Akbar R.R.E., Shofa R.N., Paripurna M.I., Supratman . 2019 International Conference on Sustainable Engineering and Creative Computing (ICSECC) 2019. The implementation of naïve Bayes algorithm for classifying tweets containing hate speech with political motive; pp. 144–148.
-
- Alfina I., Mulia R., Fanany M.I., Ekanata Y. 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS) 2017. Hate speech detection in the Indonesian language: a dataset and preliminary study; pp. 233–238.
Publication types
LinkOut - more resources
Full Text Sources