Abstract
With the proliferation of social networks and their impact on human life, one of the rising problems in this environment is the rise in verbal and written insults and hatred. As one of the significant platforms for distributing text-based content, Twitter frequently publishes its users’ abusive remarks. Creating a model that requires a complete collection of offensive sentences is the initial stage in recognizing objectionable phrases. In addition, despite the abundance of resources in English and other languages, there are limited resources and studies on identifying hateful and offensive statements in Persian. In this study, we compiled a 38K-tweet dataset of Persian Hate and Offensive language using keyword-based data selection strategies. A Persian offensive lexicon and nine hatred target group lexicons were gathered through crowdsourcing for this purpose. The dataset was annotated manually so that at least two annotators investigated tweets. In addition, for the purpose of analyzing the effect of used lexicons on language model functionality, we employed two assessment criteria (FPED and pAUCED) to measure the dataset’s potential bias. Then, by configuring the dataset based on the results of the bias measurement, we mitigated the effect of words’ bias in tweets on language model performance. The results indicate that bias is significantly diminished, while less than a hundredth reduced the F1 score.
Similar content being viewed by others
Availability of data and materials
The data that support the findings of this study are available from the corresponding author, upon reasonable request.
Notes
SemEval-2022.
TWINT.
References
(2019). Jigsaw unintended bias in toxicity classification. https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/
Alavi, P., Nikvand, P., & Shamsfard, M. (2021). Offensive language detection with bert-based models, by customizing attention probabilities. CoRR arXiv:abs/2110.05133.
Aldjanabi, W., Dahou, A., Al-qaness, M. A., et al. (2021). Arabic offensive and hate speech detection using a cross-corpora multi-task learning model. In Informatics, Multidisciplinary Digital Publishing Institute, p. 69.
Aljarah, I., Habib, M., Hijazi, N., et al. (2021). Intelligent detection of hate speech in arabic social network: A machine learning approach. Journal of Information Science, 47(4), 483–501.
Aljero, M. K. A., & Dimililer, N. (2021). A novel stacked ensemble for hate speech recognition. Applied Sciences, 11(24), 11,684.
Alshalan, R., Al-Khalifa, H., Alsaeed, D., et al. (2020). Detection of hate speech in covid-19-related tweets in the Arab region: Deep learning and topic modeling approach. Journal of Medical Internet Research, 22(12), e22,609.
Arslan, Y., Allix, K., Veiber, L., et al. (2021). A comparison of pre-trained language models for multi-class text classification in the financial domain. Companion Proceedings of the Web Conference, 2021, 260–268.
Badjatiya, P., Gupta, S., Gupta, M., et al. (2017). Deep learning for hate speech detection in tweets. In Proceedings of the 26th international conference on World Wide Web companion, pp. 759–760.
Badjatiya, P., Gupta, M., & Varma, V. (2019). Stereotypical bias removal for hate speech detection task using knowledge-based generalizations. In The World Wide Web Conference, pp. 49–59.
Barbieri, F., Camacho-Collados, J., Neves, L., et al. (2020). Tweeteval: Unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421
Basile, V., Bosco, C., Fersini, E., et al. (2019). Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In 13th international workshop on semantic evaluation, association for computational linguistics, pp. 54–63.
Chiu, K. L., & Alexander, R. (2021). Detecting hate speech with gpt-3. arXiv:2103.12407
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Conneau, A., Khandelwal, K., Goyal, N., et al. (2019). Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116
Czarnowska, P., Vyas, Y., & Shah, K. (2021). Quantifying social biases in nlp: A generalization and empirical comparison of extrinsic fairness metrics. Transactions of the Association for Computational Linguistics, 9, 1249–1267.
Davidson, T., Warmsley, D., Macy, M., et al. (2017). Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, pp. 512–515.
Davidson, T., Bhattacharya, D., & Weber, I. (2019). Racial bias in hate speech and abusive language detection datasets. arXiv:1905.12516
Dehghani, M., Dehkordy, D. T., & Bahrani, M. (2021). Abusive words detection in persian tweets using machine learning and deep learning techniques. In 2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS), IEEE (pp. 1–5).
Devlin, J., Chang, M. W., Lee, K., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota (pp. 4171–4186), https://doi.org/10.18653/v1/N19-1423, https://aclanthology.org/N19-1423
Dixon, L., Li, J., Sorensen, J., et al. (2018). Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society (pp. 67–73).
Dowlagar, S., & Mamidi, R. (2021). Hasocone@ fire-hasoc2020: Using bert and multilingual bert models for hate speech detection. arXiv:2101.09007
Gharachorloo, M., Farahani, M., Farahani, M., et al. (2021). Parsbert: Transformer-based model for Persian language understanding. Neural Processing Letters, 53(6), 3831–3847.
Fortuna, P., & Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4), 1–30.
Fortuna, P., Soler-Company, J., & Wanner, L. (2021). How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Information Processing and Management, 58(3), 102,524.
Founta, A. M., Djouvas, C., Chatzakou, D., et al. (2018). Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth international AAAI conference on web and social media
Garg, T., Masud, S., Suresh, T., et al. (2022). Handling bias in toxic speech detection: A survey. arXiv:2202.00126
Golbeck, J., Ashktorab, Z., Banjo, R. O., et al. (2017). A large labeled corpus for online harassment research. In Proceedings of the 2017 ACM on web science conference. Association for Computing Machinery, New York, NY, USA, WebSci ’17, (p. 229-233), https://doi.org/10.1145/3091478.3091509
Haq, N. U., Ullah, M., Khan, R., et al. (2020). Usad: An intelligent system for slang and abusive text detection in Perso-Arabic-Scripted Urdu. Complexity 2020.
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in Neural Information Processing systems, 29.
He, B., Ziems, C., Soni, S., et al. (2021). Racism is a virus: anti-asian hate and counterspeech in social media during the covid-19 crisis. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (pp. 90–94).
Jey, P. S., Hemmati, A., Toosi, R., et al. (2022). Hate sentiment recognition system for persian language. In 2022 12th International Conference on Computer and Knowledge Engineering (ICCKE), IEEE, (pp. 517–522).
Kennedy, B., Jin, X., Davani, A. M., et al. (2020). Contextualizing hate speech classifiers with post-hoc explanation. arXiv:2005.02439
Kennedy, G., McCollough, A., Dixon, E., et al. (2017). Technology solutions to combat online harassment. In Proceedings of the first workshop on abusive language online. Association for Computational Linguistics, Vancouver, BC, Canada, (pp. 73–77), https://doi.org/10.18653/v1/W17-3011, https://aclanthology.org/W17-3011
Madukwe, K., Gao, X., & Xue, B. (2020). In data we trust: A critical analysis of hate speech detection datasets. In Proceedings of the fourth workshop on online abuse and harms, pp. 150–161.
Malekzadeh, A. (2020). Bertweet-fa: A pre-trained language model for persian (a.k.a farsi) tweets. https://github.com/arm-on/BERTweet-FA
Mollas, I., Chrysopoulou, Z., Karlos, S., et al. (2020). Ethos: An online hate speech detection dataset. arXiv:2006.08328
Mozafari, M., Farahbakhsh, R., & Crespi, N. (2019). A bert-based transfer learning approach for hate speech detection in online social media. In International conference on complex networks and their applications, Springer, (pp. 928–940).
Mozafari, M., Farahbakhsh, R., & Crespi, N. (2020). Hate speech detection and racial bias mitigation in social media based on bert model. PloS One, 15(8), e0237,861.
Mozafari, M., Farahbakhsh, R., & Crespi, N. (2022). Cross-lingual few-shot hate speech and offensive language detection using meta learning. IEEE Access, 10, 14,880-14,896. https://doi.org/10.1109/ACCESS.2022.3147588
Park, J. H., Shin, J., & Fung, P. (2018). Reducing gender bias in abusive language detection. arXiv:1808.07231
Qian, J., Bethke, A., Liu, Y., et al. (2019). A benchmark dataset for learning to intervene in online hate speech. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, (pp. 4755–4764), https://doi.org/10.18653/v1/D19-1482, https://aclanthology.org/D19-1482
Rajput, G., Punn, N. S., Sonbhadra, S. K., et al. (2021). Hate speech detection using static bert embeddings. In International conference on big data analytics, Springer, (pp. 67–77).
Salawu, S., He, Y., & Lumsden, J. (2020). Approaches to automated detection of cyberbullying: A survey. IEEE Transactions on Affective Computing, 11(1), 3–24. https://doi.org/10.1109/TAFFC.2017.2761757
Salminen, J., Almerekhi, H., Milenković, M., et al. (2018). Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In Twelfth International AAAI Conference on Web and Social Media.
Sap, M., Gabriel, S., Qin, L., et al. (2019). Social bias frames: Reasoning about social and power implications of language. arXiv:1911.03891
Schmidt, A., & Wiegand, M. (2019). A survey on hate speech detection using natural language processing. In Proceedings of the fifth international workshop on natural language processing for social media, April 3, 2017, Valencia, Spain, Association for Computational Linguistics, (pp. 1–10).
Shah, D., Schwartz, H. A., & Hovy, D. (2019). Predictive biases in natural language processing models: A conceptual framework and overview. arXiv:1912.11078
Silva, L., Mondal, M., Correa, D., et al. (2016). Analyzing the targets of hate in online social media. In Tenth international AAAI conference on web and social media.
Van Hee, C., Lefever, E., Verhoeven, B., et al. (2015). Detection and fine-grained classification of cyberbullying events. In Proceedings of the international conference recent advances in natural language processing, (pp. 672–680).
Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, (pp. 88–93).
Wiegand, M., Ruppenhofer, J., & Kleinbauer, T. (2019). Detection of abusive language: The problem of biased datasets. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, (pp. 602–608), https://doi.org/10.18653/v1/N19-1060, https://aclanthology.org/N19-1060
Wu, T., Caccia, M., Li, Z., et al. (2022). Pretrained language model in continual learning: A comparative study. In International conference on learning representations.
Wulczyn, E., Thain, N., & Dixon, L. (2017). Ex machina: Personal attacks seen at scale. In Proceedings of the 26th international conference on world wide web. International world wide web conferences steering committee, republic and canton of Geneva, CHE, WWW ’17, (p. 1391-1399), https://doi.org/10.1145/3038912.3052591
Wullach, T., Adler, A., & Minkov, E. (2021). Towards hate speech detection at large via deep generative modeling. IEEE Internet Computing, 25(2), 48–57. https://doi.org/10.1109/MIC.2020.3033161
Zampieri, M., Malmasi, S., Nakov, P., et al. (2019). Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). arXiv:1903.08983
Zampieri, M., Nakov, P., Rosenthal, S., et al. (2020). SemEval-2020 task 12: Multilingual offensive language identification in social media (OffensEval 2020). In Proceedings of the fourteenth workshop on semantic evaluation. International committee for computational linguistics, Barcelona (online), (pp. 1425–1447), https://doi.org/10.18653/v1/2020.semeval-1.188, URL https://aclanthology.org/2020.semeval-1.188
Zhang, C., Beetz, J., & de Vries, B. (2018). Bimsparql: Domain-specific functional sparql extensions for querying rdf building data. Semantic Web, 9(6), 829–855.
Zhou, X. (2021). Challenges in automated debiasing for toxic language detection. University of Washington.
Acknowledgements
This research was in part supported by a grant from the School of Computer Science, Institute for Research in Fundamental Sciences, IPM, Iran (No. CS1402-4-237).
Funding
This research was in part supported by a grant from the School of Computer Science, Institute for Research in Fundamental Sciences, IPM (No. CS1400-4-237).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Emad Kebriaei, Ali Homayouni, Roghayeh Faraji and Armita Razavi. The first draft of the manuscript was written by Emad Kebriaei. Reviewing the manuscript and editing is done by Azadeh Shakery, Heshaam Faili and Yadollah Yaghoobzadeh. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
Ethics approval
We confirm that the manuscript would not be submitted for publication in any other Journal or Magazine till the decision is made by journal editors.
Consent to participate
We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.
Consent for publication
Not applicable.
Code availability
Code for data cleaning and analysis is provided as part of the replication package. It is available at https://www.dropbox.com/s/z09fjb84wcaqqvn/HSD.zip?dl=0 for review.
Additional information
Editor: Nuno Moniz, Paula Branco, Luís Torgo, Nathalie Japkowicz, Michal Wozniak, and Shuo Wang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
Identity Keywords
Full list of identity keywords in Persian, English translation(using Google Translate), and with transliteration are shown in Fig. 7.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kebriaei, E., Homayouni, A., Faraji, R. et al. Persian offensive language detection. Mach Learn 113, 4359–4379 (2024). https://doi.org/10.1007/s10994-023-06370-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-023-06370-5