Skip to main content

Review of Offensive Language Detection on Social Media: Current Trends and Opportunities

  • Conference paper
  • First Online:
Emerging Trends and Applications in Artificial Intelligence ( ICETAI 2023)

Abstract

Offensive language is defined as derogatory or obscene language that has various forms such as hate speech or cyberbullying. Automated detection of offensive language gains traction due to the high and growing scale of social media user input. In this paper, we provide an overview of the field including background and recent research with a focus on natural language processing. We present a synopsis on the ambiguity in definition and categorization of offensive language, application areas of an automated system, shared tasks organized in this field, dataset creation, model evolution in time through machine learning and deep learning algorithms. Finally challenges and gaps in research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.statista.com/topics/1164/social-networks/dossierKeyfigures.

  2. 2.

    https://www.coe.int/en/web/european-commission-against-racism-and-intolerance/ecri-glossary.

  3. 3.

    https://thelawdictionary.org/offensive-language/.

  4. 4.

    https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech.

  5. 5.

    https://www.stopbullying.gov/cyberbullying/what-is-it.

  6. 6.

    https://store.businessinsider.com/products/the-content-moderation-report.

  7. 7.

    https://knowledge.wharton.upenn.edu/article/social-media-firms-moderate-content/.

  8. 8.

    https://competitions.codalab.org/competitions/33835.

  9. 9.

    https://competitions.codalab.org/competitions/34344.

  10. 10.

    https://webtribunal.net/blog/meme-statistics/gref.

References

  1. Agrawal, S., Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2019. LNCS, vol. 10772, pp. 141–153. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-76941-7_11

    Chapter  Google Scholar 

  2. Akhter, M.P., Jiangbin, Z., Naqvi, I.R., Abdelmajeed, M., Sadiq, M.T.: Automatic detection of offensive language for urdu and roman urdu. IEEE Access 8, 91213–91226 (2020)

    Article  Google Scholar 

  3. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)

    Google Scholar 

  4. Bilewicz, M., Soral, W.: Hate speech epidemic. The dynamic effects of derogatory language on intergroup relations and political radicalization. Polit. Psychol. 41, 3–33 (2020)

    Article  Google Scholar 

  5. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  6. Bosco, C., Felice, D., Poletto, F., Sanguinetti, M., Maurizio, T.: Overview of the evalita 2018 hate speech detection task. In: EVALITA 2018-Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, vol. 2263, pp. 1–9. CEUR (2018)

    Google Scholar 

  7. Breazzano, C., Croce, D., Basili, R.: Multi-task and Generative Adversarial Learning for Robust and Sustainable Text Classification. In: Bandini, S., Gasparini, F., Mascardi, V., Palmonari, M., Vizzari, G. (eds.) AIxIA 2021. LNCS, vol. 13196, pp. 228–244. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-08421-8_16

    Chapter  Google Scholar 

  8. Casavantes, M., Aragón, M.E., González, L.C., Montes-y Gómez, M.: Leveraging posts’ and authors’ metadata to spot several forms of abusive comments in twitter. J. Intell. Inf. Syst. 61, 519–539 (2023)

    Article  Google Scholar 

  9. Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80. IEEE (2012)

    Google Scholar 

  10. Chowdhury, S.A., Mubarak, H., Abdelali, A., Jung, S., Jansen, B.J., Salminen, J.: A multi-platform arabic news comment dataset for offensive language detection. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6203–6212 (2020)

    Google Scholar 

  11. Chung, Y.L., Tekiroglu, S.S., Guerini, M.: Towards knowledge-grounded counter narrative generation for hate speech. arXiv preprint arXiv:2106.11783 (2021)

  12. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    Article  Google Scholar 

  13. Davidson, T., Bhattacharya, D., Weber, I.: Racial bias in hate speech and abusive language detection datasets. arXiv preprint arXiv:1905.12516 (2019)

  14. De la Peña Sarracén, G.L., Rosso, P.: Convolutional graph neural networks for hate speech detection in data-poor settings. In: Natural Language Processing and Information Systems: 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022, Valencia, Spain, June 15–17, 2022, Proceedings, pp. 16–24. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-08473-7_2

  15. Deng, J., et al.: Cold: a benchmark for chinese offensive language detection. arXiv preprint arXiv:2201.06025 (2022)

  16. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  17. Escartín, C.P., Lynn, T., Moorkens, J., Dunne, J.: Towards transparency in nlp shared tasks. arXiv preprint arXiv:2105.05020 (2021)

  18. Fanton, M., Bonaldi, H., Tekiroglu, S.S., Guerini, M.: Human-in-the-loop for data collection: a multi-target counter narrative dataset to fight online hate speech. arXiv preprint arXiv:2107.08720 (2021)

  19. Fortuna, P., da Silva, R.R., Wanner, L., Nunes, S., et al.: A hierarchically-labeled Portuguese hate speech dataset. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 94–104 (2019)

    Google Scholar 

  20. Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. (CSUR) 51(4), 1–30 (2018)

    Article  Google Scholar 

  21. Fukushima, K., Miyake, S.: Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition. In: Amari, S., Arbib, M.A. (eds.) Competition and Cooperation in Neural Nets, pp. 267–285. Springer, Heidelberg (1982). https://doi.org/10.1007/978-3-642-46466-9_18

  22. Galán-García, P., de la Puerta, J.G., Gómez, C.L., Santos, I., Bringas, P.G.: Supervised machine learning for the detection of troll profiles in twitter social network: application to a real case of cyberbullying. Logic J. IGPL 24(1), 42–53 (2016)

    Google Scholar 

  23. Hada, R., Sudhir, S., Mishra, P., Yannakoudakis, H., Mohammad, S.M., Shutova, E. Ruddit: norms of offensiveness for english reddit comments. arXiv preprint arXiv:2106.05664 (2021)

  24. Hassan, S., Samih, Y., Mubarak, H., Abdelali, A., Rashed, A., Chowdhury, S.A.: Alt submission for osact shared task on offensive language detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 61–65 (2020)

    Google Scholar 

  25. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  26. Holmes, G., Donkin, A., Witten, I.H.: Weka: a machine learning workbench. In: Proceedings of ANZIIS’94-Australian New Zealand Intelligent Information Systems Conference, pp. 357–361. IEEE (1994)

    Google Scholar 

  27. Ilan, T., Vilenchik, D.: Harald: augmenting hate speech data sets with real data. In: Findings of the Association for Computational Linguistics: EMNLP 2022, pp.2241–2248 (2022)

    Google Scholar 

  28. Jahan, M.S., Oussalah, M.: A systematic review of hate speech automatic detection using natural language processing. arXiv preprint arXiv:2106.00742 (2021)

  29. Jaki, S., De Smedt, T.: Right-wing german hate speech on twitter: analysis and automatic detection. arXiv preprint arXiv:1910.07518 (2019)

  30. Jeong, Y., et al.: Kold: Korean offensive language dataset. arXiv preprint arXiv:2205.11315 (2022)

  31. Karayiğit, H., Akdagli, A., Aci, Ç.İ: Homophobic and hate speech detection using multilingual-bert model on Turkish social media. Inf. Technol. Control 51(2), 356–375 (2022)

    Article  Google Scholar 

  32. Keele, S., et al.: Guidelines for performing systematic literature reviews in software engineering. Technical report, ver. 2.3 ebse technical report. ebse (2007)

    Google Scholar 

  33. Kiela, D., et al.: The hateful memes challenge: competition report. In: NeurIPS 2020 Competition and Demonstration Track, pp. 344–360. PMLR (2021)

    Google Scholar 

  34. Klonick, K.: The new governors: the people, rules, and processes governing online speech. Harv. L. Rev. 131, 1598 (2017)

    Google Scholar 

  35. Kumar, A., Saumya, S., Roy, P.K.: Abusive and threatening language detection from urdu social media posts: a machine learning approach (2021)

    Google Scholar 

  36. Kumar, R., Ojha, A.K., Malmasi, S., Zampieri, M.: Benchmarking aggression identification in social media. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 1–11 (2018)

    Google Scholar 

  37. Kwan, I., et al.: Cyberbullying and children and young people’s mental health: a systematic map of systematic reviews. Cyberpsychol. Behav. Soc. Netw. 23(2), 72–82 (2020)

    Article  Google Scholar 

  38. Leite, J.A., Silva, D.F., Bontcheva, K., Scarton, C.: Toxic language detection in social media for brazilian portuguese: new dataset and multilingual analysis. arXiv preprint arXiv:2010.04543 (2020)

  39. MacAvaney, S., Yao, H.-R., Yang, E., Russell, K., Goharian, N., Frieder, O.: Hate speech detection: challenges and solutions. PLoS ONE 14(8), e0221152 (2019)

    Article  Google Scholar 

  40. Makhnytkina, O., Matveev, A., Bogoradnikova, D., Lizunova, I., Maltseva, A., Shilkina, N.: Detection of toxic language in short text messages. In: Karpov, A., Potapova, R. (eds.) SPECOM 2020. LNCS, pp. 315–325. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-030-60276-5_31

    Chapter  Google Scholar 

  41. McGillivray, B., et al.: Leveraging time-dependent lexical features for offensive language detection. In: Proceedings of the 1st Workshop of Ever Evolving NLP, EMNLP 2022 (2022)

    Google Scholar 

  42. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  43. Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification: a comprehensive review. ACM Comput. Surv. (CSUR) 54(3), 1–40 (2021)

    Article  Google Scholar 

  44. Mozafari, N., Farahbakhsh, R., Crespi, N.: A bert-based transfer learning approach for hate speech detection in online social media. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds.) COMPLEX NETWORKS 2019, vol. 881, pp. 928–940. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-030-36687-2_77

    Chapter  Google Scholar 

  45. Mubarak, H., Al-Khalifa, H., Al-Thubaity, A.M.: Overview of osact5 shared task on arabic offensive language and hate speech detection. In: Proceedings of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, pp. 162–166 (2022)

    Google Scholar 

  46. Mulki, H., Haddad, H., Ali, C.B., L-hsab, H.A.: A levantine twitter dataset for hate speech and abusive language. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 111–118 (2019)

    Google Scholar 

  47. Nouri, N.: Data augmentation with dual training for offensive span detection. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2569–2575 (2022)

    Google Scholar 

  48. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  49. Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, vol. 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics (2018)

    Google Scholar 

  50. Radford, A., Jeffrey, W., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  51. Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S.: Offensive language detection using multi-level classification. In: Farzindar, A., Keselj, V. (eds.) Canadian AI 2010. LNCS, vol. 6085, pp. 16–27. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13059-5_5

    Chapter  Google Scholar 

  52. Rodríguez-Sánchez, F., et al.: Overview of exist 2022: sexism identification in social networks. Procesamiento del Lenguaje Natural 69, 229–240 (2022)

    Google Scholar 

  53. Rosa, H., et al.: Automatic cyberbullying detection: a systematic review. Comput. Human Behav. 93, 333–345 (2019)

    Article  Google Scholar 

  54. Rosenthal, S., Atanasova, P., Karadzhov, G., Zampieri, M., Nakov, P.: Solid: a large-scale semi-supervised dataset for offensive language identification. arXiv preprint arXiv:2004.14454 (2020)

  55. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  56. Sadiq, S., Mehmood, A., Ullah, S., Ahmad, M., Choi, G.S., On, B.W.: Aggression detection through deep neural model on twitter. Future Gener. Comput. Syst. 114, 120–129 (2021)

    Article  Google Scholar 

  57. Sap, M., Card, D., Gabriel, S., Choi, Y., Smith, N.A.: The risk of racial bias in hate speech detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1668–1678 (2019)

    Google Scholar 

  58. Si, S., Datta, A., Banerjee,S., Naskar, S.K.: Aggression detection on multilingual social media text. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–5. IEEE (2019)

    Google Scholar 

  59. Tanvir, R., et al.: A gan-bert based approach for bengali text classification with a few labeled examples. In: Omatu, S., Mehmood, R., Sitek, P., Cicerone, S., Rodriguez, S. (eds.) DCAI 2022. LNCS, vol. 583, pp. 20–30. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20859-1_3

    Chapter  Google Scholar 

  60. Tekiroglu, S.S., Chung, Y.L., Guerini, M.: Generating counter narratives against online hate speech: data and strategies. arXiv preprint arXiv:2004.04216 (2020)

  61. Templier, M., Paré, G.: A framework for guiding and evaluating literature reviews. Commun. Assoc. Inf. Syst. 37(1), 6 (2015)

    Google Scholar 

  62. Van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for toxic comment classification: an in-depth error analysis. arXiv preprint arXiv:1809.07572 (2018)

  63. Van Hee, C., Jacobs, G., Emmery, C., Desmet, B., Lefever, E., Verhoeven, B., De Pauw, G., Daelemans, W., Hoste, V.: Automatic detection of cyberbullying in social media text. PLoS ONE 13(10), e0203794 (2018)

    Article  Google Scholar 

  64. Vidgen, B., Derczynski, L.: Directions in abusive language training data, a systematic review: Garbage in, garbage out. PLoS ONE 15(12), e0243300 (2020)

    Article  Google Scholar 

  65. Wiegand, M., Ruppenhofer, J., Eder, E.: Implicitly abusive language–what does it actually look like and why are we not getting there? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 576–587. Association for Computational Linguistics (2021)

    Google Scholar 

  66. Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, vol. 1 (long and short papers), pp. 602–608 (2019)

    Google Scholar 

  67. Wiegand, M., Siegel, M., Ruppenhofer, J.: Overview of the germeval 2018 shared task on the identification of offensive language. In: Overview of the Germeval 2018 Shared Task on the Identification of Offensive Language (2018)

    Google Scholar 

  68. Wullach, T., Adler, A., Minkov, E.: Fight fire with fire: fine-tuning hate detectors using large samples of generated hate speech. arXiv preprint arXiv:2109.00591 (2021)

  69. Yang, F., et al.: Exploring deep multimodal fusion of text and photo for hate speech classification. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 11–18 (2019)

    Google Scholar 

  70. Yang, K., Jang, W., Cho, W.I.: Apeach: attacking pejorative expressions with analysis on crowd-generated hate speech evaluation datasets. arXiv preprint arXiv:2202.12459 (2022)

  71. Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666 (2019)

  72. Zampieri, N., et al. Semeval-2020 task 12: multilingual offensive language identification in social media (offenseval 2020). arXiv preprint arXiv:2006.07235 (2020)

  73. Zhu, W., Bhat, S.: Generate, prune, select: a pipeline for counterspeech generation against online hate speech. arXiv preprint arXiv:2106.01625 (2021)

Download references

Acknowledgments

We acknowledge support from the Maria de Maeztu Units of Excellence Programme CEX2021-001195-M, funded by MICIU/AEI /10.13039/50110001103 and with the support from Departament de Recerca i Universitats de la Generalitat de Catalunya (ajuts SGR-Cat 2021).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lütfiye Seda Mut Altın .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mut Altın, L.S., Saggion, H. (2024). Review of Offensive Language Detection on Social Media: Current Trends and Opportunities. In: García Márquez, F.P., Jamil, A., Hameed, A.A., Segovia Ramírez, I. (eds) Emerging Trends and Applications in Artificial Intelligence. ICETAI 2023. Lecture Notes in Networks and Systems, vol 960. Springer, Cham. https://doi.org/10.1007/978-3-031-56728-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56728-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56727-8

  • Online ISBN: 978-3-031-56728-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics