Advertisement

Applied Intelligence

, Volume 48, Issue 12, pp 4730–4742 | Cite as

Effective hate-speech detection in Twitter data using recurrent neural networks

  • Georgios K. PitsilisEmail author
  • Heri Ramampiaro
  • Helge Langseth
Article

Abstract

This paper addresses the important problem of discerning hateful content in social media. We propose a detection scheme that is an ensemble of Recurrent Neural Network (RNN) classifiers, and it incorporates various features associated with user-related information, such as the users’ tendency towards racism or sexism. This data is fed as input to the above classifiers along with the word frequency vectors derived from the textual content. We evaluate our approach on a publicly available corpus of 16k tweets, and the results demonstrate its effectiveness in comparison to existing state-of-the-art solutions. More specifically, our scheme can successfully distinguish racism and sexism messages from normal text, and achieve higher classification quality than current state-of-the-art algorithms.

Keywords

Text classification Micro-blogging Hate-speech Deep learning Recurrent neural networks Twitter 

Notes

Acknowledgements

This work has been supported by Telenor Research, Norway, through the collaboration project between NTNU and Telenor. It has been carried out at the Telenor – NTNU AI-Lab.

References

  1. 1.
    Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion. International World Wide Web Conferences Steering Committee, pp 759–760Google Scholar
  2. 2.
    Barnaghi P, Ghaffari P, Breslin JG (2016) Opinion mining and sentiment polarity on twitter and correlation between events and sentiment. In: 2nd IEEE international conference on big data computing service and applications (BigDataService), pp 52–57Google Scholar
  3. 3.
    Facebook, Google and Twitter agree german hate speech deal. Website. http://www.bbc.com/news/world-europe-35105003. Accessed 26 Nov 2016
  4. 4.
    Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 international conference on privacy, security, risk and trust (PASSAT 2012), and 2012 international confernece on social computing (SocialCom 2012), pp 71–80Google Scholar
  5. 5.
    Zuckerberg in Germany: No place for hate speech onFacebook. Website. http://www.dailymail.co.uk/wires/ap/article-3465562/Zuckerberg-no-place-hate-speech-Facebook.html. Accessed 26 Feb 2016
  6. 6.
    Davidson T, Warmsley D, Macy MW, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the 11th international conference on web and social media (ICWSM 2017), pp 512–515Google Scholar
  7. 7.
    Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings. In: Proceedings of the 24th international conference on world wide web companion. ACM, pp 29–30Google Scholar
  8. 8.
    Elman J (1990) Finding structure in time. Cogn Sci 14:179–211CrossRefGoogle Scholar
  9. 9.
    Gambäck B., Sikdar UK (2017) Using convolutional neural networks to classify hate-speech. In: Proceedings of the 1st workshop on abusive language online at ACL 2017Google Scholar
  10. 10.
    Gandhi I, Pandey M (2015) Hybrid ensemble of classifiers using voting. In: 2015 international conference on green computing and Internet of Things (ICGCIoT), pp 399–404.  https://doi.org/10.1109/ICGCIoT.2015.7380496
  11. 11.
    Jha A, Mamidi R (2017) When does a compliment become sexist? analysis and classification of ambivalent sexism using twitter data. In: Proceedings of the second workshop on NLP and computational social science. Association for Computational Linguistics, pp 7–16Google Scholar
  12. 12.
    Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv:http://arXiv.org/abs/1607.01759
  13. 13.
    Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the 3rd international conference on learning representations (ICLR 2014)Google Scholar
  14. 14.
    Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324.  https://doi.org/10.1109/5.726791 CrossRefGoogle Scholar
  15. 15.
    NewYorkTimes (2017) Twitter must do more to block isis. Website. https://www.nytimes.com/2017/01/13/opinion/twitter-must-do-more-to-block-isis.html. Accessed 30 Sept 2017
  16. 16.
    Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web (WWW 2016). International World Wide Web Conferences Steering Committee, pp 145–153Google Scholar
  17. 17.
    Omer S, Lior R (2018) Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, p e1249.  https://doi.org/10.1002/widm.1249. Published Online: Feb 27 2018Google Scholar
  18. 18.
    Orrite C, Rodríguez M, Martínez F, Fairhurst M (2008) Classifier ensemble generation for the majority vote rule. In: Ruiz-Shulcloper J, Kropatsch WG (eds) Progress in pattern recognition, image analysis and applications. Springer Berlin Heidelberg, Berlin Heidelberg, pp 340–347Google Scholar
  19. 19.
    Park JH, Fung P (2017) One-step and two-step classification for abusive language detection on twitter. In: Proceedings of the 1st workshop on abusive language online at ACL 2017Google Scholar
  20. 20.
    Saha S, Ekbal A (2013) Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition. Data Knowl Eng 85:15–39. Natural Language for Information Systems, Communicating with Anything, Anywhere in Natural LanguageCrossRefGoogle Scholar
  21. 21.
    Schmidt A, Wiegand M (2017) A survey on hate speech detection using natural language processing. In: Proceedings of the 5th international workshop on natural language processing for social media. Association for Computational Linguistics, pp 1–10)Google Scholar
  22. 22.
    Vigna FD, Cimino A, Dell’Orletta F, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on facebook. In: Proceedings of the 1st Italian conference on cybersecurity (ITASEC17), pp 86–95. http://ceur-ws.org/Vol-1816/paper-09.pdf
  23. 23.
    Warner W, Hirschberg J (2012) Detecting hate speech on the world wide web. In: Proceedings of the 2nd workshop on language in social media (LSM2012) LSM ’12. Association for Computational Linguistics, pp 19–26Google Scholar
  24. 24.
    Waseem Z (2016) Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science. association for computational linguistics, pp 138–142Google Scholar
  25. 25.
    Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. Association for Computational LinguisticsGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Georgios K. Pitsilis
    • 1
    Email author
  • Heri Ramampiaro
    • 1
  • Helge Langseth
    • 1
  1. 1.Department of Computer ScienceNorwegian University of Science and Technology (NTNU)TrondheimNorway

Personalised recommendations