Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space


Religious hatred is a serious problem on Arabic Twitter space and has the potential to ignite terrorism and hate crimes beyond cyber space. To the best of our knowledge, this is the first research effort investigating the problem of recognizing Arabic tweets using inflammatory and dehumanizing language to promote hatred and violence against people on the basis of religious beliefs. In this work, we create the first public Arabic dataset of tweets annotated for religious hate speech detection. We also create three public Arabic lexicons of terms related to religion along with hate scores. We then present a thorough analysis of the labeled dataset, reporting most targeted religious groups and hateful and non-hateful tweets’ country of origin. The labeled dataset is then used to train seven classification models using lexicon-based, n-gram-based, and deep-learning-based approaches. These models are evaluated on new unseen dataset to assess the generalization ability of the developed classifiers. While using Gated Recurrent Units with pre-trained word embeddings provides best precision (0.76) and \(F_1\) score (0.77), training that same neural network on additional temporal, users, and content features provides the state-of-the-art performance in terms of recall (0.84).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. 1.

  2. 2.

  3. 3.

  4. 4.


  1. Al-garadi MA, Varathan KD, Ravana SD (2016) Cybercrime detection in online communications: the experimental case of cyberbullying detection in the twitter network. Comput Hum Behav 63:433–443

    Article  Google Scholar 

  2. Al-Twairesh N, Al-Khalifa H, AlSalman A (2016) Arasenti: large-scale twitter-specific Arabic sentiment lexicons. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 697–705

  3. Albadi N, Kurdi M, Mishra S (2018) Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 69–76

  4. Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion. International World Wide Web Conferences Steering Committee, pp 759–760

  5. Burton SH, Tanner KW, Giraud-Carrier CG, West JH, Barnes MD (2012) “right time, right place” health communication on twitter: value and accuracy of location information. J Med Internet Res 14(6):e156

    Article  Google Scholar 

  6. Chatzakou D, Kourtellis N, Blackburn J, De Cristofaro E, Stringhini G, Vakali A (2017) Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference. ACM, pp 13–22

  7. Chong A (2006) Intolerance of terror, or the terror of intolerance-religious tolerance and the response to terrorism. UTS L Rev 8:153

    Google Scholar 

  8. Chung J, Gülçehre Ç, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, arXiv:abs/1412.3555

  9. Church KW, Hanks P (1990) Word association norms, mutual information, and lexicography. Comput Linguist 16(1):22–29

    Google Scholar 

  10. Darwish K, Magdy W, Mourad A (2012) Language processing for Arabic microblog retrieval. In: Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, pp 2427–2430

  11. Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the 11th international AAAI conference on web and social media. ICWSM ’17, pp 512–515

  12. Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings. In: Proceedings of the 24th international conference on world wide web. ACM, pp 29–30

  13. Duwairi RM, Marji R, Sha’ban N, Rushaidat S (2014) Sentiment analysis in Arabic tweets. In: 2014 5th international conference on information and communication systems (ICICS). IEEE, pp 1–6

  14. Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  15. Forman G (2008) Bns feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM, pp 263–270

  16. Founta AM, Chatzakou D, Kourtellis N, Blackburn J, Vakali A, Leontiadis I (2019) A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM conference on web science. ACM, pp 105–114

  17. Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection. Int J Multimed Ubiquitous Eng 10(4):215–230

    Article  Google Scholar 

  18. Gouws S, Metzler D, Cai C, Hovy E (2011) Contextual bearing on linguistic variation in social media. In: Proceedings of the workshop on languages in social media. Association for Computational Linguistics, pp 20–29

  19. Kaati L, Omer E, Prucha N, Shrestha A (2015) Detecting multipliers of jihadism on twitter. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 954–960

  20. Kaji N, Kitsuregawa M (2007) Building lexicon for sentiment analysis from massive collection of html documents. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL)

  21. Kulshrestha J, Kooti F, Nikravesh A, Gummadi KP (2012) Geographic dissection of the twitter network. In: Sixth international AAAI conference on weblogs and social media. AAAI

  22. Kwok I, Wang Y (2013) Locate the hate: detecting tweets against blacks. In: AAAI

  23. Larsen ME, Boonstra TW, Batterham PJ, O’Dea B, Paris C, Christensen H (2015) We feel: mapping emotion on twitter. IEEE J Biomed Health Inform 19(4):1246–1252

    Article  Google Scholar 

  24. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196

  25. Magdy W, Darwish K, Abokhodair N, Rahimi A, Baldwin T (2016a) # isisisnotislam or# deportallmuslims? Predicting unspoken views. In: Proceedings of the 8th ACM conference on web science. ACM, pp 95–106

  26. Magdy W, Darwish K, Weber I (2016b) # failedrevolutions: using twitter to study the antecedents of ISIS support. First Monday 21(2)

  27. Mohammad SM, Kiritchenko S (2015) Using hashtags to capture fine emotion categories from tweets. Comput Intell 31(2):301–326

    MathSciNet  Article  Google Scholar 

  28. Mohammad SM, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the seventh international workshop on semantic evaluation exercises (SemEval-2013), Atlanta, Georgia, USA

  29. Mubarak H, Darwish K, Magdy W (2017) Abusive language detection on Arabic social media. In: Proceedings of the first workshop on abusive language online, pp 52–56

  30. Müller K, Schwarz C, et al. (2018) Fanning the flames of hate: social media and hate crime. Technical reports. Competitive Advantage in the Global Economy (CAGE)

  31. Olteanu A, Castillo C, Diaz F, Vieweg S (2014) Crisislex: a lexicon for collecting and filtering microblogged communications in crises. In: ICWSM

  32. Pasha A, Al-Badrashiny M, Diab MT, El Kholy A, Eskander R, Habash N, Pooleery M, Rambow O, Roth R (2014) Madamira: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. LREC 14:1094–1101

    Google Scholar 

  33. Pearson K (1900) X on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Lond Edinb Dublin Philos Mag J Sci 50(302):157–175

    Article  Google Scholar 

  34. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  35. Pew Research Center, Washington, DC (2009) Mapping the global Muslim population. Accessed 24 October 2018

  36. Pew Research Center, Washington, DC (2015) Religious composition by country, 2010–2050. Accessed 24 October 2018

  37. Pew Research Center, Washington, DC (2017) Global restrictions on religion rise modestly in 2015, reversing downward trend - appendix b: social hostilities index. Accessed 24 October 2018

  38. Ribeiro MH, Calais PH, Santos YA, Almeida VA, Meira Jr W (2018) Characterizing and detecting hateful users on twitter. In: Twelfth international AAAI conference on web and social media

  39. Salem F (2017) Social media and the internet of things towards data-driven policymaking in the Arab world: potential, limits and concerns. MBR School of Government 7, Dubai

  40. Silva LA, Mondal M, Correa D, Benevenuto F, Weber I (2016) Analyzing the targets of hate in online social media. In: Proceedings of the 11th international AAAI conference on web and social media. ICWSM’16, pp 687–690

  41. Soliman AB, Eissa K, El-Beltagy SR (2017) Aravec: a set of Arabic word embedding models for use in Arabic NLP. Proc Comput Sci 117:256–265

    Article  Google Scholar 

  42. Taghva K, Elkhoury R, Coombs J (2005) Arabic stemming without a root dictionary. In: International conference on information technology: coding and computing, 2005, ITCC 2005. IEEE, vol 1, pp 152–157

  43. Twitter Safety (2017) Enforcing new rules to reduce hateful conduct and abusive behavior. Accessed 27 November 2018

  44. Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics—human language technologies. NAACL-HLT’16, pp 88–93

  45. Wiktorowicz Q, Amanullah S (2015) How tech can fight extremism. Accessed 24 October 2018

  46. Yang J, Jiang YG, Hauptmann AG, Ngo CW (2007) Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the international workshop on multimedia information retrieval. ACM, pp 197–206

Download references

Author information



Corresponding author

Correspondence to Nuha Albadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Albadi, N., Kurdi, M. & Mishra, S. Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space. Soc. Netw. Anal. Min. 9, 41 (2019).

Download citation


  • Cyberhate
  • Religious hate speech
  • Online radicalization
  • Social media mining
  • Arabic NLP
  • Twitter data analysis