Abstract
Over the last decade, the increased use of social media has led to an increase in hateful activities in social networks. Hate speech is one of the most dangerous of these activities, so users have to protect themselves from these activities from YouTube, Facebook, Twitter etc. This paper introduces a method for using a hybrid of natural language processing and with machine learning technique to predict hate speech from social media websites. After hate speech is collected, steaming, token splitting, character removal and inflection elimination is performed before performing hate speech recognition process. After that collected data is examined using a killer natural language processing optimization ensemble deep learning approach (KNLPEDNN). This method detects hate speech on social media websites using an effective learning process that classifies the text into neutral, offensive and hate language. The performance of the system is then evaluated using overall accuracy, f-score, precision and recall metrics. The system attained minimum deviations mean square error − 0.019, Cross Entropy Loss − 0.015 and Logarithmic loss L-0.0238 and 98.71% accuracy.
This is a preview of subscription content, access via your institution.








References
Xiang G, Fan B, Wang L, Hong J, Rose C (2012) Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 1980–1984
Del Vigna F, Cimino A, Dell’Orletta F, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on Facebook. In: Proceedings of the first Italian conference on cybersecurity (ITASEC17), Venice, Italy
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp 88–93
Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6:13825–13835
Bouazizi M, Ohtsuki TO (2016) A pattern-based approach for sarcasm detection on twitter. IEEE Access 4:5477–5488
Facebook, Google and Twitter agree German Hate Speech Deal. Website. http://www.bbc.com/news/world-europe-35105003. Accessed 26 Mar 2019
AlFarraj O, AlZubi A, Tolba A (2018) Optimized feature selection algorithm based on fireflies with gravitational ant colony algorithm for big data predictive analytics. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3612-0
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 international conference on privacy, security, risk and trust and 2012 international conference on social computing. IEEE, pp 71–80
Xia F, Liaqat HB, Ahmed AM, Liu L, Ma J, Huang R, Tolba A (2016) User popularity-based packet scheduling for congestion control in ad-hoc social networks. J Comput Syst Sci 82(1):93–112
Li J, Ning Z, Jedari B, Xia F, Lee I, Tolba A (2016) Geo-social distance-based data dissemination for socially aware networking. IEEE Access 4:1444–1453
Rahim A, Qiu T, Ning Z, Wang J, Ullah N, Tolba A, Xia F (2019) Social acquaintance based routing in vehicular social networks. Future Gen Comput Syst 93:751–760
Fortuna P, Nunes S (2018) A survey on automatic detection of hate speech in text. ACM Comput Surv (CSUR) 51(4):85
Pitsilis GK, Ramampiaro H, Langseth H (2018) Effective hate-speech detection in Twitter data using recurrent neural networks. Appl Intell 48(12):4730–4742
Gaydhani A, Doma V, Kendre S, Bhagwat L (2018) Detecting hate speech and offensive language on twitter using machine learning: an N-gram and TFIDF based approach. arXiv preprint arXiv:1809.08651
Fauzi MA, Yuniarti A (2018) Ensemble method for indonesian twitter hate speech detection. Indones. J Electr Eng Comput Sci 11(1):294–299
Zhang Z, Luo L (2018) Hate speech detection: A solved problem? The challenging case of long tail on Twitter. Semantic Web, (Preprint), pp 1–21
Chang CY, Lee SJ, Lai CC (2017) Sighted word2vec based on the distance of words. In: 2017 international conference on machine learning and cybernetics (ICMLC). IEEE, vol 2, pp 563–568
Alarifi A, Tolba A, Al-Makhadmeh Z, Said W (2018) A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. J Supercomput. https://doi.org/10.1007/s11227-018-2398-2
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Thirtieth AAAI conference on artificial intelligence
Caren N, Jowers K, Gaby S (2012) A social movement online community: stormfront and the white nationalist movement. In: Earl J, Rohlinger DA (eds) Media, movements, and political change (research in social movements, conflicts and change, volume 33). Emerald Group Publishing Limited, Bingley, pp 163–193
https://data.world/crowdflower/hate-speech-identification. Accessed 10 June 2019
Bergin TJ (2006) The origins of word processing software for personal computers: 1976–1985. IEEE Ann Hist Comput 28(4):32–47
Wong KF, Li W, Xu R, Zhang ZS (2009) Introduction to Chinese natural language processing. Synth Lect Hum Lang Technol 2(1):1–148
Gupta V (2014) Automatic stemming of words for Punjabi language. In: Thampi SM, Gelbukh A, Mukhopadhyay J (eds) Advances in signal processing and intelligent recognition systems. Springer, Cham, pp 73–84
Fares M, Oepen S, Zhang Y (2013) Machine learning for high-quality tokenization replicating variable tokenization schemes. In: International conference on intelligent text processing and computational linguistics. Springer, Berlin, Heidelberg, pp 231–244
Domínguez MA, Infante-Lopez G (2008) Searching for part of speech tags that improve parsing models. In: International conference on natural language processing. Springer, Berlin, Heidelberg, pp 126–137
Rahim A, Ma K, Zhao W, Tolba A, Al-Makhadmeh Z, Xia F (2018) Cooperative data forwarding based on crowdsourcing in vehicular social networks. Pervasive Mob Comput 51:43–55
Nicholls C, Song F (2010) Comparison of feature selection methods for sentiment analysis. In: Canadian conference on artificial intelligence. Springer, Berlin, Heidelberg, pp 286–289
Razavi AH, Inkpen D, Uritsky S, Matwin S (2010) Offensive language detection using multi-level classification. In: Canadian conference on artificial intelligence. Springer, Berlin, Heidelberg, pp 16–27
Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 international conference on privacy, security, risk and trust and 2012 international confernece on social computing. IEEE, pp 71–80
Jedari B, Xia F, Chen H, Das SK, Tolba A, Zafer AM (2019) A social-based watchdog system to detect selfish nodes in opportunistic mobile networks. Future Gen Comput Syst 92:777–788
Gomathi P, Baskar S, Shakeel PM, Dhulipala VS (2019) Identifying brain abnormalities from electroencephalogram using evolutionary gravitational neocognitron neural network. Multimedia Tools Appl. https://doi.org/10.1007/s11042-019-7301-5
Shakeel PM, Tolba A, Al-Makhadmeh Z, Al-Makhadmeh M, Musa J (2019) Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks. Neural Comput Appl. https://doi.org/10.1007/s00521-018-03972-2
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international AAAI conference on web and social media
Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on World Wide Web companion, pp 759–760
Yao Z, Sun Y, Ding W, Rao N, Xiong H (2018) Dynamic word embeddings for evolving semantic discovery. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 673–681
Hong G (2005) Relation extraction using support vector machine. In: International conference on natural language processing. Springer, Berlin, Heidelberg, pp 366–377
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: European semantic web conference. Springer, Cham, pp 745–760
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Thirtieth AAAI conference on artificial intelligence
Wackerly D, Mendenhall W, Scheaffer RL (2008) Mathematical statistics with applications, 7th edn. Thomson Higher Education, Belmont. ISBN 978-0-495-38508-0
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Mikolov T, Deoras A, Kombrink S, Burget L, Černocký J (2011) Empirical evaluation and combination of advanced language modeling techniques. In: Twelfth annual conference of the international speech communication association
Powers DM (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2(1):37–63
Muhammed Shafi P, Selvakumar S, Mohamed Shakeel P (2018) An efficient optimal fuzzy C means (OFCM) algorithm with particle swarm optimization (PSO) to analyze and predict crime data. J Adv Res Dyn Control Syst 10(06):699–707
Shakeel PM, Manogaran G (2018) Prostate cancer classification from prostate biomedical data using ant rough set algorithm with radial trained extreme learning neural network. Health Technol. https://doi.org/10.1007/s12553-018-0279-6
Powers DM (2012) ROC-ConCert: ROC-based measurement of consistency and certainty. In: 2012 Spring congress on engineering and technology. IEEE, pp 1–4
Acknowledgements
The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through Research Group No. RG-1439-088.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Al-Makhadmeh, Z., Tolba, A. Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 102, 501–522 (2020). https://doi.org/10.1007/s00607-019-00745-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00745-0
Keywords
- Social media
- YouTube
- Hate speech
- Killer natural language processing optimizing ensemble deep learning approach
Mathematics Subject Classification
- 01-00
- 01-02
- 11Axx
- 03-04
- 03Bxx
- 39-00