Abstract
Social platforms are receiving many posts from the hate speech category. These social posts affect societal order and impact the reader’s mental and emotional state, sometimes even leading to suicide. Hence, detecting hate speech posts from social media at the right time plays a crucial role in restraining the spread of hate speech. This paper presents a multimodal architecture consisting of a concatenated transfer learning model and LSTM based model to classify social media posts into hate speech and non-hate speech. The proposed model simultaneously considers text and images to understand their context and intent to predict hate in the post. The image and text features were fused to create a multimodal architecture to predict hate or non-hate speech from social media posts. Separate models for the text and image were also investigated and found the fusion of image and text information provided promising prediction outcomes by outperforming the base model.
Similar content being viewed by others
Notes
References
Ala’M A-Z, Faris H, Hassonah MA et al (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl-Based Syst 153:91–104
Almeida TA, Silva TP, Santos I, Hidalgo JMG (2016) Text normalization and semantic indexing to enhance instant messaging and sms spam filtering. Knowl-Based Syst 108:25–32
Aroyehun ST, Gelbukh A (2018) Aggression detection in social media: using deep neural networks, data augmentation, and pseudo labeling. In: Proceedings of the first workshop on trolling aggression and cyberbullying (TRAC-2018), pp 90–97
Arroyo-Fernández I, Forest D, Torres-Moreno J-M, Carrasco-Ruiz M, Legeleux T, Joannette K (2018) Cyberbullying detection task: the ebsi-lia-unam system (elu) at coling’18 trac-1. In: Proceedings of the first workshop on trolling aggression and cyberbullying (TRAC-2018) pp 140–149
Ayo FE, Folorunso O, Ibharalu FT, Osinuga IA, Abayomi-Alli A (2021) A probabilistic clustering model for hate speech classification in twitter. Expert Syst Appl 173:114762
Campbell MA (2005) Cyber bullying: an old problem in a new guise?. J Psychol Counsellors Schools 15(1):68–76
Chatzakou D, Kourtellis N, Blackburn J, De Cristofaro E, Stringhini G, Vakali A (2017) Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference, pp 13–22
Chen J, Yan S, Wong K-C (2018) Verbal aggression detection on twitter comments: convolutional neural network for short-text sentiment analysis, Neural Comput Appl:1–10
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078
Cohen Y, Gordon D, Hendler D (2018) Early detection of spamming accounts in large-scale service provider networks. Knowl-Based Syst 142:241–255
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international aaai conference on web and social media, pp 512–515
Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Sci Rev 40:100379
Gao L, Huang R (2017) Detecting online hate speech using context aware models, pp 26–266, arXiv:1710.07395
Gomez R, Gibert J, Gomez L, Karatzas D (2020) Exploring hate speech detection in multimodal publications. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1470–1478
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hosseinmardi H, Mattson SA, Rafiq RI, Han R, Lv Q, Mishr S (2015) Prediction of cyberbullying incidents on the instagram social network, arXiv:1508.06257
Hosseinmardi H, Rafiq RI, Han R, Lv Q, Mishra S (2016) Prediction of cyberbullying incidents in a media-based social network. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), IEEE, pp 186–192
Kapil P, Ekbal A (2020) A deep neural network based multi-task learning approach to hate speech detection. Knowl-Based Syst 210:106458
Kumar R, Ojha AK, Malmasi S, Zampieri M (2018) Benchmarking aggression identification in social media. In: Proceedings of the first workshop on trolling aggression and cyberbullying (TRAC-2018), pp 1–11
Kumar A, Sachdeva N (2021) Multimodal cyberbullying detection using capsule network with dynamic routing and deep convolutional neural network, Multimed Syst:1–10
Kumari K, Singh JP (2021) Identification of cyberbullying on multi-modal social media posts using genetic algorithm. Trans Emerg Telecommun Technol 32(2):e3907
Kumari K, Singh JP, Dwivedi YK, Rana NP (2020) Towards cyberbullying-free social media in smart cities: a unified multi-modal approach. Soft Comput 24(15):11059–11070
Lu D, Neves L, Carvalho V, Zhang N, Ji H (2018) Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th annual meeting of the association for computational linguistics (vol 1: long papers), pp 1990–1999
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space, arXiv:1301.3781
Modha S, Majumder P, Mandl T (2018) Filtering aggression from the multilingual social media feed. In: Proceedings of the first workshop on trolling aggression and cyberbullying (TRAC-2018), pp 199–207
Modha S, Majumder P, Mandl T, Mandalia C (2020) Detecting and visualizing hate speech in social media: a cyber watchdog for surveillance. Expert Syst Appl 161:113725
Nikhil N, Pahwa R, Nirala MK, Khilnani R (2018) Lstms with attention for aggression detection. In: Proceedings of the First Workshop on Trolling Aggression and Cyberbullying (TRAC-2018), pp 52–57
Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web, pp 145–153
Pamungkas EW, Patti V (2019) Cross-domain and cross-lingual abusive language detection: a hybrid approach with deep learning and a multilingual lexicon. In: Proceedings of the 57th annual meeting of the association for computational linguistics: student research workshop, pp 363–370
Park JH, Fung P (2017) One-step and two-step classification for abusive language detection on twitter, arXiv:1706.01206
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Plaza-del Arco FM, Molina-González MD, Ureña-lópez LA, Martín-valdivia MT (2021) Comparing pre-trained language models for spanish hate speech detection. Expert Syst Appl 166:114120
Raiyani K, Gonçalves T, Quaresma P, Nogueira VB (2018) Fully connected neural network with advance preprocessor to identify aggression over facebook and twitter. In: Proceedings of the first workshop on trolling aggression and cyberbullying (TRAC-2018), pp 28–41
Risch J, Krestel R (2018) Aggression identification using deep learning and data augmentation. In: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018), pp 150–158
Roy PK, Singh JP, Banerjee S (2020) Deep learning to filter sms spam. Futur Gener Comput Syst 102:524–533
Roy PK, Tripathy AK, Das TK, Gao X-Z (2020) A framework for hate speech detection using deep convolutional neural network. IEEE Access 8:204951–204962
Samghabadi NS, Mave D, Kar S, Solorio T (2018) Ritual-uh at trac 2018 shared task: aggression identification, arXiv:1807.11712
Schmidt A, Wiegand M (2017) A survey on hate speech detection using natural language processing. In: Proceedings of the fifth international workshop on natural language processing for social media, pp 1–10
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556
Singh VK, Ghosh S, Jose C (2017) Toward multimodal cyberbullying detection. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems, pp 2090–2099
Srivastava S, Khurana P, Tewari V (2018) Identifying aggression and toxicity in comments using capsule network. In: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018), pp 98–105
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. PMLR, pp 6105–6114
Vijayaraghavan P, Vosoughi S, Roy D (2017) Twitter demographic classification using deep multi-modal multi-task learning. In: Proceedings of the 55th annual meeting of the association for computational linguistics (vol 2: short papers), pp 478–483
Wang Q, Gao J, Yuan Y (2017) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst 19(5):1457–1470
Wang Q, Han T, Gao J, Yuan Y (2021) Neuron linear transformation: Modeling the domain shift for crowd counting. IEEE Trans Neural Netw Learn Syst:1–13
Wang L, Li Y, Huang J, Lazebnik S (2018) Learning two-branch neural networks for image-text matching tasks. IEEE Trans Pattern Anal Mach Intell 41(2):394–407
Waseem Z (2016) Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science, pp 138–142
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. San Diego, California: association for computational linguistics, pp 88–93
Williams ML, Burnap P, Javed A, Liu H, Ozalp S (2020) Hate in the machine: anti-black and anti-muslim social media posts as predictors of offline racially and religiously aggravated crime. British J Crim 60(1):93–117
Yang F, Peng X, Ghosh G, Shilon R, Ma H, Moore E, Predovic G (2019) Exploring deep multimodal fusion of text and photo for hate speech classification. In: Proceedings of the third workshop on abusive language online, pp 11–18
Yu D, Chen N, Jiang F, Fu B, Qin A (2017) Constrained nmf-based semi-supervised learning for social media spammer detection. Knowl-Based Syst 125:64 –73
Zhang W, Liu G, Li Z, Zhu F (2020) Hateful memes detection via complementary visual and linguistic networks, arXiv:2012.04977
Zhang Z, Luo L (2019) Hate speech detection: a solved problem? the challenging case of long tail on twitter. Semantic Web 10(5):925–945
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European semantic web conference. Springer, pp 745–760
Zhao R, Mao K (2016) Cyberbullying detection based on semantic-enhanced marginalized denoising auto-encoder. IEEE Trans Affect Comput 8 (3):328–339
Zhong H, Li H, Squicciarini AC, Rajtmajer SM, Griffin C, Miller DJ, Caragea C (2016) Content-driven detection of cyberbullying on the instagram social network. In: IJCAI, vol 16, pp 3952–3958
Zhou Y, Chen Z, Yang H (2021) Multimodal learning for hateful memes detection. In: 2021 IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 1–6
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declared that they have no conflicts of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dwivedy, V., Roy, P.K. Deep feature fusion for hate speech detection: a transfer learning approach. Multimed Tools Appl 82, 36279–36301 (2023). https://doi.org/10.1007/s11042-023-14850-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14850-y