Skip to main content

A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media


As a constructive mode of information sharing, collaboration and communication, social media platforms offer users with limitless opportunities. The same hypermedia can be transposed into a synthetic and toxic milieu that provides an anonymous, destructive pedestal for online bullying and harassment. Automatic cyberbullying detection on social media using synthetic or real-world datasets is one of a proverbial natural language processing problem. Analyzing a given text requires capturing the existent semantics, syntactic and spatial relationships. Learning representative features automatically using deep learning models efficiently captures the contextual semantics and word order arrangement to build robust and superlative predictive models. This work puts forward a hybrid model, Bi-GRU-Attention-CapsNet (Bi-GAC), that benefits by learning sequential semantic representations and spatial location information using a Bi-GRU with self-attention followed by CapsNet for cyberbullying detection in the textual content of social media. The proposed Bi-GAC model is evaluated for performance using F1-score and ROC-AUC curve as metrics. The results show a superior performance to the existing techniques on the benchmark and MySpace datasets. In comparison to the conventional models, an improvement of nearly 9% and 3% in F-score is observed for MySpace and dataset respectively.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Fig. 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Data availability

Benchmark publicly available datasets have been used.

Code availability

Can be made available on request.


  1. 1.

    Agrawal, S.,  Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. In: European Conference on Information Retrieval, pp. 141–153. Springer, Cham (2018)

  2. 2.

    Ballal, N, Saritha, SK: A study of deep learning in text analytics. In: Shukla, R, Agrawal, J, Sharma, S, Chaudhari, N, Shukla, K (eds.) Social networking and computational intelligence. Lecture notes in networks and systems, vol. 100. Springer, Singapore (2020).

  3. 3.

    Bounegru, L, Gray, J, Venturini, T, Mauri, M: A field guide to 'fake news' and other information disorders. A field guide to" fake news" and other information disorders: a collection of recipes for those who love to cook with digital methods. Public Data Lab, Amsterdam (2018)

  4. 4.

    Campbell, MA: Cyber bullying: an old problem in a new guise? Aust. J. Guid. Couns. 15(1), 68–76 (2005)

    Article  Google Scholar 

  5. 5.

    Çiğdem, A.C.I., Çürük, E., Eşsiz, E.S.: Automatic detection of cyberbullying in FORMSPRING. Me, Myspace and Youtube Social Networks. Turk J Eng 3(4), 168–178 (2019)

    Article  Google Scholar 

  6. 6.

    Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  7. 7.

    Dadvar, M., Eckert, K.: Cyberbullying detection in social networks using deep learning based models; a reproducibility study. arXiv preprint arXiv:1812.08046 (2018)

  8. 8.

    Deng, J., Cheng, L., Wang, Z.: Self-attention-based BiGRU and capsule network for named entity recognition.arXiv preprint arXiv:2002.00735 (2020)

  9. 9.

    Gangwar, A. K., Ravi, V.: A novel BGCapsule network for text classification. arXiv preprint arXiv:2007.04302 (2020)

  10. 10.

    Hang, O.C., Dahlan, H.M.: Cyberbullying lexicon for social media. In: 2019 6th International Conference on Research and Innovation in Information Systems (ICRIIS), pp. 1–6. IEEE (2019)

  11. 11.

    Hochreiter, S: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Internat. J. Uncertain. Fuzziness Knowledge-Based Systems 6(02), 107–116 (1998)

    Article  Google Scholar 

  12. 12.

    Jain, D, Kumar, A, Garg, G: Sarcasm detection in mash-up language using soft-attention based bi-directional LSTM and feature-rich CNN. Appl. Soft Comput. 91, 106198 (2020)

    Article  Google Scholar 

  13. 13.

    John, A., Glendenning, A. C., Marchant, A., Montgomery, P., Stewart, A., Wood, S., ...  Hawton, K.: Self-harm, suicidal behaviours, and cyberbullying in children and young people: systematic review. J. Med. Internet Res. 20(4), e129 (2018)

  14. 14.

    Kowalski, RM, Limber, SP: Psychological, physical, and academic correlates of cyberbullying and traditional bullying. J. Adolesc. Heal. 53(1), S13–S20 (2013)

    Article  Google Scholar 

  15. 15.

    Kim, J, Jang, S, Park, E, Choi, S: Text classification using capsules. Neurocomputing 376, 214–221 (2019)

    Article  Google Scholar 

  16. 16.

    Kumar, A., Sachdeva, N.: Cyberbullying detection on social multimedia using soft computing techniques: a meta-analysis. Multimed Tools Appl 78(17), 23973–24010 (2019)

    Article  Google Scholar 

  17. 17.

    Kumar, A, Jaiswal, A: A deep swarm-optimized model for leveraging industrial data analytics in cognitive manufacturing. IEEE Trans. Industr. Inf. 17(4), 2938–2946 (2020)

    Article  Google Scholar 

  18. 18.

    Kumar, A., Sachdeva, N.: Cyberbullying checker: online bully content detection using Hybrid Supervised Learning. In: International Conference on Intelligent Computing and Smart Communication 2019, pp. 371–382. Springer, Singapore (2020)

  19. 19.

    Kumar, A., Sachdeva, N.: Multi-input integrative learning using deep neural networks and transfer learning for cyberbullying detection in real-time code-mix data. Multimedia Systems, 1–15 (2020)

  20. 20.

    Kumar, A., Sachdeva, N.: Multimodal cyberbullying detection using capsule network with dynamic routing and deep convolutional neural network. Multimedia Systems, 1–10 (2021)

  21. 21.

    Kumar, A., Sangwan, S.R., Arora, A., Nayyar, A., Abdel-Basset, M.: Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access 7, 23319–23328 (2019)

    Article  Google Scholar 

  22. 22.

    Kumar, A., Srinivasan, K., Cheng, W.H., Zomaya, A.Y.: Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data. Inf Process Manag 57(1), 102141 (2020)

    Article  Google Scholar 

  23. 23.

    Liu, W., Wen, B., Gao, S., Zheng, J., Zheng, Y.: A multi-label text classification model based on ELMo and attention. In: MATEC Web of Conferences, Vol. 309, pp. 03015. EDP Sciences (2020)

  24. 24.

    Maslej-Krešňáková, V, Sarnovský, M, Butka, P, Machová, K: Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification. Appl. Sci. 10(23), 8631 (2020)

    Article  Google Scholar 

  25. 25.

    Meng, Z, Tian, S, Yu, L: Regional bullying text recognition based on two-branch parallel neural networks. Autom. Control. Comput. Sci. 54(4), 323–334 (2020)

    Article  Google Scholar 

  26. 26.

    Özel, SA, Sarac, E: Effects of feature extraction and classification methods on cyberbully detection. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 21(1), 190–200 (2016)

    Article  Google Scholar 

  27. 27.

    Paul, S., & Saha, S. (2020). CyberBERT: BERT for cyberbullying identification.Multimedia Systems, 1–8.

  28. 28.

    Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations.arXiv preprint arXiv:1802.05365 (2018)

  29. 29.

    Rosa, H, Pereira, N, Ribeiro, R, Ferreira, PC, Carvalho, JP, Oliveira, S, Coheur, L, Paulino, P, Simão, AV, Trancoso, I: Automatic cyberbullying detection: a systematic review. Comput. Hum. Behav. 93, 333–345 (2019)

    Article  Google Scholar 

  30. 30.

    Sabour, S., Frosst, N., Hinton, G. E.: Dynamic routing between capsules. arXiv preprint arXiv:1710.09829 (2017)

  31. 31.

    Sangwan, S.R., Bhatia, M.P.S.: D-BullyRumbler: a safety rumble strip to resolve online denigration bullying using a hybrid filter-wrapper approach. Multimedia Systems, 1–17 (2020)

  32. 32.

    Shao, Y, Lin, JCW, Srivastava, G, Jolfaei, A, Guo, D, Hu, Y: Self-attention-based conditional random fields latent variables model for sequence labeling. Pattern Recogn. Lett. 145, 157–164 (2021)

    Article  Google Scholar 

  33. 33.

    Sherstinsky, A: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D 404, 132306 (2020)

    MathSciNet  Article  Google Scholar 

  34. 34.

    Shrivastava, G., Kumar, P., Ojha, R.P., Srivastava, P.K., Mohan, S., Srivastava, G.: Defensive modeling of fake news through online social networks. IEEE Trans Comput Soc Syst 7(5), 1159–1167 (2020)

    Article  Google Scholar 

  35. 35.

    Smith, PK, Mahdavi, J, Carvalho, M, Fisher, S, Russell, S, Tippett, N: Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 49(4), 376–385 (2008)

    Article  Google Scholar 

  36. 36.

    Van Hee, C, Jacobs, G, Emmery, C, Desmet, B, Lefever, E, Verhoeven, B, Hoste, V: Automatic detection of cyberbullying in social media text. PLoS ONE 13(10), e0203794 (2018)

    Article  Google Scholar 

  37. 37.

    Young, T, Hazarika, D, Poria, S, Cambria, E: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)

    Article  Google Scholar 

  38. 38.

    Zhang, A., Li, B., Wan, S., Wang, K.: Cyberbullying detection with birnn and attention mechanism. In: International Conference on Machine Learning and Intelligent Communications, pp. 623–635. Springer, Cham (2019)

  39. 39.

    Zhao, R., Zhou, A., Mao, K.: Automatic detection of cyberbullying on social networks based on bullying features. In: Proceedings of the 17th international conference on distributed computing and networking, pp. 1–6 (2016)

  40. 40.

    Zhao, W., Ye, J., Yang, M., Lei, Z., Zhang, S., Zhao, Z.: Investigating capsule networks with dynamic routing for text classification. arXiv preprint arXiv:1804.00538 (2018)

Download references

Author information




All the authors have contributed equally in the research and manuscript preparation.

Corresponding author

Correspondence to Akshi Kumar.

Ethics declarations

Ethics approval

The work conducted is not plagiarized. No one has been harmed in this work.

Consent to participate

All the authors have given consent to submit the manuscript.

Consent for publication

Authors provide their consent for the publication.

Conflict of interest

The authors certify that there is no conflict of interest in the subject matter discussed in the manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Synthetic Media on the Web

Guest Editors: Huimin Lu, Xing Xu, Jože Guna, and Gautam Srivastava

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kumar, A., Sachdeva, N. A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media. World Wide Web (2021).

Download citation


  • CapsNet
  • Bi-GRU
  • Cyberbullying
  • Deep learning
  • Social media