Skip to main content

Comparative Analysis of Various Machine Learning Algorithms to Detect Cyberbullying on Twitter Dataset

  • Conference paper
  • First Online:
Inventive Communication and Computational Technologies (ICICCT 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 757))

  • 234 Accesses

Abstract

The advent of the digital era has seen the rise of social media as an alternative mode of communication. The usage of social media platforms for facilitating contact between individuals has become widespread. As a direct result of this, conventional modes of communication have been replaced by digital modes, thanks to social media. Because of the growing prevalence of cyberbullying, this digital development on social media platforms is a significant problem that must be addressed. Bullies have various options to harass and threaten individuals in their communities because of the platforms that are already available. It has been argued that a number of different tactics and approaches may be employed to combat cyberbullying via the use of early identification and alerts to locate and/or protect victims of cyberbullying. Methods from the field of machine learning (ML) have seen widespread use in the search for language patterns used by bullies to cause damage to their victims. This research paper analyzes standard supervised learning and ensemble machine learning algorithms. The ensemble technique utilizes random forest (RF) and AdaBoost classifiers, whereas the supervised method uses Gaussian Naive Bayes (GNV), logistic regression (LR), and decision tree (DT). We use the dataset to train and evaluate our binary classification model to classify abusive language as bullying or non-bullying and extract Twitter features using term frequency-inverse document frequency (TF-IDF). Downloaded the dataset from Kaggle. This paper analyzes each machine learning algorithm. Ensemble-supervised algorithms outperformed standard supervised algorithms in the analysis. With a dataset, the random forest classifier performed best with 92% accuracy, while the Naive Bayes classifier performed worst with 62% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Maros H, Juniar S (2016) Sentiment informed cyberbullying detection in social media

    Google Scholar 

  2. Chan TKH, Cheung CMK, Lee ZWY (2021) Cyberbullying on social networking sites: a literature review and future research directions. Inf Manag 58(2):103411. https://doi.org/10.1016/j.im.2020.103411

    Article  Google Scholar 

  3. Wang Y, Zhang C, Zhao B, Xi X, Geng L, Cui C (2018) Sentiment analysis of Twitter data based on CNN. Shuju Caiji Yu Chuli/J Data Acquis. Process 33(5):921–927. https://doi.org/10.16337/j.1004-9037.2018.05.017

  4. S. Khan and S. Khan, “Journal Pre-proof,” 2019.

    Google Scholar 

  5. Prabowo WA, Azizah F (2021) RESTI journal. IAII 10:11–12

    Google Scholar 

  6. Bastiaensens S, Vandebosch H, Poels K, Van Cleemput K, Desmet A, De Bourdeaudhuij I (2014) Cyberbullying on social network sites. An experimental study into bystanders’ behavioural intentions to help the victim or reinforce the bully. Comput Human Behav 31(1):259–271. https://doi.org/10.1016/j.chb.2013.10.036

    Article  Google Scholar 

  7. Sintaha M, Zawad N (2016) Cyberbullying detection using sentiment analysis in social, 21(11)

    Google Scholar 

  8. Almutiry S, Abdel Fattah M (2021) Arabic cyberbullying detection using arabic sentiment analysis. Egypt J Lang Eng 8(1):39–50. https://doi.org/10.21608/ejle.2021.50240.1017

  9. Coyne I, Gopaul AM, Campbell M, Pankász A, Garland R, Cousans F (2019) Bystander responses to bullying at work: the role of mode, type and relationship to target. J Bus Ethics 157(3):813–827. https://doi.org/10.1007/s10551-017-3692-2

    Article  Google Scholar 

  10. Paul S, Saha S (2020) CyberBERT: BERT for cyberbullying identification: BERT for cyberbullying identification. Multimed Syst, 0123456789. https://doi.org/10.1007/s00530-020-00710-4

  11. Atoum JO (2020) Cyberbullying detection through sentiment analysis. In: Proceedings—2020 international conference on computational science and computational intelligence CSCI 2020, pp 292–297. https://doi.org/10.1109/CSCI51800.2020.00056

  12. Li Q (2010) Cyberbullying in high schools: a study of students’ behaviors and beliefs about this new phenomenon. J Aggress Maltreatment Trauma 19(4):372–392. https://doi.org/10.1080/10926771003788979

    Article  Google Scholar 

  13. Gini G, Pozzoli T, Borghi F, Franzoni L (2008) The role of bystanders in students ’ perception of bullying and sense of safety ☆. J Sch Psychol 46(6):617–638. https://doi.org/10.1016/j.jsp.2008.02.001

    Article  Google Scholar 

  14. Nahar V, Unankard S, Li X, Pang C (2012) Sentiment analysis for effective detection of cyber bullying, pp 767–774

    Google Scholar 

  15. Khaira U, Johanda R, Utomo PEP, Suratno T (2020) Sentiment analysis of cyberbullying on Twitter using SENTISTRENGTH. Indones J Artif Intell Data Min 3(1):21. https://doi.org/10.24014/ijaidm.v3i1.9145

    Article  Google Scholar 

  16. Salawu S, He Y, Lumsden J (2020) Approaches to automated detection of cyberbullying: a survey. IEEE Trans Affect Comput 11(1):3–24. https://doi.org/10.1109/TAFFC.2017.2761757

    Article  Google Scholar 

  17. DataTurks (2018) Tweets dataset for detection of cyber-trolls. Retrieved (2023, Feb 20) (Online) https://www.kaggle.com/datasets/dataturks/dataset-for-detection-ofcybertrolls?select=Dataset+for+Detection+of+Cyber-Trolls.json

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Milind Shah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shah, M., Vasant, A., Patel, K.A. (2023). Comparative Analysis of Various Machine Learning Algorithms to Detect Cyberbullying on Twitter Dataset. In: Ranganathan, G., Papakostas, G.A., Rocha, Á. (eds) Inventive Communication and Computational Technologies. ICICCT 2023. Lecture Notes in Networks and Systems, vol 757. Springer, Singapore. https://doi.org/10.1007/978-981-99-5166-6_52

Download citation

Publish with us

Policies and ethics