Comparative Analysis of Various Machine Learning Algorithms to Detect Cyberbullying on Twitter Dataset

Shah, Milind; Vasant, Avani; Patel, Kinjal A.

doi:10.1007/978-981-99-5166-6_52

Milind Shah¹²,
Avani Vasant¹² &
Kinjal A. Patel¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 757))

Included in the following conference series:

International Conference on Information, Communication and Computing Technology

234 Accesses

Abstract

The advent of the digital era has seen the rise of social media as an alternative mode of communication. The usage of social media platforms for facilitating contact between individuals has become widespread. As a direct result of this, conventional modes of communication have been replaced by digital modes, thanks to social media. Because of the growing prevalence of cyberbullying, this digital development on social media platforms is a significant problem that must be addressed. Bullies have various options to harass and threaten individuals in their communities because of the platforms that are already available. It has been argued that a number of different tactics and approaches may be employed to combat cyberbullying via the use of early identification and alerts to locate and/or protect victims of cyberbullying. Methods from the field of machine learning (ML) have seen widespread use in the search for language patterns used by bullies to cause damage to their victims. This research paper analyzes standard supervised learning and ensemble machine learning algorithms. The ensemble technique utilizes random forest (RF) and AdaBoost classifiers, whereas the supervised method uses Gaussian Naive Bayes (GNV), logistic regression (LR), and decision tree (DT). We use the dataset to train and evaluate our binary classification model to classify abusive language as bullying or non-bullying and extract Twitter features using term frequency-inverse document frequency (TF-IDF). Downloaded the dataset from Kaggle. This paper analyzes each machine learning algorithm. Ensemble-supervised algorithms outperformed standard supervised algorithms in the analysis. With a dataset, the random forest classifier performed best with 92% accuracy, while the Naive Bayes classifier performed worst with 62% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Maros H, Juniar S (2016) Sentiment informed cyberbullying detection in social media
Google Scholar
Chan TKH, Cheung CMK, Lee ZWY (2021) Cyberbullying on social networking sites: a literature review and future research directions. Inf Manag 58(2):103411. https://doi.org/10.1016/j.im.2020.103411
Article Google Scholar
Wang Y, Zhang C, Zhao B, Xi X, Geng L, Cui C (2018) Sentiment analysis of Twitter data based on CNN. Shuju Caiji Yu Chuli/J Data Acquis. Process 33(5):921–927. https://doi.org/10.16337/j.1004-9037.2018.05.017
S. Khan and S. Khan, “Journal Pre-proof,” 2019.
Google Scholar
Prabowo WA, Azizah F (2021) RESTI journal. IAII 10:11–12
Google Scholar
Bastiaensens S, Vandebosch H, Poels K, Van Cleemput K, Desmet A, De Bourdeaudhuij I (2014) Cyberbullying on social network sites. An experimental study into bystanders’ behavioural intentions to help the victim or reinforce the bully. Comput Human Behav 31(1):259–271. https://doi.org/10.1016/j.chb.2013.10.036
Article Google Scholar
Sintaha M, Zawad N (2016) Cyberbullying detection using sentiment analysis in social, 21(11)
Google Scholar
Almutiry S, Abdel Fattah M (2021) Arabic cyberbullying detection using arabic sentiment analysis. Egypt J Lang Eng 8(1):39–50. https://doi.org/10.21608/ejle.2021.50240.1017
Coyne I, Gopaul AM, Campbell M, Pankász A, Garland R, Cousans F (2019) Bystander responses to bullying at work: the role of mode, type and relationship to target. J Bus Ethics 157(3):813–827. https://doi.org/10.1007/s10551-017-3692-2
Article Google Scholar
Paul S, Saha S (2020) CyberBERT: BERT for cyberbullying identification: BERT for cyberbullying identification. Multimed Syst, 0123456789. https://doi.org/10.1007/s00530-020-00710-4
Atoum JO (2020) Cyberbullying detection through sentiment analysis. In: Proceedings—2020 international conference on computational science and computational intelligence CSCI 2020, pp 292–297. https://doi.org/10.1109/CSCI51800.2020.00056
Li Q (2010) Cyberbullying in high schools: a study of students’ behaviors and beliefs about this new phenomenon. J Aggress Maltreatment Trauma 19(4):372–392. https://doi.org/10.1080/10926771003788979
Article Google Scholar
Gini G, Pozzoli T, Borghi F, Franzoni L (2008) The role of bystanders in students ’ perception of bullying and sense of safety ☆. J Sch Psychol 46(6):617–638. https://doi.org/10.1016/j.jsp.2008.02.001
Article Google Scholar
Nahar V, Unankard S, Li X, Pang C (2012) Sentiment analysis for effective detection of cyber bullying, pp 767–774
Google Scholar
Khaira U, Johanda R, Utomo PEP, Suratno T (2020) Sentiment analysis of cyberbullying on Twitter using SENTISTRENGTH. Indones J Artif Intell Data Min 3(1):21. https://doi.org/10.24014/ijaidm.v3i1.9145
Article Google Scholar
Salawu S, He Y, Lumsden J (2020) Approaches to automated detection of cyberbullying: a survey. IEEE Trans Affect Comput 11(1):3–24. https://doi.org/10.1109/TAFFC.2017.2761757
Article Google Scholar
DataTurks (2018) Tweets dataset for detection of cyber-trolls. Retrieved (2023, Feb 20) (Online) https://www.kaggle.com/datasets/dataturks/dataset-for-detection-ofcybertrolls?select=Dataset+for+Detection+of+Cyber-Trolls.json

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Krishna School of Emerging Technology and Applied Research (KSET), Drs. Kiran and Pallavi Patel Global University (KPGU), Vadodara, Gujarat, India
Milind Shah & Avani Vasant
Faculty of Computer Applications and Information Technology, Gujarat Law Society University, Ahmedabad, Gujarat, India
Kinjal A. Patel

Authors

Milind Shah
View author publications
You can also search for this author in PubMed Google Scholar
Avani Vasant
View author publications
You can also search for this author in PubMed Google Scholar
Kinjal A. Patel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Milind Shah .

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Gnanamani College of Technology, Namakkal, Tamil Nadu, India
G. Ranganathan
Department of Computer Science (HUMAIN-Lab), International Hellenic University, Thessaloniki, Greece
George A. Papakostas
Information Systems and Operations Management (ISEG), University of Lisbon, Lisboa, Portugal
Álvaro Rocha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shah, M., Vasant, A., Patel, K.A. (2023). Comparative Analysis of Various Machine Learning Algorithms to Detect Cyberbullying on Twitter Dataset. In: Ranganathan, G., Papakostas, G.A., Rocha, Á. (eds) Inventive Communication and Computational Technologies. ICICCT 2023. Lecture Notes in Networks and Systems, vol 757. Springer, Singapore. https://doi.org/10.1007/978-981-99-5166-6_52

Download citation

DOI: https://doi.org/10.1007/978-981-99-5166-6_52
Published: 04 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5165-9
Online ISBN: 978-981-99-5166-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics