Abstract
Cyberbullying is one of the radical emerging problems with the advancements in the Internet, connecting people around the globe by social media networks. Existing studies mostly focus only on cyberbullying detection in the English language, thus the main objective of this paper is to develop an approach to detect cyberbullying in Hindi-English code-mixed language (Hinglish) which is exorbitantly used by Indian users. Due to the unavailability of Hinglish dataset, we created the Hinglish Cyberbullying Comments (HCC) labeled dataset consisting of comments from social media networks such as Instagram and YouTube. We also developed eight different machine learning models for sentiment classification in-order to automatically detect incidents of cyberbullying. Performance measures namely accuracy, precision, recall and f1 score are used to evaluate these models. Eventually, a hybrid model is developed based on top performers of these eight baseline classifiers which perform better with an accuracy of 80.26% and f1-score of 82.96%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl.-Based Syst. 89, 14–46 (2015)
Galán-GarcÃa, P., Puerta, J.G., Gómez, C.L., Santos, I., Bringas, P.G.: Supervised machine learning for the detection of troll profiles in twitter social network: application to a real case of cyberbullying. Log. J. IGPL 24(1), 42–53 (2016)
Thakur, S., Dutta, K.: Hinglish: code switching, code mixing and indigenization in multilingual environment. Lingua Et Linguistica 1(2007), 109 (2007). books.google.com
Tarwani, N., Chorasia, U., Shukla, P.K.: Survey of cyberbulling detection on social media big-data. Int. J. Adv. Res. Comput. Sci. 8(5), 831–835 (2017)
Ravi, K., Ravi, V.: Sentiment classification of Hinglish text. In: 3rd International Conference on Recent Advances in Information Technology (RAIT), pp. 641–645. IEEE (2016)
Abdul-Mageed, M., Diab, M., Kübler, S.: SAMAR: subjectivity and sentiment analysis for Arabic social media. Comput. Speech Lang. 28(1), 20–37 (2014)
Ravi, K., Ravi, V., Gautam, C.: Online and semi-online sentiment classification. In: International Conference on Computing, Communication and Automation (ICCCA 2015), pp. 938–943 (2015)
Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3(2), 143–157 (2009)
Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)
Ducharme, D., Costa, L., DiPippo, L., Hamel, L.: SVM constraint discovery using KNN applied to the identification of cyberbullying. In: The Steering Committee of the World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), pp. 111–117 (2017)
Zhong, H., Li, H., Squicciarini, A., Rajtmajer, S., Griffin, C., Miller, D., Caragea, C.: Content-driven detection of cyberbullying on the Instagram social network. In: IJCAI, pp. 3952–3958 (2016)
Nahar, V., Al-Maskari, S., Li, X., Pang, C.: Semi-supervised learning for cyberbullying detection in social networks. In: Wang, H., Sharaf, M.A. (eds.) ADC 2014. LNCS, vol. 8506, pp. 160–171. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08608-8_14
Basu, M., Ghosh, S., Ghosh, K.: Overview of the FIRE 2018 track: information retrieval from microblogs during disasters (IRMiDis). In: Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation, pp. 1–5. ACM (2018)
Pandey, P., Govilkar, S.: A framework for sentiment analysis in Hindi using HSWN. Int. J. Comput. Appl. 119, 23–26 (2015)
Bhargava, R., Sharma, Y., Sharma, S.: Sentiment analysis for mixed script indic sentences. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 524–529. IEEE (2016)
Sharma, R., Nigam, S., Jain, R.: Polarity detection of movie reviews in Hindi language. Int. J. Comput. Sci. Appl. 4(4), 49–57 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Tarwani, S., Jethanandani, M., Kant, V. (2019). Cyberbullying Detection in Hindi-English Code-Mixed Language Using Sentiment Classification. In: Singh, M., Gupta, P., Tyagi, V., Flusser, J., Ören, T., Kashyap, R. (eds) Advances in Computing and Data Sciences. ICACDS 2019. Communications in Computer and Information Science, vol 1046. Springer, Singapore. https://doi.org/10.1007/978-981-13-9942-8_51
Download citation
DOI: https://doi.org/10.1007/978-981-13-9942-8_51
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9941-1
Online ISBN: 978-981-13-9942-8
eBook Packages: Computer ScienceComputer Science (R0)