Vulgarity Classification in Comments Using SVM and LSTM

Dias, Crystal; Jangid, Mahesh

doi:10.1007/978-981-13-8406-6_52

Vulgarity Classification in Comments Using SVM and LSTM

Crystal Dias⁸ &
Mahesh Jangid⁸

Conference paper
First Online: 27 October 2019

1194 Accesses
3 Citations

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 141))

Abstract

Multitudes of textual matter appear daily online. People, possessing the freedom of speech, very often tend to offend the sentiments of readers. Numerous accounts of online harassing, defaming, and bullying prevail in various social networking sites. Posting such content cannot be controlled but thanks to machine learning and deep learning such content can be identified and then removed. Jigsaw and Google have prepared tools to identify such kind of profanity appearing online, but they have not been successful to identify the type of toxicity a comment possesses. Kaggle hence put forth a challenge wherein besides identifying whether a comment is toxic, the comment can be classified into kinds of toxicity. In this challenge, categories like threats, insult, identity hate, and obscenity are taken into consideration. To complete this challenge, various machine learning and deep learning models are applied such as SVM and RNN-LSTM. Our main aim during this challenge is to study the results of using RNN-LSTM for toxic classification. The data is first vectorized using TF-IDF and bag of words. This paper also discusses the nature of the dataset. The results found to give a promising assurance in finding a solution to this problem.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Taspinar, A.: Text Classification and Sentiment Analysis (2015)
Google Scholar
Bretschneider, U.: Wöhner, T., Peters, R.: Detecting Online Harassment in Social Networks (2014)
Google Scholar
Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P.: Convolution Neural Network for Toxic Comment Classification. In: Proceedings of the 10th Hellenic Conference on Artificial Intelligence, p. 35. ACM (2018)
Google Scholar
Liu, F., Wu, X.: Toxic Comment Detection with Bidirectional LSTM. https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1184/reports/6838601.pdf
Durgesh, K.S., Lekha, B.: Data Classification using Support Vector Machine. J Theor Appl Inf Technol. 12(1), 1–7 (2010)
Google Scholar
Kowalczyk, A.: Linear kernel: why is it recommended for text classification? Text classification, SVM Tutorial (2014)
Google Scholar
Olah, C.: Understanding LSTM Networks (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
Crystal Dias & Mahesh Jangid

Authors

Crystal Dias
View author publications
You can also search for this author in PubMed Google Scholar
Mahesh Jangid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Crystal Dias .

Editor information

Editors and Affiliations

College of Engineering, Iowa State University, Ames, IA, USA
Arun K. Somani
School of Computing and Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Rajveer Singh Shekhawat
Department of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Ankit Mundra
Department of Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Sumit Srivastava
School of Computing and Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Vivek Kumar Verma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dias, C., Jangid, M. (2020). Vulgarity Classification in Comments Using SVM and LSTM. In: Somani, A.K., Shekhawat, R.S., Mundra, A., Srivastava, S., Verma, V.K. (eds) Smart Systems and IoT: Innovations in Computing. Smart Innovation, Systems and Technologies, vol 141. Springer, Singapore. https://doi.org/10.1007/978-981-13-8406-6_52

Download citation

DOI: https://doi.org/10.1007/978-981-13-8406-6_52
Published: 27 October 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8405-9
Online ISBN: 978-981-13-8406-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics