Abstract
We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We compiled the list of hate words by combining open coding done on our dataset and lists of profanity or swear words available online: http://www.bannedwordlist.com/lists/swearWords.txt; https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/en; https://www.frontgatemedia.com/a-list-of-723-bad-words-to-blacklist-and-how-to-use-facebooks-moderation-tool/; http://onlineslangdictionary.com/lists/most-vulgar-words/.
- 2.
fastText is a library for learning of word embeddings and sentence classification created by Facebook.
References
Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5, 11 (2016)
Del Vicario, M., et al.: Echo chambers: emotional contagion and group polarization on facebook. Sci. Rep. 6, 37825 (2016)
Kramer, A.D.I., Guillory, J.E., Hancock, J.T.: Experimental evidence of massive-scale emotional contagion through social networks. PNAS 111, 8788–8790 (2014)
Salminen, J., et al.: Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In: Proceeding of the International AAAI Conference on Web and Social Media (ICWSM 2018), San Francisco, California, USA (2018)
Wright, L., Ruths, D., Dillon, K.P., Saleem, H.M., Benesch, S.: Vectors for counterspeech on Twitter. In: Proceedings of the First Workshop on Abusive Language Online, pp. 57–62 (2017)
Scheuermann, L., Taylor, G.: Netiquette. Internet Res. 7, 269–273 (1997)
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of Eleventh International AAAI Conference on Web and Social Media, Québec, Canada (2017)
Bamberg, S.: Changing environmentally harmful behaviors: a stage model of self-regulated behavioral change. J. Environ. Psychol. 34, 151–159 (2013)
Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., Bhamidipati, N.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web, pp. 29–30. ACM, New York (2015)
Mondal, M., Silva, L.A., Benevenuto, F.: A Measurement study of hate speech in social media. In: Proceedings of the 28th ACM Conference on Hypertext and Social Media, pp. 85–94. ACM, New York (2017)
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016)
Ries, E.: The Lean Startup. Penguin Books Ltd, London (2011)
Mohan, S., Guha, A., Harris, M., Popowich, F., Schuster, A., Priebe, C.: The impact of toxic language on the health of Reddit communities. In: Mouhoub, M., Langlais, P. (eds.) AI 2017. LNCS (LNAI), vol. 10233, pp. 51–56. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57351-9_6
Saleem, H.M., Dillon, K.P., Benesch, S., Ruths, D.: A web of hate: tackling hateful speech in online social spaces (2017). arXiv:1709.10159 [cs]
Silva, L., Mondal, M., Correa, D., Benevenuto, F., Weber, I.: Analyzing the targets of hate in online social media. In: Proceedings of Tenth International AAAI Conference on Web and Social Media, Palo Alto, CA (2016)
Sood, S., Antin, J., Churchill, E.: Profanity use in online communities. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1481–1490. ACM, New York (2012)
Sood, S.O., Churchill, E.F., Antin, J.: Automatic identification of personal insults on social news sites. J. Am. Soc. Inf. Sci. 63, 270–285 (2012)
Rajadesingan, A., Zafarani, R., Liu, H.: Sarcasm detection on Twitter: a behavioral modeling approach. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 97–106. ACM (2015)
Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2017)
Park, J.H., Fung, P.: One-step and two-step classification for abusive language detection on Twitter (2017). arXiv preprint arXiv:1706.01206
Strauss, A., Corbin, J.: Grounded theory methodology. In: Denzin, N.K., Lincoln, Y.S. (eds.) Handbook of Qualitative Research, pp. 273–285. Sage, Thousand Oaks (1994)
Geiger, D., Seedorf, S., Schulze, T., Nickerson, R., Schader, M.: Managing the crowd: towards a taxonomy of crowdsourcing processes. In: AMCIS 2011 Proceedings, pp. 1–11 (2011)
Filippova, K., Strube, M.: Dependency tree based sentence compression. In: Proceedings of the Fifth International Natural Language Generation Conference, pp. 25–32. Association for Computational Linguistics, Stroudsburg (2008)
Alguliev, R., Aliguliyev, R.: Evolutionary algorithm for extractive text summarization. Intell. Inf. Manag. 1, 128 (2009)
Straka, M., Hajic, J., Strakova, J.: UDPipe: trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, POS tagging and parsing. Presented at the Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), PortoroĹľ, Slovenia (2016)
Alonso, O., Marshall, C.C., Najork, M.: Debugging a crowdsourced task with low inter-rater agreement. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 101–110. ACM, New York (2015)
Norušis, M.J.: IBM SPSS Statistics 19 Statistical Procedures Companion. Prentice Hall, Upper Saddle River (2011)
Norman, G.: Likert scales, levels of measurement and the “laws” of statistics. Adv Health Sci. Educ. Theory Pract. 15, 625–632 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Salminen, J., Luotolahti, J., Almerekhi, H., Jansen, B.J., Jung, Sg. (2018). Neural Network Hate Deletion: Developing a Machine Learning Model to Eliminate Hate from Online Comments. In: Bodrunova, S. (eds) Internet Science. INSCI 2018. Lecture Notes in Computer Science(), vol 11193. Springer, Cham. https://doi.org/10.1007/978-3-030-01437-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-01437-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01436-0
Online ISBN: 978-3-030-01437-7
eBook Packages: Computer ScienceComputer Science (R0)