Neural Network Hate Deletion: Developing a Machine Learning Model to Eliminate Hate from Online Comments

Salminen, Joni; Luotolahti, Juhani; Almerekhi, Hind; Jansen, Bernard J.; Jung, Soon-gyo

doi:10.1007/978-3-030-01437-7_3

Joni Salminen^14,15,
Juhani Luotolahti¹⁵,
Hind Almerekhi¹⁴,
Bernard J. Jansen¹⁴ &
…
Soon-gyo Jung¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11193))

Included in the following conference series:

International Conference on Internet Science

2989 Accesses
2 Citations

Abstract

We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We compiled the list of hate words by combining open coding done on our dataset and lists of profanity or swear words available online: http://www.bannedwordlist.com/lists/swearWords.txt; https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/en; https://www.frontgatemedia.com/a-list-of-723-bad-words-to-blacklist-and-how-to-use-facebooks-moderation-tool/; http://onlineslangdictionary.com/lists/most-vulgar-words/.
2.
fastText is a library for learning of word embeddings and sentence classification created by Facebook.

References

Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5, 11 (2016)
Article Google Scholar
Del Vicario, M., et al.: Echo chambers: emotional contagion and group polarization on facebook. Sci. Rep. 6, 37825 (2016)
Article Google Scholar
Kramer, A.D.I., Guillory, J.E., Hancock, J.T.: Experimental evidence of massive-scale emotional contagion through social networks. PNAS 111, 8788–8790 (2014)
Article Google Scholar
Salminen, J., et al.: Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In: Proceeding of the International AAAI Conference on Web and Social Media (ICWSM 2018), San Francisco, California, USA (2018)
Google Scholar
Wright, L., Ruths, D., Dillon, K.P., Saleem, H.M., Benesch, S.: Vectors for counterspeech on Twitter. In: Proceedings of the First Workshop on Abusive Language Online, pp. 57–62 (2017)
Google Scholar
Scheuermann, L., Taylor, G.: Netiquette. Internet Res. 7, 269–273 (1997)
Article Google Scholar
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of Eleventh International AAAI Conference on Web and Social Media, Québec, Canada (2017)
Google Scholar
Bamberg, S.: Changing environmentally harmful behaviors: a stage model of self-regulated behavioral change. J. Environ. Psychol. 34, 151–159 (2013)
Article Google Scholar
Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., Bhamidipati, N.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web, pp. 29–30. ACM, New York (2015)
Google Scholar
Mondal, M., Silva, L.A., Benevenuto, F.: A Measurement study of hate speech in social media. In: Proceedings of the 28th ACM Conference on Hypertext and Social Media, pp. 85–94. ACM, New York (2017)
Google Scholar
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016)
Google Scholar
Ries, E.: The Lean Startup. Penguin Books Ltd, London (2011)
Google Scholar
Mohan, S., Guha, A., Harris, M., Popowich, F., Schuster, A., Priebe, C.: The impact of toxic language on the health of Reddit communities. In: Mouhoub, M., Langlais, P. (eds.) AI 2017. LNCS (LNAI), vol. 10233, pp. 51–56. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57351-9_6
Chapter Google Scholar
Saleem, H.M., Dillon, K.P., Benesch, S., Ruths, D.: A web of hate: tackling hateful speech in online social spaces (2017). arXiv:1709.10159 [cs]
Silva, L., Mondal, M., Correa, D., Benevenuto, F., Weber, I.: Analyzing the targets of hate in online social media. In: Proceedings of Tenth International AAAI Conference on Web and Social Media, Palo Alto, CA (2016)
Google Scholar
Sood, S., Antin, J., Churchill, E.: Profanity use in online communities. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1481–1490. ACM, New York (2012)
Google Scholar
Sood, S.O., Churchill, E.F., Antin, J.: Automatic identification of personal insults on social news sites. J. Am. Soc. Inf. Sci. 63, 270–285 (2012)
Article Google Scholar
Rajadesingan, A., Zafarani, R., Liu, H.: Sarcasm detection on Twitter: a behavioral modeling approach. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 97–106. ACM (2015)
Google Scholar
Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2017)
Google Scholar
Park, J.H., Fung, P.: One-step and two-step classification for abusive language detection on Twitter (2017). arXiv preprint arXiv:1706.01206
Strauss, A., Corbin, J.: Grounded theory methodology. In: Denzin, N.K., Lincoln, Y.S. (eds.) Handbook of Qualitative Research, pp. 273–285. Sage, Thousand Oaks (1994)
Google Scholar
Geiger, D., Seedorf, S., Schulze, T., Nickerson, R., Schader, M.: Managing the crowd: towards a taxonomy of crowdsourcing processes. In: AMCIS 2011 Proceedings, pp. 1–11 (2011)
Google Scholar
Filippova, K., Strube, M.: Dependency tree based sentence compression. In: Proceedings of the Fifth International Natural Language Generation Conference, pp. 25–32. Association for Computational Linguistics, Stroudsburg (2008)
Google Scholar
Alguliev, R., Aliguliyev, R.: Evolutionary algorithm for extractive text summarization. Intell. Inf. Manag. 1, 128 (2009)
Google Scholar
Straka, M., Hajic, J., Strakova, J.: UDPipe: trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, POS tagging and parsing. Presented at the Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia (2016)
Google Scholar
Alonso, O., Marshall, C.C., Najork, M.: Debugging a crowdsourced task with low inter-rater agreement. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 101–110. ACM, New York (2015)
Google Scholar
Norušis, M.J.: IBM SPSS Statistics 19 Statistical Procedures Companion. Prentice Hall, Upper Saddle River (2011)
Google Scholar
Norman, G.: Likert scales, levels of measurement and the “laws” of statistics. Adv Health Sci. Educ. Theory Pract. 15, 625–632 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
Joni Salminen, Hind Almerekhi, Bernard J. Jansen & Soon-gyo Jung
University of Turku, Turku, Finland
Joni Salminen & Juhani Luotolahti

Authors

Joni Salminen
View author publications
You can also search for this author in PubMed Google Scholar
Juhani Luotolahti
View author publications
You can also search for this author in PubMed Google Scholar
Hind Almerekhi
View author publications
You can also search for this author in PubMed Google Scholar
Bernard J. Jansen
View author publications
You can also search for this author in PubMed Google Scholar
Soon-gyo Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joni Salminen .

Editor information

Editors and Affiliations

School of Journalism and Mass Communications, Saint Petersburg State University, St. Petersburg, Russia
Svetlana S. Bodrunova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salminen, J., Luotolahti, J., Almerekhi, H., Jansen, B.J., Jung, Sg. (2018). Neural Network Hate Deletion: Developing a Machine Learning Model to Eliminate Hate from Online Comments. In: Bodrunova, S. (eds) Internet Science. INSCI 2018. Lecture Notes in Computer Science(), vol 11193. Springer, Cham. https://doi.org/10.1007/978-3-030-01437-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-01437-7_3
Published: 25 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01436-0
Online ISBN: 978-3-030-01437-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics