Offensive Language Detection Using Multi-level Classification

Razavi, Amir H.; Inkpen, Diana; Uritsky, Sasha; Matwin, Stan

doi:10.1007/978-3-642-13059-5_5

Amir H. Razavi²¹,
Diana Inkpen²¹,
Sasha Uritsky²² &
…
Stan Matwin^21,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6085))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

3092 Accesses
76 Citations
3 Altmetric

Abstract

Text messaging through the Internet or cellular phones has become a major medium of personal and commercial communication. In the same time, flames (such as rants, taunts, and squalid phrases) are offensive/abusive phrases which might attack or offend the users for a variety of reasons. An automatic discriminative software with a sensitivity parameter for flame or abusive language detection would be a useful tool. Although a human could recognize these sorts of useless annoying texts among the useful ones, it is not an easy task for computer programs. In this paper, we describe an automatic flame detection method which extracts features at different conceptual levels and applies multi-level classification for flame detection. While the system is taking advantage of a variety of statistical models and rule-based patterns, there is an auxiliary weighted pattern repository which improves accuracy by matching the text to its graded entries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Spertus, E.S.: Automatic recognition of hostile messages. In: Proceedings of the Eighth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI), pp. 1058–1065 (1997)
Google Scholar
Martin, M.J.: Annotating flames in Usenet newsgroups: a corpus study. For NSF Minority Institution Infrastructure Grant Site Visit to NMSU CS department (2002)
Google Scholar
Wiebe, J., Wilson, T., Bruce, R., Bell, M., Martin, M.: Learning Subjective Language. Computational Linguistics 30(3), 277–308 (2004)
Article Google Scholar
Gyamfi, y., Wiebe, J., Mihalcea, R., Akkaya, C.: Integrating Knowledge for Subjectivity Sense Labeling. In: Joint Conference of the North American Chapter of the Association for Computational Linguistics and the Human Language Technologies Conference, NAACL-HLT 2009 (2009)
Google Scholar
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39(2-3), 165–210 (2005)
Article Google Scholar
Hall, M., Frank, E.: Combining Naive Bayes and Decision Tables. In: FLAIRS Conference, pp. 318–319 (2008)
Google Scholar
Wiebe, J., Wilson, T., Bell, B.: Identifying Collocations for Recognizing Opinions. In: Proc. ACL 2001 Workshop on Collocation, Toulouse, France (2001)
Google Scholar
Mahmud, A., Ahmed, K.Z., Khan, M.: Detecting flames and insults in text. In: Proc. of 6th International Conference on Natural Language Processing (ICON 2008), CDAC Pune, India, December 20-22 (2008)
Google Scholar
Wiebe, J., Bruce, R., Bell, M., Martin, M., Wilson, T.: A Corpus Study of Evaluative and Speculative Language. In: Proceedings of 2nd ACL SIGdial Workshop on Discourse and Dialogue, Aalborg, Denmark (2001)
Google Scholar
Kaufer, D.: Flaming: A White Paper (2000)
Google Scholar
Witten, I., Frank, E., Gray, J.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (2008) ISBN13: 9781558605527
Google Scholar
Spears, R.A.: Forbidden American English (1991) ISBN: 9780844251493
Google Scholar
Bruce, R.F., Wiebe, J.: Recognizing subjectivity: a case study in manual tagging. Natural Language Engineering 5(2) (1999)
Google Scholar
Wiebe, J., Bruce, R.F., O’Hara, T.: Development and use of a gold standard data set for subjectivity classifications. In: Proc. 37th Annual Meeting of the Assoc. for Computational Linguistics (ACL 1999), pp. 246–253 (1999)
Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.H.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)
Google Scholar
Turney, P., Littman, M.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS) 21(4), 315–346 (2003)
Article Google Scholar
Gordon, A., Kazemzadeh, A., Nair, A., Petrova, M.: Recognizing expressions of commonsense psychology in English text. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), pp. 208–215 (2003)
Google Scholar
Yu, H., Hatzivassiloglou, V.: Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 129–136 (2003)
Google Scholar
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), pp. 105–112 (2003)
Google Scholar
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining, ICDM 2003 (2003)
Google Scholar
Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of produce reviews. In: Proceedings of the 12th International World Wide Web Conference (2003)
Google Scholar
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the 7th Conference on Natural Language Learning (CoNLL), pp. 25–32 (2003)
Google Scholar
Razavi, A.H., Amini, R., Sabourin, C., Sayyad Shirabad, J., Nadeau, D., Matwin, S., De Koninck, J.: Classification of emotional tone of dreams using machine learning and text analyses. Paper presented at the Meeting of the Associated Professional Sleep Society in Baltimore. Sleep, vol. 31, pp. A380–A381 (2008)
Google Scholar
Razavi, A.H., Amini, R., Sabourin, C., Sayyad Shirabad, J., Nadeau, D., Matwin, S., De Koninck, D.: Evaluation and Time Course Representation of the Emotional Tone of dreams Using Machine Learning and Automatic Text Analyses. In: 19th Congress of European Sleep Research Society; ESRS-Glasgow Journal of Sleep Research (2008) (in press)
Google Scholar
Thelwall, M.: Fk yea I swear: Cursing and gender in a corpus of MySpace pages. Corpora 3(1), 83–107 (2008)
Article Google Scholar
McEnery, A.M.: Swearing in English: Bad Language, Purity and Power from 1586 to the Present. Routledge, London (2005) (in press)
Google Scholar
McEnery, A.M., Xiao, Z.: Swearing in modern British English: the case of fuck in the BNC. Language and Literature 13(3), 235–268 (2004)
Article Google Scholar
McEnery, A.M., Baker, J.P., Hardie, A.: Swearing and abuse in modern British English. In: Lewandowska-Tomaszczyk, B., Melia, P.J. (eds.) Practical Applications of Language Corpora, Peter Lang, Hamburg, pp. 37–48 (2000)
Google Scholar
McEnery, A.M., Baker, J.P., Hardie, J.: Assessing claims about language use with corpus data – swearing and abuse. In: Kirk, J. (ed.) Corpora Galore, Rodopi, Amsterdam, pp. 45–55 (2000)
Google Scholar
Pedersen, T., Kulkarni, A. K., Angheluta, R., Kozareva, Z., Solorio, T.: An Unsupervised Language Independent Method of Name Discrimination Using Second Order Co-occurrence Features. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 208–222. Springer, Heidelberg (2006)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Engineering (SITE), University of Ottawa, Ottawa, ON, Canada, K1N 6N5
Amir H. Razavi, Diana Inkpen & Stan Matwin
Natural Semantic Modules co., 5 Tangreen Court, Suite 510, Toronto, ON, M2M 4A7
Sasha Uritsky
Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
Stan Matwin

Authors

Amir H. Razavi
View author publications
You can also search for this author in PubMed Google Scholar
Diana Inkpen
View author publications
You can also search for this author in PubMed Google Scholar
Sasha Uritsky
View author publications
You can also search for this author in PubMed Google Scholar
Stan Matwin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NLP Technologies Inc., 1255 University Street, H3B 3W9, Montreal, Quebec, Canada
Atefeh Farzindar
Dalhousie University, Faculty of Computer Science, 6050 University Ave, Halifax, B3H 1W5, Nova Scotia, Canada
Vlado Kešelj

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S. (2010). Offensive Language Detection Using Multi-level Classification. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-13059-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics