Skip to main content

Offensive Language Detection Using Multi-level Classification

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6085))

Included in the following conference series:

Abstract

Text messaging through the Internet or cellular phones has become a major medium of personal and commercial communication. In the same time, flames (such as rants, taunts, and squalid phrases) are offensive/abusive phrases which might attack or offend the users for a variety of reasons. An automatic discriminative software with a sensitivity parameter for flame or abusive language detection would be a useful tool. Although a human could recognize these sorts of useless annoying texts among the useful ones, it is not an easy task for computer programs. In this paper, we describe an automatic flame detection method which extracts features at different conceptual levels and applies multi-level classification for flame detection. While the system is taking advantage of a variety of statistical models and rule-based patterns, there is an auxiliary weighted pattern repository which improves accuracy by matching the text to its graded entries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Spertus, E.S.: Automatic recognition of hostile messages. In: Proceedings of the Eighth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI), pp. 1058–1065 (1997)

    Google Scholar 

  2. Martin, M.J.: Annotating flames in Usenet newsgroups: a corpus study. For NSF Minority Institution Infrastructure Grant Site Visit to NMSU CS department (2002)

    Google Scholar 

  3. Wiebe, J., Wilson, T., Bruce, R., Bell, M., Martin, M.: Learning Subjective Language. Computational Linguistics 30(3), 277–308 (2004)

    Article  Google Scholar 

  4. Gyamfi, y., Wiebe, J., Mihalcea, R., Akkaya, C.: Integrating Knowledge for Subjectivity Sense Labeling. In: Joint Conference of the North American Chapter of the Association for Computational Linguistics and the Human Language Technologies Conference, NAACL-HLT 2009 (2009)

    Google Scholar 

  5. Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39(2-3), 165–210 (2005)

    Article  Google Scholar 

  6. Hall, M., Frank, E.: Combining Naive Bayes and Decision Tables. In: FLAIRS Conference, pp. 318–319 (2008)

    Google Scholar 

  7. Wiebe, J., Wilson, T., Bell, B.: Identifying Collocations for Recognizing Opinions. In: Proc. ACL 2001 Workshop on Collocation, Toulouse, France (2001)

    Google Scholar 

  8. Mahmud, A., Ahmed, K.Z., Khan, M.: Detecting flames and insults in text. In: Proc. of 6th International Conference on Natural Language Processing (ICON 2008), CDAC Pune, India, December 20-22 (2008)

    Google Scholar 

  9. Wiebe, J., Bruce, R., Bell, M., Martin, M., Wilson, T.: A Corpus Study of Evaluative and Speculative Language. In: Proceedings of 2nd ACL SIGdial Workshop on Discourse and Dialogue, Aalborg, Denmark (2001)

    Google Scholar 

  10. Kaufer, D.: Flaming: A White Paper (2000)

    Google Scholar 

  11. Witten, I., Frank, E., Gray, J.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (2008) ISBN13: 9781558605527

    Google Scholar 

  12. Spears, R.A.: Forbidden American English (1991) ISBN: 9780844251493

    Google Scholar 

  13. Bruce, R.F., Wiebe, J.: Recognizing subjectivity: a case study in manual tagging. Natural Language Engineering 5(2) (1999)

    Google Scholar 

  14. Wiebe, J., Bruce, R.F., O’Hara, T.: Development and use of a gold standard data set for subjectivity classifications. In: Proc. 37th Annual Meeting of the Assoc. for Computational Linguistics (ACL 1999), pp. 246–253 (1999)

    Google Scholar 

  15. Pang, B., Lee, L., Vaithyanathan, S.H.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)

    Google Scholar 

  16. Turney, P., Littman, M.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS) 21(4), 315–346 (2003)

    Article  Google Scholar 

  17. Gordon, A., Kazemzadeh, A., Nair, A., Petrova, M.: Recognizing expressions of commonsense psychology in English text. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), pp. 208–215 (2003)

    Google Scholar 

  18. Yu, H., Hatzivassiloglou, V.: Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 129–136 (2003)

    Google Scholar 

  19. Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), pp. 105–112 (2003)

    Google Scholar 

  20. Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining, ICDM 2003 (2003)

    Google Scholar 

  21. Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of produce reviews. In: Proceedings of the 12th International World Wide Web Conference (2003)

    Google Scholar 

  22. Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the 7th Conference on Natural Language Learning (CoNLL), pp. 25–32 (2003)

    Google Scholar 

  23. Razavi, A.H., Amini, R., Sabourin, C., Sayyad Shirabad, J., Nadeau, D., Matwin, S., De Koninck, J.: Classification of emotional tone of dreams using machine learning and text analyses. Paper presented at the Meeting of the Associated Professional Sleep Society in Baltimore. Sleep, vol. 31, pp. A380–A381 (2008)

    Google Scholar 

  24. Razavi, A.H., Amini, R., Sabourin, C., Sayyad Shirabad, J., Nadeau, D., Matwin, S., De Koninck, D.: Evaluation and Time Course Representation of the Emotional Tone of dreams Using Machine Learning and Automatic Text Analyses. In: 19th Congress of European Sleep Research Society; ESRS-Glasgow Journal of Sleep Research (2008) (in press)

    Google Scholar 

  25. Thelwall, M.: Fk yea I swear: Cursing and gender in a corpus of MySpace pages. Corpora 3(1), 83–107 (2008)

    Article  Google Scholar 

  26. McEnery, A.M.: Swearing in English: Bad Language, Purity and Power from 1586 to the Present. Routledge, London (2005) (in press)

    Google Scholar 

  27. McEnery, A.M., Xiao, Z.: Swearing in modern British English: the case of fuck in the BNC. Language and Literature 13(3), 235–268 (2004)

    Article  Google Scholar 

  28. McEnery, A.M., Baker, J.P., Hardie, A.: Swearing and abuse in modern British English. In: Lewandowska-Tomaszczyk, B., Melia, P.J. (eds.) Practical Applications of Language Corpora, Peter Lang, Hamburg, pp. 37–48 (2000)

    Google Scholar 

  29. McEnery, A.M., Baker, J.P., Hardie, J.: Assessing claims about language use with corpus data – swearing and abuse. In: Kirk, J. (ed.) Corpora Galore, Rodopi, Amsterdam, pp. 45–55 (2000)

    Google Scholar 

  30. Pedersen, T., Kulkarni, A. K., Angheluta, R., Kozareva, Z., Solorio, T.: An Unsupervised Language Independent Method of Name Discrimination Using Second Order Co-occurrence Features. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 208–222. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S. (2010). Offensive Language Detection Using Multi-level Classification. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13059-5_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13058-8

  • Online ISBN: 978-3-642-13059-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics