Skip to main content

Performance Measures in Classification of Human Communications

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4509))

Abstract

This study emphasizes the importance of using appropriate measures in particular text classification settings. We focus on methods that evaluate how well a classifier performs. The effect of transformations on the confusion matrix are considered for eleven well-known and recently introduced classification measures. We analyze the measure’s ability to retain its value under changes in a confusion matrix. We discuss benefits from the use of the invariant and non-invariant measures with respect to characteristics of data classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  Google Scholar 

  2. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  3. Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Proc. 21st International Conf on Machine Learning ICML’04, pp. 489–495 (2004)

    Google Scholar 

  4. Thomas, M., Pang, B., Lee, L.: Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 327–335 (2006)

    Google Scholar 

  5. Sarle, W.S.: Measurement theory: Frequently asked questions. In: The Disseminations of the International Statistical Applications Institute, pp. 61–66. ACG Press (1996)

    Google Scholar 

  6. Geng, L., Hamilton, H.: Interestingness measures for data mining: A survey. ACM Computing Surveys 38(3) (2006)

    Google Scholar 

  7. Lallich, S., Teytaud, O., Prudhomme, E.: Association rules interestingness: measure and validation. In: Guillet, F., Hamilton, H.J. (eds.) Quality Measures in Data Mining, Springer, Heidelberg (2006)

    Google Scholar 

  8. Tan, P., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 29(4), 293–313 (2004)

    Article  Google Scholar 

  9. Sokolova, M.: Assessing invariance properties of evaluation measures. In: Proceedings of the Workshop on Testing of Deployable Learning and Decision Systems, the 19th Neural Information Processing Systems Conference, NIPS’06 (2006)

    Google Scholar 

  10. Leech, G.: Principles of Pragmatics, 2nd edn. Longman, Harlow (1991)

    Google Scholar 

  11. Grice, P.: Studies in the Way of Words. Harvard University Press, Cambridge (1989)

    Google Scholar 

  12. Boparai, J., Kay, J.: Supporting user task based conversations via email. In: Proc. 7th Australasian Document Computing Symposium (2002)

    Google Scholar 

  13. Sokolova, M.: Learning from Communication Data: Language in Electronic Business Negotiations. Ph.D. dissertation (2006)

    Google Scholar 

  14. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proc. Empirical Methods of Natural Language Processing EMNLP’02, pp. 79–86 (2002)

    Google Scholar 

  15. Mishne, G.: Experiments with mood classification in blog posts. In: Proc. 1st Workshop on Stylistic Analysis of Text for Information Access, Style’05 (2005), staff.science.uva.nl/gilad/pubs/style2005-blogmoods.pdf

    Google Scholar 

  16. Wilson, T., Wiebe, J., Hwa, R.: Recognizing strong and weak opinion clauses. Computational Intelligence 22(2), 73–99 (2006)

    Article  MathSciNet  Google Scholar 

  17. Gabrilovich, E., Markovitch, S.: Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4.5. In: Proc. 21st International Conf on Machine Learning ICML’04, pp. 321–328 (2004)

    Google Scholar 

  18. Gamon, M.: Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), pp. 841–847 (2004)

    Google Scholar 

  19. Biggerstaff, B.: Comparing diagnostic tests: a simple graphic using likelihood ratios. Statistics in Medicine 19(5), 649–663 (2000)

    Article  Google Scholar 

  20. Huang, J., Ling, C.: Constructing new and better evaluation measures for machine learning. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence, IJCAI’07 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ziad Kobti Dan Wu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Sokolova, M., Lapalme, G. (2007). Performance Measures in Classification of Human Communications. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72665-4_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72664-7

  • Online ISBN: 978-3-540-72665-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics