Data Mining and Knowledge Discovery

, Volume 29, Issue 3, pp 820–865 | Cite as

Evaluation measures for hierarchical classification: a unified view and novel approaches

  • Aris KosmopoulosEmail author
  • Ioannis Partalas
  • Eric Gaussier
  • Georgios Paliouras
  • Ion Androutsopoulos


Hierarchical classification addresses the problem of classifying items into a hierarchy of classes. An important issue in hierarchical classification is the evaluation of different classification algorithms, an issue which is complicated by the hierarchical relations among the classes. Several evaluation measures have been proposed for hierarchical classification using the hierarchy in different ways without however providing a unified view of the problem. This paper studies the problem of evaluation in hierarchical classification by analysing and abstracting the key components of the existing performance measures. It also proposes two alternative generic views of hierarchical evaluation and introduces two corresponding novel measures. The proposed measures, along with the state-of-the-art ones, are empirically tested on three large datasets from the domain of text classification. The empirical results illustrate the undesirable behaviour of existing approaches and how the proposed methods overcome most of these problems across a range of cases.


Evaluation Evaluation measures Hierarchical classification Tree-structured class hierarchies DAG-structured class hierarchies 


  1. Aho AV, Hopcroft JE, Ullman JD (1973) On finding lowest common ancestors in trees. In: Proceedings of 5th ACM Symposium Theory of Computing (STOC), pp 253–265Google Scholar
  2. Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows: theory, algorithms, and applications. Prentice Hall, Upper Saddle RiverzbMATHGoogle Scholar
  3. Blockeel H, Bruynooghe M, Dzeroski S, Ramon J, Struyf J (2002) Hierarchical multi-classification. In: ACM SIGKDD 2002 Workshop on multi-relational data mining, pp 21–35Google Scholar
  4. Brucker F, Benites F, Sapozhnikova, E (2011) An empirical comparison of flat and hierarchical performance measures for multi-label classification with hierarchy extraction. In: Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems—Volume Part I, pp 579–589Google Scholar
  5. Cai L, Hofmann T (2007) Exploiting known taxonomies in learning overlapping concepts. In: International joint conferences on artificial intelligence, pp 714–719Google Scholar
  6. Cesa-Bianchi N, Gentile C, Zaniboni L (2006) Incremental algorithms for hierarchical classification. J Mach Learn Res 7:31–54zbMATHMathSciNetGoogle Scholar
  7. Costa EP, Lorena AC, Carvalho, Freitas AA (2007) A review of performance evaluation measures for hierarchical classifiers. In: 2007 AAAI Workshop, VancouverGoogle Scholar
  8. Dekel O, Keshet J, Singer, Y (2004) Large margin hierarchical classification. In: Proceedings of the twenty-first international conference on machine learning, pp 209–216Google Scholar
  9. Holden N, Freitas AA (2006) Hierarchical classification of g-protein-coupled receptors with a pso/aco algorithm. In: IEEE swarm intelligence symposium (SIS-06), pp 77–84Google Scholar
  10. Ipeirotis PG, Gravano L, Sahami M (2001) Probe, count, and classify: categorizing hidden web databases. In: ACM SIGMOD international conference on management of data, SIGMOD ’01, pp 67–78Google Scholar
  11. Kendall MG (1938) A new measure of rank correlation. Biometrica 30:81–93CrossRefzbMATHGoogle Scholar
  12. Kiritchenko S, Matwin S, Fazel FA (2005) Functional annotation of genes using hierarchical text categorization. In: ACL workshop on linking biological literature, ontologies and databases: mining biological semanticsGoogle Scholar
  13. Koller D, Sahami M (1997) Hierarchically classifying documents using very few wordsGoogle Scholar
  14. Kosmopoulos A, Gaussier E, Paliouras G (2010) The ECIR 2010 large scale hierarchical classification workshop. SIGIR Forum 44:23–32CrossRefGoogle Scholar
  15. McCallum A, Rosenfeld R (1998) Improving text classification by shrinkage in a hierarchy of classes. ICML 98:359–367Google Scholar
  16. Nowak S, Lukashevich H, Dunker P, Rüger S (2010) Performance measures for multilabel evaluation: a case study in the area of image classification. In: Proceedings of the international conference on multimedia information retrieval, pp 35–44Google Scholar
  17. Silla CN Jr, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Discov 22:31–72CrossRefzbMATHMathSciNetGoogle Scholar
  18. Sokolova M, Guy L (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437CrossRefGoogle Scholar
  19. Struyf J, Dzeroski S, Blockeel H, Clare A (2005) Hierarchical multi-classification with predictive clustering trees in functional genomics. In Carlos B, Cardoso A, and Dias G, (eds) Progress in artificial Intelligence. Lecture Notes in Computer Science, vol 3808, pp 272–283Google Scholar
  20. Sun A, Lim E-P (2001) Hierarchical text classification and evaluation. In: IEEE International conference on data mining, pp 521–528Google Scholar
  21. Sun A, Lim E-P, Ng W-K (2003) Performance measurement framework for hierarchical text classification. J Am Soc Inf Sci Technol 54:1014–1028CrossRefGoogle Scholar
  22. Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRefGoogle Scholar
  23. Xiao L, Zhou D, Wu M (2011) Hierarchical classification via orthogonal transfer. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 801–808Google Scholar
  24. Yang Y, Liu X (1999) A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp 42–49Google Scholar

Copyright information

© The Author(s) 2014

Authors and Affiliations

  • Aris Kosmopoulos
    • 1
    • 2
    Email author
  • Ioannis Partalas
    • 3
  • Eric Gaussier
    • 3
  • Georgios Paliouras
    • 1
  • Ion Androutsopoulos
    • 2
  1. 1.National Center for Scientific Research “Demokritos”AthensGreece
  2. 2.Athens University of Economics and BusinessAthensGreece
  3. 3.Laboratoire d’Informatique de GrenobleUnivesité Joseph FourierGrenobleFrance

Personalised recommendations