Skip to main content
Log in

A survey of hierarchical classification across different application domains

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

In this survey we discuss the task of hierarchical classification. The literature about this field is scattered across very different application domains and for that reason research in one domain is often done unaware of methods developed in other domains. We define what is the task of hierarchical classification and discuss why some related tasks should not be considered hierarchical classification. We also present a new perspective about some existing hierarchical classification approaches, and based on that perspective we propose a new unifying framework to classify the existing approaches. We also present a review of empirical comparisons of the existing methods reported in the literature as well as a conceptual comparison of those methods at a high level of abstraction, discussing their advantages and disadvantages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aleksovski D, Kocev D, Dzeroski S (2009) Evaluation of distance measures for hierarchical multilabel classification in functional genomics. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 5–16

  • Altun Y, Hofmann T (2003) Large margin methods for label sequence learning. In: Proceedings of the 8th European conference on speech communication and technology (EuroSpeech)

  • Alves RT, Delgado MR, Freitas AA (2008) Multi-label hierarchical classification of protein functions with artificial immune systems. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 1–12

  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology consortium. Gene ontology: tool for the unification of biology. Nat Genet 25: 25–29

    Article  Google Scholar 

  • Astikainen K, Holmand L, Pitkanen E, Szedmak S, Rousu J (2008) Towards structured output prediction of enzyme function. BMC Proc 2(Suppl 4)

  • Barbedo JGA, Lopes A (2007) Automatic genre classification of musical signals. EURASIP J Adv Signal Process 2007: 12

    MathSciNet  Google Scholar 

  • Barret AJ (1997) Nomenclature committee of the international union of biochemistry and molecular biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions. Eur J Biochem 250(1): 1–6

    Article  Google Scholar 

  • Barutcuoglu Z, DeCoro C (2006) Hierarchical shape classification using bayesian aggregation. In: Proceedings of the IEEE conference on shape modeling and applications

  • Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Syst Biol 22: 830–836

    Google Scholar 

  • Bennett PN, Nguyen N (2009) Refined experts: improving classification in large taxonomies. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, pp 11–18

  • Binder A, Kawanabe M, Brefeld U (2009) Efficient classification of images with taxonomies. In: Proceedings of the 9th Asian conference on computer vision

  • Blockeel H, Bruynooghe M, Dzeroski S, Ramon J, Struyf J (2002) Hierarchical multi-classification. In: Proceedings of the first SIGKDD workshop on multirelational data mining (MRDM-2002), pp 21–35

  • Blockeel H, Schietgat L, Struyf J, Džeroski S, Clare A (2006) Decision trees for hierarchical multilabel classification: a case study in functional genomics. In: Knowledge discovery in databases: PKDD 2006. Lecture notes in computer science, vol 4213. Springer, Berlin, pp 18–29

  • Brecheisen S, Kriegel HP, Kunath P, Pryakhin A (2006a) Hierarchical genre classification for large music collections. In: Proceedings of the IEEE 7th international conference on Multimedia & Expo, pp 1385–1388

  • Brecheisen S, Kriegel HP, Kunath P, Pryakhin A, Vorberger F (2006b) MUSCLE: music classification engine with user feedback. In: Springer (ed) Proceedings of the 10th international conference on extending database technology, vol 3896 in Lecture notes in computer science, pp 1164–1167

  • Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of German emotional speech. In: Proceedings of the 9th European conference on speech communication and technology, pp 1517–1520

  • Burred JJ, Lerch A (2003) A hierarchical approach to automatic musical genre classification. In: Proceedings of the 6th international conference on digital audio effects, pp 8–11

  • Cai L, Hofmann T (2004) Hierarchical document categorization with support vector machines. In: Proceedings of the 13th ACM international conference on information and knowledge management, pp 78–87

  • Cai L, Hofmann T (2007) Exploiting known taxonomies in learning overlapping concepts. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 714–719

  • Ceci M (2008) Hierarchical text categorization in a transductive setting. In: Proceedings of the IEEE international conference of data mining workshops, pp 184–191

  • Ceci M, Malerba D (2007) Classifying web documents in a hierarchy of categories: a comprehensive study. J Intell Inform Syst 28(1): 1–41

    Article  Google Scholar 

  • Cesa-Bianchi N, Valentini G (2009) Hierarchical cost-sensitive algorithms for genome-wide gene function prediction. In: Third international workshop on machine learning in systems biology

  • Cesa-Bianchi N, Gentile C, Zaniboni L (2006a) Hierarchical classification: combining Bayes with SVM. In: Proceedings of the 23rd international conference on machine learning, pp 177–184

  • Cesa-Bianchi N, Gentile C, Zaniboni L (2006b) Incremental algorithms for hierarchical classification. J Mach Learn Res 7: 31–54

    MathSciNet  Google Scholar 

  • Chakrabarti S, Dom B, Agrawal R, Raghavan P (1998) Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. VLDB J 7: 163–178

    Article  Google Scholar 

  • Chen Y, Crawford MM, Ghosh J (2004) Integrating support vector machines in a hierarchical output space decomposition framework. In: Proceedings of the IEEE international symposium on geoscience and remote sensing, vol 2, pp 949–952

  • Clare A (2004) Machine learning and data mining for yeast functional genomics. PhD thesis, University of Wales Aberystwyth

  • Clare A, King RD (2003) Predicting gene function in Saccharomyces cerevisiae. Bioinformatics 19(suppl 2): ii42–ii49

    Google Scholar 

  • Costa E, Lorena A, Carvalho A, Freitas A (2007a) A review of performance evaluation measures for hierarchical classifiers. In: Evaluation methods for machine learning II: papers from the 2007 AAAI Workshop, AAAI Press, pp 1–6

  • Costa E, Lorena A, Carvalho A, Freitas AA, Holden N (2007b) Comparing several approaches for hierarchical classification of proteins with decision trees. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 4643. Springer, Berlin, pp 126–137

  • Costa EP, Lorena AC, de Carvalho A, Freitas AA (2008) Top-down hierarchical ensembles of classifiers for predicting g-protein-coupled-receptor functions. In: Advances in Bioinformatics and computational biology. Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 35–46

  • D’Alessio S, Murray K, Schiaffino R, Kershenbaum A (2000) The effect of using hierarchical classifiers in text categorization. In: Proceedings of the 6th international conference Recherche d´ Information Assistee par Ordinateur, pp 302–313

  • DeCoro C, Barutcuoglu Z, Fiebrink R (2007) Bayesian aggregation for hierarchical genre classification. In: Proceedings of the 8th international conference on music information retrieval, Vienna, Austria, pp 77–80

  • Dekel O, Keshet J, Singer Y (2004a) Large margin hierarchical classification. In: Proceedings of the 21th international conference on Machine learning

  • Dekel O, Keshet J, Singer Y (2004b) An online algorithm for hierarchical phoneme classification. In: Proceedings of the 1st machine learning for multimodal interaction workshop. Lecture notes in computer science, vol 3361. Springer, Berlin, pp 146–158

  • Dimitrovski I, Kocev D, Loskovska S, Dzeroski S (2008) Hierarchical annotation of medical images. In: Proceedings of the 11th international multiconference information society, vol A, pp 174–177

  • Downie JS, Cunningham SJ (2002) Toward a theory of music information retrieval queries: System design implications. In: Proceedings of the 3rd international conference on music information retrieval, pp 299–300

  • Dumais ST, Chen H (2000) Hierarchical classification of Web content. In: Belkin NJ, Ingwersen P, Leong MK (eds) Proceedings of the 23rd ACM international conference on research and development in information retrieval, pp 256–263

  • Eisner R, Poulin B, Szafron D, Lu P, Greiner R (2005) Improving protein function prediction using the hierarchical structure of the gene ontology. In: Proceedings of the IEEE symposium on computational intelligence in bioinformatics and computational biology, pp 1–10

  • Esuli A, Fagni T, Sebastiani F (2008) Boosting multi-label hierarchical text categorization. Inform Retr 11(4): 287–313

    Article  Google Scholar 

  • Fagni T, Sebastiani F (2007) On the selection of negative examples for hierarchical text categorization. In: Proceedings of the 3rd language technology conference, pp 24–28

  • Freitas AA, de Carvalho ACPLF (2007) Research and trends in data mining technologies and applications, Idea Group, chap A: tutorial on hierarchical classification with applications in bioinformatics, pp 175–208

  • Freitas COA, Oliveira LS, Aires SBK, Bortolozzi F (2008) Metaclasses and zoning mechanism applied to handwriting recognition. J Univers Comput Sci 14(2): 211–223

    Google Scholar 

  • García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9: 2677–2694

    Google Scholar 

  • Gauch S, Chandramouli A, Ranganathan S (2009) Training a hierarchical classifier using inter document relationships. J Am Soc Inform Sci Technol 60(1): 47–58

    Article  Google Scholar 

  • Gerlt JA, Babbitt PC (2000) Can sequence determine function. Genome Biol 1(5): 1–10

    Article  Google Scholar 

  • Guan Y, Myers CL, Hess DC, Barutcuoglu Z, Caudy AA, Troyanskaya OG (2008) Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol 9(Suppl 1):S3

    Google Scholar 

  • Hao PY, Chiang JH, Tu YK (2007) Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Syst Appl 33: 627–635

    Article  Google Scholar 

  • Hayete B, Bienkowska J (2005) Gotrees: predicting go associations from protein domain composition using decision trees. In: Proceedings of the Pacific symposium on biocomputing, pp 127–138

  • Holden N, Freitas AA (2005) A hybrid particle swarm/ant colony algorithm for the classification of hierarchical biological data. In: Proceedings of the 2nd IEEE swarm intelligence symposium, pp 100–107

  • Holden N, Freitas AA (2006) Hierarchical classification of g-protein-coupled receptors with a pso/aco algorithm. In: Proceedings of the 3rd IEEE swarm intelligence symposium, pp 77–84

  • Holden N, Freitas AA (2008) Improving the performance of hierarchical classification with swarm intelligence. In: Proc. 6th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture notes in computer science, vol 4973. Springer, Berlin, pp 48–60

  • Holden N, Freitas AA (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Comput J 13: 259–272

    Article  Google Scholar 

  • Jin B, Muller B, Zhai C, Lu X (2008) Multi-label literature classification based on the gene ontology graph. BMC Bioinform 9:525

    Google Scholar 

  • Kiritchenko S, Matwin S, Famili AF (2005) Functional annotation of genes using hierarchical text categorization. In: Proceedings of the ACL workshop on linking biological literature, ontologies and databases: mining biological semantics

  • Kiritchenko S, Matwin S, Nock R, Famili AF (2006) Learning and evaluation in the presence of class hierarchies: application to text categorization. In: Proceedings of the 19th Canadian conference on artificial intelligence. Lecture notes in artificial intelligence, vol 4013, pp 395–406

  • Koerich AL, Kalva PR (2005) Unconstrained handwritten character recognition using metaclasses of characters. In: Proceedings of the IEEE international conference on image processing, vol 2, pp 542–545

  • Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: Proceedings of the 14th international conference on machine learning, pp 170–178

  • Kriegel HP, Kroger P, Pryakhin A, Schubert M (2004) Using support vector machines for classifying large sets of multi-represented objects. In: Proceedings of the SIAM international conference on data mining, pp 102–114

  • Kumar S, Ghosh J, Crawford MM (2002) Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Anal Appl 5: 210–220

    Article  MATH  MathSciNet  Google Scholar 

  • Labrou Y, Finin T (1999) Yahoo! as an ontology—using yahoo! categories to describe documents. In: Proceedings of the ACM conference on information and knowledge management, pp 180–187

  • Lee JH, Downie JS (2004) Survey of music information needs, uses, and seeking behaviours: preliminary findings. In: Proceedings of the fifth international conference on music information retrieval, Barcelona, Spain, pp 441–446

  • Li T, Ogihara M (2005) Music genre classification with taxonomy. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 197–200

  • Li T, Zhu S, Ogihara M (2007) Hierarchical document classification using automatically generated hierarchy. J Intell Inform Syst 29(2): 211–230

    Article  Google Scholar 

  • Liu TY, Yang Y, Wan H, Zeng HJ, Chen Z, Ma WY (2005) Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explor Newsl 7(1): 36–43

    Article  Google Scholar 

  • Lorena AC, Carvalho ACPLF (2004) Comparing techniques for multiclass classification using binary svm predictors. In: Proceedings of the IV Mexican international conference on artificial intelligence. Lecture notes in artificial intelligence, vol 2972, pp 272–281

  • McCallum A, Rosenfeld R, Mitchell TM, Ng AY (1998) Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the international conference on machine learning, pp 359–367

  • McKay C, Fujinaga I (2004) Automatic genre classification using large high-level musical feature sets. In: Proceedings of the international conference on music information retrieval, pp 525–530

  • Mladenic D, Grobelnik M (2003) Feature selection on hierarchy of web documents. Decis Support Syst 35: 45–87

    Article  Google Scholar 

  • Otero FEB, Freitas AA, Johnson CG (2009) A hierarchical classification ant colony algorithm for predicting gene ontology terms. In: Pizzuti C, Ritchie M, Giacobini M (eds) Proceedings of the 7th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture Notes in Computer Science, vol 5483. Springer, Berlin, pp 68–79

  • Peng X, Choi B (2005) Document classifications based on word semantic hierarchies. In: Proceedings of the international conference on artificial intelligence and applications, pp 362–367

  • Punera K, Ghosh J (2008) Enhanced hierarchical classification via isotonic smoothing. In: Proceedings of the 17th international conference on World Wide Web, pp 151–160

  • Punera K, Rajan S, Ghosh J (2005) Automatically learning document taxonomies for hierarchical classification. In: Proceedings of the international World Wide Web conference, pp 1010–1011

  • Qiu X, Gao W, Huang X (2009) Hierarchical multi-class text categorization with global margin maximization. In: Proceedings of the Joint conference of the 47th Annual Meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, Association for computational linguistics, pp 165–168

  • Rocchio JJ (1971) The SMART retrieval system: experiments in automatic document processing, chap: relevance feedback in information retrieval, Prentice Hall, pp 313–323

  • Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2005) Learning hierarchical multi-category text classification models. In: Proceedings of the 22nd international conference on machine learning, pp 744–751

  • Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2006) Kernel-based learning of hierarchical multilabel classification models. J Mach Learn Res 7: 1601–1626

    MathSciNet  Google Scholar 

  • Ruepp A, Zollner A, Maier D, Albermann K, Hani J, Mokrejs M, Tetko I, Guldener U, Mannhaupt G, Munsterkotter M, Mewes HW (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32(18): 5539–5545

    Article  Google Scholar 

  • Ruiz ME, Srinivasan P (2002) Hierarchical text categorization using neural networks. Inform Retr 5: 87–118

    Article  MATH  Google Scholar 

  • Sasaki M, Kita K (1998) Rule-based text categorization using hierarchical categories. In: Proceedings of IEEE international conference on systems, man, and cybernetics, pp 2827–2830

  • Secker A, Davies M, Freitas A, Timmis J, Mendao M, Flower D (2007) An experimental comparison of classification algorithms for the hierarchical prediction of protein function. Expert Updat (the BCS-SGAI Mag) 9(3): 17–22

    Google Scholar 

  • Secker A, Davies M, Freitas AA, Clark E, Timmis J, Flower DR (2010) Hierarchical classification of g-protein-coupled-receptors with data-driven selection of attributes and classifiers. Int J Data Mining Bioinform 4(2): 191–210

    Article  Google Scholar 

  • Seeger MW (2008) Cross-validation optimization for large scale structured classification kernel methods. J Mach Learn Res 9: 1147–1178

    MathSciNet  Google Scholar 

  • Shilane P, Kazhdan M, Min P, Funkhouser T (2004) The Princeton shape benchmark. In: Proceedings of the shape modeling international

  • Silla Jr CN, Freitas AA (2009a) A global-model naive bayes approach to the hierarchical prediction of protein functions. In: Proceedings of the 9th IEEE international conference on data mining, pp 992–997

  • Silla Jr CN, Freitas AA (2009b) Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, pp 3599–3604

  • Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45: 427–437

    Article  Google Scholar 

  • Sun A, Lim EP (2001) Hierarchical text classification and evaluation. In: Proceedings of the IEEE international conference on data mining, pp 521–528

  • Sun A, Lim EP, Ng WK (2003) Performance measurement framework for hierarchical text classification. J Am Soc Inform Sci Technol 54(11): 1014–1028

    Article  Google Scholar 

  • Sun A, Lim EP, Ng WK, Srivastava J (2004) Blocking reduction strategies in hierarchical text classification. IEEE Trans Knowl Data Eng 16(10): 1305–1308

    Article  Google Scholar 

  • Tikk D, Biró G (2003) Experiment with a hierarchical text categorization method on the wipo-alpha patent collection. In: Proceedings of the 4th international symposium on uncertainty modeling and analysis, pp 104–109

  • Tikk D, Yang JD, Bang SL (2003) Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika 39(5): 583–600

    Google Scholar 

  • Tikk D, Biró G, Yang JD (2004) A hierarchical text categorization approach and its application to frt expansion. Aust J Intell Inform Process Syst 8(3): 123–131

    Google Scholar 

  • Tikk D, Biró G, Torcsvári A (2007) Emerging technologies of text mining: techniques and applications, Idea Group, chap: a hierarchical online classifier for patent categorization, pp 244–267

  • Tsoumakas G, Katakis I (2007) Multi label classification: an overview. Int J Data Wareh Mining 3(3): 1–13

    Google Scholar 

  • Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6: 1453–1484

    MathSciNet  Google Scholar 

  • Valentini G (2009) True path rule hierarchical ensembles. In: Kittler J, Benediktsson J, Roli F (eds) Proceedings of the eighth international workshop on multiple classifier systems. Lecture notes in computer science, vol 5519. Springer, Berlin, pp 232–241

  • Valentini G, Re M (2009) Weighted true path rule: a multilabel hierarchical algorithm for gene function prediction. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 132–145

  • Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2): 185–214

    Article  Google Scholar 

  • Wang K, Zhou S, Liew SC (1999) Building hierarchical classifiers using class proximity. In: In Proceedings of the 25th conference on very large data base. Morgan Kaufmann Publishers, San Francisco, pp 363–374

  • Wang K, Zhou S, He Y (2001) Hierarchical classification of real life documents. In: Proceedings of the 1st SIAM international conference on data mining, Chicago, USA

  • Wang J, Shen X, Pan W (2009) Large margin hierarchical classification with multiple paths. J Am Stat Assoc 104(487): 1213–1223

    Article  MathSciNet  Google Scholar 

  • Weigend AS, Wiener ED, Pedersen JO (1999) Exploiting hierarchy in text categorization. Inform Retr 1: 193–216

    Article  Google Scholar 

  • Wu F, Zhang J, Honavar V (2005) Learning classifiers using hierarchically structured class taxonomies. In: Proceedings of the symposium on abstraction, reformulation, and approximation, vol 3607. Springer, Berlin, pp 313–320

  • Xiao Z, Dellandréa E, Dou W, Chen L (2007) Hierarchical Classification of Emotional Speech. Technical report RR-LIRIS-2007-006, LIRIS UMR 5205 CNRS/INSA de Lyon/Université Claude Bernard Lyon 1/Université Lumière Lyon 2/Ecole Centrale de Lyon, http://liris.cnrs.fr/publis/?id=2742

  • Xue GR, Xing D, Yang Q, Yu Y (2008) Deep classification in large-scale text hierarchies. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 619–626

  • Zhang T (2003) Semi-automatic approach for music classification. In: Proceedings of the SPIE conference on internet multimedia management systems, pp 81–91

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos N. Silla Jr..

Rights and permissions

Reprints and permissions

About this article

Cite this article

Silla, C.N., Freitas, A.A. A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22, 31–72 (2011). https://doi.org/10.1007/s10618-010-0175-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-010-0175-9

Keywords

Navigation