Data Mining and Knowledge Discovery

, Volume 22, Issue 1–2, pp 31–72 | Cite as

A survey of hierarchical classification across different application domains

Article

Abstract

In this survey we discuss the task of hierarchical classification. The literature about this field is scattered across very different application domains and for that reason research in one domain is often done unaware of methods developed in other domains. We define what is the task of hierarchical classification and discuss why some related tasks should not be considered hierarchical classification. We also present a new perspective about some existing hierarchical classification approaches, and based on that perspective we propose a new unifying framework to classify the existing approaches. We also present a review of empirical comparisons of the existing methods reported in the literature as well as a conceptual comparison of those methods at a high level of abstraction, discussing their advantages and disadvantages.

Keywords

Hierarchical classification Tree-structured class hierarchies DAG-structured class hierarchies 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aleksovski D, Kocev D, Dzeroski S (2009) Evaluation of distance measures for hierarchical multilabel classification in functional genomics. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 5–16Google Scholar
  2. Altun Y, Hofmann T (2003) Large margin methods for label sequence learning. In: Proceedings of the 8th European conference on speech communication and technology (EuroSpeech)Google Scholar
  3. Alves RT, Delgado MR, Freitas AA (2008) Multi-label hierarchical classification of protein functions with artificial immune systems. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 1–12Google Scholar
  4. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology consortium. Gene ontology: tool for the unification of biology. Nat Genet 25: 25–29CrossRefGoogle Scholar
  5. Astikainen K, Holmand L, Pitkanen E, Szedmak S, Rousu J (2008) Towards structured output prediction of enzyme function. BMC Proc 2(Suppl 4)Google Scholar
  6. Barbedo JGA, Lopes A (2007) Automatic genre classification of musical signals. EURASIP J Adv Signal Process 2007: 12MathSciNetGoogle Scholar
  7. Barret AJ (1997) Nomenclature committee of the international union of biochemistry and molecular biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions. Eur J Biochem 250(1): 1–6CrossRefGoogle Scholar
  8. Barutcuoglu Z, DeCoro C (2006) Hierarchical shape classification using bayesian aggregation. In: Proceedings of the IEEE conference on shape modeling and applicationsGoogle Scholar
  9. Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Syst Biol 22: 830–836Google Scholar
  10. Bennett PN, Nguyen N (2009) Refined experts: improving classification in large taxonomies. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, pp 11–18Google Scholar
  11. Binder A, Kawanabe M, Brefeld U (2009) Efficient classification of images with taxonomies. In: Proceedings of the 9th Asian conference on computer visionGoogle Scholar
  12. Blockeel H, Bruynooghe M, Dzeroski S, Ramon J, Struyf J (2002) Hierarchical multi-classification. In: Proceedings of the first SIGKDD workshop on multirelational data mining (MRDM-2002), pp 21–35Google Scholar
  13. Blockeel H, Schietgat L, Struyf J, Džeroski S, Clare A (2006) Decision trees for hierarchical multilabel classification: a case study in functional genomics. In: Knowledge discovery in databases: PKDD 2006. Lecture notes in computer science, vol 4213. Springer, Berlin, pp 18–29Google Scholar
  14. Brecheisen S, Kriegel HP, Kunath P, Pryakhin A (2006a) Hierarchical genre classification for large music collections. In: Proceedings of the IEEE 7th international conference on Multimedia & Expo, pp 1385–1388Google Scholar
  15. Brecheisen S, Kriegel HP, Kunath P, Pryakhin A, Vorberger F (2006b) MUSCLE: music classification engine with user feedback. In: Springer (ed) Proceedings of the 10th international conference on extending database technology, vol 3896 in Lecture notes in computer science, pp 1164–1167Google Scholar
  16. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of German emotional speech. In: Proceedings of the 9th European conference on speech communication and technology, pp 1517–1520Google Scholar
  17. Burred JJ, Lerch A (2003) A hierarchical approach to automatic musical genre classification. In: Proceedings of the 6th international conference on digital audio effects, pp 8–11Google Scholar
  18. Cai L, Hofmann T (2004) Hierarchical document categorization with support vector machines. In: Proceedings of the 13th ACM international conference on information and knowledge management, pp 78–87Google Scholar
  19. Cai L, Hofmann T (2007) Exploiting known taxonomies in learning overlapping concepts. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 714–719Google Scholar
  20. Ceci M (2008) Hierarchical text categorization in a transductive setting. In: Proceedings of the IEEE international conference of data mining workshops, pp 184–191Google Scholar
  21. Ceci M, Malerba D (2007) Classifying web documents in a hierarchy of categories: a comprehensive study. J Intell Inform Syst 28(1): 1–41CrossRefGoogle Scholar
  22. Cesa-Bianchi N, Valentini G (2009) Hierarchical cost-sensitive algorithms for genome-wide gene function prediction. In: Third international workshop on machine learning in systems biologyGoogle Scholar
  23. Cesa-Bianchi N, Gentile C, Zaniboni L (2006a) Hierarchical classification: combining Bayes with SVM. In: Proceedings of the 23rd international conference on machine learning, pp 177–184Google Scholar
  24. Cesa-Bianchi N, Gentile C, Zaniboni L (2006b) Incremental algorithms for hierarchical classification. J Mach Learn Res 7: 31–54MathSciNetGoogle Scholar
  25. Chakrabarti S, Dom B, Agrawal R, Raghavan P (1998) Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. VLDB J 7: 163–178CrossRefGoogle Scholar
  26. Chen Y, Crawford MM, Ghosh J (2004) Integrating support vector machines in a hierarchical output space decomposition framework. In: Proceedings of the IEEE international symposium on geoscience and remote sensing, vol 2, pp 949–952Google Scholar
  27. Clare A (2004) Machine learning and data mining for yeast functional genomics. PhD thesis, University of Wales AberystwythGoogle Scholar
  28. Clare A, King RD (2003) Predicting gene function in Saccharomyces cerevisiae. Bioinformatics 19(suppl 2): ii42–ii49Google Scholar
  29. Costa E, Lorena A, Carvalho A, Freitas A (2007a) A review of performance evaluation measures for hierarchical classifiers. In: Evaluation methods for machine learning II: papers from the 2007 AAAI Workshop, AAAI Press, pp 1–6Google Scholar
  30. Costa E, Lorena A, Carvalho A, Freitas AA, Holden N (2007b) Comparing several approaches for hierarchical classification of proteins with decision trees. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 4643. Springer, Berlin, pp 126–137Google Scholar
  31. Costa EP, Lorena AC, de Carvalho A, Freitas AA (2008) Top-down hierarchical ensembles of classifiers for predicting g-protein-coupled-receptor functions. In: Advances in Bioinformatics and computational biology. Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 35–46Google Scholar
  32. D’Alessio S, Murray K, Schiaffino R, Kershenbaum A (2000) The effect of using hierarchical classifiers in text categorization. In: Proceedings of the 6th international conference Recherche d´ Information Assistee par Ordinateur, pp 302–313Google Scholar
  33. DeCoro C, Barutcuoglu Z, Fiebrink R (2007) Bayesian aggregation for hierarchical genre classification. In: Proceedings of the 8th international conference on music information retrieval, Vienna, Austria, pp 77–80Google Scholar
  34. Dekel O, Keshet J, Singer Y (2004a) Large margin hierarchical classification. In: Proceedings of the 21th international conference on Machine learningGoogle Scholar
  35. Dekel O, Keshet J, Singer Y (2004b) An online algorithm for hierarchical phoneme classification. In: Proceedings of the 1st machine learning for multimodal interaction workshop. Lecture notes in computer science, vol 3361. Springer, Berlin, pp 146–158Google Scholar
  36. Dimitrovski I, Kocev D, Loskovska S, Dzeroski S (2008) Hierarchical annotation of medical images. In: Proceedings of the 11th international multiconference information society, vol A, pp 174–177Google Scholar
  37. Downie JS, Cunningham SJ (2002) Toward a theory of music information retrieval queries: System design implications. In: Proceedings of the 3rd international conference on music information retrieval, pp 299–300Google Scholar
  38. Dumais ST, Chen H (2000) Hierarchical classification of Web content. In: Belkin NJ, Ingwersen P, Leong MK (eds) Proceedings of the 23rd ACM international conference on research and development in information retrieval, pp 256–263Google Scholar
  39. Eisner R, Poulin B, Szafron D, Lu P, Greiner R (2005) Improving protein function prediction using the hierarchical structure of the gene ontology. In: Proceedings of the IEEE symposium on computational intelligence in bioinformatics and computational biology, pp 1–10Google Scholar
  40. Esuli A, Fagni T, Sebastiani F (2008) Boosting multi-label hierarchical text categorization. Inform Retr 11(4): 287–313CrossRefGoogle Scholar
  41. Fagni T, Sebastiani F (2007) On the selection of negative examples for hierarchical text categorization. In: Proceedings of the 3rd language technology conference, pp 24–28Google Scholar
  42. Freitas AA, de Carvalho ACPLF (2007) Research and trends in data mining technologies and applications, Idea Group, chap A: tutorial on hierarchical classification with applications in bioinformatics, pp 175–208Google Scholar
  43. Freitas COA, Oliveira LS, Aires SBK, Bortolozzi F (2008) Metaclasses and zoning mechanism applied to handwriting recognition. J Univers Comput Sci 14(2): 211–223Google Scholar
  44. García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9: 2677–2694Google Scholar
  45. Gauch S, Chandramouli A, Ranganathan S (2009) Training a hierarchical classifier using inter document relationships. J Am Soc Inform Sci Technol 60(1): 47–58CrossRefGoogle Scholar
  46. Gerlt JA, Babbitt PC (2000) Can sequence determine function. Genome Biol 1(5): 1–10CrossRefGoogle Scholar
  47. Guan Y, Myers CL, Hess DC, Barutcuoglu Z, Caudy AA, Troyanskaya OG (2008) Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol 9(Suppl 1):S3Google Scholar
  48. Hao PY, Chiang JH, Tu YK (2007) Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Syst Appl 33: 627–635CrossRefGoogle Scholar
  49. Hayete B, Bienkowska J (2005) Gotrees: predicting go associations from protein domain composition using decision trees. In: Proceedings of the Pacific symposium on biocomputing, pp 127–138Google Scholar
  50. Holden N, Freitas AA (2005) A hybrid particle swarm/ant colony algorithm for the classification of hierarchical biological data. In: Proceedings of the 2nd IEEE swarm intelligence symposium, pp 100–107Google Scholar
  51. Holden N, Freitas AA (2006) Hierarchical classification of g-protein-coupled receptors with a pso/aco algorithm. In: Proceedings of the 3rd IEEE swarm intelligence symposium, pp 77–84Google Scholar
  52. Holden N, Freitas AA (2008) Improving the performance of hierarchical classification with swarm intelligence. In: Proc. 6th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture notes in computer science, vol 4973. Springer, Berlin, pp 48–60Google Scholar
  53. Holden N, Freitas AA (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Comput J 13: 259–272CrossRefGoogle Scholar
  54. Jin B, Muller B, Zhai C, Lu X (2008) Multi-label literature classification based on the gene ontology graph. BMC Bioinform 9:525Google Scholar
  55. Kiritchenko S, Matwin S, Famili AF (2005) Functional annotation of genes using hierarchical text categorization. In: Proceedings of the ACL workshop on linking biological literature, ontologies and databases: mining biological semanticsGoogle Scholar
  56. Kiritchenko S, Matwin S, Nock R, Famili AF (2006) Learning and evaluation in the presence of class hierarchies: application to text categorization. In: Proceedings of the 19th Canadian conference on artificial intelligence. Lecture notes in artificial intelligence, vol 4013, pp 395–406Google Scholar
  57. Koerich AL, Kalva PR (2005) Unconstrained handwritten character recognition using metaclasses of characters. In: Proceedings of the IEEE international conference on image processing, vol 2, pp 542–545Google Scholar
  58. Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: Proceedings of the 14th international conference on machine learning, pp 170–178Google Scholar
  59. Kriegel HP, Kroger P, Pryakhin A, Schubert M (2004) Using support vector machines for classifying large sets of multi-represented objects. In: Proceedings of the SIAM international conference on data mining, pp 102–114Google Scholar
  60. Kumar S, Ghosh J, Crawford MM (2002) Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Anal Appl 5: 210–220MATHCrossRefMathSciNetGoogle Scholar
  61. Labrou Y, Finin T (1999) Yahoo! as an ontology—using yahoo! categories to describe documents. In: Proceedings of the ACM conference on information and knowledge management, pp 180–187Google Scholar
  62. Lee JH, Downie JS (2004) Survey of music information needs, uses, and seeking behaviours: preliminary findings. In: Proceedings of the fifth international conference on music information retrieval, Barcelona, Spain, pp 441–446Google Scholar
  63. Li T, Ogihara M (2005) Music genre classification with taxonomy. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 197–200Google Scholar
  64. Li T, Zhu S, Ogihara M (2007) Hierarchical document classification using automatically generated hierarchy. J Intell Inform Syst 29(2): 211–230CrossRefGoogle Scholar
  65. Liu TY, Yang Y, Wan H, Zeng HJ, Chen Z, Ma WY (2005) Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explor Newsl 7(1): 36–43CrossRefGoogle Scholar
  66. Lorena AC, Carvalho ACPLF (2004) Comparing techniques for multiclass classification using binary svm predictors. In: Proceedings of the IV Mexican international conference on artificial intelligence. Lecture notes in artificial intelligence, vol 2972, pp 272–281Google Scholar
  67. McCallum A, Rosenfeld R, Mitchell TM, Ng AY (1998) Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the international conference on machine learning, pp 359–367Google Scholar
  68. McKay C, Fujinaga I (2004) Automatic genre classification using large high-level musical feature sets. In: Proceedings of the international conference on music information retrieval, pp 525–530Google Scholar
  69. Mladenic D, Grobelnik M (2003) Feature selection on hierarchy of web documents. Decis Support Syst 35: 45–87CrossRefGoogle Scholar
  70. Otero FEB, Freitas AA, Johnson CG (2009) A hierarchical classification ant colony algorithm for predicting gene ontology terms. In: Pizzuti C, Ritchie M, Giacobini M (eds) Proceedings of the 7th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture Notes in Computer Science, vol 5483. Springer, Berlin, pp 68–79Google Scholar
  71. Peng X, Choi B (2005) Document classifications based on word semantic hierarchies. In: Proceedings of the international conference on artificial intelligence and applications, pp 362–367Google Scholar
  72. Punera K, Ghosh J (2008) Enhanced hierarchical classification via isotonic smoothing. In: Proceedings of the 17th international conference on World Wide Web, pp 151–160Google Scholar
  73. Punera K, Rajan S, Ghosh J (2005) Automatically learning document taxonomies for hierarchical classification. In: Proceedings of the international World Wide Web conference, pp 1010–1011Google Scholar
  74. Qiu X, Gao W, Huang X (2009) Hierarchical multi-class text categorization with global margin maximization. In: Proceedings of the Joint conference of the 47th Annual Meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, Association for computational linguistics, pp 165–168Google Scholar
  75. Rocchio JJ (1971) The SMART retrieval system: experiments in automatic document processing, chap: relevance feedback in information retrieval, Prentice Hall, pp 313–323Google Scholar
  76. Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2005) Learning hierarchical multi-category text classification models. In: Proceedings of the 22nd international conference on machine learning, pp 744–751Google Scholar
  77. Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2006) Kernel-based learning of hierarchical multilabel classification models. J Mach Learn Res 7: 1601–1626MathSciNetGoogle Scholar
  78. Ruepp A, Zollner A, Maier D, Albermann K, Hani J, Mokrejs M, Tetko I, Guldener U, Mannhaupt G, Munsterkotter M, Mewes HW (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32(18): 5539–5545CrossRefGoogle Scholar
  79. Ruiz ME, Srinivasan P (2002) Hierarchical text categorization using neural networks. Inform Retr 5: 87–118MATHCrossRefGoogle Scholar
  80. Sasaki M, Kita K (1998) Rule-based text categorization using hierarchical categories. In: Proceedings of IEEE international conference on systems, man, and cybernetics, pp 2827–2830Google Scholar
  81. Secker A, Davies M, Freitas A, Timmis J, Mendao M, Flower D (2007) An experimental comparison of classification algorithms for the hierarchical prediction of protein function. Expert Updat (the BCS-SGAI Mag) 9(3): 17–22Google Scholar
  82. Secker A, Davies M, Freitas AA, Clark E, Timmis J, Flower DR (2010) Hierarchical classification of g-protein-coupled-receptors with data-driven selection of attributes and classifiers. Int J Data Mining Bioinform 4(2): 191–210CrossRefGoogle Scholar
  83. Seeger MW (2008) Cross-validation optimization for large scale structured classification kernel methods. J Mach Learn Res 9: 1147–1178MathSciNetGoogle Scholar
  84. Shilane P, Kazhdan M, Min P, Funkhouser T (2004) The Princeton shape benchmark. In: Proceedings of the shape modeling internationalGoogle Scholar
  85. Silla Jr CN, Freitas AA (2009a) A global-model naive bayes approach to the hierarchical prediction of protein functions. In: Proceedings of the 9th IEEE international conference on data mining, pp 992–997Google Scholar
  86. Silla Jr CN, Freitas AA (2009b) Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, pp 3599–3604Google Scholar
  87. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45: 427–437CrossRefGoogle Scholar
  88. Sun A, Lim EP (2001) Hierarchical text classification and evaluation. In: Proceedings of the IEEE international conference on data mining, pp 521–528Google Scholar
  89. Sun A, Lim EP, Ng WK (2003) Performance measurement framework for hierarchical text classification. J Am Soc Inform Sci Technol 54(11): 1014–1028CrossRefGoogle Scholar
  90. Sun A, Lim EP, Ng WK, Srivastava J (2004) Blocking reduction strategies in hierarchical text classification. IEEE Trans Knowl Data Eng 16(10): 1305–1308CrossRefGoogle Scholar
  91. Tikk D, Biró G (2003) Experiment with a hierarchical text categorization method on the wipo-alpha patent collection. In: Proceedings of the 4th international symposium on uncertainty modeling and analysis, pp 104–109Google Scholar
  92. Tikk D, Yang JD, Bang SL (2003) Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika 39(5): 583–600Google Scholar
  93. Tikk D, Biró G, Yang JD (2004) A hierarchical text categorization approach and its application to frt expansion. Aust J Intell Inform Process Syst 8(3): 123–131Google Scholar
  94. Tikk D, Biró G, Torcsvári A (2007) Emerging technologies of text mining: techniques and applications, Idea Group, chap: a hierarchical online classifier for patent categorization, pp 244–267Google Scholar
  95. Tsoumakas G, Katakis I (2007) Multi label classification: an overview. Int J Data Wareh Mining 3(3): 1–13Google Scholar
  96. Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6: 1453–1484MathSciNetGoogle Scholar
  97. Valentini G (2009) True path rule hierarchical ensembles. In: Kittler J, Benediktsson J, Roli F (eds) Proceedings of the eighth international workshop on multiple classifier systems. Lecture notes in computer science, vol 5519. Springer, Berlin, pp 232–241Google Scholar
  98. Valentini G, Re M (2009) Weighted true path rule: a multilabel hierarchical algorithm for gene function prediction. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 132–145Google Scholar
  99. Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2): 185–214CrossRefGoogle Scholar
  100. Wang K, Zhou S, Liew SC (1999) Building hierarchical classifiers using class proximity. In: In Proceedings of the 25th conference on very large data base. Morgan Kaufmann Publishers, San Francisco, pp 363–374Google Scholar
  101. Wang K, Zhou S, He Y (2001) Hierarchical classification of real life documents. In: Proceedings of the 1st SIAM international conference on data mining, Chicago, USAGoogle Scholar
  102. Wang J, Shen X, Pan W (2009) Large margin hierarchical classification with multiple paths. J Am Stat Assoc 104(487): 1213–1223CrossRefMathSciNetGoogle Scholar
  103. Weigend AS, Wiener ED, Pedersen JO (1999) Exploiting hierarchy in text categorization. Inform Retr 1: 193–216CrossRefGoogle Scholar
  104. Wu F, Zhang J, Honavar V (2005) Learning classifiers using hierarchically structured class taxonomies. In: Proceedings of the symposium on abstraction, reformulation, and approximation, vol 3607. Springer, Berlin, pp 313–320Google Scholar
  105. Xiao Z, Dellandréa E, Dou W, Chen L (2007) Hierarchical Classification of Emotional Speech. Technical report RR-LIRIS-2007-006, LIRIS UMR 5205 CNRS/INSA de Lyon/Université Claude Bernard Lyon 1/Université Lumière Lyon 2/Ecole Centrale de Lyon, http://liris.cnrs.fr/publis/?id=2742
  106. Xue GR, Xing D, Yang Q, Yu Y (2008) Deep classification in large-scale text hierarchies. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 619–626Google Scholar
  107. Zhang T (2003) Semi-automatic approach for music classification. In: Proceedings of the SPIE conference on internet multimedia management systems, pp 81–91Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.School of ComputingUniversity of KentCanterburyUK

Personalised recommendations