Journal of Intelligent Information Systems

, Volume 52, Issue 1, pp 141–164 | Cite as

Improving large-scale hierarchical classification by rewiring: a data-driven filter based approach

  • Azad NaikEmail author
  • Huzefa Rangwala


Hierarchical Classification (HC) is a supervised learning problem where unlabeled instances are classified into a taxonomy of classes. Several methods that utilize the hierarchical structure have been developed to improve the HC performance. However, in most cases apriori defined hierarchical structure by domain experts is inconsistent; as a consequence performance improvement is not noticeable in comparison to flat classification methods. We propose a scalable data-driven filter based rewiring approach to modify an expert-defined hierarchy. Experimental comparisons of top-down hierarchical classification with our modified hierarchy, on a wide range of datasets shows classification performance improvement over the baseline hierarchy (i.e., defined by expert), clustered hierarchy and flattening based hierarchy modification approaches. In comparison to existing rewiring approaches, our developed method (rewHier) is computationally efficient, enabling it to scale to datasets with large numbers of classes, instances and features. We also show that our modified hierarchy leads to improved classification performance for classes with few training samples in comparison to flat and state-of-the-art hierarchical classification approaches. Source Code:


Top-down hierarchical classification Inconsistency Error propagation Flattening Clustering Rewiring 



The authors gratefully acknowledge support of the work described in this paper from the NSF grant #1447489 and #1252318.

Compliance with Ethical Standards

Conflict of interests

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. Aggarwal, C., Gates, S., Yu, P. (1999). On the merits of building categorization systems by supervised clustering. In SIGKDD (pp. 352–356).Google Scholar
  2. Babbar, R., Partalas, I., Gaussier, E., Amini, M. (2013a). On flat versus hierarchical classification in large-scale taxonomies. In NIPS (pp. 1824–1832).Google Scholar
  3. Babbar, R., Partalas, I., Gaussier, E., Amini, M. R. (2013b). Maximum-margin framework for training data synchronization in large-scale hierarchical classification. In Neural Information Processing (pp. 336–343).Google Scholar
  4. Cai, L., & Hofmann, T. (2004). Hierarchical document categorization with support vector machines. In CIKM (pp. 78–87).Google Scholar
  5. Charuvaka, A., & Rangwala, H. (2015). Hiercost: Improving large scale hierarchical classification with cost sensitive learning. In ECML PKDD.Google Scholar
  6. Chuang, S., & Chien, L. (2004). A practical web-based approach to generating topic hierarchy for text segments. In CIKM (pp. 127–136).Google Scholar
  7. Dimitrovski, I., Kocev, D., Loskovska, S., džeroski, S. (2011). Hierarchical annotation of medical images. Pattern Recognition, 44(10), 2436–2449.CrossRefGoogle Scholar
  8. Dimitrovski, I., Kocev, D., Loskovska, S., Džeroski, S. (2012). Hierarchical classification of diatom images using predictive clustering trees. Ecological Informatics, 7, 19–29.CrossRefGoogle Scholar
  9. Dumais, S., & Chen, H. (2000). Hierarchical classification of web content. In ACM SIGIR (pp. 256–263).Google Scholar
  10. Gao, T., & Koller, D. (2011). Discriminative learning of relaxed hierarchy for large-scale visual recognition. In ICCV (pp. 2072–2079).Google Scholar
  11. Gopal, S., & Yang, Y. (2013). Recursive regularization for large-scale classification with hierarchical & graphical dependencies. In ACM SIGKDD (pp. 257–265).Google Scholar
  12. Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. in ICML (pp. 170–178).Google Scholar
  13. Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I. (2015). Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Mining and Knowledge Discovery, 29(3), 820–865.MathSciNetCrossRefGoogle Scholar
  14. Li, T., Zhu, S., Ogihara, M. (2007). Hierarchical document classification using automatically generated hierarchy. JIIS, 29(2), 211–230.Google Scholar
  15. Liu, T., Wan, H., Qin, T., Chen, Z., Ren, Y., Ma, W. (2005). Site abstraction for rare category classification in large-scale web directory. In WWW: Special interest tracks & posters (pp. 1108–1109).Google Scholar
  16. Malik, H. (2010). Improving hierarchical svms by hierarchy flattening and lazy classification. In Large-Scale HC Workshop of ECIR.Google Scholar
  17. McCallum, A., Rosenfeld, R., Mitchell, T., Ng, A. (1998). Improving text classification by shrinkage in a hierarchy of classes. In ICML (pp. 359–367).Google Scholar
  18. Naik, A., & Rangwala, H. (2016a). Filter based taxonomy modification for improving hierarchical classification. arXiv:1603.00772.
  19. Naik, A., & Rangwala, H. (2016b). Inconsistent node flattening for improving top-down hierarchical classification. In IEEE DSAA (pp. 379–388).Google Scholar
  20. Naik, A., & Rangwala, H. (2017a). Hierflat: flattened hierarchies for improving top-down hierarchical classification. International Journal of Data Science and Analytics, 4(3), 191–208.CrossRefGoogle Scholar
  21. Naik, A., & Rangwala, H. (2017b). Integrated framework for improving large-scale hierarchical classification. In 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 281–288).Google Scholar
  22. Nitta, K. (2010). Improving taxonomies for large-scale hierarchical classifiers of web docs. In CIKM (pp. 1649–1652).Google Scholar
  23. Punera, K., Rajan, S., Ghosh, J. (2005). Automatically learning document taxonomies for hierarchical classification. In WWW: Special interest tracks & posters.Google Scholar
  24. Qi, X., & Davison, B. (2011). Hierarchy evolution for improved classification. In CIKM (pp. 2193–2196).Google Scholar
  25. Silla, C.N., Jr., & Freitas, A.A. (2011). A survey of hierarchical classification across different application domains. DMKD, 22(1-2), 31–72.MathSciNetzbMATHGoogle Scholar
  26. Steinbach, M., Ertöz, L., Kumar, V. (2004). The challenges of clustering high dimensional data. in new directions in statistical physics (pp. 273–309).Google Scholar
  27. Sun, A., & Lim, E. (2001). Hierarchical text classification and evaluation. In ICDM (pp. 521–528).Google Scholar
  28. Tang, L., Zhang, J., Liu, H. (2006). Acclimatizing taxonomic semantics for hierarchical content classification. In ACM SIGKDD (pp. 384–393).Google Scholar
  29. Vens, C., Struyf, J., Schietgat, L., džeroski, S., Blockeel, H. (2008). Decision trees for hierarchical multi-label classification. Machine Learning, 73(2), 185–214.CrossRefGoogle Scholar
  30. Victor, G. S., Antonia, P., Spyros, S. (2014). Csmr: A scalable algorithm for text clustering with cosine similarity and mapreduce. In IFIP International Conference on Artificial Intelligence Applications and Innovations (pp. 211–220): Springer.Google Scholar
  31. Wang, X., & Lu, B. (2010). Flatten hierarchies for large-scale hierarchical text categorization. In ICDIM (pp. 139–144).Google Scholar
  32. Wang, J., Shen, H.T., Song, J., Ji, J. (2014). Hashing for similarity search: A survey. arXiv:1408.2927.
  33. Xiao, L., Zhou, D., Wu, M. (2011). Hierarchical classification via orthogonal transfer. In ICML (pp. 801–808).Google Scholar
  34. Yang, Y., & Liu, X. (1999). A re-examination of text categorization methods. In ACM SIGIR (42–49).Google Scholar
  35. Zimek, A., Buchwald, F., Frank, E., Kramer, S. (2010). A study of hierarchical and flat classification of proteins. IEEE/ACM TCBB, 7(3), 563–571.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Microsoft CorporationRedmondUSA
  2. 2.George Mason UniversityFairfaxUSA

Personalised recommendations