Skip to main content
Log in

Hierarchical Multilabel Classification with Optimal Path Prediction

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We consider multilabel classification problems where the labels are arranged hierarchically in a tree or directed acyclic graph (DAG). In this context, it is of much interest to select a well-connected subset of nodes which best preserve the label dependencies according to the learned models. Top-down or bottom-up procedures for labelling the nodes in the hierarchy have recently been proposed, but they rely largely on pairwise interactions, thus susceptible to get stuck in local optima. In this paper, we remedy this problem by directly finding a small number of label paths that can cover the desired subgraph in a tree/DAG. To estimate the high-dimensional label vector, we adopt the advantages of partial least squares techniques which perform simultaneous projections of the feature and label space, while constructing sound linear models between them. We then show that the optimal label prediction problem with hierarchy constraints can be reasonably transformed into the optimal path prediction problem with the structured sparsity penalties. The introduction of path selection models further allows us to leverage the efficient network flow solvers with polynomial time complexity. The experimental results validate the promising performance of the proposed algorithm in comparison to the state-of-the-art algorithms on both tree- and DAG-structured data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Barros RC, Cerri R, Freitas AA, de Carvalho ACPLF (2013) Probabilistic clustering for hierarchical multi-label classification of protein functions. In: Machine learning and knowledge discovery in databases, proceedings, part II, pp 385–400

  2. Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Bioinformatics 22(7):830–836

    Article  Google Scholar 

  3. Bi W, Kwok JT (2011) Multi-label classification on tree- and dag-structured hierarchies. In: Proceedings of the 28th international conference on machine learning, pp 17–24

  4. Bi W, Kwok JT (2012) Hierarchical multilabel classification with minimum bayes risk. In: Proceedings of the 12th IEEE international conference on data mining, pp 101–110

  5. Bi W, Kwok JT (2014) Mandatory leaf node prediction in hierarchical multilabel classification. IEEE Trans Neural Netw Learn Syst 25(12):2275–2287

    Article  Google Scholar 

  6. Blockeel H, Schietgat L, Struyf J, Džeroski S, Clare A (2006) Decision trees for hierarchical multilabel classification: a case study in functional genomics. In: Proceedings of the 10th European conference on principles of data mining and knowledge discovery, pp 18–29

  7. Cerri R, Barros RC, de Carvalho ACPLF (2011) Hierarchical multi-label classification for protein function prediction: a local approach based on neural networks. In: Intelligent systems design and applications, pp 337–343

  8. Cerri R, Barros RC, de Carvalho ACPLF (2014) Hierarchical multi-label classification using local neural networks. J Comput Syst Sci 80:39–56

    Article  MathSciNet  MATH  Google Scholar 

  9. Cerri R, Barros RC, de Carvalho ACPLF (2015) Hierarchical classification of gene ontology-based protein functions with neural networks. In Proceedings of the 2015 international joint conference on neural networks, pp 1–8

  10. Cesa-bianchi N, Zaniboni L, Collins M (2004) Incremental algorithms for hierarchical classification. J Mach Learn Res 7:31–54

    MathSciNet  MATH  Google Scholar 

  11. Cesa-bianchi N, Gentile C, Zaniboni L (2006) Hierarchical classification: combining bayes with SVM. In: Proceedings of the 23rd international conference on machine learning, pp 177–184

  12. Clare A (2003) Machine learning and data mining for yeast functional genomics. Ph.D. Thesis, University of Wales, Aberystwyth

  13. Grauman K, Sha F, Hwang SJ (2011) Learning a tree of metrics with disjoint visual features. In: Advances in neural information processing systems 24, pp 621–629

  14. Hariharan B, Zelnik-Manor L, Vishwanathan SVN, Varma M (2010) Large scale max-margin multi-label classification with priors. In: Proceedings of the 27th international conference on machine learning, pp 423–430

  15. Hernandez J, Sucar LE, Morales EF (2013) A hybrid global-local approach for hierarchical classification. In: Proceedings of the twenty-sixty international Florida artificial intelligence research society conference, pp 432–437

  16. Kiritchenko S, Matwin S, Famili AF (2004) Hierarchical text categorization as a tool of associating genes with gene ontology codes. In: European workshop on data mining and text mining in bioinformatics, pp 30–34

  17. Ramírez-Corona M, Sucar LE, Morales EF (2014) Chained path evaluation for hierarchical multi-label classification. In Proceedings of the twenty-seventh international Florida artificial intelligence research society conference, pp 502–507

  18. Rosipal R, Krämer N (2006) Overview and recent advances in partial least squares. In: Subspace, latent structure and feature selection techniques, pp 34–51

  19. Rousu J, Saunders C, Szedmák S, Shawe-Taylor J (2006) Kernel-based learning of hierarchical multilabel classification models. J Mach Learn Res 7:1601–1626

    MathSciNet  MATH  Google Scholar 

  20. Silla CN Jr, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22(1–2):31–72

    Article  MathSciNet  MATH  Google Scholar 

  21. Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2):185–214

    Article  Google Scholar 

  22. Wang P, Zhang P, Guo L (2012) Mining multi-label data streams using ensemble-based active learning. In: Proceedings of the 12th SIAM international conference on data mining, pp 1131–1140

  23. Wold H (1975) Path models with latent variables: the nipals approach. In: Quantitative sociology: international perspectives on mathematical and statistical model building, pp 307–357

  24. Wold S, Martens H, Wold H (1983) The multivariate calibration problem in chemistry solved by the pls method. In: Matrix pencils, pp 286–293

  25. Zhou D, Xiao L, Wu M (2011) Hierarchical classification via orthogonal transfer. In: Proceedings of the 28th international conference on machine learning, pp 801–808

Download references

Acknowledgments

The authors thank the anonymous reviewers for their valuable comments. This research work is funded by the National Natural Science Foundation of China under Grant No. 61303179.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengya Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Z., Zhao, Y., Cao, D. et al. Hierarchical Multilabel Classification with Optimal Path Prediction. Neural Process Lett 45, 263–277 (2017). https://doi.org/10.1007/s11063-016-9526-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-016-9526-x

Keywords

Navigation