Skip to main content

Learning Gradient Boosted Multi-label Classification Rules

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Abstract

In multi-label classification, where the evaluation of predictions is less straightforward than in single-label classification, various meaningful, though different, loss functions have been proposed. Ideally, the learning algorithm should be customizable towards a specific choice of the performance measure. Modern implementations of boosting, most prominently gradient boosted decision trees, appear to be appealing from this point of view. However, they are mostly limited to single-label classification, and hence not amenable to multi-label losses unless these are label-wise decomposable. In this work, we develop a generalization of the gradient boosting framework to multi-output problems and propose an algorithm for learning multi-label classification rules that is able to minimize decomposable as well as non-decomposable loss functions. Using the well-known Hamming loss and subset 0/1 loss as representatives, we analyze the abilities and limitations of our approach on synthetic data and evaluate its predictive performance on multi-label benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An implementation is available at https://www.github.com/mrapp-ke/Boomer.

  2. 2.

    Data sets are available at http://mulan.sourceforge.net/datasets-mlc.html and https://sourceforge.net/projects/meka/files/Datasets.

References

  1. Amit, Y., Dekel, O., Singer, Y.: A boosting algorithm for label covering in multilabel problems. In: In Proceedings of International Conference AI and Statistics (AISTATS), pp. 27–34 (2007)

    Google Scholar 

  2. Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems 28, pp. 730–738. Curran Associates, Inc. (2015)

    Google Scholar 

  3. Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings 22nd International Conference on Knowledge Discovery and Data Mining (KDD), p. 785–794 (2016)

    Google Scholar 

  4. Cheng, W., Hüllermeier, E., Dembczyński, K.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of 27th International Conference on Machine Learning (ICML), pp. 279–286 (2010)

    Google Scholar 

  5. Dembczyński, K., Kotłowski, W., Hüllermeier, E.: Consistent multilabel ranking through univariate losses. In: Proceedings of 29th International Conference on Machine Learning (ICML), pp. 1319–1326. Omnipress (2012)

    Google Scholar 

  6. Dembczyński, K., Kotłowski, W., Słowiński, R.: ENDER: a statistical framework for boosting decision rules. Data Min. Knowl. Discov. 21(1), 52–90 (2010)

    Article  MathSciNet  Google Scholar 

  7. Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88(1–2), 5–45 (2012)

    Article  MathSciNet  Google Scholar 

  8. Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 28(2), 337–407 (2000)

    Article  MathSciNet  Google Scholar 

  9. Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2, 916–954 (2008)

    Article  MathSciNet  Google Scholar 

  10. Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Berlin (2012)

    Book  Google Scholar 

  11. Gao, W., Zhou, Z.H.: On the consistency of multi-label learning. Artif. Intell. 199–200, 22–44 (2013)

    Article  MathSciNet  Google Scholar 

  12. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2009)

    Book  Google Scholar 

  13. Johnson, M., Cipolla, R.: Improved image annotation and labelling through multi-label boosting. In: Proceedings of British Machine Vision Conference (BMVC) (2005)

    Google Scholar 

  14. Joly, A., Wehenkel, L., Geurts, P.: Gradient tree boosting with random output projections for multi-label classification and multi-output regression. arXiv preprint arXiv:1905.07558 (2019)

  15. Jung, Y.H., Tewari, A.: Online boosting algorithms for multi-label ranking. In: Proceedings of 21st International Conference on AI and Statistics (AISTATS), pp. 279–287 (2018)

    Google Scholar 

  16. Nam, J., Loza Mencía, E., Kim, H.J., Fürnkranz, J.: Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp. 5419–5429 (2017)

    Google Scholar 

  17. Pillai, I., Fumera, G., Roli, F.: Designing multi-label classifiers that maximize F measures: state of the art. Pattern Recogn. 61, 394–404 (2017)

    Article  Google Scholar 

  18. Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5782, pp. 254–269. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04174-7_17

    Chapter  Google Scholar 

  19. Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)

    Article  Google Scholar 

  20. Senge, R., del Coz, J.J., Hüllermeier, E.: Rectifying classifier chains for multi-label classification. In: Proceedings Lernen, Wissen & Adaptivität, pp. 151–158 (2013)

    Google Scholar 

  21. Si, S., Zhang, H., Keerthi, S.S., Mahajan, D., Dhillon, I.S., Hsieh, C.J.: Gradient boosted decision trees for high dimensional sparse output. In: Proceedings of 34th International Conference on Machine Learning (ICML), pp. 3182–3190 (2017)

    Google Scholar 

  22. Tsamardinos, I., Greasidou, E., Borboudakis, G.: Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach. Learn. 107(12), 1895–1922 (2018). https://doi.org/10.1007/s10994-018-5714-4

    Article  MathSciNet  MATH  Google Scholar 

  23. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2010) https://doi.org/10.1007/978-0-387-09823-4_34

  24. Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)

    Article  Google Scholar 

  25. Zhang, Z., Jung, C.: GBDT-MO: Gradient boosted decision trees for multiple outputs. arXiv preprint arXiv:1909.04373 (2019)

Download references

Acknowledgments

This work was supported by the German Research Foundation (DFG) under grant number 400845550. Computations were conducted on the Lichtenberg high performance computer of the TU Darmstadt.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Rapp .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rapp, M., Mencía, E.L., Fürnkranz, J., Nguyen, VL., Hüllermeier, E. (2021). Learning Gradient Boosted Multi-label Classification Rules. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67664-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67663-6

  • Online ISBN: 978-3-030-67664-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics