On Aggregation in Ensembles of Multilabel Classifiers

Nguyen, Vu-Linh; Hüllermeier, Eyke; Rapp, Michael; Loza Mencía, Eneldo; Fürnkranz, Johannes

doi:10.1007/978-3-030-61527-7_35

Vu-Linh Nguyen¹²,
Eyke Hüllermeier¹²,
Michael Rapp¹³,
Eneldo Loza Mencía¹³ &
…
Johannes Fürnkranz¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12323))

Included in the following conference series:

International Conference on Discovery Science

1403 Accesses
4 Citations

Abstract

While a variety of ensemble methods for multilabel classification have been proposed in the literature, the question of how to aggregate the predictions of the individual members of the ensemble has received little attention so far. In this paper, we introduce a formal framework of ensemble multilabel classification, in which we distinguish two principal approaches: “predict then combine” (PTC), where the ensemble members first make loss minimizing predictions which are subsequently combined, and “combine then predict” (CTP), which first aggregates information such as marginal label probabilities from the individual ensemble members, and then derives a prediction from this aggregation. While both approaches generalize voting techniques commonly used for multilabel ensembles, they allow to explicitly take the target performance measure into account. Therefore, concrete instantiations of CTP and PTC can be tailored to concrete loss functions. Experimentally, we show that standard voting techniques are indeed outperformed by suitable instantiations of CTP and PTC, and provide some evidence that CTP performs well for decomposable loss functions, whereas PTC is the better choice for non-decomposable losses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
\(\llbracket \cdot \rrbracket \) is the indicator function, i.e., \(\llbracket A \rrbracket = 1\) if the predicate A is true and \(=0\) otherwise.
2.
http://mulan.sourceforge.net/datasets.html. The source code will be available at https://github.com/nvlml/DS2020-EMLC.

References

Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Cheng, W., Hüllermeier, E., Dembczyński, K.J.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning (ICML), pp. 279–286 (2010)
Google Scholar
Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88, 5–45 (2012). https://doi.org/10.1007/s10994-012-5285-8
Article MathSciNet MATH Google Scholar
Dembczyński, K., Waegeman, W., Hüllermeier, E.: An analysis of chaining in multi-label classification. In: Proceedings of the 20th European Conference on Artificial Intelligence (ECAI), pp. 294–299. IOS Press (2012)
Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Chapter Google Scholar
Gharroudi, O.: Ensemble multi-label learning in supervised and semi-supervised settings. Ph.D. Thesis, Université de Lyon (2017)
Google Scholar
Gharroudi, O., Elghazel, H., Aussem, A.: Ensemble multi-label classification: a comparative study on threshold selection and voting methods. In: Proceedings of the 27th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 377–384. IEEE Computer Society (2015)
Google Scholar
Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Ensembles of multi-objective decision trees. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 624–631. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_61
Chapter Google Scholar
Li, N., Zhou, Z.-H.: Selective ensemble of classifier chains. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 146–156. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38067-9_13
Chapter Google Scholar
Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)
Article Google Scholar
Moyano, J.M., Gibaja, E.L., Cios, K.J., Ventura, S.: Review of ensembles of multi-label classifiers: models, experimental study and prospects. Inf. Fusion 44, 33–45 (2018)
Article Google Scholar
Murthy, S.K.: Automatic construction of decision trees from data: A multi-disciplinary survey. Data Min. Knowl. Disc. 2(4), 345–389 (1998). https://doi.org/10.1023/A:1009744630224
Article Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). https://doi.org/10.1007/BF00116251
Article Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5782, pp. 254–269. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04174-7_17
Chapter Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011)
Article MathSciNet Google Scholar
Saha, S., Sarkar, D., Kramer, S.: Exploring multi-objective optimization for multi-label classifier ensembles. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC), pp. 2753–2760. IEEE, Wellington (2019)
Google Scholar
Shi, C., Kong, X., Fu, D., Yu, P.S., Wu, B.: Multi-label classification based on multi-objective optimization. ACM Trans. Intell. Syst. Technol. 5(2), 1–22 (2014)
Article Google Scholar
Shi, C., Kong, X., Yu, P.S., Wang, B.: Multi-label ensemble learning. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 223–239. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_15
Chapter Google Scholar
Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_38
Chapter Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Heidelberg (2009). https://doi.org/10.1007/978-0-387-09823-4_34
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2010)
Article Google Scholar
Waegeman, W., Dembczyńki, K., Jachnik, A., Cheng, W., Hüllermeier, E.: On the Bayes-optimality of F-measure maximizers. J. Mach. Learn. Res. 15(1), 3333–3388 (2014)
MathSciNet MATH Google Scholar
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the German Research Foundation (DFG) under grant number 400845550.

Author information

Authors and Affiliations

Heinz Nixdorf Institute and Department of Computer Science, Paderborn University, Paderborn, Germany
Vu-Linh Nguyen & Eyke Hüllermeier
Knowledge Engineering Group, TU Darmstadt, Darmstadt, Germany
Michael Rapp & Eneldo Loza Mencía
Computational Data Analytics Group, JKU Linz, Linz, Austria
Johannes Fürnkranz

Authors

Vu-Linh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Eyke Hüllermeier
View author publications
You can also search for this author in PubMed Google Scholar
Michael Rapp
View author publications
You can also search for this author in PubMed Google Scholar
Eneldo Loza Mencía
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Fürnkranz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vu-Linh Nguyen .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas
Open University of Cyprus, Nicosia, Cyprus
Yannis Manolopoulos
Dalhousie University, Halifax, NS, Canada
Stan Matwin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, VL., Hüllermeier, E., Rapp, M., Loza Mencía, E., Fürnkranz, J. (2020). On Aggregation in Ensembles of Multilabel Classifiers. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-61527-7_35
Published: 15 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61526-0
Online ISBN: 978-3-030-61527-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics