Improved Multi-label Propagation for Small Data with Multi-objective Optimization

Musayeva, Khadija; Binois, Mickaël

doi:10.1007/978-3-031-43421-1_17

Khadija Musayeva¹² &
Mickaël Binois¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14172))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

917 Accesses

Abstract

This paper focuses on multi-label learning from small amounts of labelled data. We demonstrate that the binary-relevance extension of the interpolated label propagation algorithm, the harmonic function, is a competitive learning method with respect to many widely-used evaluation measures. This is achieved by a new transition matrix that better captures the underlying structure useful for classification coupled with the use of data dependent thresholding strategies. Furthermore, we show that in the case of label dependence, one can use the outputs of a competitive learning model as part of the input to the harmonic function to improve the performance of this model. Finally, since we are using multiple measures to thoroughly evaluate the performance of the algorithm, we propose to use the game-theory based method of Kalai and Smorodinsky to output a single compromise solution for all measures. This method can be applied to any learning model irrespective of the number of evaluation metrics used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Inference Problem in Probabilistic Multi-label Classification

Attribute and label distribution driven multi-label active learning

Article 20 January 2022

Multi-objective Optimisation-Based Feature Selection for Multi-label Classification

Notes

1.
The appendices are available at github.com/kmusayeva/M-LP.
2.
To see this let (1, 0, 1, 1) be the true label set. Then, although both (1, 0, 0, 0) and (1, 1, 0, 1) predict two labels incorrectly, the latter provides a higher F1 value because it provides higher recall without much degrading the precision.
3.
The code and the data are available at github.com/kmusayeva/M-LP.
4.
All datasets except for Fungi are taken from https://www.uco.es/kdis/mllresources/. The Fungi dataset has been kindly provided to us by C. Averill [1].

References

Averill, C., Werbin, Z., Atherton, K., Bhatnagar, J., Dietze, M.: Soil microbiome predictability increases with spatial and taxonomic scale. Nature Ecol. Evol. 5(6), 747–756 (2021)
Article Google Scholar
Belkin, M., Niyogi, P.: Semi-supervised learning on Riemannian manifolds. Mach. Learn. 56(1), 209–239 (2004)
Article MATH Google Scholar
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7(11) (2006)
Google Scholar
Binois, M., Picheny, V., Taillandier, P., Habbal, A.: The Kalai-Smorodinsky solution for many-objective Bayesian optimization. J. Mach. Learn. Res. 21(150), 1–42 (2020)
MathSciNet MATH Google Scholar
Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: Advances in Neural Information Processing Systems. Citeseer (2002)
Google Scholar
Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76, 211–225 (2009)
Article MATH Google Scholar
Chung, F.: Spectral graph theory, vol. 92. American Mathematical Soc. (1997)
Google Scholar
Dembczyński, K., Jachnik, A., Kotlowski, W., Waegeman, W., Hüllermeier, E.: Optimizing the F-measure in multi-label classification: Plug-in rule approach versus structured loss minimization. In: International Conference on Machine Learning, pp. 1130–1138. PMLR (2013)
Google Scholar
Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, W.: Regret analysis for performance metrics in multi-label classification: the case of Hamming and subset zero-one loss. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 280–295. Springer (2010)
Google Scholar
Dembczyński, K.K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: ICML (2010)
Google Scholar
Dembczyński, K.K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence in multilabel classification. In: LastCFP: ICML Workshop on learning from multi-label data. Ghent University, KERMIT, Department of Applied Mathematics, Biometrics (2010)
Google Scholar
Doyle, P., Snell, J.: Random walks and electric networks, vol. 22. American Mathematical Soc. (1984)
Google Scholar
Fan, R., Lin, C.: A study on threshold selection for multi-label classification, pp. 1–23. Department of Computer Science, National Taiwan University pp (2007)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.H., Friedman, J.H.: The elements of statistical learning: data mining, inference, and prediction, vol. 2. Springer (2009)
Google Scholar
Kalai, E., Smorodinsky, M.: Other solutions to Nash’s bargaining problem. Econometrica: J. Econometric Soc., 513–518 (1975)
Google Scholar
Kong, X., Ng, M., Zhou, Z.: Transductive multilabel learning via label set propagation. IEEE Trans. Knowl. Data Eng. 25(3), 704–719 (2011)
Article Google Scholar
Leathart, T., Frank, E., Holmes, G., Pfahringer, B.: Probability calibration trees. In: Asian Conference on Machine Learning, pp. 145–160 (2017)
Google Scholar
Legendre, P., Gallagher, E.: Ecologically meaningful transformations for ordination of species data. Oecologia 129(2), 271–280 (2001)
Article Google Scholar
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Advances in neural information processing systems 14 (2001)
Google Scholar
Picheny, V., Binois, M.: GPGame: Solving Complex Game Problems using Gaussian Processes (2022). www.github.com/vpicheny/GPGame, R package version 1.2.0
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classifiers 10(3), 61–74 (1999)
Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2021). www.R-project.org/
Read, J., Pfahringer, B., Holmes, G.: Generating synthetic multi-label data streams. In: ECML/PKKD 2009 Workshop on Learning from Multi-label Data (MLD 2009), pp. 69–84. Citeseer (2009)
Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Article MathSciNet Google Scholar
Rivolli, A., de Carvalho, A.: The utiml package: Multi-label classification in R. R J. 10(2), 24 (2018)
Article Google Scholar
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Google Scholar
Sechidis, K., Tsoumakas, G., Vlahavas, I.: On the stratification of multi-label data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 145–158. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_10
Chapter Google Scholar
Shi, C., Kong, X., Yu, P., Wang, B.: Multi-objective multi-label classification. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 355–366. SIAM (2012)
Google Scholar
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. Math. Program. Comput. 12(4), 637–672 (2020)
Article MathSciNet MATH Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehousing Mining (IJDWM) 3(3), 1–13 (2007)
Article Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer (2009). https://doi.org/10.1007/978-0-387-09823-4_34
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2010)
Article Google Scholar
Wang, B., Tu, Z., Tsotsos, J.: Dynamic label propagation for semi-supervised multi-class multi-label classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 425–432 (2013)
Google Scholar
Wang, F., Zhang, C.: Label propagation through linear neighborhoods. IEEE Trans. Knowl. Data Eng. 20(1), 55–67 (2007)
Article Google Scholar
Yang, Y.: A study of thresholding strategies for text categorization. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 137–145 (2001)
Google Scholar
Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)
Google Scholar
Zhang, M., Li, Y., Yang, H., Liu, X.: Towards class-imbalance aware multi-label learning. IEEE Trans. Cybern. (2020)
Google Scholar
Zhang, M., Zhou: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Google Scholar
Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)
Article Google Scholar
Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, pp. 321–328 (2004)
Google Scholar
Zhou, D., Schölkopf, B.: Learning from labeled and unlabeled data using random walks. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 237–244. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28649-3_29
Chapter Google Scholar
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation (2002)
Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 912–919 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Université Côte d’Azur, Inria, CNRS, LJAD, Sophia Antipolis, France
Khadija Musayeva & Mickaël Binois

Authors

Khadija Musayeva
View author publications
You can also search for this author in PubMed Google Scholar
Mickaël Binois
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khadija Musayeva .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Musayeva, K., Binois, M. (2023). Improved Multi-label Propagation for Small Data with Multi-objective Optimization. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14172. Springer, Cham. https://doi.org/10.1007/978-3-031-43421-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-43421-1_17
Published: 18 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43420-4
Online ISBN: 978-3-031-43421-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Improved Multi-label Propagation for Small Data with Multi-objective Optimization

Abstract

Access this chapter

Similar content being viewed by others

Inference Problem in Probabilistic Multi-label Classification

Attribute and label distribution driven multi-label active learning

Multi-objective Optimisation-Based Feature Selection for Multi-label Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Improved Multi-label Propagation for Small Data with Multi-objective Optimization

Abstract

Access this chapter

Similar content being viewed by others

Inference Problem in Probabilistic Multi-label Classification

Attribute and label distribution driven multi-label active learning

Multi-objective Optimisation-Based Feature Selection for Multi-label Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation