LabCor: Multi-label classification using a label correction strategy

Wu, Chengkai; Zhou, Tianshu; Wu, Junya; Tian, Yu; Li, Jingsong

doi:10.1007/s10489-021-02674-y

LabCor: Multi-label classification using a label correction strategy

Published: 12 August 2021

Volume 52, pages 5414–5434, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Chengkai Wu¹,
Tianshu Zhou²,
Junya Wu¹,
Yu Tian¹ &
…
Jingsong Li^1,2

678 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Multi-label classification is a branch of machine learning that can effectively reflect real-world problems. Among all the multi-label classification methods, stacked binary relevance (2BR) is a classic approach. Based on 2BR, a series of optimized algorithms have been derived. Although these algorithms have adapted complex optimizing processes and shown remarkable performance, their core concepts are rather similar, mainly involving restructuring the feature spaces of meta-classifiers. Existing research has rarely discussed that the use of inappropriate two-level predictions causes negative impacts on 2BR structures. In this study, we propose a 2BR-based label correction method named LabCor, which focuses on the identification and correction of unreliable two-level predictions. We first discuss the inner mechanism by which the 2BR-based algorithms obtain their two-level outputs and find a marker that can reflect the reliabilities of the samples predictions. Based on the mechanism, we then introduce a graph based output determination method that can use the training samples to generate dimensional decision patterns. The global label count distribution is also used to reflect the goal of classification problems. In the prediction phase, LabCor uses the decision patterns and label count constraints to identify and correct the misclassified labels. According to the evaluation results, the proposed method can effectively reduce the impact of troubling two-level predictions and yield superior or competitive performance versus well-established 2BR-based algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Article Open access 15 November 2019

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Code Availability

The source code used in this work is available on Github (https://github.com/KingsWoo/LabCor).

References

Alali A, Kubat M (2015) Prudent: a pruned and confident stacking approach for multi-label classification. IEEE Trans Knowl Data Eng 27(9):2480–2493
Article Google Scholar
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recog 37(9):1757–1771
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach learn Res 7(Jan):1–30
MathSciNet MATH Google Scholar
Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. In: Advances in neural information processing systems, pp 681–687
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Article MathSciNet Google Scholar
Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
Article Google Scholar
Godbole S, Sarawagi S (2004). Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 22–30
Gonçalves EC, Plastino A, Freitas AA (2015) Simpler is better: a novel genetic algorithm to induce compact multi-label chain classifiers. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp 559–566
Holm S (1979) A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6(2):65–70. Retrieved July 24, 2021, from http://www.jstor.org/stable/4615733
MathSciNet MATH Google Scholar
Huang J, Li G, Huang Q, Wu X (2017) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889
Article Google Scholar
Huang J, Li G, Wang S, Xue Z, Huang Q (2017) Multi-label classification by exploiting local positive and negative pairwise label correlation. Neurocomputing 257:164–174
Article Google Scholar
Jun X, Lu Y, Lei Z, Guolun D (2019) Conditional entropy based classifier chains for multi-label classification. Neurocomputing 335:185–194
Article Google Scholar
Kumar A, Vembu S, Menon AK, Elkan C (2013) Beam search algorithms for multilabel learning. Mach Learn 92(1):65– 89
Article MathSciNet Google Scholar
Li YK, Zhang ML (2014) Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In: Pacific Rim International Conference on Artificial Intelligence, Springer, pp 91–103
Liu H, Wang Z, Sun Y (2020) Stacking model of multi-label classification based on pruning strategies. Neural Comput & Applic 32:16763–16774. https://doi.org/10.1007/s00521-018-3888-0
Article Google Scholar
Luchi D, Varejao FM (2014) Recursive dependent binary relevance model for multi-label classification. In: Advances in artificial intelligence–IBERAMIA 2014, vol 8864, Springer, pp 206
Mencía EL, Fürnkranz J, Hüllermeier E, Rapp M (2018) Learning interpretable rules for multi-label classification. In: Explainable and Interpretable Models in Computer Vision and Machine Learning, Springer, pp 81–113
Montañes E, Senge R, Barranquero J, Quevedo JR, del Coz JJ, Hüllermeier E (2014) Dependent binary relevance models for multi-label classification. Pattern Recogn 47(3):1494–1508
Article Google Scholar
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33 (3):1065–1076
Article MathSciNet Google Scholar
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333
Article MathSciNet Google Scholar
Read J, Reutemann P, Pfahringer B, Holmes G (2016) Meka: a multi-label/multi-target extension to weka. J Mach Learn Res 17(1):667–671
MathSciNet MATH Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1-2):1–39
Article Google Scholar
Senge R, Del Coz JJ, Hüllermeier E (2014) On the problem of error propagation in classifier chains for multi-label classification. In: Data Analysis, Machine Learning and Knowledge Discovery, Springer, pp 163–170
da Silva PN, Gonçalves EC, Plastino A, Freitas AA (2014) Distinct chains for different instances: An effective strategy for multi-label classifier chains. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp 453–468
Trajdos P, Kurzynski M (2019) Dynamic classifier chains for multi-label learning. In: German Conference on Pattern Recognition, Springer, pp 567–580
Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: European conference on machine learning, Springer, pp 406–417
Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12(Jul):2411–2414
MathSciNet MATH Google Scholar
Varma S, Simon R (2006) Bias in error estimation when using cross-validation for model selection. BMC Bioinform 7(1): 1–8
Article Google Scholar
Weng W, Chen CL, Wu SX, Li YW, Wen J (2019) An efficient stacking model of multi-label classification based on pareto optimum. IEEE Access 7:127427–127437
Article Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article Google Scholar
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inform Syst 14(1):1– 37
Article Google Scholar
Zhang ML, Wu L (2014) Lift: Multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120
Article MathSciNet Google Scholar
Zhang ML, Zhou ZH (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Article Google Scholar
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recog 40(7):2038–2048
Article Google Scholar
Zhang ML, Zhou ZH (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Article Google Scholar
Zhang Y, Li Y, Cai Z (2015) Correlation-based pruning of dependent binary relevance models for multi-label classification. In: 2015 IEEE 14Th international conference on cognitive informatics & cognitive computing (ICCI* CC). IEEE, pp 399–404

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (No. 2018YFC0116901), the National Natural Science Foundation of China (No. 81771936), the Major Scientific Project of Zhejiang Lab, China (No. 2020ND8AD01), the Fundamental Research Funds for the Central Universities (No. 2021FZZX002-18) and the Youth Innovation Team Project of the College of Biomedical Engineering & Instrument Science, Zhejiang University.

Funding

This work was supported by the National Key Research and Development Program of China (No. 2018YFC0116901) and the National Natural Science Foundation of China (No.81771936), which provided financial support in the design of the study, analysis of data and writing of the manuscript; supported by the Major Scientific Project of Zhejiang Lab (No. 2018DG0ZX01), the Fundamental Research Funds for the Central Universities (No. 2021FZZX002-18) and the Youth Innovation Team Project of the College of Biomedical Engineering & Instrument Science, Zhejiang University, which provided financial support for the analysis.

Author information

Authors and Affiliations

Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China
Chengkai Wu, Junya Wu, Yu Tian & Jingsong Li
Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, 311100, China
Tianshu Zhou & Jingsong Li

Authors

Chengkai Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tianshu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Junya Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Tian
View author publications
You can also search for this author in PubMed Google Scholar
Jingsong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingsong Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and material

Datasets used in this work are available on the Mulan (http://mulan.sourceforge.net/datasets-mlc.html) and Meka (http://waikato.github.io/meka/datasets/) websites.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors Chengkai Wu and Tianshu Zhou contribute equally to this article.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, C., Zhou, T., Wu, J. et al. LabCor: Multi-label classification using a label correction strategy. Appl Intell 52, 5414–5434 (2022). https://doi.org/10.1007/s10489-021-02674-y

Download citation

Accepted: 05 July 2021
Published: 12 August 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10489-021-02674-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LabCor: Multi-label classification using a label correction strategy

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Learning from imbalanced data: open challenges and future directions

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Availability of data and material

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

LabCor: Multi-label classification using a label correction strategy

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Learning from imbalanced data: open challenges and future directions

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Availability of data and material

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation