Skip to main content
Log in

LabCor: Multi-label classification using a label correction strategy

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-label classification is a branch of machine learning that can effectively reflect real-world problems. Among all the multi-label classification methods, stacked binary relevance (2BR) is a classic approach. Based on 2BR, a series of optimized algorithms have been derived. Although these algorithms have adapted complex optimizing processes and shown remarkable performance, their core concepts are rather similar, mainly involving restructuring the feature spaces of meta-classifiers. Existing research has rarely discussed that the use of inappropriate two-level predictions causes negative impacts on 2BR structures. In this study, we propose a 2BR-based label correction method named LabCor, which focuses on the identification and correction of unreliable two-level predictions. We first discuss the inner mechanism by which the 2BR-based algorithms obtain their two-level outputs and find a marker that can reflect the reliabilities of the samples predictions. Based on the mechanism, we then introduce a graph based output determination method that can use the training samples to generate dimensional decision patterns. The global label count distribution is also used to reflect the goal of classification problems. In the prediction phase, LabCor uses the decision patterns and label count constraints to identify and correct the misclassified labels. According to the evaluation results, the proposed method can effectively reduce the impact of troubling two-level predictions and yield superior or competitive performance versus well-established 2BR-based algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Code Availability

The source code used in this work is available on Github (https://github.com/KingsWoo/LabCor).

References

  1. Alali A, Kubat M (2015) Prudent: a pruned and confident stacking approach for multi-label classification. IEEE Trans Knowl Data Eng 27(9):2480–2493

    Article  Google Scholar 

  2. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recog 37(9):1757–1771

    Article  Google Scholar 

  3. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach learn Res 7(Jan):1–30

    MathSciNet  MATH  Google Scholar 

  4. Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. In: Advances in neural information processing systems, pp 681–687

  5. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  Google Scholar 

  6. Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153

    Article  Google Scholar 

  7. Godbole S, Sarawagi S (2004). Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 22–30

  8. Gonçalves EC, Plastino A, Freitas AA (2015) Simpler is better: a novel genetic algorithm to induce compact multi-label chain classifiers. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp 559–566

  9. Holm S (1979) A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6(2):65–70. Retrieved July 24, 2021, from http://www.jstor.org/stable/4615733

    MathSciNet  MATH  Google Scholar 

  10. Huang J, Li G, Huang Q, Wu X (2017) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889

    Article  Google Scholar 

  11. Huang J, Li G, Wang S, Xue Z, Huang Q (2017) Multi-label classification by exploiting local positive and negative pairwise label correlation. Neurocomputing 257:164–174

    Article  Google Scholar 

  12. Jun X, Lu Y, Lei Z, Guolun D (2019) Conditional entropy based classifier chains for multi-label classification. Neurocomputing 335:185–194

    Article  Google Scholar 

  13. Kumar A, Vembu S, Menon AK, Elkan C (2013) Beam search algorithms for multilabel learning. Mach Learn 92(1):65– 89

    Article  MathSciNet  Google Scholar 

  14. Li YK, Zhang ML (2014) Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In: Pacific Rim International Conference on Artificial Intelligence, Springer, pp 91–103

  15. Liu H, Wang Z, Sun Y (2020) Stacking model of multi-label classification based on pruning strategies. Neural Comput & Applic 32:16763–16774. https://doi.org/10.1007/s00521-018-3888-0

    Article  Google Scholar 

  16. Luchi D, Varejao FM (2014) Recursive dependent binary relevance model for multi-label classification. In: Advances in artificial intelligence–IBERAMIA 2014, vol 8864, Springer, pp 206

  17. Mencía EL, Fürnkranz J, Hüllermeier E, Rapp M (2018) Learning interpretable rules for multi-label classification. In: Explainable and Interpretable Models in Computer Vision and Machine Learning, Springer, pp 81–113

  18. Montañes E, Senge R, Barranquero J, Quevedo JR, del Coz JJ, Hüllermeier E (2014) Dependent binary relevance models for multi-label classification. Pattern Recogn 47(3):1494–1508

    Article  Google Scholar 

  19. Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33 (3):1065–1076

    Article  MathSciNet  Google Scholar 

  20. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333

    Article  MathSciNet  Google Scholar 

  21. Read J, Reutemann P, Pfahringer B, Holmes G (2016) Meka: a multi-label/multi-target extension to weka. J Mach Learn Res 17(1):667–671

    MathSciNet  MATH  Google Scholar 

  22. Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1-2):1–39

    Article  Google Scholar 

  23. Senge R, Del Coz JJ, Hüllermeier E (2014) On the problem of error propagation in classifier chains for multi-label classification. In: Data Analysis, Machine Learning and Knowledge Discovery, Springer, pp 163–170

  24. da Silva PN, Gonçalves EC, Plastino A, Freitas AA (2014) Distinct chains for different instances: An effective strategy for multi-label classifier chains. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp 453–468

  25. Trajdos P, Kurzynski M (2019) Dynamic classifier chains for multi-label learning. In: German Conference on Pattern Recognition, Springer, pp 567–580

  26. Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: European conference on machine learning, Springer, pp 406–417

  27. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12(Jul):2411–2414

    MathSciNet  MATH  Google Scholar 

  28. Varma S, Simon R (2006) Bias in error estimation when using cross-validation for model selection. BMC Bioinform 7(1): 1–8

    Article  Google Scholar 

  29. Weng W, Chen CL, Wu SX, Li YW, Wen J (2019) An efficient stacking model of multi-label classification based on pareto optimum. IEEE Access 7:127427–127437

    Article  Google Scholar 

  30. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  Google Scholar 

  31. Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inform Syst 14(1):1– 37

    Article  Google Scholar 

  32. Zhang ML, Wu L (2014) Lift: Multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120

    Article  MathSciNet  Google Scholar 

  33. Zhang ML, Zhou ZH (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351

    Article  Google Scholar 

  34. Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recog 40(7):2038–2048

    Article  Google Scholar 

  35. Zhang ML, Zhou ZH (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  36. Zhang Y, Li Y, Cai Z (2015) Correlation-based pruning of dependent binary relevance models for multi-label classification. In: 2015 IEEE 14Th international conference on cognitive informatics & cognitive computing (ICCI* CC). IEEE, pp 399–404

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (No. 2018YFC0116901), the National Natural Science Foundation of China (No. 81771936), the Major Scientific Project of Zhejiang Lab, China (No. 2020ND8AD01), the Fundamental Research Funds for the Central Universities (No. 2021FZZX002-18) and the Youth Innovation Team Project of the College of Biomedical Engineering & Instrument Science, Zhejiang University.

Funding

This work was supported by the National Key Research and Development Program of China (No. 2018YFC0116901) and the National Natural Science Foundation of China (No.81771936), which provided financial support in the design of the study, analysis of data and writing of the manuscript; supported by the Major Scientific Project of Zhejiang Lab (No. 2018DG0ZX01), the Fundamental Research Funds for the Central Universities (No. 2021FZZX002-18) and the Youth Innovation Team Project of the College of Biomedical Engineering & Instrument Science, Zhejiang University, which provided financial support for the analysis.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingsong Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and material

Datasets used in this work are available on the Mulan (http://mulan.sourceforge.net/datasets-mlc.html) and Meka (http://waikato.github.io/meka/datasets/) websites.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors Chengkai Wu and Tianshu Zhou contribute equally to this article.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, C., Zhou, T., Wu, J. et al. LabCor: Multi-label classification using a label correction strategy. Appl Intell 52, 5414–5434 (2022). https://doi.org/10.1007/s10489-021-02674-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02674-y

Keywords

Navigation