Abstract
In conventional multi-label learning, each training instance is associated with multiple available labels. Nevertheless, real-world objects usually exhibit more sophisticated properties such as abundant irrelevant features, incomplete labels, noisy labels, as well as class imbalance. Unfortunately, most existing multi-label learning algorithms only discussed one of them and failed to consider the confounding effects of these factors, which will degrade the accuracy of multi-label classification. In this paper, we propose an integrated multi-label learning framework ML-INC that trains the multi-label model while addressing the aforementioned issues simultaneously. Specifically, we first decompose the observed label matrix into an incomplete ground-truth label matrix and a noisy label matrix by employing the low-rank and sparse decomposition scheme. Secondly, a label confidence matrix is learned to supplement the incomplete label matrix by utilizing the high-order label correlation and the label consistency. Additionally, the low-rank structure is adopted to capture the label correlation. Thirdly, a label regularization matrix is introduced to alleviate the effects of class imbalance in the label matrix, and a sparse constraint is imposed on the feature mapping matrix to select relevant discriminative features. Finally, the Alternating Direction Multiplier Method (ADMM) is employed to handle the optimization problem and comprehensive experiments are conducted to certify the effectiveness of the proposed method.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available in the mulan and meka repository, http://mulan.sourceforge.net/datasets and http://meka.sourceforge.net.
Notes
Let \(\mathcal {S}_{i}=\left \{j\lvert y_{ij}=1,j=1,\dots ,q\right \}\) denote the candidate label set of i-th example, which contains the ground-truth labels and noisy labels. The mainly difference between the concept here and [37] is that all ground-truth labels are not required to be in the candidate labels.
In MLL, a large-scale dataset usually means the data whose number of instances is more than 5000 [20].
References
Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685
Zhang ML, Zhou ZH (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Zhang Y, Wu J, Cai Z, Philip SY (2020) Multi-view multi-label learning with sparse feature selection for image annotation. IEEE Trans Multimed 22(11):2844–2857
Wang R, Ridley R, Qu W, Dai X et al (2021) A novel reasoning mechanism for multi-label text classification. Inf Process Manag 58(2):102441
Cerri R, Barros RC, PLF de Carvalho AC, Jin Y (2016) Reduction strategies for hierarchical multi-label classification in protein function prediction. BMC Bioinf 17(1):1–24
Zhang ML, Wu L (2014) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Liu XW, Zhu XZ, Li MM, Wang L, Zhu E, Liu TL, Kloft M, Shen DG, Yin JP, Gao W (2019) Multiple kernel k k-means with incomplete kernels. IEEE Trans Pattern Anal Mach Intell 42(5):1191–1204
Huang J, Li GR, Huang QM, Wu XD (2017) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889
Huang J, Li GR, Huang QM, Wu XD (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323
Zhu Y, Kwok JT, Zhou ZH (2017) Multi-label learning with global and local label correlation. IEEE Trans Knowl ta Eng 30(6):1081–1094
Teisseyre P (2021) Classifier chains for positive unlabelled multi-label learning. Knowl-Based Syst 213:106709
Huang J, Qin F, Zheng X, Cheng Z, Yuan Z, Zhang W, Huang Q (2019) Improving multi-label classification with missing labels by learning label-specific features. Inform Sci 492:124–146
Guan YY, Li WH, Zhang BX, Han B, Ji ML (2021) Multi-label classification by formulating label-specific features from simultaneous instance level and feature level. Appl Intell 51(6):3375–3390
Sun L, Yin TY, Ding WP, Qian YH, Xu JC (2021) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211. https://ieeexplore.ieee.org/abstract/document/9333666/
Charte F, Rivera AJ, Del Jesus MJ (2016) Multilabel classification: problem analysis, metrics and techniques. Springer
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: International conference on neural information processing systems: natural and synthetic, pp 681–687
Sun LJ, Ye P, Lyu GY, Feng SH, Dai GJ, Zhang H (2020) Weakly-supervised multi-label learning with noisy features and incomplete labels. Neurocomputing 413:61–71
Tan AH, Ji XW, Liang JY, Tao YZ, Wu WZ, Pedrycz W (2022) Weak multi-label learning with missing labels via instance granular discrimination. Inf Sci 594:200–216. https://doi.org/10.1016/j.ins.2022.02.011
Zhang J, Li SZ, Jiang M, Tan KC (2020) Learning from weakly labeled data based on manifold regularized sparse model. IEEE Trans Cybern 52(5):3841–3854
Tan AH, Liang JY, Wu WZ, Zhang J (2022) Semi-supervised partial multi-label classification via consistency learning. Pattern Recognit:108839
Sun LJ, Lyu GY, Feng SH, Huang XK (2021) Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels. Appl Intell 51(3):1552–1564
Sun L, Li MM, Ding WP, Zhang E, Mu XX, Xu JC (2022) Afnfs: adaptive fuzzy neighborhood-based feature selection with adaptive synthetic over-sampling for imbalanced data. Inform Sci 612:724–744. https://doi.org/10.1016/j.ins.2022.08.118
Braytee A, Liu W, Anaissi A, Kennedy PJ (2019) Correlated multi-label classification with incomplete label space and class imbalance. ACM Trans Intell Syst Technol (TIST) 10(5):1–26
Han M, Zhang H (2022) Multiple kernel learning for label relation and class imbalance in multi-label learning. Inform Sci 613:344–356
Bucak SS, Jin R, Jain AK (2011) Multi-label learning with incomplete class assignments. In: CVPR 2011. IEEE, pp 2801–2808
Kong XN, Wu ZM, Li LJ, Zhang RF, Yu PS, Wu H, Fan W (2014) Large-scale multi-label learning with incomplete label assignments. In: Proceedings of the 2014 SIAM international conference on data mining. SIAM, pp 920–928
He Z-F, Yang M, Gao Y, Liu H-D, Yin Y (2019) Joint multi-label classification and label correlations with missing labels and feature selection. Knowl-Based Syst 163:145–158
Wu BY, Jia F, Liu W, Ghanem B, Lyu SW (2018) Multi-label learning with missing labels using mixed dependency graphs. Int J Comput Vis 126(8):875–896
Xu M, Jin R, Zhou ZH (2013) Speedup matrix completion with side information: application to multi-label learning. Adv Neural Inf Process Syst 26:2301–2309
Liu B, Li Y, Xu Z (2018) Manifold regularized matrix completion for multi-label learning with admm. Neural Netw 101:57–67
Tan QY, Yu GX, Domeniconi C, Wang J, Zhang ZL (2018) Multi-view weak-label learning based on matrix completion. In: Proceedings of the 2018 SIAM international conference on data mining, pp 450–458
Wang R, Ye S, Li K, Kwong S (2021) Bayesian network based label correlation analysis for multi-label classifier chain. Inform Sci 554:256–275
Vasisht D, Damianou A, Varma M, Kapoor A (2014) Active learning for sparse bayesian multilabel classification. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 472–481
Li X, Zhao FP, Guo YH (2015) Conditional restricted boltzmann machines for multi-label learning with incomplete labels. In: Artificial intelligence and statistics. PMLR, pp 635–643
Xie MK, Huang SJ (2018) Partial multi-label learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 6454–6461
Yu GX, Chen X, Domeniconi C, Wang J, Li Z, Zhang ZL, Wu XD (2018) Feature-induced partial multi-label learning. In: 2018 IEEE international conference on data mining (ICDM), pp 1398–1403
Xie MK, Huang SJ (2021) Partial multi-label learning with noisy label identification. IEEE Trans Pattern Anal Mach Intell 44(7):3676–3687. https://doi.org/10.1109/TPAMI.2021.3059290
Li ZW, Lyu GY, Feng SH (2020) Partial multi-label learning via multi-subspace representation. In: IJCAI, pp 2612–2618
Zhang ML, Fang JP (2020) Partial multi-label learning via credible label elicitation. IEEE Trans Pattern Anal Mach Intell 43(10):3587–3599
Liu WW, Wang HB, Shen XB, Tsang I (2021) The emerging trends of multi-label learning. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3119334
Weng W, Lin YJ, Wu SX, Li YW, Kang Y (2018) Multi-label learning based on label-specific features and local pairwise label correlation. Neurocomputing 273:385–394
Dembczynski K, Jachnik A, Kotlowski W, Waegeman W, Hüllermeier E (2013) Optimizing the f-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: International conference on machine learning. PMLR, pp 1130–1138
Guo HX, Li YJ, Shang J, Gu MY, Huang YY, Gong B (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239
Chen K, Lu BL, Kwok JT (2006) Efficient classification of multi-label and imbalanced data using min-max modular classifiers. In: The 2006 IEEE international joint conference on neural network proceedings. IEEE, pp 1770–1775
Wu BY, Lyu SW, Ghanem B (2016) Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 30
Zhang ML, Li YK, Yang H, Liu XY (2020) Towards class-imbalance aware multi-label learning. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2020.3027509
Xu LL, Wang Z, Shen ZF, Wang YB, Chen EH (2014) Learning low-rank label correlations for multi-label classification with missing labels. In: 2014 IEEE international conference on data mining. IEEE, pp 1067–1072
Wang J, Jebara T, Chang S-F (2008) Graph transduction via alternating minimization. In: Proceedings of the 25th international conference on machine learning, pp 1144–1151
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Lin ZC, Liu RS, Su ZX (2011) Linearized alternating direction method with adaptive penalty for low-rank representation. Adv Neural Inf Process Syst 24:612–620
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202
Wang L, Hu JF, Chen CZ (2014) On accelerated singular value thresholding algorithm for matrix completion. Appl Math 5(21):3445
Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163:3–16
Hosseini Akbarnejad A, Soleymani Baghshah M (2019) An efficient large-scale semi-supervised multi-label classifier capable of handling missing labels. IEEE Trans Knowl Data Eng 31(2):229–242
Ma ZC, Chen SC (2021) Expand globally, shrink locally: discriminant multi-label learning with missing labels. Pattern Recognit 111:107675
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Acknowledgements
This work was supported by National Natural Science Foundation of China under Grants 62076221 and 61976194.
Author information
Authors and Affiliations
Contributions
Xiaowan Ji: Methodology, Original draft preparation, Investigation, Software, Validation, Review and editing.
Anhui Tan: Conceptualization, Software, Supervision, Review and editing.
Wei-Zhi Wu: Resources, Review and editing.
Shenming Gu: Conceptualization, Supervision, Formal analysis.
Corresponding author
Ethics declarations
Competing of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ji, X., Tan, A., Wu, WZ. et al. Multi-label classification with weak labels by learning label correlation and label regularization. Appl Intell 53, 20110–20133 (2023). https://doi.org/10.1007/s10489-023-04562-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04562-z