Skip to main content
Log in

Multi-label feature selection via spectral clustering-based label enhancement and manifold distribution consistency

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Multi-label feature selection can effectively improve the performance and efficiency of subsequent learning tasks by selecting important features within multi-label data. However, for handling multiple labels, many approaches group them to gather insights for label fusion, but ignore the different importance of these label groups and treat them equally, which seems unfair to individual label groups and fails to consider their distinct significances. Moreover, for handling the relationship between features and labels, many multi-label feature selection methods efficiently achieve linear fitting of features and labels through manifold learning, but ignore fitting spatial distribution between feature space and label space. Motivated by these, this paper integrates label distribution learning and spectral clustering to evaluate the unique significance of each label group and construct an improved label space, which is then aligned with the feature space through manifold distribution consistency for multi-label feature selection. First, we propose a hypothetical model indicating the existence of a relationship among labels, wherein this relationship involves clustering subordinate labels around a central core label. On this basis, we employ spectral clustering to generate distinct label clusters by integrating density peaks, thereafter combining this with label distribution learning to assess the significance of each cluster. Then, we design a manifold distribution consistency evaluation, i.e., quantifying the structural disparity between feature space and the enhanced label space achieved through spectral clustering-based label enhancement strategy, so as to obtain a low-dimensional feature space and the optimal feature subset. Finally, experimental results showcase the superiority of our proposed multi-label feature selection algorithm when compared with five other algorithms, across several datasets from diverse domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data is provided within the manuscript.

References

  1. Al-Salemi B, Noah SAM, Ab Aziz MJ (2016) RFBoost: an improved multi-label boosting algorithm and its application to text categorisation. Knowl-Based Syst 103:104–117

    Article  Google Scholar 

  2. Al-Salemi B, Ayob M, Noah SAM (2018) Feature ranking for enhancing boosting-based multi-label text categorization. Expert Syst Appl 113:531–543

    Article  Google Scholar 

  3. Burkhardt S, Kramer S (2018) Online multi-label dependency topic models for text classification. Mach Learn 107:859–886

    Article  MathSciNet  Google Scholar 

  4. Gargiulo F, Silvestri S, Ciampi M et al (2019) Deep neural network for hierarchical extreme multi-label text classification. Appl Soft Comput 79:125–138

    Article  Google Scholar 

  5. Liu Y, Wen KW, Gao QX et al (2018) SVM based multi-label learning with missing labels for image annotation. Pattern Recognit 78:307–317

    Article  Google Scholar 

  6. Su JH, Chou CL, Lin CY et al (2011) Effective semantic annotation by image-to-concept distribution model. IEEE Trans Multimed 13(3):530–538

    Article  Google Scholar 

  7. Song LY, Liu J, Qian BY et al (2018) A deep multi-modal CNN for multi-instance multi-label image classification. IEEE Trans Image Process 27(12):6025–6038

    Article  MathSciNet  Google Scholar 

  8. Fakhari A, Moghadam AME (2013) Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl Soft Comput 13(2):1292–1302

    Article  Google Scholar 

  9. Elisseeff A, Weston JA (2001) Kernel method for multi-labelled classification. In: Advances in international conference on neural information processing systems: natural and synthetic, pp 681–687

  10. Liu L, Tang L, Jin X et al (2019) A multi-label supervised topic model conditioned on arbitrary features for gene function prediction. Genes 10(1):57

    Article  Google Scholar 

  11. Zhang JP, Zhang ZP, Wang ZX et al (2018) Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification. Bioinformatics 34(10):1750–1757

    Article  MathSciNet  Google Scholar 

  12. Xu YH, Min HQ, Song HJ et al (2016) Multi-instance multi-label distance metric learning for genome-wide protein function prediction. Comput Biol Chem 63:30–40

    Article  Google Scholar 

  13. Del Giudice M (2021) Effective dimensionality: a tutorial. Multivar Behav Res 56(3):527–542

    Article  MathSciNet  Google Scholar 

  14. Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  15. Newton S, Cherman EA, Monard MC et al (2013) A comparison of multi-label feature selection methods using the problem transformation approach. Electron Notes Theor Comput Sci 292:135–151

    Article  Google Scholar 

  16. Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34(3):349–357

    Article  Google Scholar 

  17. Rahmaninia M, Moradi P (2018) OSFSMI: online stream feature selection method based on mutual information. Appl Soft Comput 68:733–746

    Article  Google Scholar 

  18. Lee J, Kim DW (2015) Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognit 48(9):2761–2771

    Article  Google Scholar 

  19. Cai Y, Yang M, Gao Y et al (2015) ReliefF-based multi-label feature selection. Int J Database Theory Appl 8:307–318

    Article  Google Scholar 

  20. Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning, pp 171–182

  21. Xin G (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748

    Article  Google Scholar 

  22. Qian W, Long X, Wang Y et al (2020) Multi-label feature selection based on label distribution and feature complementarity. Appl Soft Comput 90:106167

    Article  Google Scholar 

  23. Geng X, Xia Y et al (2022) Head pose estimation based on multivariate label distribution. IEEE Trans Pattern Anal Mach Intell 44(4):1974–1991

    Article  Google Scholar 

  24. He JH, Hu CL, Wang LJ (2023) Facial age estimation based on asymmetrical label distribution. Multimed Syst 29(2):753–762

    Article  Google Scholar 

  25. Chen JY, Guo C, Xu RY et al (2022) Toward children’s empathy ability analysis: joint facial expression recognition and intensity estimation using label distribution learning. IEEE Trans Ind Inform 18(1):16–25

    Article  Google Scholar 

  26. Xu N, Liu YP, Geng X (2021) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643

    Article  Google Scholar 

  27. Xu N, Shu J, Liu YP et al (2020) Variational label enhancement. In: Proceedings of the 37th international conference on machine learning, vol 119, pp 10597–10606

  28. Xu N, Qiao C, Geng X et al (2021) Instance-dependent partial label learning. Adv Neural Inf Process Syst 34:27119–27130

    Google Scholar 

  29. Xu N, Qiao C, Lv J et al (2022) One positive label is sufficient: single-positive multi-label learning with label enhancement. Adv Neural Inf Process Syst 35:21765–21776

    Google Scholar 

  30. Zhang P, Gao W, Hu J et al (2020) Multi-label feature selection based on the division of label topics. Inf Sci 553(10):129–153

    MathSciNet  Google Scholar 

  31. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  32. Liu R, Huang W, Fei Z et al (2019) Constraint-based clustering by fast search and find of density peaks. Neurocomputing 330:223–237

    Article  Google Scholar 

  33. Hu Q, Zhang L, Zhang D et al (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750

    Article  Google Scholar 

  34. Wang T, Ji ZX, Yang J et al (2021) Global manifold learning for interactive image segmentation. IEEE Trans Multimed 23:3239–3249

    Article  Google Scholar 

  35. Tan C, Chen S, Ji GL et al (2022) Multilabel distribution learning based on multioutput regression and manifold learning. IEEE Trans Cybern 52(6):5064–5078

    Article  Google Scholar 

  36. Eybpoosh K, Rezghi M, Heydari A (2022) Applying inverse stereographic projection to manifold learning and clustering. Appl Intell 52(4):4443–4457

    Article  Google Scholar 

  37. Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Mach Learn Cybern 9(8):1321–1334

    Article  Google Scholar 

  38. Hu J, Li Y, Gao W et al (2020) Robust multi-label feature selection with dual-graph regularization. Knowl-Based Syst 203:106126

    Article  Google Scholar 

  39. Jian L, Li J, Shu K et al (2016) Multi-label informed feature selection. In: International Joint Conference on Artificial Intelligence, pp 1627–1633

  40. Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2020) MFS-MCDM: multi-label feature selection using multi-criteria decision making. Knowl-Based Syst 206:106365

    Article  Google Scholar 

  41. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J et al (2011) MULAN: a java library for multi-label learning. J Mach Learn Res 12:2411–2414

    MathSciNet  Google Scholar 

  42. Multi-label classification dataset repository. http://www.uco.es/kdis/mllresources

  43. Zhang ML, Zhou ZH (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048

    Article  Google Scholar 

  44. Chen L, Chen D, Wang H (2018) Alignment based feature selection for multi-label learning. Neural Process Lett 50:2323–2344

    Article  Google Scholar 

  45. Yu K, Yu S, Tresp V (2005) Multi-label informed latent semantic indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp 258–265

  46. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China (Nos. 62266018 and 61966016), Natural Science Foundation of Jiangxi Province (No. 20232BAB202052), and Jiangxi Postgraduate Innovation Fund Project (YC2022-s547).

Author information

Authors and Affiliations

Authors

Contributions

Wenhao Shu: Conceptualization, Formal analysis; Dongtao Cao: Data curation, Software, Writing; Wenbin Qian: Visualization, Review.

Corresponding author

Correspondence to Dongtao Cao.

Ethics declarations

Conflict of interest

All the authors do not have any possible Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shu, W., Cao, D. & Qian, W. Multi-label feature selection via spectral clustering-based label enhancement and manifold distribution consistency. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02181-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13042-024-02181-9

Keywords

Navigation