Skip to main content
Log in

A bipartite matching-based feature selection for multi-label learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Many real-world data have multiple class labels known as multi-label data, where the labels are correlated with each other, and as such, they are not independent. Since these data are usually high-dimensional, and the current multi-label feature selection methods have not been precise enough, then a new feature selection method is necessary. In this paper, for the first time, we have modeled the problem of multi-label feature selection to a bipartite graph matching process. The proposed method constructs a bipartite graph of features (as the left vertices) and labels (as the right vertices), called Feature-Label Graph (FLG), where each feature is connected to the set of labels, where the weight of the edge between each feature and label is equal to their correlation. Then, the Hungarian algorithm estimates the best matching in FLG. The selected features in each matching are sorted by weighted correlation distance and added to the ranking vector. To select the discriminative features, the proposed method considers both the redundancy of features and the relevancy of each feature to the class labels. The results indicate the superiority of the proposed method against the other methods in classification measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://mulan.sourceforge.net/datasets.html

References

  1. Arslan S, Ozturk C (2019) Multi hive artificial bee colony programming for high dimensional symbolic regression with feature selection. Appl Soft Comput J 78:515–527. https://doi.org/10.1016/j.asoc.2019.03.014

    Article  Google Scholar 

  2. Bayati H, Dowlatshahi MB, Paniri M (2020) MLPSO: a filter multi-label feature selection based on particle swarm optimization. In: 2020 25th International Computer Conference, Computer Society of Iran (CSICC). IEEE, pp 1–6

  3. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: A new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077

    Article  Google Scholar 

  4. Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci (Ny) 512:795–812. https://doi.org/10.1016/j.ins.2019.10.022

    Article  MathSciNet  Google Scholar 

  5. Cherman EA, Spolaôr N, Valverde-Rebaza J, Monard MC (2015) Lazy Multi-label learning algorithms based on mutuality strategies. J Intell Robot Syst Theory Appl 80:261–276. https://doi.org/10.1007/s10846-014-0144-4

    Article  Google Scholar 

  6. Coakley CW, Conover WJ (2000) Practical nonparametric statistics. J Am Stat Assoc 95:332. https://doi.org/10.2307/2669565

    Article  Google Scholar 

  7. Doquire G, Verleysen M (2011) Feature selection for multi-label classification problems. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 9–16

  8. Dowlatshahi MB, Derhami V (2019) Winner determination in combinatorial auctions using hybrid ant colony optimization and multi-neighborhood local search. J AI Data Min 5:169–181. https://doi.org/10.22044/jadm.2017.880

  9. Dowlatshahi MB, Derhami V, Nezamabadi-pour H (2018) A novel three-stage filter-wrapper framework for miRNA subset selection in cancer classification. Informatics. https://doi.org/10.3390/informatics5010013

    Article  Google Scholar 

  10. Dowlatshahi MB, Derhami V, Nezamabadi-Pour H (2020) Fuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization. Iran J Fuzzy Syst 17:7–24. https://doi.org/10.22111/ijfs.2020.5403

  11. Dowlatshahi MB, Derhami V, Nezamabadi-Pour H (2017) Ensemble of filter-based rankers to guide an epsilon-greedy swarm optimizer for high-dimensional feature subset selection. Inf. https://doi.org/10.3390/info8040152

    Article  Google Scholar 

  12. Dowlatshahi MB, Nezamabadi-Pour H (2014) GGSA: a grouping gravitational search algorithm for data clustering. Eng Appl Artif Intell 36:114–121. https://doi.org/10.1016/j.engappai.2014.07.016

    Article  Google Scholar 

  13. Dowlatshahi MB, Nezamabadi-Pour H, Mashinchi M (2014) A discrete gravitational search algorithm for solving combinatorial optimization problems. Inf Sci (Ny) 258:94–107. https://doi.org/10.1016/j.ins.2013.09.034

    Article  MathSciNet  MATH  Google Scholar 

  14. Dowlatshahi MB, Rezaeian M (2016) Training spiking neurons with gravitational search algorithm for data classification. In: 1st conference on swarm intelligence and evolutionary computation, CSIEC 2016—Proceedings. pp 53–58

  15. Duan R, Su HH (2012) A scaling algorithm for maximum weight matching in bipartite graphs. In: proceedings of the annual ACM-SIAM symposium on discrete algorithms, pp 1413–1424

  16. Ventura JS, Cano A (2020) Distributed multi-label feature selection using individual mutual information measures. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105052

    Article  Google Scholar 

  17. Hashemi A, Dowlatshahi MB (2020) MLCR: a fast multi-label feature selection method based on K-means and L2-norm. In: 2020 25th international computer conference, Computer Society of Iran (CSICC). IEEE, pp 1–7

  18. Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2020) MGFS: a multi-label graph-based feature selection algorithm via pagerank centrality. Expert Syst Appl 142:113024. https://doi.org/10.1016/j.eswa.2019.113024

    Article  Google Scholar 

  19. Hastie T, Tibshirani R, Friedman J, Franklin J (2017) The elements of statistical learning: data mining, inference, and prediction. Math Intell. https://doi.org/10.1007/BF02985802

    Article  MATH  Google Scholar 

  20. Huang R, Jiang W, Sun G (2018) Manifold-based constraint Laplacian score for multi-label feature selection. Pattern Recognit Lett 112:346–352. https://doi.org/10.1016/j.patrec.2018.08.021

    Article  Google Scholar 

  21. Kashef S, Nezamabadi-pour H, Nikpour B (2018) Multilabel feature selection: a comprehensive review and guiding experiments. Wiley Interdiscip Rev Data Min Knowl Discov 8:e1240. https://doi.org/10.1002/widm.1240

    Article  Google Scholar 

  22. Kashef S, Nezamabadi-Pour H, Nikpour B (2018b) FCBF3Rules: a feature selection method for multi-label datasets. In: 3rd conference on swarm intelligence and evolutionary computation (CSIEC). IEEE, pp 1–5

  23. Kuhn HW (2010) The hungarian method for the assignment problem. In: 50 years of integer programming 1958–2008: From the early years to the state-of-the-art. Springer, Berlin, pp 29–47

  24. Lee J, Kim D-W (2015) Mutual Information-based multi-label feature selection using interaction information. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2014.09.063

    Article  Google Scholar 

  25. Li J, Cheng K, Wang S et al (2017) Feature selection: a data perspective. ACM Comput Surv. https://doi.org/10.1145/3136625

    Article  Google Scholar 

  26. Liu H, Yang Y (2015) Bipartite edge prediction via transductive learning over product graphs. In: 32nd International Conference on Machine Learning, ICML 2015. pp 1880–1888

  27. Livi L, Rizzi A (2013) The graph matching problem. Pattern Anal Appl 16:253–283. https://doi.org/10.1007/s10044-012-0284-8

    Article  MathSciNet  MATH  Google Scholar 

  28. Miao J, Niu L (2016) A survey on feature selection. Procedia Comput Sci 91:919–926. https://doi.org/10.1016/j.procs.2016.07.111

    Article  Google Scholar 

  29. Momeni E, Dowlatshahi MB, Omidinasab F et al (2020) Gaussian process regression technique to estimate the pile bearing capacity. Arab J Sci Eng. https://doi.org/10.1007/s13369-020-04683-4

    Article  Google Scholar 

  30. Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl Math 5:32–38. https://doi.org/10.1137/0105003

    Article  MathSciNet  MATH  Google Scholar 

  31. Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2020) MLACO: a multi-label feature selection algorithm based on ant colony optimization. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105285

    Article  Google Scholar 

  32. Pereira RB, Plastino A, Zadrozny B, Merschmann LHC (2018) Categorizing feature selection methods for multi-label classification. Artif Intell Rev 49:57–78. https://doi.org/10.1007/s10462-016-9516-4

    Article  Google Scholar 

  33. Rafsanjani MK, Dowlatshahi MB (2012) Using gravitational search algorithm for finding near-optimal base station location in two-tiered WSNs. Int J Mach Learn Comput. https://doi.org/10.7763/ijmlc.2012.v2.148

    Article  Google Scholar 

  34. Reyes O, Morell C, Ventura S (2015) Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing. https://doi.org/10.1016/j.neucom.2015.02.045

    Article  Google Scholar 

  35. Stauffer M, Tschachtli T, Fischer A, Riesen K (2017) A survey on applications of bipartite graph edit distance. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 242–252

  36. Sun Z, Zhang J, Dai L et al (2019) Mutual information based multi-label feature selection via constrained convex optimization. Neurocomputing. https://doi.org/10.1016/j.neucom.2018.10.047

    Article  Google Scholar 

  37. Wang C, Lin Y, Liu J (2019) Feature selection for multi-label learning with missing labels. Appl Intell 49:3027–3042. https://doi.org/10.1007/s10489-019-01431-6

    Article  Google Scholar 

  38. Wang H, Zhang Y, Zhang J et al (2019) A factor graph model for unsupervised feature selection. Inf Sci (Ny) 480:144–159. https://doi.org/10.1016/j.ins.2018.12.034

    Article  MathSciNet  MATH  Google Scholar 

  39. Yan J, Yin XC, Lin W, et al (2016) A short survey of recent advances in graph matching. In: ICMR 2016—proceedings of the 2016 ACM International Conference on Multimedia Retrieval, pp 167–174

  40. Zepeda-Mendoza ML, Resendis-Antonio O (2013) Bipartite Graph. Encyclopedia of Systems Biology. Springer, New York, pp 147–148

  41. Zhang J, Luo Z, Li C et al (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recognit 95:136–150. https://doi.org/10.1016/j.patcog.2019.06.003

    Article  Google Scholar 

  42. Zhang L, Hu Q, Zhou Y, Wang X (2014) Multi-label attribute evaluation based on fuzzy rough sets, pp 100–108

  43. Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label leaming. Pattern Recognit 40:2038–2048. https://doi.org/10.1016/j.patcog.2006.12.019

    Article  MATH  Google Scholar 

  44. Zhang P, Liu G, Gao W (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognit 95:72–82. https://doi.org/10.1016/j.patcog.2019.06.004

    Article  Google Scholar 

  45. Zhou F, Lin Y (2016) Fine-grained image classification by exploring bipartite-graph labels. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 1124–1133

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Bagher Dowlatshahi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hashemi, A., Dowlatshahi, M.B. & Nezamabadi-Pour, H. A bipartite matching-based feature selection for multi-label learning. Int. J. Mach. Learn. & Cyber. 12, 459–475 (2021). https://doi.org/10.1007/s13042-020-01180-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01180-w

Keywords

Navigation