Abstract
Multi-label learning has widely applied in machine learning and data mining. The purpose of feature selection is to select an approximately optimal feature subset to characterize the original feature space. Similar to single-label data, feature selection is an import preprocessing step to enhance the performance of multi-label classification model. In this paper, we propose a multi-label feature selection approach with Pareto optimality for continuous data, called MLFSPO. It maps multi-label features to high-dimensional space to evaluate the correlation between features and labels by utilizing the Hilbert-Schmidt Independence Criterion (HSIC). Then, the feature subset obtains by combining the Pareto optimization with feature ordering criteria and label weighting. Eventually, extensive experimental results on publicly available data sets show the effectiveness of the proposed algorithm in multi-label tasks.
Similar content being viewed by others
References
Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. Data Mining and Knowledge Discovery Handbook
Read J (2008) A pruned problem transformation method for multi-label classification. In: New Zealand computer science research student conference, pp 143–150
Zhang M, Jos M, Robles V (2009) Feature selection for multi-label naive bayes classification. Inform Sci 179(19):3218–3229
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label ReliefF and F-statistic feature selections for image annotation. In: IEEE Conference on computer vision and pattern recognition, CVPR
Lee J, Kim JD (2015) Memetic feature selection algorithm for multi-label classification. Inform Sci 293:80–96
Kong X, Yu P (2012) GMLC: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305
Lee J, Kim D (2015) Mutual information-based multi-label feature selection using interaction information. Expert Syst Appl 42(4):2013–2025
Lin Y, Hu Q, Liu J, et al. (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Li L, Liu H, Ma Z, Mo Y, Duan Z, Zhou J, Zhao J (2014) Multi-label feature selection via information gain. In: Advanced data mining and applications. Springer International Publishing, pp 1345–1355
Yu Y, Wang Y (2014) Feature selection for multi-label learning using mutual information and GA. In: Rough sets and knowledge technology. Springer International Publishing, pp 454–463
Wang H, Ding C, Huang H (2010) Multi-label linear discriminant analysis. In: European conference on computer vision, pp 126–139
Zhang P, Liu G, Gao W (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognitition 95:72–82
Li H, Li D, Zhai Y, et al. (2016) A novel attribute reduction approach for multi-label data based on rough set theory. Inform Sci 367:827–847
Zhang Y, Zhou Z (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4:1–21
Doquire G, Verleysen M (2013) Mutual information-based feature selection for multilabel classification. Neurocomputing 122:148–155
Reyes O, Morell C, Ventura S (2013) ReliefF-ML: an extension of ReliefF algorithm to multi-label learning. In: Iberoamerican congress on pattern recognition. Springer, Berlin, pp 528–535
Lee J, Kim D (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34(3):349–357
Lin Y, Hu Q, Liu J, Chen J, Duan J (2015) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Kashef S, Nezamabadi-Pour H (2019) A label-specific multi-label feature selection algorithm based on the Pareto dominance concept. Pattern Recogn 88:654–667
Gretton A, Bousquet O, Smola A, Scholkopf B (2005) Measuring statistical dependence with Hilbert-Schmidt norms. In: the 16th International conference on algorithmic learning theory, pp 63–77
Abualigah L, Diabat A (2020) A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Clust Comput 1:1–19
Abualigah L, Diabat A (2020) A comprehensive survey of the Grasshopper optimization algorithm: results, variants, and applications. Neural Comput Applic, 1–24
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:674–701
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Dunn O (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Acknowledgments
We are greatly indebted to colleagues at Data and Knowledge Engineering Center, School of Information Technology and Electrical Engineering, the University of Queensland, Australia. We thank Prof. Xiaofang Zhou, Pro. Xue Li, Dr. Shuo Shang and Dr. Kai Zheng for their special suggestions and many interesting discussions.
This work is partly supported by the Nature Science Foundation of China under Grant (Nos. 60473125 and 61701213), Science Foundation of China University of Petroleum-Beijing At Karamay under Grant (Nos. RCYJ2016B-03-001), Kalamay Science & Technology Research Project (Nos.2020CGZH0009), the Natural Science Foundation of Fujian Province(Nos. 2018J01546 and 2019J01748), and the Research Fund for Educational Department of Fujian Province (Nos. JAT190392).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, G., Li, Y., Zheng, Y. et al. A novel feature selection approach with Pareto optimality for multi-label data. Appl Intell 51, 7794–7811 (2021). https://doi.org/10.1007/s10489-021-02228-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02228-2