A novel feature selection approach with Pareto optimality for multi-label data

Li, Guohe; Li, Yong; Zheng, Yifeng; Li, Ying; Hong, Yunfeng; Zhou, Xiaoming

doi:10.1007/s10489-021-02228-2

A novel feature selection approach with Pareto optimality for multi-label data

Published: 17 March 2021

Volume 51, pages 7794–7811, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Guohe Li^1,2,
Yong Li^1,2,
Yifeng Zheng ORCID: orcid.org/0000-0001-9884-2481^1,2,3,4,
Ying Li^1,2,
Yunfeng Hong⁵ &
…
Xiaoming Zhou⁶

540 Accesses
3 Citations
Explore all metrics

Abstract

Multi-label learning has widely applied in machine learning and data mining. The purpose of feature selection is to select an approximately optimal feature subset to characterize the original feature space. Similar to single-label data, feature selection is an import preprocessing step to enhance the performance of multi-label classification model. In this paper, we propose a multi-label feature selection approach with Pareto optimality for continuous data, called MLFSPO. It maps multi-label features to high-dimensional space to evaluate the correlation between features and labels by utilizing the Hilbert-Schmidt Independence Criterion (HSIC). Then, the feature subset obtains by combining the Pareto optimization with feature ordering criteria and label weighting. Eventually, extensive experimental results on publicly available data sets show the effectiveness of the proposed algorithm in multi-label tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Simple and Convex Formulation for Multi-label Feature Selection

Towards Multi-label Feature Selection by Instance and Label Selections

Feature selection for multi-label learning with missing labels

Article 23 February 2019

Notes

http://mulan.sourceforge.net/datasets-mlc.html

References

Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Article Google Scholar
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. Data Mining and Knowledge Discovery Handbook
Read J (2008) A pruned problem transformation method for multi-label classification. In: New Zealand computer science research student conference, pp 143–150
Zhang M, Jos M, Robles V (2009) Feature selection for multi-label naive bayes classification. Inform Sci 179(19):3218–3229
Article Google Scholar
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label ReliefF and F-statistic feature selections for image annotation. In: IEEE Conference on computer vision and pattern recognition, CVPR
Lee J, Kim JD (2015) Memetic feature selection algorithm for multi-label classification. Inform Sci 293:80–96
Article Google Scholar
Kong X, Yu P (2012) GMLC: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305
Article Google Scholar
Lee J, Kim D (2015) Mutual information-based multi-label feature selection using interaction information. Expert Syst Appl 42(4):2013–2025
Article Google Scholar
Lin Y, Hu Q, Liu J, et al. (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Article Google Scholar
Li L, Liu H, Ma Z, Mo Y, Duan Z, Zhou J, Zhao J (2014) Multi-label feature selection via information gain. In: Advanced data mining and applications. Springer International Publishing, pp 1345–1355
Yu Y, Wang Y (2014) Feature selection for multi-label learning using mutual information and GA. In: Rough sets and knowledge technology. Springer International Publishing, pp 454–463
Wang H, Ding C, Huang H (2010) Multi-label linear discriminant analysis. In: European conference on computer vision, pp 126–139
Zhang P, Liu G, Gao W (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognitition 95:72–82
Article Google Scholar
Li H, Li D, Zhai Y, et al. (2016) A novel attribute reduction approach for multi-label data based on rough set theory. Inform Sci 367:827–847
Article Google Scholar
Zhang Y, Zhou Z (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4:1–21
Article Google Scholar
Doquire G, Verleysen M (2013) Mutual information-based feature selection for multilabel classification. Neurocomputing 122:148–155
Article Google Scholar
Reyes O, Morell C, Ventura S (2013) ReliefF-ML: an extension of ReliefF algorithm to multi-label learning. In: Iberoamerican congress on pattern recognition. Springer, Berlin, pp 528–535
Lee J, Kim D (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34(3):349–357
Article Google Scholar
Lin Y, Hu Q, Liu J, Chen J, Duan J (2015) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Article Google Scholar
Kashef S, Nezamabadi-Pour H (2019) A label-specific multi-label feature selection algorithm based on the Pareto dominance concept. Pattern Recogn 88:654–667
Article Google Scholar
Gretton A, Bousquet O, Smola A, Scholkopf B (2005) Measuring statistical dependence with Hilbert-Schmidt norms. In: the 16th International conference on algorithmic learning theory, pp 63–77
Abualigah L, Diabat A (2020) A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Clust Comput 1:1–19
Google Scholar
Abualigah L, Diabat A (2020) A comprehensive survey of the Grasshopper optimization algorithm: results, variants, and applications. Neural Comput Applic, 1–24
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:674–701
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Article MathSciNet Google Scholar
Dunn O (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Article MathSciNet Google Scholar

Download references

Acknowledgments

We are greatly indebted to colleagues at Data and Knowledge Engineering Center, School of Information Technology and Electrical Engineering, the University of Queensland, Australia. We thank Prof. Xiaofang Zhou, Pro. Xue Li, Dr. Shuo Shang and Dr. Kai Zheng for their special suggestions and many interesting discussions.

This work is partly supported by the Nature Science Foundation of China under Grant (Nos. 60473125 and 61701213), Science Foundation of China University of Petroleum-Beijing At Karamay under Grant (Nos. RCYJ2016B-03-001), Kalamay Science & Technology Research Project (Nos.2020CGZH0009), the Natural Science Foundation of Fujian Province(Nos. 2018J01546 and 2019J01748), and the Research Fund for Educational Department of Fujian Province (Nos. JAT190392).

Author information

Authors and Affiliations

Beijing Key Lab of Petroleum Data Mining, China University of Petroleum-Beijing, Beijing, 102249, China
Guohe Li, Yong Li, Yifeng Zheng & Ying Li
College of Information Science and Engineering, China University of Petroleum-Beijing, Beijing, 102249, China
Guohe Li, Yong Li, Yifeng Zheng & Ying Li
School of Computer Science, Minnan Normal University, ZhangZhou, 363000, China
Yifeng Zheng
Key Laboratory of Data Science and Intelligence Application, Fujian Province University, Zhangzhou, 363000, China
Yifeng Zheng
China Anti-Infringement and Anti-Counterfeit Innovation Strategic Alliance, Hangzhou, 310010, China
Yunfeng Hong
Xiamen Hanying Internet of things Application Research Institute, Xiamen, 361021, China
Xiaoming Zhou

Authors

Guohe Li
View author publications
You can also search for this author in PubMed Google Scholar
Yong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunfeng Hong
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yifeng Zheng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, G., Li, Y., Zheng, Y. et al. A novel feature selection approach with Pareto optimality for multi-label data. Appl Intell 51, 7794–7811 (2021). https://doi.org/10.1007/s10489-021-02228-2

Download citation

Accepted: 22 January 2021
Published: 17 March 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s10489-021-02228-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel feature selection approach with Pareto optimality for multi-label data

Abstract

Access this article

Similar content being viewed by others

A Simple and Convex Formulation for Multi-label Feature Selection

Towards Multi-label Feature Selection by Instance and Label Selections

Feature selection for multi-label learning with missing labels

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel feature selection approach with Pareto optimality for multi-label data

Abstract

Access this article

Similar content being viewed by others

A Simple and Convex Formulation for Multi-label Feature Selection

Towards Multi-label Feature Selection by Instance and Label Selections

Feature selection for multi-label learning with missing labels

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation