Abstract
In multi-label learning, feature selection is a non-ignorable preprocessing step which can alleviate the negative effect of high-dimensionality. To address this problem, a number of effective information theory based feature selection algorithms for multi-label learning are proposed. However, these existing algorithms assume that the label space of multi-label training data is complete. In practice, the standpoint does not always hold true, due to the ambiguity among class labels or the cost effort to fully annotate instances. In this paper, we first define the new concepts of multi-label information entropy and multi-label mutual information. Then, feature redundancy, feature independence, and feature interaction are defined, respectively. In which, feature interaction is used to select more valuable features which may be ignored due to the incomplete label space. Moreover, a multi-label feature selection method with missing labels is proposed. Finally, extensive experiments conducted on eight publicly available data sets verify the effectiveness of the proposed algorithm via comparing it with state-of-the-art methods.
Similar content being viewed by others
References
Alzami F, Tang J, Yu Z, Wu S, Chen C, You J, Zhang J (2018) Adaptive hybrid feature selection-based classifier ensemble for epileptic seizure classification. IEEE Access 6:2169–3536
Boutell M, Luo J, Shen X, Brown C (2004) Learning multi-label scene classificaiton. Pattern Recogn 37:1757–1771
Ding M, Yang Y, Lan Z (2018) Multi-label imbalanced classification based on assessments of cost and value. Appl Intell 48:3577–3590
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Doquire G, Verleysen M (2013) Mutual information-based feature selection for multilabel classification. Neurocomputing 122:148–155
Fakhari A, Moghadam A (2013) Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl Soft Comput 13:1292–1302
Friedman M (1940) A comparison of alternative tests of significance for the problem of m ranking. Ann Math Stat 11:86–92
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, New York
Guyon I, Elisseeff A (2003) An introduction to variable and features election. J Mach Learn Res 3:1157–1182
Herrera F, Charte F, Rivera A, Jesus M (2016) Multilabel classification problem analysis, metrics and techniques. Springer, Berlin
Hu Q, Pedrycz W, Yu D, Lang J (2010) Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Trans Syst Man Cybern B Cybern 40:137–50
Hu Q, Zhang L, Zhang D, Pan W, An S, Pedrycz W (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Exp Syst Appl 38:10737–10750
Janwe N, Bhoyar K (2018) Multi-label semantic concept detection in videos using fusion of asymmetrically trained deep convolutional neural networks and foreground driven concept co-occurrence matrix. Appl Intell 48:2047–2066
Lee J, Kim D (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34:349–357
Lee J, Kim D (2015) Mutual information-based multi-label feature selection using interaction information. Exp Syst Appl 42:2013–2025
Lee J, Kim D (2015) Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recogn 48:2761–2771
Li F, Miao D, Pedrycz W (2017) Granular multi-label feature selection based on mutual information. Pattern Recogn 67:410–423
Lin Y, Li J, Lin P, Lin G, Chen J (2014) Feature selection via neighborhood multi-granulation fusion. Knowl-Based Syst 67:162–168
Lin Y, Hu Q, Liu J, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Lin YJ, Hu QH, Liu JH, Chen JK, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Lin Y, Hu Q, Zhang J, Wu X (2016) Multi-label feature selection with streaming labels. Inf Sci 372:256–275
Lin Y, Hu Q, Liu J, Li J, Wu X (2017) Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Trans Fuzzy Syst 25:1491–1507
Lin Y, Li Y, Wang C, Chen J (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl-Based Syst 152:51–61
Liu J, Lin Y, Wu S, Wang C (2018) Online multi-label group feature selection. Knowl-Based Syst 143:42–57
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Sun L, Ji S, Ye J (2016) Multi-label dimensionality reduction. Chapman and Hall/CRC, London
Weng W, Lin Y, Wu S, Li Y, Kang Y (2018) Multi-label learning based on label-specific features and local pairwise label correlation. Neurocomputing 273:384–394
Wu B, Lyu S, Hu B, Ji Q (2015) Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recogn 48:2279–2289
Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label learning. Inf Sci 40:2038–2048
Zhang Y, Zhou Z-H (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4:1–21
Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26:1819–1837
Zhang M-L, Peña J, Robles V (2009) Feature selection for multi-label naive Bayes classification. Inf Sci 179:3218–3229
Zhang J, Li C, Cao D, Lin Y, Song S, Dai L, Li S (2018) Multi-label learning with label-specific features by resolving label correlations. Knowl-Based Syst. https://doi.org/10.1016/j.knosys.2018.07.003
Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature selection. Pattern Recogn 48:2656–2666
Zhou H, Zhang Y, Zhang Y, Liu H (2018) Feature selection based on conditional mutual information: minimum conditional relevance and minimum conditional redundancy. Appl Intell. https://doi.org/10.1007/s10489-018-1305-0
Zeng D, Zuo L, Zhou X, He F (2018) Integrating feature selection and feature extraction methods with deep learning to predict clinical outcome of breast cancer. IEEE Access 6:28936–28944
Zhu P, Xu Q, Hu Q, Zhang C, Zhao H (2018) Multi-label feature selection with missing labels. Pattern Recogn 74:488–502
Acknowledgments
The authors would like to thank the anonymous reviewers and the editor for their constructive and valuable comments. This work is supported by Grants from the National Natural Science Foundation of China (No. 61672272), the Natural Science Foundation of Fujian Province (Nos. 2018J01548, 2016J01314, and 2018J01547), and the Department of Education of Fujian Province (No. JT180318).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, C., Lin, Y. & Liu, J. Feature selection for multi-label learning with missing labels. Appl Intell 49, 3027–3042 (2019). https://doi.org/10.1007/s10489-019-01431-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01431-6