Abstract
In recent years, multi-label study has received extensive attention and research in many fields. The feature dimensions of a multi-label data set are high but contain a large amount of noise as well as irrelevant and redundant features. This not only leads to huge storage and time overhead, but also brings serious dimensional disaster problems, making multi-label learning tasks very difficult. Therefore, how to effectively select multi-label features is an important research content in multi-label learning. However, most of the current methods are converted from the methods of single-label feature selection, and feature selection is easy to fall into the local optimal heuristic search strategy. Time complexity has always been the biggest problem of such methods. Based on those considerations, our paper proposes a fast and effective multi-label feature selection method, which uses the optimization strategy to replace the previous search strategy for multi-label feature selection, and transforms the search problem into convex optimization problem. Therefore, the time performance of the traditional method is improved by two to three orders of magnitude. Finally, the experimental results of five evaluation indicators on the four data sets show that our method is superior to many popular methods in feature selection field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: CVPR 2011, pp. 2801–2808 (2011)
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based systemfor text categorization. Mach. Learn. 39(2), 135–168 (2000). https://doi.org/10.1023/A:1007649029923
Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42
Chen, Z., Chen, M., Weinberger, K.Q., Zhang, W.: Marginalized denoising for link prediction and multi-label learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 1707–1713 (2015)
Liu, G., Li, G., Wang, Y., Wang, Y.: Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement. Altern. Med. 10(1), 37–37 (2010)
Gu, Q., Li, Z., Han, J.: Correlated multi-label feature selection. In: Proceedings of the 20th ACM International Conference on Information and knowledge management, pp. 1087–1096 (2011)
Sun, Z., Zhang, J., Luo, Z., Cao, D., Li, S.: A fast feature selection method based on mutual information in multi-label learning. In: Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H. (eds.) ChineseCSCW 2018. CCIS, vol. 917, pp. 424–437. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-3044-5_31
Zhang, J., Li, C., Sun, Z., Luo, Z., Zhou, C., Li, S.: Towards a unified multi-source-based optimization framework for multi-label learning. Appl. Soft Comput. 76, 425–435 (2019)
Lin, Y., Hu, Q., Jia, Z., Wu, X.: Multi-label feature selection with streaming labels. Inf. Sci. 372, 256–275 (2016)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Zhu, S., Wu, Y.N., Mumford, D.: Minimax entropy principle and its application to texture modeling. Neural Comput. 9(9), 1627–1660 (1997)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002). https://doi.org/10.1023/A:1012487302797
Dy, J.G., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 373–378 (2003)
Braytee, A., Liu, W., Catchpoole, D.R., Kennedy, P.J.: Multi-label feature selection using correlation information. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1649–1656 (2017)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Peng, H., Long, F., Ding, C.H.Q.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168, 92–103 (2015)
Lee, J., Kim, D.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)
Li, H.: Optimized mutual information feature selection method. Comput. Eng. Appl. 46(26), 122–124 (2010)
Brown, G., Pocock, A.C., Zhao, M., Lujan, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(1), 27–66 (2012)
Sun, Z., et al.: Mutual information based multi-label feature selection via constrained convex optimization. Neurocomputing 329, 447–456 (2019)
Zhang, J., et al.: Multi-label learning with label-specific features by resolving label correlations. Knowl. Based Syst. 159, 148–157 (2018)
Wang, J., Wei, J., Yang, Z., Wang, S.: Feature selection by maximizing independent classification information. IEEE Trans. Knowl. Data Eng. 29(4), 828–841 (2017)
Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)
Liu, L., Zhang, J., Li, P., Zhang, Y., Hu, X.: A label correlation based weighting feature selection approach for multi-label data. In: Cui, B., Zhang, N., Xu, J., Lian, X., Liu, D. (eds.) WAIM 2016. LNCS, vol. 9659, pp. 369–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39958-4_29
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Wang, J., Zucker, J.: Solving multiple-instance problem: A lazy learning approach, pp. 1119–1126 (2000)
Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization, pp. 1503–1505 (2008)
Acknowledgements
This work is supported by the National Nature Science Foundation of China (No. 61876159, No. 61806172, No. 61572409, No. U1705286 & 61571188), the National Key Research and Development Program of China (No.2018YFC0831402), Fujian Province 2011Collaborative Innovation Center of TCM Health Management, Collaborative Innovation Center of Chinese Oolong Tea Industry-Collaborative Innovation Center (2011) of Fujian Province.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lin, P., Sun, Z., Zhang, J., Luo, Z., Li, S. (2019). A Simple and Convex Formulation for Multi-label Feature Selection. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_42
Download citation
DOI: https://doi.org/10.1007/978-981-15-1377-0_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)