A Simple and Convex Formulation for Multi-label Feature Selection

Lin, Peng; Sun, Zhenqiang; Zhang, Jia; Luo, Zhiming; Li, Shaozi

doi:10.1007/978-981-15-1377-0_42

Peng Lin¹²,
Zhenqiang Sun¹²,
Jia Zhang¹²,
Zhiming Luo¹² &
…
Shaozi Li¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

1029 Accesses
1 Citations

Abstract

In recent years, multi-label study has received extensive attention and research in many fields. The feature dimensions of a multi-label data set are high but contain a large amount of noise as well as irrelevant and redundant features. This not only leads to huge storage and time overhead, but also brings serious dimensional disaster problems, making multi-label learning tasks very difficult. Therefore, how to effectively select multi-label features is an important research content in multi-label learning. However, most of the current methods are converted from the methods of single-label feature selection, and feature selection is easy to fall into the local optimal heuristic search strategy. Time complexity has always been the biggest problem of such methods. Based on those considerations, our paper proposes a fast and effective multi-label feature selection method, which uses the optimization strategy to replace the previous search strategy for multi-label feature selection, and transforms the search problem into convex optimization problem. Therefore, the time performance of the traditional method is improved by two to three orders of magnitude. Finally, the experimental results of five evaluation indicators on the four data sets show that our method is superior to many popular methods in feature selection field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: CVPR 2011, pp. 2801–2808 (2011)
Google Scholar
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based systemfor text categorization. Mach. Learn. 39(2), 135–168 (2000). https://doi.org/10.1023/A:1007649029923
Article MATH Google Scholar
Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42
Chapter Google Scholar
Chen, Z., Chen, M., Weinberger, K.Q., Zhang, W.: Marginalized denoising for link prediction and multi-label learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 1707–1713 (2015)
Google Scholar
Liu, G., Li, G., Wang, Y., Wang, Y.: Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement. Altern. Med. 10(1), 37–37 (2010)
Article Google Scholar
Gu, Q., Li, Z., Han, J.: Correlated multi-label feature selection. In: Proceedings of the 20th ACM International Conference on Information and knowledge management, pp. 1087–1096 (2011)
Google Scholar
Sun, Z., Zhang, J., Luo, Z., Cao, D., Li, S.: A fast feature selection method based on mutual information in multi-label learning. In: Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H. (eds.) ChineseCSCW 2018. CCIS, vol. 917, pp. 424–437. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-3044-5_31
Chapter Google Scholar
Zhang, J., Li, C., Sun, Z., Luo, Z., Zhou, C., Li, S.: Towards a unified multi-source-based optimization framework for multi-label learning. Appl. Soft Comput. 76, 425–435 (2019)
Article Google Scholar
Lin, Y., Hu, Q., Jia, Z., Wu, X.: Multi-label feature selection with streaming labels. Inf. Sci. 372, 256–275 (2016)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Zhu, S., Wu, Y.N., Mumford, D.: Minimax entropy principle and its application to texture modeling. Neural Comput. 9(9), 1627–1660 (1997)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002). https://doi.org/10.1023/A:1012487302797
Article MATH Google Scholar
Dy, J.G., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 373–378 (2003)
Article Google Scholar
Braytee, A., Liu, W., Catchpoole, D.R., Kennedy, P.J.: Multi-label feature selection using correlation information. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1649–1656 (2017)
Google Scholar
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Article Google Scholar
Peng, H., Long, F., Ding, C.H.Q.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168, 92–103 (2015)
Article Google Scholar
Lee, J., Kim, D.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)
Article Google Scholar
Li, H.: Optimized mutual information feature selection method. Comput. Eng. Appl. 46(26), 122–124 (2010)
Google Scholar
Brown, G., Pocock, A.C., Zhao, M., Lujan, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(1), 27–66 (2012)
MathSciNet MATH Google Scholar
Sun, Z., et al.: Mutual information based multi-label feature selection via constrained convex optimization. Neurocomputing 329, 447–456 (2019)
Article Google Scholar
Zhang, J., et al.: Multi-label learning with label-specific features by resolving label correlations. Knowl. Based Syst. 159, 148–157 (2018)
Article Google Scholar
Wang, J., Wei, J., Yang, Z., Wang, S.: Feature selection by maximizing independent classification information. IEEE Trans. Knowl. Data Eng. 29(4), 828–841 (2017)
Article Google Scholar
Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)
Article Google Scholar
Liu, L., Zhang, J., Li, P., Zhang, Y., Hu, X.: A label correlation based weighting feature selection approach for multi-label data. In: Cui, B., Zhang, N., Xu, J., Lian, X., Liu, D. (eds.) WAIM 2016. LNCS, vol. 9659, pp. 369–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39958-4_29
Chapter Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Article MathSciNet MATH Google Scholar
Wang, J., Zucker, J.: Solving multiple-instance problem: A lazy learning approach, pp. 1119–1126 (2000)
Google Scholar
Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization, pp. 1503–1505 (2008)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China (No. 61876159, No. 61806172, No. 61572409, No. U1705286 & 61571188), the National Key Research and Development Program of China (No.2018YFC0831402), Fujian Province 2011Collaborative Innovation Center of TCM Health Management, Collaborative Innovation Center of Chinese Oolong Tea Industry-Collaborative Innovation Center (2011) of Fujian Province.

Author information

Authors and Affiliations

Department of Cognitive Science, Xiamen University, Xiamen, 361005, People’s Republic of China
Peng Lin, Zhenqiang Sun, Jia Zhang, Zhiming Luo & Shaozi Li

Authors

Peng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Zhenqiang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiming Luo
View author publications
You can also search for this author in PubMed Google Scholar
Shaozi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaozi Li .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Kunming University of Science and Technology, Kunming, China
Zhengtao Yu
Tongji University, Shanghai, China
Hongfei Fan
University of Shanghai for Science and Technology, Shanghai, China
Liping Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, P., Sun, Z., Zhang, J., Luo, Z., Li, S. (2019). A Simple and Convex Formulation for Multi-label Feature Selection. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_42

Download citation

DOI: https://doi.org/10.1007/978-981-15-1377-0_42
Published: 14 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)