Skip to main content

A Simple and Convex Formulation for Multi-label Feature Selection

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

Abstract

In recent years, multi-label study has received extensive attention and research in many fields. The feature dimensions of a multi-label data set are high but contain a large amount of noise as well as irrelevant and redundant features. This not only leads to huge storage and time overhead, but also brings serious dimensional disaster problems, making multi-label learning tasks very difficult. Therefore, how to effectively select multi-label features is an important research content in multi-label learning. However, most of the current methods are converted from the methods of single-label feature selection, and feature selection is easy to fall into the local optimal heuristic search strategy. Time complexity has always been the biggest problem of such methods. Based on those considerations, our paper proposes a fast and effective multi-label feature selection method, which uses the optimization strategy to replace the previous search strategy for multi-label feature selection, and transforms the search problem into convex optimization problem. Therefore, the time performance of the traditional method is improved by two to three orders of magnitude. Finally, the experimental results of five evaluation indicators on the four data sets show that our method is superior to many popular methods in feature selection field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: CVPR 2011, pp. 2801–2808 (2011)

    Google Scholar 

  2. Schapire, R.E., Singer, Y.: Boostexter: a boosting-based systemfor text categorization. Mach. Learn. 39(2), 135–168 (2000). https://doi.org/10.1023/A:1007649029923

    Article  MATH  Google Scholar 

  3. Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42

    Chapter  Google Scholar 

  4. Chen, Z., Chen, M., Weinberger, K.Q., Zhang, W.: Marginalized denoising for link prediction and multi-label learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 1707–1713 (2015)

    Google Scholar 

  5. Liu, G., Li, G., Wang, Y., Wang, Y.: Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement. Altern. Med. 10(1), 37–37 (2010)

    Article  Google Scholar 

  6. Gu, Q., Li, Z., Han, J.: Correlated multi-label feature selection. In: Proceedings of the 20th ACM International Conference on Information and knowledge management, pp. 1087–1096 (2011)

    Google Scholar 

  7. Sun, Z., Zhang, J., Luo, Z., Cao, D., Li, S.: A fast feature selection method based on mutual information in multi-label learning. In: Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H. (eds.) ChineseCSCW 2018. CCIS, vol. 917, pp. 424–437. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-3044-5_31

    Chapter  Google Scholar 

  8. Zhang, J., Li, C., Sun, Z., Luo, Z., Zhou, C., Li, S.: Towards a unified multi-source-based optimization framework for multi-label learning. Appl. Soft Comput. 76, 425–435 (2019)

    Article  Google Scholar 

  9. Lin, Y., Hu, Q., Jia, Z., Wu, X.: Multi-label feature selection with streaming labels. Inf. Sci. 372, 256–275 (2016)

    Article  Google Scholar 

  10. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  11. Zhu, S., Wu, Y.N., Mumford, D.: Minimax entropy principle and its application to texture modeling. Neural Comput. 9(9), 1627–1660 (1997)

    Article  Google Scholar 

  12. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002). https://doi.org/10.1023/A:1012487302797

    Article  MATH  Google Scholar 

  13. Dy, J.G., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 373–378 (2003)

    Article  Google Scholar 

  14. Braytee, A., Liu, W., Catchpoole, D.R., Kennedy, P.J.: Multi-label feature selection using correlation information. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1649–1656 (2017)

    Google Scholar 

  15. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)

    Article  Google Scholar 

  16. Peng, H., Long, F., Ding, C.H.Q.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  17. Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168, 92–103 (2015)

    Article  Google Scholar 

  18. Lee, J., Kim, D.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)

    Article  Google Scholar 

  19. Li, H.: Optimized mutual information feature selection method. Comput. Eng. Appl. 46(26), 122–124 (2010)

    Google Scholar 

  20. Brown, G., Pocock, A.C., Zhao, M., Lujan, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(1), 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  21. Sun, Z., et al.: Mutual information based multi-label feature selection via constrained convex optimization. Neurocomputing 329, 447–456 (2019)

    Article  Google Scholar 

  22. Zhang, J., et al.: Multi-label learning with label-specific features by resolving label correlations. Knowl. Based Syst. 159, 148–157 (2018)

    Article  Google Scholar 

  23. Wang, J., Wei, J., Yang, Z., Wang, S.: Feature selection by maximizing independent classification information. IEEE Trans. Knowl. Data Eng. 29(4), 828–841 (2017)

    Article  Google Scholar 

  24. Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)

    Article  Google Scholar 

  25. Liu, L., Zhang, J., Li, P., Zhang, Y., Hu, X.: A label correlation based weighting feature selection approach for multi-label data. In: Cui, B., Zhang, N., Xu, J., Lian, X., Liu, D. (eds.) WAIM 2016. LNCS, vol. 9659, pp. 369–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39958-4_29

    Chapter  Google Scholar 

  26. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  27. Wang, J., Zucker, J.: Solving multiple-instance problem: A lazy learning approach, pp. 1119–1126 (2000)

    Google Scholar 

  28. Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization, pp. 1503–1505 (2008)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China (No. 61876159, No. 61806172, No. 61572409, No. U1705286 & 61571188), the National Key Research and Development Program of China (No.2018YFC0831402), Fujian Province 2011Collaborative Innovation Center of TCM Health Management, Collaborative Innovation Center of Chinese Oolong Tea Industry-Collaborative Innovation Center (2011) of Fujian Province.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaozi Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lin, P., Sun, Z., Zhang, J., Luo, Z., Li, S. (2019). A Simple and Convex Formulation for Multi-label Feature Selection. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_42

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1377-0_42

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1376-3

  • Online ISBN: 978-981-15-1377-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics