Multi-label Feature Selection Using Particle Swarm Optimization: Novel Initialization Mechanisms

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11919)


In standard single-label classification, feature selection is an important but challenging task due to its large and complex search space. However, feature selection for multi-label classification is even more challenging since it needs to consider not only the feature interactions but also the label interactions. Particle Swarm Optimization (PSO) has been widely applied to select features for single-label classification, but its potential has not been investigated in multi-label classification. Therefore, this work proposes PSO-based multi-label feature selection algorithms to investigate the importance of population initialization in multi-label feature selection. Particularly, the discriminative information is utilized to let the swarm start with more promising feature combinations. Results on eight real-world datasets show that the new strategies can reduce the number of features and improve classification performance over using all features and standard PSO-based multi-label feature selection.


Particle Swarm Optimization Feature selection Multi-label classification 


  1. 1.
    Gheyas, I.A., Smith, L.S.: Feature subset selection in large dimensionality domains. Pattern Recogn. 43(1), 5–13 (2010)CrossRefGoogle Scholar
  2. 2.
    Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018)Google Scholar
  3. 3.
    Zhang, P., Liu, G., Gao, W.: Distinguishing two types of labels for multi-label feature selection. Pattern Recogn. (2019)Google Scholar
  4. 4.
    Zhang, M.-L., Li, Y.-K., Liu, X.-Y., Geng, X.: Binary relevance for multi-label learning: an overview. Front. Comput. Sci. 12(2), 191–202 (2018) CrossRefGoogle Scholar
  5. 5.
    Pereira, R.B., Plastino, A., Zadrozny, B., Merschmann, L.H.: Categorizing feature selection methods for multi-label classification. Artif. Intell. Rev. 49(1), 57–78 (2018)CrossRefGoogle Scholar
  6. 6.
    SpolaôR, N., Cherman, E.A., Monard, M.C., Lee, H.D.: A comparison of multi-label feature selection methods using the problem transformation approach. Electron. Notes Theor. Comput. Sci. 292, 135–151 (2013)CrossRefGoogle Scholar
  7. 7.
    Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)CrossRefGoogle Scholar
  8. 8.
    Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: IEEE World Congress on Computational Intelligence, pp. 69–73 (1998)Google Scholar
  9. 9.
    Hu, X., Eberhart, R.: Solving constrained nonlinear optimization problems with particle swarm optimization. In: Proceedings of The sixth World Multiconference on Systemics, Cybernetics and Informatics, vol. 5, pp. 203–206. Citeseer (2002)Google Scholar
  10. 10.
    Lee, J., Kim, D.-W.: Memetic feature selection algorithm for multi-label classification. Inf. Sci. 293, 80–96 (2015) CrossRefGoogle Scholar
  11. 11.
    Zhang, Y., Gong, D.-W., Sun, X.-Y., Guo, Y.-N.: A PSO-based multi-objective multi-label feature selection method in classification. Sci. Rep. 7(1), 376 (2017)CrossRefGoogle Scholar
  12. 12.
    Lipowski, A., Lipowska, D.: Roulette-wheel selection via stochastic acceptance. Phys. A 391(6), 2193–2196 (2012)CrossRefGoogle Scholar
  13. 13.
    Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Machine Learning Proceedings 1992, pp. 249–256. Elsevier (1992)Google Scholar
  14. 14.
    Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). Scholar
  15. 15.
    Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)Google Scholar
  16. 16.
    Hall, M.A.: Correlation-based feature selection for machine learning (1999)Google Scholar
  17. 17.
    Holmes, G., Donkin, A., Witten, I.H.: WEKA: a machine learning workbench (1994)Google Scholar
  18. 18.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Data mining and knowledge discovery handbook. Mining Multi-label Data (2010)Google Scholar
  19. 19.
    Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002)CrossRefGoogle Scholar
  20. 20.
    Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)CrossRefGoogle Scholar
  21. 21.
    Spolaôr, N., Monard, M.C., Lee, H.D.: A systematic review to identify feature selection publications in multi-labeled data. Relatório Técnico do ICMC No 374(31), 3 (2012)Google Scholar
  22. 22.
    Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. (IJDWM) 3(3), 1–13 (2007)CrossRefGoogle Scholar
  23. 23.
    Zhang, M.-L., Zhou, Z.-H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)CrossRefGoogle Scholar
  24. 24.
    Sorower, M.S.: A literature survey on algorithms for multi-label learning, vol. 18. Oregon State University, Corvallis (2010)Google Scholar
  25. 25.
    Zhang, M.-L., Peña, J.M., Robles, V.: Feature selection for multi-label naive bayes classification. Inform. Sci. 179(19), 3218–3229 (2009)CrossRefGoogle Scholar
  26. 26.
    Pereira, R.B., Plastino, A., Zadrozny, B., Merschmann, L.H.: Information gain feature selection for multi-label classification. J. Inform. Data Manage. 6(1), 48 (2015)Google Scholar
  27. 27.
    Jungjit, S., Freitas, A.: A lexicographic multi-objective genetic algorithm for multi-label correlation based feature selection. In: Proceedings of the Companion Publication of the Conference on Genetic and Evolutionary Computation, pp. 989–996. ACM (2015)Google Scholar
  28. 28.
    Nguyen, H.B., Xue, B., Andreae, P., Zhang, M.: Particle swarm optimisation with genetic operators for feature selection. In: IEEE Congress on Evolutionary Computation (CEC), pp. 286–293 (2017)Google Scholar
  29. 29.
    Nguyen, H.B., Xue, B., Andreae, P.: PSO with surrogate models for feature selection: static and dynamic clustering-based methods. Memet. Comput. 10(3), 291–300 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Engineering and Computer ScienceVictoria University of WellingtonWellingtonNew Zealand

Personalised recommendations