Advertisement

Improved PSO for Feature Selection on High-Dimensional Datasets

  • Binh Tran
  • Bing Xue
  • Mengjie Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8886)

Abstract

Classification on high-dimensional (i.e. thousands of dimensions) data typically requires feature selection (FS) as a pre-processing step to reduce the dimensionality. However, FS is a challenging task even on datasets with hundreds of features. This paper proposes a new particle swarm optimisation (PSO) based FS approach to classification problems with thousands or tens of thousands of features. The proposed algorithm is examined and compared with three other PSO based methods on five high-dimensional problems of varying difficulty. The results show that the proposed algorithm can successfully select a much smaller number of features and significantly increase the classification accuracy over using all features. The proposed algorithm outperforms the other three PSO methods in terms of both the classification performance and the number of features. Meanwhile, the proposed algorithm is computationally more efficient than the other three PSO methods because it selects a smaller number of features and employs a new fitness evaluation strategy.

Keywords

Particle swarm optimisation Feature selection Classification High-dimensional data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proceedings of the National Academy of Sciences 99(10), 6562–6566 (2002)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bala, J., Huang, J., Vafaie, H., Dejong, K., Wechsler, H.: Hybrid learning using genetic algorithms and decision trees for pattern classification. In: The 14th International Joint Conference on Artificial Intelligence, vol. 1Google Scholar
  3. 3.
    Chakraborty, B.: Genetic algorithm with fuzzy fitness function for feature selection. In: IEEE International Symposium on Industrial Electronics (ISIE 2002), vol. 1, pp. 315–319 (2002)Google Scholar
  4. 4.
    Chuang, L.Y., Chang, H.W., Tu, C.J., Yang, C.H.: Improved binary pso for feature selection using gene expression data. Computational Biology and Chemistry 32(1), 29–38 (2008)CrossRefzbMATHGoogle Scholar
  5. 5.
    Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)CrossRefGoogle Scholar
  6. 6.
    Davis, R.A., Charlton, A.J., Oehlschlager, S., Wilson, J.C.: Novel feature selection method for genetic programming using metabolomic 1h NMR data. Chemometrics and Intelligent Laboratory Systems 81(1), 50–59 (2006)CrossRefGoogle Scholar
  7. 7.
    Dorigo, M., Di Caro, G.: Ant colony optimization: a new meta-heuristic. In: IEEE Congress on Evolutionary Computation, vol. 2, pp. 1470–1477 (1999)Google Scholar
  8. 8.
    Engelbrecht, A.P.: Computational intelligence: an introduction, 2nd edn. Wiley (2007)Google Scholar
  9. 9.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  10. 10.
    Huang, J., Cai, Y., Xu, X.: A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognition Letters 28(13), 1825–1844 (2007)CrossRefGoogle Scholar
  11. 11.
    Jensen, R., Shen, Q.: Finding rough set reducts with ant colony optimization. In: Proceedings of the 2003 UK Workshop on Computational Intelligence, pp. 15–22 (2003)Google Scholar
  12. 12.
    Kanan, H.R., Faez, K.: An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system. Applied Mathematics and Computation 205(2), 716–725 (2008), Special Issue on Advanced Intelligent Computing Theory and Methodology in Applied Mathematics and ComputationGoogle Scholar
  13. 13.
    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)Google Scholar
  14. 14.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997), relevanceCrossRefzbMATHGoogle Scholar
  15. 15.
    Lane, M.C., Xue, B., Liu, I., Zhang, M.: Gaussian based particle swarm optimisation and statistical clustering for feature selection. In: Blum, C., Ochoa, G. (eds.) EvoCOP 2014. LNCS, vol. 8600, pp. 133–144. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  16. 16.
    Lanzi, P.L.: Fast feature selection with genetic algorithms: a filter approach. In: IEEE International Conference on Evolutionary Computation, pp. 537–540 (1997)Google Scholar
  17. 17.
    Marill, T., Green, D.M.: On the effectiveness of receptors in recognition systems. IEEE Transactions on Information Theory 9(1), 11–17 (1963)CrossRefGoogle Scholar
  18. 18.
    Ming, H.: A rough set based hybrid method to feature selection. In: International Symposium on Knowledge Acquisition and Modeling, KAM 2008, pp. 585–588 (December 2008)Google Scholar
  19. 19.
    Neshatian, K., Zhang, M.: Pareto front feature selection: Using genetic programming to explore feature space. In: The 11th Annual Conference on Genetic and Evolutionary Computation, GECCO 2009, pp. 1027–1034 (2009)Google Scholar
  20. 20.
    Oliveira, L., Sabourin, R., Bortolozzi, F., Suen, C.: Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: 16th International Conference on Pattern Recognition (ICPR 2002), vol. 1, pp. 568–571 (2002)Google Scholar
  21. 21.
    Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15(11), 1119–1125 (1994)CrossRefGoogle Scholar
  22. 22.
    Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: IEEE International Conference on Evolutionary Computation (CEC 1998), pp. 69–73 (1998)Google Scholar
  23. 23.
    Shi, Y., Eberhart, R.: Empirical study of particle swarm optimization. In: IEEE Congress on Evolutionary Computation (CEC 1999), vol. 3, pp. 1945–1950 (1999)Google Scholar
  24. 24.
    Stearns, S.D.: On selecting features for pattern classifiers. In: Proceedings of the 3rd International Conference on Pattern Recognition (ICPR 1976), Coronado, CA, pp. 71–75 (1976)Google Scholar
  25. 25.
    Whitney, A.: A direct method of nonparametric measurement selection. IEEE Transactions on Computers C-20(9), 1100–1103 (1971)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Xue, B.: Particle Swarm Optimisation for Feature Selection in Classification. Ph.D. thesis, Victoria University of Wellington, Wellington, New Zealand (2014)Google Scholar
  27. 27.
    Xue, B., Cervante, L., Shang, L., Browne, W.N., Zhang, M.: A multi-objective particle swarm optimisation for filter based feature selection in classification problems. Connection Science 24(2-3), 91–116 (2012)CrossRefGoogle Scholar
  28. 28.
    Xue, B., Cervante, L., Shang, L., Browne, W.N., Zhang, M.: Binary PSO and rough set theory for feature selection: A multi-objective filter based approach. International Journal of Computational Intelligence and Applications 13(02), 1450009 (2014)CrossRefGoogle Scholar
  29. 29.
    Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics 43(6), 1656–1671 (2013)CrossRefGoogle Scholar
  30. 30.
    Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied Soft Computing 18, 261–276 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Binh Tran
    • 1
  • Bing Xue
    • 1
  • Mengjie Zhang
    • 1
  1. 1.Victoria University of WellingtonWellingtonNew Zealand

Personalised recommendations