Skip to main content

Advertisement

Log in

An approach for feature selection using local searching and global optimization techniques

  • New Trends in data pre-processing methods for signal and image classification
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Classification problems such as gene expression array analysis, text processing of Internet document, combinatorial chemistry, software defect prediction and image retrieval involve tens or hundreds of thousands of features in the dataset. However, many of these features may be irrelevant and redundant, which only worsen the performance of the learning algorithms, and this may lead to the problem of overfitting. These superfluous features only degrade the accuracy and the computation time of a classification algorithm. So, the selection of relevant and nonredundant features is an important preprocessing step of any classification problem. Most of the global optimization techniques have the ability to converge to a solution quickly, but these begin with initializing a population randomly and the choice of initial population is an important step. In this paper, local searching algorithms have been used for generating a subset of relevant and nonredundant features; thereafter, a global optimization algorithm has been used so as to remove the limitations of global optimization algorithms, like lack of consistency in classification results and very high time complexity, to some extent. The computation time and classification accuracy are improved by using a feature set obtained from sequential backward selection and mutual information maximization algorithm which is fed to a global optimization technique (genetic algorithm, differential evolution or particle swarm optimization). In this proposed work, the computation time of these global optimization techniques has been reduced by using variance as stopping criteria. The proposed approach has been tested on publicly available Sonar, Wdbc and German datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE T Pattern Anal 19:153–158

    Article  Google Scholar 

  2. Kotsiantis S (2011) Feature selection for machine learning classification problems: a recent overview. Artif Intell Rev 42:157

    Article  Google Scholar 

  3. Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224

    MathSciNet  MATH  Google Scholar 

  4. Peng Y, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43:15–23

    Article  Google Scholar 

  5. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156

    Article  Google Scholar 

  6. Sutha K, Tamilselvi JJ (2015) A review of feature selection algorithms for data mining techniques. Int J Comput Sci Eng 7:63–67

    Google Scholar 

  7. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502

    Article  Google Scholar 

  8. Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 100:917–922

    Article  MATH  Google Scholar 

  9. Gupta P, Doermann D, DeMenthon D (2002) Beam search for feature selection in automatic SVM defect classification. Proc Int Conf Pattern Recogn 2:212–215

    Google Scholar 

  10. Kohavi R, Sommerfield D (1995) Feature subset selection using the wrapper method: overfitting and dynamic search space topology. In: Proceedings of international conference of knowledge discovery and data mining, pp 192–197

  11. Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125

    Article  Google Scholar 

  12. Liu Y, Zheng YF (2006) FS_SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345

    Article  MATH  Google Scholar 

  13. Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recogn Lett 30:525–534

    Article  Google Scholar 

  14. Chaikla N, Qi Y (1999) Genetic algorithms in feature selection. Proc IEEE Int Conf Syst Man Cybernet 5:538–540

    Google Scholar 

  15. Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176

    Article  MathSciNet  MATH  Google Scholar 

  16. Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer, Berlin, pp 37–130

    MATH  Google Scholar 

  17. Ahmad I (2015) Feature selection using particle swarm optimization in intrusion detection. Int J Distrib Sens Netw. doi:10.1155/2015/806954

    Google Scholar 

  18. Christensen J, Marks J, Shieber S (1995) An empirical study of algorithms for point-feature label placement. ACM Trans Gr 14:203–232

    Article  Google Scholar 

  19. Hall MA (1999) Correlation-based feature selection for machine learning. Dissertation, University of Waikato

  20. Burrell L, Smart O, Georgoulas GK, Marsh E, Vachtsevanos GJ (2007) Evaluation of feature selection techniques for analysis of functional MRI and EEG. In: Proceedings of international conference on data mining, pp 256–262

  21. Vafaie H, Imam IF (1994) Feature selection methods: genetic algorithms vs. greedy-like search. Proc Int Conf Fuzzy Intell Control Syst 51:39–43

    Google Scholar 

  22. Ladha L, Deepa T (2011) Feature selection methods and algorithms. Int J Adv Trends Comput Sci Eng 3:1787–1797

    Google Scholar 

  23. Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal 26:1424–1437

    Article  Google Scholar 

  24. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271

    Article  MathSciNet  MATH  Google Scholar 

  25. Yuan Huang, TsengSS Gangshan W, Fuyan Z (1999) A two-phase feature selection method using both filter and wrapper. Proc IEEE Conf Syst Man Cybernet 2:132–136

    Google Scholar 

  26. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  27. Brown G, Pocock A, Zhao MJ, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66

    MathSciNet  MATH  Google Scholar 

  28. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 22:8520–8532

    Article  Google Scholar 

  29. Bhandari D, Murthy CA, Pal SK (2012) Variance as a stopping criterion for genetic algorithms with elitist model. Fundam Inform 120:145–164

    MathSciNet  MATH  Google Scholar 

  30. Yu L, Liu H (2003) Efficiently handling feature redundancy in high-dimensional data. In: Proceedings of international conference on knowledge discovery and data mining, pp 685–690

  31. Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal 24:1667–1671

    Article  Google Scholar 

  32. Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. Proc Int Conf Mach Learn 3:856–863

    Google Scholar 

  33. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal 27:1226–1238

    Article  Google Scholar 

  34. Zhuo L, Zheng J, Li X, Wang F, Ai B, Qian, J (2008) A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine. In: Proceedings of geoinformatics 2008 and joint conference on GIS and built environment: classification of remote sensing images, pp 71471J–71471J

  35. Jung M, Zscheischler J (2013) A guided hybrid genetic algorithm for feature selection with expensive cost functions. Proc Int Conf Comput Sci 18:2337–2346

    Article  Google Scholar 

  36. Jiang J, Bo Y, Song C, Bao L (2012) Hybrid algorithm based on particle swarm optimization and artificial fish swarm algorithm. Int Symp Neural Netw 607–614

  37. Balakrishnan U, Venkatachalapathy K, Marimuthu SG (2015) A hybrid PSO-DEFS based feature selection for the identification of diabetic retinopathy. Curr Diabet Rev 11:182–190

    Article  Google Scholar 

  38. Brown G (2009) A new perspective for information theoretic feature selection. In: Proceedings of international conference on artificial intelligence and statistics, pp 49–56

  39. Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555

    MathSciNet  MATH  Google Scholar 

  40. Venter G (2010) Review of optimization techniques. Encycl Aerosp Eng. doi:10.1002/9780470686652.eae495

    Google Scholar 

  41. Lichman M (2013) UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml

  42. Wang G, Song Q, Sun H, Zhang X, Xu B, Zhou Y (2013) A feature subset selection algorithm automatic recommendation method. J Artif Intell Res 47:1–34

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manpreet Kaur.

Ethics declarations

Conflict of interest

The authors do not bear any financial or personal relationships with other people or the organizations that could inappropriately influence their work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tiwari, S., Singh, B. & Kaur, M. An approach for feature selection using local searching and global optimization techniques. Neural Comput & Applic 28, 2915–2930 (2017). https://doi.org/10.1007/s00521-017-2959-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-017-2959-y

Keywords

Navigation