Abstract
Feature selection remains one of the most important steps for usability of a model for both supervised and unsupervised classification. For a dataset, with n features, the number of possible feature subsets is 2n. Even for a moderate size of n, there is a combinatorial explosion in the search space. Feature selection is a NP-hard problem; hence finding the optimal solution is not feasible. Typically various kinds of intelligent and metaheuristic search techniques can be employed for this purpose. Hill climbing is arguably the simplest of such techniques. It has many variants based on (a) trade-off between greediness and randomness, (b) direction of the search, and (c) size of the neighborhood. Consequently it might not be trivial for the practitioner to choose a suitable method for the task in hand. In this paper, we have attempted to address this issue in the context of feature selection. The descriptions of the methods are followed by an extensive empirical study over 20 publicly available datasets. Finally a comparison has been done with genetic algorithm, which shows the effectiveness of hill climbing methods in the context of feature selection.
Keywords
- Hill climbing
- Filter
- Feature selection
- Heuristic
- Classification
This is a preview of subscription content, access via your institution.
Buying options








References
Goswami S, Chakrabarti A (2014) Feature selection: a practitioner view. IJITCS 6(11):66–77. https://doi.org/10.5815/ijitcs.2014.11.10
Liu H, Yu L (2005 Apr) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 58:267–288
Das AK, Goswami S, Chakrabarti A, Chakraborty B (2017) A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 88:81–94
Goswami S, Das AK, Guha P, Tarafdar A, Chakraborty S, Chakrabarti A, Chakraborty B (2017) An approach of feature selection using graph-theoretic heuristic and hill climbing. Pattern Anal Applic:1–17
Goswami S, Chakrabarti A, Chakraborty B (2016) A proposal for recommendation of feature selection algorithm based on data set characteristics. J UCS 22(6):760–781
Goswami S, Saha S, Chakravorty S, Chakrabarti A, Chakraborty B (2015) A new evaluation measure for feature subset selection with genetic algorithm. Int J Intell Syst Appl MECS 7(10):28
Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recogn 43(1):5–13
De La Iglesia B (2013) Evolutionary computation for feature selection in classification problems. Wiley Interdiscip Rev Data Min Knowl Disc 3(6):381–407
Goswami S, Das AK, Chakrabarti A, Chakraborty B (2017) A feature cluster taxonomy based feature selection technique. Expert Syst Appl 79:76–89
Goswami S, Chakraborty S, Saha HN (2017) An univariate feature elimination strategy for clustering based on metafeatures. Int J Intell Syst Appl 9(10):20
Goswami S, Chakrabarti A, Chakraborty B (2017) An efficient feature selection technique for clustering based on a new measure of feature importance. J Intell Fuzzy Syst 32(6):3847–3858
Gent IP, Walsh T (1993) Towards an understanding of hill-climbing procedures for SAT. In: AAAI, vol 93, pp 28–33
Wang R, Youssef AM, Elhakeem AK (2006) On some feature selection strategies for spam filter design. In: Electrical and computer engineering, 2006. CCECE'06, Canadian Conference on 2006 May. IEEE, pp 2186–2189
Burke EK, Bykov Y (2008) A late acceptance strategy in hill-climbing for exam timetabling problems. PATAT 2008 Conference, Montreal
Lang KJ (2016) Hill climbing beats genetic search on a boolean circuit synthesis problem of koza's. In: Proceedings of the twelfth international conference on machine learning 2016 Jan 22, pp 340–343
Bykov Y, Petrovic S (2016) A step counting hill climbing algorithm applied to university examination timetabling. J Schedul:1–4
Seyedmahmoudian M, Horan B, Rahmani R, Maung Than Oo A, Stojcevski A (2016) Efficient photovoltaic system maximum power point tracking using a new technique. Energies 9(3):147
Saichandana B, Srinivas K, Kumar RK (2014) Clustering algorithm combined with hill climbing for classification of remote sensing image. Int J Electr Comput Eng 4(6):923–930
Ou TC, Su WF, Liu XZ, Huang SJ, Tai TY (2016) A modified bird-mating optimization with hill-climbing for connection decisions of transformers. Energies 9(9):671
Nunes CM, Britto AS, Kaestner CA, Sabourin R (2004) An optimized hill climbing algorithm for feature subset selection: Evaluation on handwritten character recognition. In: Frontiers in handwriting recognition, 2004. IWFHR-9 2004. Ninth international workshop on 2004 Oct 26. IEEE, pp 365–370
Gelbart D, Morgan N, Tsymbal A (2009) Hill-climbing feature selection for multi-stream ASR. In: INTERSPEECH 2009, pp 2967–2970
Hall MA, Smith LA (1997) Feature subset selection: a correlation based filter approach. In: International conference on neural information processing and intelligent information systems, pp 855–858
Liu Y, Schumann M (2005) Data mining feature selection for credit scoring models. J Oper Res Soc 56(9):1099–1108
Begg RK, Palaniswami M, Owen B (2005) Support vector machines for automated gait classification. IEEE Trans Biomed Eng 52(5):828–838
Farmer ME, Bapna S, Jain AK (2004) Large scale feature selection using modified random mutation hill climbing. In: Pattern recognition, 2004. ICPR 2004. Proceedings of the 17th international conference on 2004 Aug 23, vol 2. IEEE, pp 287–290
Malakasiotis P (2009) Paraphrase recognition using machine learning to combine similarity measures. In: Proceedings of the ACL-IJCNLP 2009 student research workshop 2009 Aug 4. Association for Computational Linguistics, pp 27–35
Caruana R, Freitag D (1994) Greedy Attribute Selection. In: ICML, pp 28–36
Lewis R (2009) A general-purpose hill-climbing method for order independent minimum grouping problems: A case study in graph colouring and bin packing. Comput Oper Res 36(7):2295–2310
Mitchell M, Holland JH, Forrest S (2014) Relative building-block fitness and the building block hypothesis. D. Whitley. Found Genet Algorithms 2:109–126
Lourenço HR, Martin OC, Stützle T (2003) Iterated local search. In: Handbook of metaheuristics. Springer, Boston, pp 320–353
Mitchell M, Holland JH When will a genetic algorithm outperform hill-climbing?
Hall MA Correlation-based feature selection for machine learning. Doctoral dissertation, The University of Waikato
Lichman M (2013) UCI machine learning repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science, Irvine
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. J Mult Valued Log Soft Comput 17(2-3):255–287
R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/
Luca Scrucca (2013) GA: A Package for Genetic Algorithms in R. Journal of Statistical Software, 53(4), 1–37. URL,http://www.jstatsoft.org/v53/i04/
Taylor BM (2013) miscFuncs: miscellaneous useful functions. R package version 1.2-4. http://CRAN.R-project.org/package=miscFuncs
Hausser J, Strimmer K (2012) entropy: entropy and mutual information estimation. R package version 1.1.7 http://CRAN.R-project.org/package=entropy
Gutowski MW (2005) Biology, physics, small worlds and genetic algorithms. In: Shannon S (ed) Leading edge computer science research. Nova Science Publishers Inc, Hauppage, pp 165–218
Therneau T, Atkinson B, Ripley B (2012) rpart: recursive partitioning. R package version 4.1-0
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Goswami, S., Chakraborty, S., Guha, P., Tarafdar, A., Kedia, A. (2019). Filter-Based Feature Selection Methods Using Hill Climbing Approach. In: Li, X., Wong, KC. (eds) Natural Computing for Unsupervised Learning. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-98566-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-98566-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98565-7
Online ISBN: 978-3-319-98566-4
eBook Packages: EngineeringEngineering (R0)