Soft Computing

, Volume 18, Issue 7, pp 1373–1382 | Cite as

A hybrid genetic algorithm for feature subset selection in rough set theory

  • Si-Yuan JingEmail author
Methodologies and Application


Rough set theory has been proven to be an effective tool to feature subset selection. Current research usually employ hill-climbing as search strategy to select feature subset. However, they are inadequate to find the optimal feature subset since no heuristic can guarantee optimality. Due to this, many researchers study stochastic methods. Since previous works of combination of genetic algorithm and rough set theory do not show competitive performance compared with some other stochastic methods, we propose a hybrid genetic algorithm for feature subset selection in this paper, called HGARSTAR. Different from previous works, HGARSTAR embeds a novel local search operation based on rough set theory to fine-tune the search. This aims to enhance GA’s intensification ability. Moreover, all candidates (i.e. feature subsets) generated in evolutionary process are enforced to include core features to accelerate convergence. To verify the proposed algorithm, experiments are performed on some standard UCI datasets. Experimental results demonstrate the efficiency of our algorithm.


Feature subset selection Hybrid genetic algorithm Rough set theory Local search operation Core 


  1. Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233CrossRefGoogle Scholar
  2. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156CrossRefGoogle Scholar
  3. Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176CrossRefzbMATHMathSciNetGoogle Scholar
  4. Derrac J, Verbiest N, García S (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238CrossRefGoogle Scholar
  5. Guyon I, Elisseeff A (2003) An introduction to variable feature selection. J Mach Learn Res 3:1157–1182zbMATHGoogle Scholar
  6. Hedar AR, Wang J, Fukushima M (2008) Tabu search for attribute reduction in rough set theory. Soft Comput 12(9):909–918CrossRefzbMATHGoogle Scholar
  7. Holland J (1992) Adaptation in nature and artificial systems. MIT Press, CambridgeGoogle Scholar
  8. Hu K, Lu YC, Shi CY (2003) Feature ranking in rough sets. AI Commun 16(1):41–50Google Scholar
  9. Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40(12):3509–3521CrossRefzbMATHGoogle Scholar
  10. Hu X, Cereone N (1995) Learning in relational databases: a rough set approach. Comput Intell 11(2):323–337CrossRefGoogle Scholar
  11. Inza I, Larrañaga P, Etxeberria R et al (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123:157–184CrossRefzbMATHGoogle Scholar
  12. Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. In: Proceedings of 2003 UK Workshop, Computational Intelligence, pp 15–22Google Scholar
  13. Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Set Syst 141(3):469–485CrossRefzbMATHMathSciNetGoogle Scholar
  14. Jing SY, She K, Ali S (2013) A universal neighbourhood rough sets model for knowledge discovering from incomplete heterogeneous data. Expert Syst 30(1):89–96CrossRefGoogle Scholar
  15. Krishnapuram B, Harternink AJ, Carin L et al (2004) Bayesian approach to joint feature selection and classifier design. IEEE Trans Pattern Anal Mach Intell 26(9):1105–1111CrossRefGoogle Scholar
  16. Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159CrossRefGoogle Scholar
  17. Li ST, Wu XX, Hu XY (2008) Gene selection using genetic algorithm and support vectors machines. Soft Comput 12(7):693–698CrossRefGoogle Scholar
  18. Lin SW, Chen SC (2012) Parameter determination and feature selection for C4.5 algorithm using scatter search approach. Soft Comput 16(1):63–75CrossRefGoogle Scholar
  19. Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, BostonCrossRefzbMATHGoogle Scholar
  20. Lozanoa M, García-Martínez C (2010) Hybrid metaheuristics with evolutionary algorithms specializing in intensification and diversification: overview and progress report. Comput Oper Res 37(3):481–497CrossRefMathSciNetGoogle Scholar
  21. Miao DQ, Wang J (1997) Information-based algorithm for reduction of knowledge. In: IEEE Int Conf Intell Process Syst, Beijing, China, pp 1155–1158Google Scholar
  22. Muni DP, Das NR, Pal J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern B 36(1):106–117CrossRefGoogle Scholar
  23. Newman DJ, Hettich S, Blake CL (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, IrvineGoogle Scholar
  24. Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–2437CrossRefGoogle Scholar
  25. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356CrossRefzbMATHMathSciNetGoogle Scholar
  26. Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishing, DordrechtCrossRefzbMATHGoogle Scholar
  27. Pedrycz W (2007) Granular computing-the emerging paradigm. J Uncertain Syst 1(1):38–61Google Scholar
  28. Pedrycz W, Skowron A, Kreinovich V (2008) Handbook of granular computing. Wiley, New YorkCrossRefGoogle Scholar
  29. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRefGoogle Scholar
  30. Qian YH, Liang JY (2008) Combination entropy and combination granulation in rough set theory. Int J Uncertain Fuzz Knowl Based Syst 16(2):179–193CrossRefzbMATHMathSciNetGoogle Scholar
  31. Shen Q, Jensen R (2004) Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring. Pattern Recogn 37(7):1351–1363 CrossRefzbMATHGoogle Scholar
  32. Shen YJ, Li TR, Hermans E et al (2010) A hybrid system of neural networks and rough sets for road safety performance indicators. Soft Comput 14(12):1255–1263CrossRefGoogle Scholar
  33. Skowron A, Bazan J, Son NH et al (2005) RSES 2.2 user’s guide. Institute of Mathematics, Warsaw University
  34. Stefanowski J (1998) On rough set based approaches to induction of decision rules. In: Skowron A, Polkowski L (eds) Rough sets in knowledge discovery. Physica Verlag, Heidelberg, pp 500–529Google Scholar
  35. Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn 24(6):833–849CrossRefzbMATHGoogle Scholar
  36. Tan F, Fu XZ, Zhang YQ et al (2008) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120CrossRefGoogle Scholar
  37. Verikas A, Bacauskiene M (2002) Feature selection with neural networks. Pattern Recogn Lett 23(11):1323–1335CrossRefzbMATHGoogle Scholar
  38. Wang GY, Yu H (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7):759–766Google Scholar
  39. Wang GY (2003) Rough reduction in algebra view and information view. Int J Intell Syst 18(6):679–688CrossRefzbMATHGoogle Scholar
  40. Wang XY, Yang J, Teng XL et al (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471CrossRefGoogle Scholar
  41. Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proceedings of second annual join conf. on information sciences, Wrightsville Beach, NC, pp 186–189Google Scholar
  42. Xu FF, Miao DQ, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017CrossRefzbMATHGoogle Scholar
  43. Zhai LY, Khoo LP, Fok SC (2002) Feature extraction using rough set theory and genetic algorithms: an application for the simplification of product quality evaluation. Comput Ind Eng 43:661–676CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.School of Computer ScienceLeshan Normal UniversityLeshanChina

Personalised recommendations