Skip to main content

Feature selection for support vector machines with RBF kernel


Linear kernel Support Vector Machine Recursive Feature Elimination (SVM-RFE) is known as an excellent feature selection algorithm. Nonlinear SVM is a black box classifier for which we do not know the mapping function \({\Phi}\) explicitly. Thus, the weight vector w cannot be explicitly computed. In this paper, we proposed a feature selection algorithm utilizing Support Vector Machine with RBF kernel based on Recursive Feature Elimination(SVM-RBF-RFE), which expands nonlinear RBF kernel into its Maclaurin series, and then the weight vector w is computed from the series according to the contribution made to classification hyperplane by each feature. Using \({w_i^2}\) as ranking criterion, SVM-RBF-RFE starts with all the features, and eliminates one feature with the least squared weight at each step until all the features are ranked. We use SVM and KNN classifiers to evaluate nested subsets of features selected by SVM-RBF-RFE. Experimental results based on 3 UCI and 3 microarray datasets show SVM-RBF-RFE generally performs better than information gain and SVM-RFE.

This is a preview of subscription content, access via your institution.


  1. Albrecht A (2006) Stochastic local search for the feature set problem, with applications to microarray data. Appl Math Comput 183(2): 1148–1164

    MATH  Article  MathSciNet  Google Scholar 

  2. Ando S, Iba H (2004) Classification of gene expression profile using combinatory method of evolutionary computation and machine learning. Genet Program Evol Mach 5: 1573–7632

    Article  Google Scholar 

  3. Bontempi G (2007) A blocking strategy to improve gene selection for classification of gene expression data. IEEE/ACM Trans Comput Biology Bioinform 4: 293–300

    Article  Google Scholar 

  4. Brank J, Grobelnik M, Milic-Frayling N, Mladenic D (2002) Feature selection using linear support vector machines. Technical Report, MSR-TR-2002-63, Microsoft Research, Microsoft Corporation

  5. Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowl Discovery 2: 121–167

    Article  Google Scholar 

  6. Claeskens G, Croux C, Kerckhoven J (2008) An information criterion for variable selection in support vector machines. J Mach Learn Res 9: 541–558

    MathSciNet  Google Scholar 

  7. Cristianini N, Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    Google Scholar 

  8. Deng L, Pei J, Ma J, Lee D (2004) A rank sum test method for informative gene discovery. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, pp 410–419

  9. Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biology 3(2): 185–205

    Article  MathSciNet  Google Scholar 

  10. Ding Y, Wilkins D(2006)Improving the performance of SVM-RFE to select genes in microarray data. BMC Bioinform 7 (Suppl 2):S12. doi:10.1186/1471-2105-7-S2-S12

  11. Draminski M, Rada-Iglesias A, Enroth S, Wadelius C, Koronacki J, Komorowski J (2008) Monte Carlo feature selection for supervised classification. Bioinformatics 24(1): 110–117

    Article  Google Scholar 

  12. Duan K, Rajapakse J (2004a) SVM-RFE peak selection for cancer classification with mass spectrometry data. In: Proceedings of the 3rd Asia-pacific bioinformatics conference, pp 191–200

  13. Duan K, Rajapakse J (2004b) A variant of SVM-RFE for gene selection in cancer classification with expression data. In: Proceedings of IEEE symposium computational intelligence in bioinformatics and computational biology, pp 49–55

  14. Duan K, Rajapakse J, Wang H, Azuaje F (2005) Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE Trans Nanobiosci 4(3): 228–234

    Article  Google Scholar 

  15. Elalami M (2009) A filter model for feature subset selection based on genetic algorithm. Knowledge-Based Syst 22: 356–362

    Article  Google Scholar 

  16. Estevez P, Tesmer M, Perez C, Zurada J (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20: 189–201

    Article  Google Scholar 

  17. Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 1022–1027

  18. Golub T et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537

    Article  Google Scholar 

  19. Guyon W, Barnhill V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46: 389–422

    MATH  Article  Google Scholar 

  20. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182

    MATH  Article  Google Scholar 

  21. Ho S, Hsieh C, Chen H, Huang H (2006) Interpretable gene expression classifier with an accurate and compact fuzzy rule base for microarray data analysis. BioSystems 85: 165–176

    Article  Google Scholar 

  22. Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28: 1825–1844

    Article  Google Scholar 

  23. Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97(1-2): 273–324

    MATH  Article  Google Scholar 

  24. LeCun Y, Denker J, Solla S (1990) Optimal brain damage. Adv Neural Inform Process Syst II: 598–605

    Google Scholar 

  25. Lee C, Lee G (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inform Process Manage 42(1): 155–165

    Article  Google Scholar 

  26. Li F, Yang Y (2005) Analysis of recursive gene selection approaches from microarray data. Bioinformatics 21(19): 3741–3747

    Article  Google Scholar 

  27. Liu Q, Zhang Y, Hu Z (2007) Extracting positive and negative association classification rules from RBF kernel. In: 2007 International conference on convergence information technology. IEEE Computer Society, pp 1285–1291

  28. Niijima S, Kuhara S (2006) Gene subset selection in kernel-induced feature space. Pattern Recogn Lett 27: 1884–1892

    Article  Google Scholar 

  29. Schoch C, Kohlmann A, Schnittger S et al (2002) Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proc Nat Acad Sci USA 99(15): 10008–10013

    Article  Google Scholar 

  30. Shipp M, Ross K, Tamayo P et al (2002) Diffuse large B-Cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Med 8(1): 68–74

    Article  Google Scholar 

  31. Silva P, Hashimoto R, Kim S et al (2005) Feature selection algorithms to find strong genes. Pattern Recogn Lett 26: 1444–1453

    Article  Google Scholar 

  32. Singh D, Febbo P et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1: 203–209

    Article  Google Scholar 

  33. Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. In: IEEE transactions on pattern analysis and machine intelligence, vol. 29(6):1035–1051

  34. Tang Y, Zhang Y, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol Bioinform 4(3): 365–381

    Article  Google Scholar 

  35. Tong D, Phalp K, Schierz A, Mintram R (2009) Innovative hybridisation of genetic algorithms and neural networks in detecting marker genes for leukaemia cancer. In: 4th IAPR international conference on pattern recognition in bioinformatics, Sheffield, 7–9 September 2009

  36. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  37. Wang Z, Palade V, Xu Y (2006) Neuro-fuzzy ensemble approach for microarray cancer gene expression data analysis. In: Proceedings of the second international symposium on evolving fuzzy system (EFS’06), IEEE Computational Intelligence Society 2006 , pp 241–246

  38. Youn E, Jeong M (2009) Class dependent feature scaling method using naive Bayes classifier for text data mining. Pattern Recogn Lett 30: 477–485

    Article  Google Scholar 

  39. Zhang C, Lu X, Zhang X (2006) Significance of gene ranking for classification of microarray samples. IEEE/ACM Trans Comput Biology Bioinform 3(3): 312–320

    Article  Google Scholar 

  40. Zhang H, Song X, Wang H, Zhang X (2009) MIClique: an algorithm to identify differentially coexpressed disease gene subset from microarray data. J Biomed Biotechnol 2009. Article No.: 42524, doi:10.1155/2009/642524

Download references

Author information



Corresponding author

Correspondence to Quanzhong Liu.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Liu, Q., Chen, C., Zhang, Y. et al. Feature selection for support vector machines with RBF kernel. Artif Intell Rev 36, 99–115 (2011).

Download citation


  • Feature selection
  • RBF kernel
  • Information gain
  • Recursive Feature Elimination