SVM-RFE with Relevancy and Redundancy Criteria for Gene Selection

  • Piyushkumar A. Mundra
  • Jagath C. Rajapakse
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4774)


This paper introduces a novel gene selection method incorporating mutual information in the support vector machine recursive feature elimination (SVM-RFE). We incorporate an additional term of mutual information based minimum redundancy maximum relevancy criteria along with feature weight calculated by SVM algorithm. We tested proposed method on colon cancer and leukemia cancer gene expression dataset. The results show that the proposed method performs better than the original SVM-RFE method. The selected gene subset has better classification accuracy and better generalization capability.


Gene selection mutual information minimum redundancy maximum relevancy SVM-RFE cancer classification 


  1. 1.
    Blum, A., Langley, A.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)zbMATHCrossRefGoogle Scholar
  3. 3.
    Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.: Filter versus wrapper gene selection approaches in DNA microarray domains. Arti. Intelli. Medicine 31, 91–103 (2004)CrossRefGoogle Scholar
  4. 4.
    Rakotomamonjy, A.: Variable selection using SVM criteria. J. Mach. Learn. Res (Special Issue on Variable Selection) 3, 1357–1370 (2003)zbMATHMathSciNetGoogle Scholar
  5. 5.
    Ruiz, R., Riquelme, J., Aguilar-Ruiz, J.: Incremental wrapper-based gene selection from microarraydata for cancer classification. Patter. Recog. 39, 2383–2392 (2006)CrossRefGoogle Scholar
  6. 6.
    Yousef, M., Jung, S., Showe, L., Showe, M.: Recursive Cluster Elimination (RCE) for Classification and Feature Selection from Gene Expression Data. BMC Bioinfo. 8, 144 (2007)CrossRefGoogle Scholar
  7. 7.
    Kai-Bo, D., Rajapakse, J.C., Wang, H., Azuaje, F.: Multiple SVM-RFE for Gene Selection in Cancer Classification With Expression Data. IEEE Trans. Nanobio. 4, 228–234 (2005)CrossRefGoogle Scholar
  8. 8.
    Rajapakse, J.C., Kai-Bo, D., Yeo, W.K.: Proteomic Cancer Classification with Mass Spectrometry Data. Am. J. Pharmacogenomics 5, 281–292 (2005)CrossRefGoogle Scholar
  9. 9.
    Diaz-Uriarte, R., Andres, S.: Gene Selection and classification of microarray data using random forest. BMC Bioinfo. 7, 3 (2006)CrossRefGoogle Scholar
  10. 10.
    Guyon, I., Weston, J., Barhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46, 389–422 (2002)zbMATHCrossRefGoogle Scholar
  11. 11.
    Ding, C., Peng, H.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. J. Bioinfo. Compu. Bio. 3, 185–205 (2005)CrossRefGoogle Scholar
  12. 12.
    Ding, C., Peng, H.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. In: Proceed. Second IEEE Comp. System. Bioinfo. Conferen., pp. 523–529. IEEE Computer Society Press, Los Alamitos (2003)Google Scholar
  13. 13.
    Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Trans. Patt. Anal. Machi. Intell. 27, 1226–1237 (2005)CrossRefGoogle Scholar
  14. 14.
    Ooi, C., Chetty, M., Teng, S.: Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data. BMC Bioinfo. 7, 320–339 (2006)CrossRefGoogle Scholar
  15. 15.
    LeCun, Y., Denker, J., Solla, S., Howard, R., Jackel, L.: Optimal Brain Damage. In: Touretzky, D. (ed.) Advances in Neural Information Processing Systems II, pp. 598–605. Morgan Kaufmann, San Mateo, CA (1990)Google Scholar
  16. 16.
    Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  17. 17.
    Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS 96, 6745–6750 (1999)CrossRefGoogle Scholar
  18. 18.
    Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines (2001),

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Piyushkumar A. Mundra
    • 1
  • Jagath C. Rajapakse
    • 1
    • 2
  1. 1.Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798Singapore
  2. 2.Singapore-MIT Alliance, N2-B2C-15, 50 Nanyang AvenueSingapore

Personalised recommendations