Abstract
This paper introduces a novel gene selection method incorporating mutual information in the support vector machine recursive feature elimination (SVM-RFE). We incorporate an additional term of mutual information based minimum redundancy maximum relevancy criteria along with feature weight calculated by SVM algorithm. We tested proposed method on colon cancer and leukemia cancer gene expression dataset. The results show that the proposed method performs better than the original SVM-RFE method. The selected gene subset has better classification accuracy and better generalization capability.
Chapter PDF
Similar content being viewed by others
Keywords
References
Blum, A., Langley, A.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)
Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.: Filter versus wrapper gene selection approaches in DNA microarray domains. Arti. Intelli. Medicine 31, 91–103 (2004)
Rakotomamonjy, A.: Variable selection using SVM criteria. J. Mach. Learn. Res (Special Issue on Variable Selection) 3, 1357–1370 (2003)
Ruiz, R., Riquelme, J., Aguilar-Ruiz, J.: Incremental wrapper-based gene selection from microarraydata for cancer classification. Patter. Recog. 39, 2383–2392 (2006)
Yousef, M., Jung, S., Showe, L., Showe, M.: Recursive Cluster Elimination (RCE) for Classification and Feature Selection from Gene Expression Data. BMC Bioinfo. 8, 144 (2007)
Kai-Bo, D., Rajapakse, J.C., Wang, H., Azuaje, F.: Multiple SVM-RFE for Gene Selection in Cancer Classification With Expression Data. IEEE Trans. Nanobio. 4, 228–234 (2005)
Rajapakse, J.C., Kai-Bo, D., Yeo, W.K.: Proteomic Cancer Classification with Mass Spectrometry Data. Am. J. Pharmacogenomics 5, 281–292 (2005)
Diaz-Uriarte, R., Andres, S.: Gene Selection and classification of microarray data using random forest. BMC Bioinfo. 7, 3 (2006)
Guyon, I., Weston, J., Barhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46, 389–422 (2002)
Ding, C., Peng, H.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. J. Bioinfo. Compu. Bio. 3, 185–205 (2005)
Ding, C., Peng, H.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. In: Proceed. Second IEEE Comp. System. Bioinfo. Conferen., pp. 523–529. IEEE Computer Society Press, Los Alamitos (2003)
Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Trans. Patt. Anal. Machi. Intell. 27, 1226–1237 (2005)
Ooi, C., Chetty, M., Teng, S.: Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data. BMC Bioinfo. 7, 320–339 (2006)
LeCun, Y., Denker, J., Solla, S., Howard, R., Jackel, L.: Optimal Brain Damage. In: Touretzky, D. (ed.) Advances in Neural Information Processing Systems II, pp. 598–605. Morgan Kaufmann, San Mateo, CA (1990)
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression. Science 286, 531–537 (1999)
Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS 96, 6745–6750 (1999)
Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mundra, P.A., Rajapakse, J.C. (2007). SVM-RFE with Relevancy and Redundancy Criteria for Gene Selection. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-75286-8_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)