Abstract
Gene expression data that are gathered from tissue samples are expected to significantly help the development of efficient tumor diagnosis and classification platforms. Since DNA microarray experiments provide us with huge amount of gene expression data and only a few of genes are related to tumor, gene selection algorithms should be emphatically explored to extract those informative genes related tumor from gene expression data. So we propose a novel feature selection approach to further improve the SVM-based classification performance of gene expression data, which projects high dimensional data onto lower dimensional feature space. We examine a set of gene expression data that include sets of tumor and normal clinical samples by means of SVMs classifier. Experiments show that SVM has a superior performance in classification of gene expression data as long as the selected features can represent the principal components of all gene expression samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vapnik, V.N.: Statistical learning theory. Springer, New York (1998)
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Guyon, I.J., Barnhill, W.S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Eisen, M., Spellman, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. PNAS 95, 14863–14868 (1998)
Furlanello, C., Serafini, M., Merler, S., Jurman, G.: An accelerated procedure for recursive feature ranking on microarray data. Neural Networks 16, 641–648 (2003)
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Cho, S.-B., Won, H.-H.: Machine learning in DNA microarray analysis for cancer classification. In: Proceedings of the First Asia-Pacific Bioinformatics Conference on Bioinformatics, pp. 189–198 (2003)
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues by oligonucleotide arrays. Proc. Nat. Acad. Sci. USA 96, 6745–6750 (1999)
Nishimura, K., Abe, K., Ishikawa, S., Ishikawa, S., Tsutsumi, S., Hirota, K., Aburatani, H.: A PCA based method of gene expression visual analysis. Genome Informatics 14, 346–347 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, S., Wang, J., Chen, H., Zhang, B. (2006). SVM-Based Tumor Classification with Gene Expression Data. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_94
Download citation
DOI: https://doi.org/10.1007/11811305_94
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)