Support Vector Machine Classification for High Dimensional Microarray Data Analysis, With Applications in Cancer Research

Zhang, Hao Helen

doi:10.1007/978-0-387-69765-9_6

Hao Helen Zhang³

Part of the book series: Applied Bioinformatics and Biostatistics in Cancer Research ((ABB))

1194 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agresti, A. (2002). Categorical Data Analysis. Wiley-Interscience, New York.
Book Google Scholar
Bach, F., Lanckriet, G. R., and Jordan, M. I. (2004). Multiple kernel learning, conic duality, and the smo algorithm. In Proceeding of the Twenty-First International Conference on Machine Learning, Vol. 69, ACM, New York.
Google Scholar
Bi, J., Bennett, K. P., Embrechts, M., Breneman, C. M., and Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3:1229–1243.
Article Google Scholar
Boser, B. E., Guyon, I. M., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Haussler, D., editor, Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM Press, Pittsburgh, PA.
Chapter Google Scholar
Bradley, P. S. and Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In Shavlik, J., editor, Machine Learning Proceedings of the Fifteenth International Conference (ICML ’98), pages 82–90. Morgan Kaufmann, San Francisco, CA.
Google Scholar
Bredensteiner, E. J. and Bennett, K. P. (1999). Multicategory classification by support vector machines. Computational Optimization and Applications, 12:35–46.
Article Google Scholar
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:121–167.
Article Google Scholar
Chaplle, O., Vapnik, V., Bousquet, O., and Mukherjee, S. (2002). Choosing kernel parameters for support vector machines. Machine Learning, 46:131–159.
Article Google Scholar
Cortes, C. and Vapnik, V. (1995). Support vector networks. Machine Learning, 20:1–25.
Google Scholar
Cox, D. and O’Sullivan, F. (1990). Asymptotic analysis of penalized likelihood and related estimator. Annals of Statistics, 18:1676–1695.
Article Google Scholar
Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK.
Google Scholar
Duan, K., Keerthi, S., and Poo, A. (2001). Evaluation of simple performance measures for tuning svm hyperparameters. Technical Report CD-01-11, Department of Mechanical Engineering, National University of Singapore.
Google Scholar
Dudoit, S., Fridlyand, J., and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of American Statistical Association, 97:77–87.
Article CAS Google Scholar
Evgeniou, T., Pontil, M., and Poggio, T. (1999). A unified framework for regularization networks and support vector machines. Technical report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning Department of Brain and Cognitive Sciences.
Google Scholar
Fan, J. and Li, R. Z. (2001). Variable selection via penalized likelihood. Journal of the American Statistical Association, 96:1348–1360.
Article Google Scholar
Fletcher, R. (1987). Practical Methods of Optimization. Wiley-Interscience, New York, NY.
Google Scholar
Fung, G. and Mangasarian, O. L. (2001). Multicategory proximal support vector machine classifiers. Technical Report 01–06, University of Wisconsin-Madison, Data Mining Institute.
Google Scholar
Fung, G. and Mangasarian, O. L. (2004). A feature selection newton method for support vector machine classification. Computational Optimization and Applications Journal, 28(2):185–202.
Article Google Scholar
Furey, T., Cristianini, N., Duffy, N., Bednarski, D., Schurmmer, M., and Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16:906–914.
Article PubMed CAS Google Scholar
Grandvalet, Y. and Canu, S. (2002). Adaptive scaling for feature selection in SVMs. Neural Information Processing Systems, 553–560.
Google Scholar
Guermeur, Y. (2002). Combining discriminant models with new multi-class SVMs. Pattern Analysis and Applications, 5:168–179.
Article Google Scholar
Gunn, S. R. and Kandola, J. S. (2002). Structural modeling with sparse kernels. Machine Learning, 48:115–136.
Article Google Scholar
Guyon, I., Weston, J., and Barnhill, S. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46:389–422.
Article Google Scholar
Hall, P., Marrson, S., and Neeman, A. (2005). Geometric representation for high dimension low sample size data. Journal of Royal Statistical Society, B, 67:427–444.
Article Google Scholar
Hand, D. J. (1997). Construction and Assessment of Classification Rules. John Wiley and Sons, Chichester, England.
Google Scholar
Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Element of Statistical Learning. Springer, New York.
Google Scholar
Hastie, T., Rosset, S., Tibshirani, R., and Zhu, J. (2004). The entire regularization path for the support vector machines. Journal of Machine Learning Research, 5:1391–1415.
Google Scholar
Hu, Z., Fan, C., Marron, J. S., He, X., Qaqish, B. F., Karaca, G., Livasy, C., Carey, L., Reynolds, E., Dressler, L., Nobel, A., Parker, J., Ewend, M. G., Sawyer, L. R., Xiang, D., Wu, J., Liu, Y., Karaca, M., Nanda, R., Tretiakova, M., Orrico, A. R., Dreher, D., Palazzo, J. P., Perreard, L., Nelson, E., Mone, M., Hansen, H., Mullins, M., Quackenbush, J. F., Olapade, O. I., Bernard, B. S., and Perou, C. M. (2005). The molecular portraits of breast tumors are conserved across microarray platforms. submitted.
Google Scholar
Joachims, T. (2000). Estimating the generalization performance of an SVM efficiently. In Proceedings of ICML-00, 17th International Conference on Machine Learning, Morgan Kaufman, San Francisco, 431–438.
Google Scholar
Khan, J., Wei, J., Ringer, M., Saal, L., Ladanyi, M., Westerman, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C., and Meltzer, P. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural network. Nature Medicine, Jun.; 7(6):673–679.
Article PubMed CAS Google Scholar
Kimeldorf, G. and Wahba, G. (1971). Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 33:82–85.
Article Google Scholar
Kittler, J. (1986). Feature selection and extraction. In T.Y.Young and K.-S. Fu, editors, Handbook of Pattern Recognition and Image Processing. Academic Press, New York.
Google Scholar
Lee, Y. and Cui, Z. (2006). Characterizing the solution path of multicategory support vector machines. Statistica Sinica, 16:391–409.
Google Scholar
Lee, Y. and Lee, C. (2003). Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics, 19:1132–1139.
Article PubMed CAS Google Scholar
Lee, Y., Lin, Y., and Wahba, G. (2004). Multicategory support vector machines, theory, and application to the classification of microarray data and satellite ra diance data. Journal of American Statistical Association, 99:67–81.
Article Google Scholar
Lin, Y. (2002). SVM and the Bayes rule in classification. Data Mining and Knowledge Discovery, 6:259–275.
Article Google Scholar
Lin, Y. and Zhang, H. H. (2006). Component selection and smoothing in smoothing spline analysis of variance models. Annals of Statistics, 34:2272–2297.
Article Google Scholar
Lin, Y., Lee, Y., and Wahba, G. (2002). Support vector machines for classification in nonstandard situations. Machine Learning, 46:191–202.
Article Google Scholar
Liu, Y. and Shen, X. (2006). Multicategory psi-learning and support vector machine: computational tools. Journal of American Statistical Association, 99:219–236.
Google Scholar
Liu, Y., Shen, X., and Doss, H. (2004). Multicategory psi-learning and support vector machine: computational tools. Journal of Computational and Graphical Statistics, 14:219–236.
Article Google Scholar
Pan, W. (2002). A comparative review of statistical methods for discovering differently expressed genes in replicated microarray experiments. Bioinformatics, 18:546–554.
Article PubMed CAS Google Scholar
Perou, C., Srlie, T., Eisen, M., van de Rijn, M., Jeffrey, S., Rees, C., Pollack, J., Ross, D., Johnsen, H., Akslen, L., Fluge, O., Pergamenschikov, A., Williams, C., Zhu, S., Lning, P., Brresen-Dale, A., Brown, P., and Botstein, D. (2000). Molecular portraits of human breast tumors. Nature, 406:747–752.
Article PubMed CAS Google Scholar
Rakotomamonjy, A. (2003). Variable selection using svm-based criteria. Journal of Machine Learning Research, 3:1357–1370.
Article Google Scholar
Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels. MIT Press, Cambridge, MA.
Google Scholar
Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Recognition. Cambridge University Press, Cambridge, UK.
Google Scholar
Sotiriou, C., Neo, S., McShane, L., Korn, E., Long, P., Jazaeri, A., Martiat, P., Fox, S., Harris, A., and Liu, E. (2003). Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proceedings of the National Academy of Sciences, 100(18):10393–10398.
Article CAS Google Scholar
Tang, Y. and Zhang, H. H. (2005). Multiclass proximal support vector machines. Journal of Computational and Graphical Statistics, 15:339–355.
Article Google Scholar
Tibshirani, R. J. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, B, 58:267–288.
Google Scholar
Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences USA, 99:6567–6572.
Article CAS Google Scholar
Vapnik, V. N. (1998). Statistical Learning Theory. Wiley, New York.
Google Scholar
Veer, L. V., Dai, H., van de Vijver, M., He, Y., Hart, A., Mao, M., Peterse, H., van der Kooy, K., Marton, M., Witteveen, A., Schreiber, G., Kerkhoven, R., Roberts, C., Linsley, P., Bernards, R., and Friend, S. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415:530–536.
Article Google Scholar
Wahba, G. (1990). Spline Models for Observational Data, volume 59. SIAM. CBMS-NSF Regional Conference Series in Applied Mathematics.
Google Scholar
Wahba, G. (1999). Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In Scholkopt, B., Burges, C., and Smola, A., editors, Advances in Kernel Methods–Support Vector Learning. MIT Press, Cambridge, MA.
Google Scholar
Wahba, G., Lin, Y., and Zhang, H. H. (2000). Generalized approximate cross validation for support vector machines, or, another way to look at margin-like quantities. In Smola, Bartlett, Scholkopf, and Schurmans, editors, Advances in Large Margin Classifiers. MIT Press.
Google Scholar
Wang, L. and Shen, X. (2007). On 11-norm multiclass support vector machines: methodology and theory. Journal of American Statistical Association, 102:583–594.
Article CAS Google Scholar
Weston, J. and Watkins, C. Multi-class support vector machines, In Verleysen, M., editor, Proceedings of ESANN99, Brussels, D. Facto Press (1999).
Google Scholar
Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V. Feature selection for SVMs. In Advances in Neural Information Processing Systems (NIPS) 13, (2000). (Edited by: TK Leen, TG Dietterich, V Tresp). MIT Press 2001, 668–674.
Google Scholar
Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, 32:56–85.
Article CAS Google Scholar
Zhang, H. (2006). Variable selection for support vector machines via smoothing spline anova. Statistica Sinica, 16:659–674.
Google Scholar
Zhang, H., Ahn, J., Lin, X., and Park, C. (2006). Gene selection using support vector machines with nonconvex penalty. Bioinformatics, 22:88–95.
Article PubMed Google Scholar
Zhang, H., Liu, Y., Wu, Y., and Zhu, J. (2008). Variable selection for multicategory SVM via supnorm regularization. The Electronic Journal of of Statistics. to appear.
Google Scholar
Zhu, J., Rosset, S., Hastie, T., and Tibshirani, R. (2003). 1-norm support vector machines. NIPS 16. MIT Press.
Google Scholar
Zou, H. and Yuan, M. (2008). The F _∞ support vector machines. Statistica Sinica, 18:379–398.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Department of Statistics, 2501 Founders Drive, Raleigh, NC, 27613, USA
Hao Helen Zhang

Authors

Hao Helen Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Helen Zhang .

Editor information

Editors and Affiliations

School of Medicine, Division of Biostatistics, Indiana University, West 10th Street 410 , Indianapolis, 46202, U.S.A.
Xiaochun Li
Dept. Mathematics, University of California, San Diego, Gilman Dr. 9500, La Jolla, 92093-0112, U.S.A.
Ronghui Xu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, H.H. (2009). Support Vector Machine Classification for High Dimensional Microarray Data Analysis, With Applications in Cancer Research. In: Li, X., Xu, R. (eds) High-Dimensional Data Analysis in Cancer Research. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69765-9_6

Download citation

DOI: https://doi.org/10.1007/978-0-387-69765-9_6
Published: 28 November 2008
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-69763-5
Online ISBN: 978-0-387-69765-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics