Skip to main content

Support Vector Machine Classification for High Dimensional Microarray Data Analysis, With Applications in Cancer Research

  • Chapter
  • First Online:
High-Dimensional Data Analysis in Cancer Research
  • 1194 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agresti, A. (2002). Categorical Data Analysis. Wiley-Interscience, New York.

    Book  Google Scholar 

  • Bach, F., Lanckriet, G. R., and Jordan, M. I. (2004). Multiple kernel learning, conic duality, and the smo algorithm. In Proceeding of the Twenty-First International Conference on Machine Learning, Vol. 69, ACM, New York.

    Google Scholar 

  • Bi, J., Bennett, K. P., Embrechts, M., Breneman, C. M., and Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3:1229–1243.

    Article  Google Scholar 

  • Boser, B. E., Guyon, I. M., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Haussler, D., editor, Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM Press, Pittsburgh, PA.

    Chapter  Google Scholar 

  • Bradley, P. S. and Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In Shavlik, J., editor, Machine Learning Proceedings of the Fifteenth International Conference (ICML ’98), pages 82–90. Morgan Kaufmann, San Francisco, CA.

    Google Scholar 

  • Bredensteiner, E. J. and Bennett, K. P. (1999). Multicategory classification by support vector machines. Computational Optimization and Applications, 12:35–46.

    Article  Google Scholar 

  • Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:121–167.

    Article  Google Scholar 

  • Chaplle, O., Vapnik, V., Bousquet, O., and Mukherjee, S. (2002). Choosing kernel parameters for support vector machines. Machine Learning, 46:131–159.

    Article  Google Scholar 

  • Cortes, C. and Vapnik, V. (1995). Support vector networks. Machine Learning, 20:1–25.

    Google Scholar 

  • Cox, D. and O’Sullivan, F. (1990). Asymptotic analysis of penalized likelihood and related estimator. Annals of Statistics, 18:1676–1695.

    Article  Google Scholar 

  • Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  • Duan, K., Keerthi, S., and Poo, A. (2001). Evaluation of simple performance measures for tuning svm hyperparameters. Technical Report CD-01-11, Department of Mechanical Engineering, National University of Singapore.

    Google Scholar 

  • Dudoit, S., Fridlyand, J., and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of American Statistical Association, 97:77–87.

    Article  CAS  Google Scholar 

  • Evgeniou, T., Pontil, M., and Poggio, T. (1999). A unified framework for regularization networks and support vector machines. Technical report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning Department of Brain and Cognitive Sciences.

    Google Scholar 

  • Fan, J. and Li, R. Z. (2001). Variable selection via penalized likelihood. Journal of the American Statistical Association, 96:1348–1360.

    Article  Google Scholar 

  • Fletcher, R. (1987). Practical Methods of Optimization. Wiley-Interscience, New York, NY.

    Google Scholar 

  • Fung, G. and Mangasarian, O. L. (2001). Multicategory proximal support vector machine classifiers. Technical Report 01–06, University of Wisconsin-Madison, Data Mining Institute.

    Google Scholar 

  • Fung, G. and Mangasarian, O. L. (2004). A feature selection newton method for support vector machine classification. Computational Optimization and Applications Journal, 28(2):185–202.

    Article  Google Scholar 

  • Furey, T., Cristianini, N., Duffy, N., Bednarski, D., Schurmmer, M., and Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16:906–914.

    Article  PubMed  CAS  Google Scholar 

  • Grandvalet, Y. and Canu, S. (2002). Adaptive scaling for feature selection in SVMs. Neural Information Processing Systems, 553–560.

    Google Scholar 

  • Guermeur, Y. (2002). Combining discriminant models with new multi-class SVMs. Pattern Analysis and Applications, 5:168–179.

    Article  Google Scholar 

  • Gunn, S. R. and Kandola, J. S. (2002). Structural modeling with sparse kernels. Machine Learning, 48:115–136.

    Article  Google Scholar 

  • Guyon, I., Weston, J., and Barnhill, S. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46:389–422.

    Article  Google Scholar 

  • Hall, P., Marrson, S., and Neeman, A. (2005). Geometric representation for high dimension low sample size data. Journal of Royal Statistical Society, B, 67:427–444.

    Article  Google Scholar 

  • Hand, D. J. (1997). Construction and Assessment of Classification Rules. John Wiley and Sons, Chichester, England.

    Google Scholar 

  • Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Element of Statistical Learning. Springer, New York.

    Google Scholar 

  • Hastie, T., Rosset, S., Tibshirani, R., and Zhu, J. (2004). The entire regularization path for the support vector machines. Journal of Machine Learning Research, 5:1391–1415.

    Google Scholar 

  • Hu, Z., Fan, C., Marron, J. S., He, X., Qaqish, B. F., Karaca, G., Livasy, C., Carey, L., Reynolds, E., Dressler, L., Nobel, A., Parker, J., Ewend, M. G., Sawyer, L. R., Xiang, D., Wu, J., Liu, Y., Karaca, M., Nanda, R., Tretiakova, M., Orrico, A. R., Dreher, D., Palazzo, J. P., Perreard, L., Nelson, E., Mone, M., Hansen, H., Mullins, M., Quackenbush, J. F., Olapade, O. I., Bernard, B. S., and Perou, C. M. (2005). The molecular portraits of breast tumors are conserved across microarray platforms. submitted.

    Google Scholar 

  • Joachims, T. (2000). Estimating the generalization performance of an SVM efficiently. In Proceedings of ICML-00, 17th International Conference on Machine Learning, Morgan Kaufman, San Francisco, 431–438.

    Google Scholar 

  • Khan, J., Wei, J., Ringer, M., Saal, L., Ladanyi, M., Westerman, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C., and Meltzer, P. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural network. Nature Medicine, Jun.; 7(6):673–679.

    Article  PubMed  CAS  Google Scholar 

  • Kimeldorf, G. and Wahba, G. (1971). Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 33:82–85.

    Article  Google Scholar 

  • Kittler, J. (1986). Feature selection and extraction. In T.Y.Young and K.-S. Fu, editors, Handbook of Pattern Recognition and Image Processing. Academic Press, New York.

    Google Scholar 

  • Lee, Y. and Cui, Z. (2006). Characterizing the solution path of multicategory support vector machines. Statistica Sinica, 16:391–409.

    Google Scholar 

  • Lee, Y. and Lee, C. (2003). Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics, 19:1132–1139.

    Article  PubMed  CAS  Google Scholar 

  • Lee, Y., Lin, Y., and Wahba, G. (2004). Multicategory support vector machines, theory, and application to the classification of microarray data and satellite ra diance data. Journal of American Statistical Association, 99:67–81.

    Article  Google Scholar 

  • Lin, Y. (2002). SVM and the Bayes rule in classification. Data Mining and Knowledge Discovery, 6:259–275.

    Article  Google Scholar 

  • Lin, Y. and Zhang, H. H. (2006). Component selection and smoothing in smoothing spline analysis of variance models. Annals of Statistics, 34:2272–2297.

    Article  Google Scholar 

  • Lin, Y., Lee, Y., and Wahba, G. (2002). Support vector machines for classification in nonstandard situations. Machine Learning, 46:191–202.

    Article  Google Scholar 

  • Liu, Y. and Shen, X. (2006). Multicategory psi-learning and support vector machine: computational tools. Journal of American Statistical Association, 99:219–236.

    Google Scholar 

  • Liu, Y., Shen, X., and Doss, H. (2004). Multicategory psi-learning and support vector machine: computational tools. Journal of Computational and Graphical Statistics, 14:219–236.

    Article  Google Scholar 

  • Pan, W. (2002). A comparative review of statistical methods for discovering differently expressed genes in replicated microarray experiments. Bioinformatics, 18:546–554.

    Article  PubMed  CAS  Google Scholar 

  • Perou, C., Srlie, T., Eisen, M., van de Rijn, M., Jeffrey, S., Rees, C., Pollack, J., Ross, D., Johnsen, H., Akslen, L., Fluge, O., Pergamenschikov, A., Williams, C., Zhu, S., Lning, P., Brresen-Dale, A., Brown, P., and Botstein, D. (2000). Molecular portraits of human breast tumors. Nature, 406:747–752.

    Article  PubMed  CAS  Google Scholar 

  • Rakotomamonjy, A. (2003). Variable selection using svm-based criteria. Journal of Machine Learning Research, 3:1357–1370.

    Article  Google Scholar 

  • Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels. MIT Press, Cambridge, MA.

    Google Scholar 

  • Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Recognition. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  • Sotiriou, C., Neo, S., McShane, L., Korn, E., Long, P., Jazaeri, A., Martiat, P., Fox, S., Harris, A., and Liu, E. (2003). Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proceedings of the National Academy of Sciences, 100(18):10393–10398.

    Article  CAS  Google Scholar 

  • Tang, Y. and Zhang, H. H. (2005). Multiclass proximal support vector machines. Journal of Computational and Graphical Statistics, 15:339–355.

    Article  Google Scholar 

  • Tibshirani, R. J. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, B, 58:267–288.

    Google Scholar 

  • Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences USA, 99:6567–6572.

    Article  CAS  Google Scholar 

  • Vapnik, V. N. (1998). Statistical Learning Theory. Wiley, New York.

    Google Scholar 

  • Veer, L. V., Dai, H., van de Vijver, M., He, Y., Hart, A., Mao, M., Peterse, H., van der Kooy, K., Marton, M., Witteveen, A., Schreiber, G., Kerkhoven, R., Roberts, C., Linsley, P., Bernards, R., and Friend, S. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415:530–536.

    Article  Google Scholar 

  • Wahba, G. (1990). Spline Models for Observational Data, volume 59. SIAM. CBMS-NSF Regional Conference Series in Applied Mathematics.

    Google Scholar 

  • Wahba, G. (1999). Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In Scholkopt, B., Burges, C., and Smola, A., editors, Advances in Kernel Methods–Support Vector Learning. MIT Press, Cambridge, MA.

    Google Scholar 

  • Wahba, G., Lin, Y., and Zhang, H. H. (2000). Generalized approximate cross validation for support vector machines, or, another way to look at margin-like quantities. In Smola, Bartlett, Scholkopf, and Schurmans, editors, Advances in Large Margin Classifiers. MIT Press.

    Google Scholar 

  • Wang, L. and Shen, X. (2007). On 11-norm multiclass support vector machines: methodology and theory. Journal of American Statistical Association, 102:583–594.

    Article  CAS  Google Scholar 

  • Weston, J. and Watkins, C. Multi-class support vector machines, In Verleysen, M., editor, Proceedings of ESANN99, Brussels, D. Facto Press (1999).

    Google Scholar 

  • Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V. Feature selection for SVMs. In Advances in Neural Information Processing Systems (NIPS) 13, (2000). (Edited by: TK Leen, TG Dietterich, V Tresp). MIT Press 2001, 668–674.

    Google Scholar 

  • Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, 32:56–85.

    Article  CAS  Google Scholar 

  • Zhang, H. (2006). Variable selection for support vector machines via smoothing spline anova. Statistica Sinica, 16:659–674.

    Google Scholar 

  • Zhang, H., Ahn, J., Lin, X., and Park, C. (2006). Gene selection using support vector machines with nonconvex penalty. Bioinformatics, 22:88–95.

    Article  PubMed  Google Scholar 

  • Zhang, H., Liu, Y., Wu, Y., and Zhu, J. (2008). Variable selection for multicategory SVM via supnorm regularization. The Electronic Journal of of Statistics. to appear.

    Google Scholar 

  • Zhu, J., Rosset, S., Hastie, T., and Tibshirani, R. (2003). 1-norm support vector machines. NIPS 16. MIT Press.

    Google Scholar 

  • Zou, H. and Yuan, M. (2008). The F ∞ support vector machines. Statistica Sinica, 18:379–398.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Helen Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Zhang, H.H. (2009). Support Vector Machine Classification for High Dimensional Microarray Data Analysis, With Applications in Cancer Research. In: Li, X., Xu, R. (eds) High-Dimensional Data Analysis in Cancer Research. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69765-9_6

Download citation

Publish with us

Policies and ethics