Skip to main content

A User’s Guide to Support Vector Machines

  • Protocol
  • First Online:
Data Mining Techniques for the Life Sciences

Part of the book series: Methods in Molecular Biology ((MIMB,volume 609))

Abstract

The Support Vector Machine (SVM) is a widely used classifier in bioinformatics. Obtaining the best results with SVMs requires an understanding of their workings and the various ways a user can influence their accuracy. We provide the user with a basic understanding of the theory behind SVMs and focus on their use in practice. We describe the effect of the SVM parameters on the resulting classifier, how to select good values for those parameters, data normalization, factors that affect training time, and software for training SVMs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992) A training algorithm for optimal margin classifiers. In D. Haussler, editor, 5th Annual ACM Workshop on COLT, pp. 144–152, Pittsburgh, PA. ACM Press.

    Google Scholar 

  2. Schölkopf, B., Tsuda, K., and Vert, J-P., editors (2004) Kernel Methods in Computational Biology. MIT Press series on Computational Molecular Biology.

    Google Scholar 

  3. Noble, W.S. (2006) What is a support vector machine? Nature Biotechnology 24, 1564–1567.

    Article  Google Scholar 

  4. Shawe-Taylor, J. and Cristianini, N. (2004) Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, MA.

    Google Scholar 

  5. Schölkopf, B. and Smola, A. (2002) Learning with Kernels. MIT Press, Cambridge, MA.

    Google Scholar 

  6. Hsu, C-W., Chang, C-C., and Lin, C-J. (2003) A Practical Guide to Support Vector Classification. Technical report, Department of Computer Science, National Taiwan University.

    Google Scholar 

  7. Sonnenburg, S., Braun, M.L., Ong, C.S. et al. (2007) The need for open source software in machine learning. Journal of Machine Learning Research, 8, 2443–2466.

    Google Scholar 

  8. Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, MA.

    Google Scholar 

  9. Hastie, T., Tibshirani, R., and Friedman, J.H. (2001) The Elements of Statistical Learning. Springer.

    Google Scholar 

  10. Bishop, C.M. (2007) Pattern Recognition and Machine Learning. Springer.

    Google Scholar 

  11. Cortes, C. and Vapnik, V.N. (1995) Support vector networks. Machine Learning 20, 273–297.

    Google Scholar 

  12. Chapelle, O. (2007) Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines. MIT Press, Cambridge, MA.

    Google Scholar 

  13. Provost, F. (2000) Learning with imbalanced data sets 101. In AAAI 2000 workshop on imbalanced data sets.

    Google Scholar 

  14. Chang, C-C. and Lin, C-J. (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm.

  15. Bottou, L., Chapelle, O., DeCoste, D., and Weston, J., editors (2007) Large Scale Kernel Machines. MIT Press, Cambridge, MA.

    Google Scholar 

  16. Joachims, J. (2006) Training linear SVMs in linear time. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217 – 226.

    Google Scholar 

  17. Sindhwani, V. and Keerthi, S.S. (2006) Large scale semi-supervised linear SVMs. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 477–484.

    Google Scholar 

  18. Bordes, A., Ertekin, S., Weston, J., and Bottou, L. (2005) Fast kernel classifiers with online and active learning. Journal of Machine Learning Research 6, 1579–1619.

    Google Scholar 

  19. Joachims, J. (1998) Making large-scale support vector machine learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA.

    Google Scholar 

  20. Demsar, J., Zupan, B., and Leban, J. (2004) Orange: From Experimental Machine Learning to Interactive Data Mining. Faculty of Computer and Information Science, University of Ljubljana.

    Google Scholar 

  21. Gawande, K., Webers, C., Smola, A., et al. (2007) ELEFANT user manual (revision 0.1). Technical report, NICTA.

    Google Scholar 

  22. Witten, I.H., and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition.

    Google Scholar 

  23. Bottou, L. and Le Cun, Y. (2002) Lush Reference Manual. Available at http://lush.sourceforge.net

  24. Sonnenburg, S., Raetsch, G., Schaefer, C. and Schoelkopf, B. (2006) Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565.

    Google Scholar 

  25. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and Euler, T. (2006) YALE: Rapid prototyping for complex data mining tasks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

    Google Scholar 

  26. Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L., editors. (2006) Feature Extraction, Foundations and Applications. Springer Verlag.

    Google Scholar 

  27. Guyon, I., and Elisseeff, A. (2003) An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182. MIT Press, Cambridge, MA, USA.

    Google Scholar 

  28. Guyon, I., Weston, J., Barnhill, S., and Vapnik, V.N. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422.

    Article  Google Scholar 

  29. Weston, J. and Watkins, C. (1998) Multi-class support vector machines. Royal Holloway Technical Report CSD-TR-98-04.

    Google Scholar 

  30. Rifkin, R. and Klautau, A. (2004) In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141.

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank William Noble for comments on the manuscript.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Ben-Hur, A., Weston, J. (2010). A User’s Guide to Support Vector Machines. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 609. Humana Press. https://doi.org/10.1007/978-1-60327-241-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-241-4_13

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60327-240-7

  • Online ISBN: 978-1-60327-241-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics