Advertisement

A User’s Guide to Support Vector Machines

  • Asa Ben-Hur
  • Jason Weston
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 609)

Abstract

The Support Vector Machine (SVM) is a widely used classifier in bioinformatics. Obtaining the best results with SVMs requires an understanding of their workings and the various ways a user can influence their accuracy. We provide the user with a basic understanding of the theory behind SVMs and focus on their use in practice. We describe the effect of the SVM parameters on the resulting classifier, how to select good values for those parameters, data normalization, factors that affect training time, and software for training SVMs.

Key words

Kernel methods Support Vector Machines (SVM) 

Notes

Acknowledgments

The authors would like to thank William Noble for comments on the manuscript.

References

  1. 1.
    Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992) A training algorithm for optimal margin classifiers. In D. Haussler, editor, 5th Annual ACM Workshop on COLT, pp. 144–152, Pittsburgh, PA. ACM Press.Google Scholar
  2. 2.
    Schölkopf, B., Tsuda, K., and Vert, J-P., editors (2004) Kernel Methods in Computational Biology. MIT Press series on Computational Molecular Biology.Google Scholar
  3. 3.
    Noble, W.S. (2006) What is a support vector machine? Nature Biotechnology 24, 1564–1567.CrossRefGoogle Scholar
  4. 4.
    Shawe-Taylor, J. and Cristianini, N. (2004) Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, MA.Google Scholar
  5. 5.
    Schölkopf, B. and Smola, A. (2002) Learning with Kernels. MIT Press, Cambridge, MA.Google Scholar
  6. 6.
    Hsu, C-W., Chang, C-C., and Lin, C-J. (2003) A Practical Guide to Support Vector Classification. Technical report, Department of Computer Science, National Taiwan University.Google Scholar
  7. 7.
    Sonnenburg, S., Braun, M.L., Ong, C.S. et al. (2007) The need for open source software in machine learning. Journal of Machine Learning Research, 8, 2443–2466.Google Scholar
  8. 8.
    Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, MA.Google Scholar
  9. 9.
    Hastie, T., Tibshirani, R., and Friedman, J.H. (2001) The Elements of Statistical Learning. Springer.Google Scholar
  10. 10.
    Bishop, C.M. (2007) Pattern Recognition and Machine Learning. Springer.Google Scholar
  11. 11.
    Cortes, C. and Vapnik, V.N. (1995) Support vector networks. Machine Learning 20, 273–297.Google Scholar
  12. 12.
    Chapelle, O. (2007) Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines. MIT Press, Cambridge, MA.Google Scholar
  13. 13.
    Provost, F. (2000) Learning with imbalanced data sets 101. In AAAI 2000 workshop on imbalanced data sets.Google Scholar
  14. 14.
    Chang, C-C. and Lin, C-J. (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
  15. 15.
    Bottou, L., Chapelle, O., DeCoste, D., and Weston, J., editors (2007) Large Scale Kernel Machines. MIT Press, Cambridge, MA.Google Scholar
  16. 16.
    Joachims, J. (2006) Training linear SVMs in linear time. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217 – 226.Google Scholar
  17. 17.
    Sindhwani, V. and Keerthi, S.S. (2006) Large scale semi-supervised linear SVMs. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 477–484.Google Scholar
  18. 18.
    Bordes, A., Ertekin, S., Weston, J., and Bottou, L. (2005) Fast kernel classifiers with online and active learning. Journal of Machine Learning Research 6, 1579–1619.Google Scholar
  19. 19.
    Joachims, J. (1998) Making large-scale support vector machine learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA.Google Scholar
  20. 20.
    Demsar, J., Zupan, B., and Leban, J. (2004) Orange: From Experimental Machine Learning to Interactive Data Mining. Faculty of Computer and Information Science, University of Ljubljana.Google Scholar
  21. 21.
    Gawande, K., Webers, C., Smola, A., et al. (2007) ELEFANT user manual (revision 0.1). Technical report, NICTA.Google Scholar
  22. 22.
    Witten, I.H., and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition.Google Scholar
  23. 23.
    Bottou, L. and Le Cun, Y. (2002) Lush Reference Manual. Available at http://lush.sourceforge.net
  24. 24.
    Sonnenburg, S., Raetsch, G., Schaefer, C. and Schoelkopf, B. (2006) Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565.Google Scholar
  25. 25.
    Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and Euler, T. (2006) YALE: Rapid prototyping for complex data mining tasks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
  26. 26.
    Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L., editors. (2006) Feature Extraction, Foundations and Applications. Springer Verlag.Google Scholar
  27. 27.
    Guyon, I., and Elisseeff, A. (2003) An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182. MIT Press, Cambridge, MA, USA.Google Scholar
  28. 28.
    Guyon, I., Weston, J., Barnhill, S., and Vapnik, V.N. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422.CrossRefGoogle Scholar
  29. 29.
    Weston, J. and Watkins, C. (1998) Multi-class support vector machines. Royal Holloway Technical Report CSD-TR-98-04.Google Scholar
  30. 30.
    Rifkin, R. and Klautau, A. (2004) In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141.Google Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Asa Ben-Hur
    • 1
  • Jason Weston
    • 2
  1. 1.Department of Computer ScienceColorado State UniversityFort CollinsUSA
  2. 2.NEC Labs AmericaPrincetonUSA

Personalised recommendations