Abstract
The Support Vector Machine (SVM) is a widely used classifier in bioinformatics. Obtaining the best results with SVMs requires an understanding of their workings and the various ways a user can influence their accuracy. We provide the user with a basic understanding of the theory behind SVMs and focus on their use in practice. We describe the effect of the SVM parameters on the resulting classifier, how to select good values for those parameters, data normalization, factors that affect training time, and software for training SVMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992) A training algorithm for optimal margin classifiers. In D. Haussler, editor, 5th Annual ACM Workshop on COLT, pp. 144–152, Pittsburgh, PA. ACM Press.
Schölkopf, B., Tsuda, K., and Vert, J-P., editors (2004) Kernel Methods in Computational Biology. MIT Press series on Computational Molecular Biology.
Noble, W.S. (2006) What is a support vector machine? Nature Biotechnology 24, 1564–1567.
Shawe-Taylor, J. and Cristianini, N. (2004) Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, MA.
Schölkopf, B. and Smola, A. (2002) Learning with Kernels. MIT Press, Cambridge, MA.
Hsu, C-W., Chang, C-C., and Lin, C-J. (2003) A Practical Guide to Support Vector Classification. Technical report, Department of Computer Science, National Taiwan University.
Sonnenburg, S., Braun, M.L., Ong, C.S. et al. (2007) The need for open source software in machine learning. Journal of Machine Learning Research, 8, 2443–2466.
Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, MA.
Hastie, T., Tibshirani, R., and Friedman, J.H. (2001) The Elements of Statistical Learning. Springer.
Bishop, C.M. (2007) Pattern Recognition and Machine Learning. Springer.
Cortes, C. and Vapnik, V.N. (1995) Support vector networks. Machine Learning 20, 273–297.
Chapelle, O. (2007) Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines. MIT Press, Cambridge, MA.
Provost, F. (2000) Learning with imbalanced data sets 101. In AAAI 2000 workshop on imbalanced data sets.
Chang, C-C. and Lin, C-J. (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
Bottou, L., Chapelle, O., DeCoste, D., and Weston, J., editors (2007) Large Scale Kernel Machines. MIT Press, Cambridge, MA.
Joachims, J. (2006) Training linear SVMs in linear time. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217 – 226.
Sindhwani, V. and Keerthi, S.S. (2006) Large scale semi-supervised linear SVMs. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 477–484.
Bordes, A., Ertekin, S., Weston, J., and Bottou, L. (2005) Fast kernel classifiers with online and active learning. Journal of Machine Learning Research 6, 1579–1619.
Joachims, J. (1998) Making large-scale support vector machine learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA.
Demsar, J., Zupan, B., and Leban, J. (2004) Orange: From Experimental Machine Learning to Interactive Data Mining. Faculty of Computer and Information Science, University of Ljubljana.
Gawande, K., Webers, C., Smola, A., et al. (2007) ELEFANT user manual (revision 0.1). Technical report, NICTA.
Witten, I.H., and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition.
Bottou, L. and Le Cun, Y. (2002) Lush Reference Manual. Available at http://lush.sourceforge.net
Sonnenburg, S., Raetsch, G., Schaefer, C. and Schoelkopf, B. (2006) Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565.
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and Euler, T. (2006) YALE: Rapid prototyping for complex data mining tasks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Guyon, I., Gunn, S., Nikravesh, M., and Zadeh, L., editors. (2006) Feature Extraction, Foundations and Applications. Springer Verlag.
Guyon, I., and Elisseeff, A. (2003) An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182. MIT Press, Cambridge, MA, USA.
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V.N. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422.
Weston, J. and Watkins, C. (1998) Multi-class support vector machines. Royal Holloway Technical Report CSD-TR-98-04.
Rifkin, R. and Klautau, A. (2004) In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141.
Acknowledgments
The authors would like to thank William Noble for comments on the manuscript.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Ben-Hur, A., Weston, J. (2010). A User’s Guide to Support Vector Machines. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 609. Humana Press. https://doi.org/10.1007/978-1-60327-241-4_13
Download citation
DOI: https://doi.org/10.1007/978-1-60327-241-4_13
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60327-240-7
Online ISBN: 978-1-60327-241-4
eBook Packages: Springer Protocols