Skip to main content

Kernel Methods

  • Reference work entry
Encyclopedia of Machine Learning
  • 382 Accesses

Definition

Kernel methods refer to a class of techniques that employ positive definite kernels. At an algorithmic level, its basic idea is quite intuitive: implicitly map objects to high-dimensional feature spaces, and then directly specify the inner product there. As a more principled interpretation, it formulates learning and estimation problems in a reproducing kernel Hilbert space, which is advantageous in a number of ways:

  • It induces a rich feature space and admits a large class of (nonlinear) functions.

  • It can be flexibly applied to a wide range of domains including both Euclidean and non-Euclidean spaces.

  • Searching in this infinite-dimensional space of functions can be performed efficiently, and one only needs to consider the finite subspace expanded by the data.

  • Working in the linear spaces of function lends significant convenience to the construction and analysis of learning algorithms.

Motivation and Background

Over the past decade, kernel methods have gained much popularity...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 337–404.

    MATH  MathSciNet  Google Scholar 

  • Bach, F. R., & Jordan, M. I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48.

    Article  MathSciNet  Google Scholar 

  • Boser, B., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In D. Haussler, (Ed.), Proceedings of the annual conference computational learning theory, (pp. 144–152). Pittsburgh: ACM Press.

    Google Scholar 

  • Collins, M., Globerson, A., Koo, T., Carreras, X., & Bartlett, P. (2008). Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks. Journal of Machine Learning Research, 9, 1775–1822.

    MathSciNet  Google Scholar 

  • Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.

    Google Scholar 

  • Haussler, D. (1999). Convolution kernels on discrete structures (Tech. Rep. UCS-CRL-99-10). University of California, Santa Cruz.

    Google Scholar 

  • Hofmann, T., Schölkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. Annals of Statistics, 36(3), 1171–1220.

    Article  MATH  MathSciNet  Google Scholar 

  • Lampert, C. H. (2009). Kernel methods in computer vision. Foundations and Trends in Computer Graphics and Vision, 4(3), 193–285.

    Article  Google Scholar 

  • Poggio, T., & Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78(9), 1481–1497.

    Article  Google Scholar 

  • Schölkopf, B., & Smola, A. (2002). Learning with Kernels. Cambridge: MIT Press.

    Google Scholar 

  • Schölkopf, B., Smola, A. J., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.

    Article  Google Scholar 

  • Schölkopf, B., Tsuda, K., & Vert, J.-P. (2004). Kernel methods in computational biology. Cambridge: MIT Press.

    Google Scholar 

  • Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.

    Google Scholar 

  • Smola, A. J., Gretton, A., Song, L., & Schölkopf, B. (2007). A Hilbert space embedding for distributions. In International conference on algorithmic learning theory. LNAI (Vol. 4754, pp. 13–31). Springer, Berlin, Germany.

    Google Scholar 

  • Smola, A. J., Schölkopf, B., & Müller, K.-R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11(5), 637–649.

    Article  Google Scholar 

  • Smola, A., Vishwanathan, S. V. N., & Le, Q. (2007). Bundle methods for machine learning. In D. Koller, & Y. Singer, (Eds.), Advances in neural information processing systems (Vol. 20). Cambridge: MIT Press.

    Google Scholar 

  • Steinwart, I., & Christmann, A. (2008). Support vector machines. Information Science and Statistics. Springer, New York.

    Google Scholar 

  • Taskar, B., Guestrin, C., & Koller, D. (2004). Max-margin Markov networks. In S. Thrun, L. Saul, & B. Schölkopf, (Eds.), Advances in neural information processing systems (Vol. 16, pp. 25–32). Cambridge: MIT Press.

    Google Scholar 

  • Vapnik, V. (1998). Statistical learning theory. New York: Wiley.

    MATH  Google Scholar 

  • Wahba, G. (1990). Spline models for observational data. CBMS-NSF regional conference series in applied mathematics (Vol. 59). Philadelphia: SIAM.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Zhang, X. (2011). Kernel Methods. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_430

Download citation

Publish with us

Policies and ethics