Abstract
This chapter addresses the study of kernel methods, a class of techniques that play a major role in machine learning and nonparametric statistics. Among others, these methods include support vector machines (GlossaryTerm
SVM
s) and least squares GlossaryTermSVM
s, kernel principal component analysis, kernel Fisher discriminant analysis, and Gaussian processes. The use of kernel methods is systematic and properly motivated by statistical principles. In practical applications, kernel methods lead to flexible predictive models that often outperform competing approaches in terms of generalization performance. The core idea consists of mapping data into a high-dimensional space by means of a feature map. Since the feature map is normally chosen to be nonlinear, a linear model in the feature space corresponds to a nonlinear rule in the original domain. This fact suits many real world data analysis problems that often require nonlinear models to describe their structure.In Sect. 32.1 we present historical notes and summarize the main ingredients of kernel methods. In Sect. 32.2 we present the core ideas of statistical learning and show how regularization can be employed to devise practical learning algorithms. In Sect. 32.3 we show a selection of techniques that are representative of a large class of kernel methods; these techniques – termed primal–dual methods – use Lagrange duality as the main mathematical tools. Section 32.4 discusses Gaussian processes, a class of kernel methods that uses a Bayesian approach to perform inference and learning. Section 32.5 recalls different approaches for the tuning of parameters. In Sect. 32.6 we review the mathematical properties of different yet equivalent notions of kernels and recall a number of specialized kernels for learning problems involving structured data. We conclude the chapter by presenting applications in Sect. 32.7.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- ERM:
-
empirical risk minimization
- GACV:
-
generalized approximate cross-validation
- GP:
-
Gaussian process
- HS:
-
Hilbert space
- i.i.d.:
-
independent, identically distributed
- KKT:
-
Karush–Kuhn–Tucker
- LASSO:
-
least absolute shrinkage and selection operator
- LOO:
-
leave-one-out
- LS:
-
least square
- MAP:
-
maximum a posteriori
- MEG:
-
magnetoencephalography
- MKL:
-
multiple kernel learning
- ML:
-
maximum likelihood
- MLP:
-
multilayer perceptron
- PCA:
-
principal component analysis
- QP:
-
quadratic programming
- r.k.:
-
reproducing kernel
- RBF:
-
radial basis function
- RKHS:
-
reproducing kernel Hilbert space
- SMO:
-
sequential minimum optimization
- SRM:
-
structural risk minimization
- SVC:
-
support vector classification
- SVD:
-
singular value decomposition
- SVM:
-
support vector machine
- VC:
-
Vapnik–Chervonenkis
References
J. Shawe-Taylor, N. Cristianini: Kernel Methods for Pattern Analysis (Cambridge Univ. Press, Cambridge 2004)
B. Schölkopf, A.J. Smola: Learning with Kernels: Support Vector Machines, Regularization, Optimization, Beyond (MIT Press, Cambridge 2002)
A.J. Smola, B. Schölkopf: A tutorial on support vector regression, Stat. Comput. 14(3), 199–222 (2004)
T. Hofmann, B. Schölkopf, A.J. Smola: Kernel methods in machine learning, Ann. Stat. 36(3), 1171–1220 (2008)
K.R. Müller, S. Mika, G. Ratsch, K. Tsuda, B. Schölkopf: An introduction to kernel-based learning algorithms, IEEE Trans. Neural Netw. 12(2), 181–201 (2001)
F. Jäkel, B. Schölkopf, F.A. Wichmann: A tutorial on kernel methods for categorization, J. Math. Psychol. 51(6), 343–358 (2007)
C. Campbell: Kernel methods: A survey of current techniques, Neurocomputing 48(1), 63–84 (2002)
J. Mercer: Functions of positive and negative type, and their connection with the theory of integral equations, Philos. Trans. R. Soc. A 209, 415–446 (1909)
E.H. Moore: On properly positive Hermitian matrices, Bull. Am. Math. Soc. 23(59), 66–67 (1916)
T. Kailath: RKHS approach to detection and estimation problems – I: Deterministic signals in Gaussian noise, IEEE Trans. Inf. Theory 17(5), 530–549 (1971)
E. Parzen: An approach to time series analysis, Ann. Math. Stat. 32, 951–989 (1961)
N. Aronszajn: Theory of reproducing kernels, Trans. Am. Math. Soc. 68, 337–404 (1950)
G. Wahba: Spline Models for Observational Data, CBMS-NSF Regional Conference Series in Applied Mathematics, Vol. 59 (SIAM, Philadelphia 1990)
A. Berlinet, C. Thomas-Agnan: Reproducing Kernel Hilbert Spaces in Probability and Statistics (Springer, New York 2004)
S. Saitoh: Integral Transforms, Reproducing Kernels and Their Applications, Chapman Hall/CRC Research Notes in Mathematics, Vol. 369 (Longman, Harlow 1997)
M. Aizerman, E.M. Braverman, L.I. Rozonoer: Theoretical foundations of the potential function method in pattern recognition learning, Autom. Remote Control 25, 821–837 (1964)
V. Vapnik: Pattern recognition using generalized portrait method, Autom. Remote Control 24, 774–780 (1963)
V. Vapnik, A. Chervonenkis: A note on one class of perceptrons, Autom. Remote Control 25(1), 112–120 (1964)
V. Vapnik, A. Chervonenkis: Theory of Pattern Recognitition (Nauka, Moscow 1974), in Russian, German Translation: W. Wapnik, A. Tscherwonenkis, Theorie der Zeichenerkennung (Akademie-Verlag, Berlin 1979)
V. Vapnik: Estimation of Dependences Based on Empirical Data (Springer, New York 1982)
B.E. Boser, I.M. Guyon, V.N. Vapnik: A training algorithm for optimal margin classifiers, Proc. 5th Ann. ACM Workshop Comput. Learn. Theory, ed. by D. Haussler (1992) pp. 44–152
I. Guyon, B. Boser, V. Vapnik: Automatic capacity tuning of very large VC-dimension classifiers, Adv. Neural Inf. Process. Syst. 5, 147–155 (1993)
I. Guyon, V. Vapnik, B. Boser, L. Bottou, S.A. Solla: Structural risk minimization for character recognition, Adv. Neural Inf. Process. Syst. 4, 471–479 (1992)
C. Cortes, V. Vapnik: Support vector networks, Mach. Learn. 20, 273–297 (1995)
V. Vapnik: The Nature of Statistical Learning Theory (Springer, New York 1995)
J.A.K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, J. Vandewalle: Least squares support vector machines (World Scientific, Singapore 2002)
O. Chapelle, B. Schölkopf, A. Zien: Semi-Supervised Learning (MIT Press, Cambridge 2006)
M. Belkin, P. Niyogi: Semi-supervised learning on Riemannian manifolds, Mach. Learn. 56(1), 209–239 (2004)
M. Belkin, P. Niyogi, V. Sindhwani: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, J. Mach. Learn. Res. 7, 2399–2434 (2006)
M. Belkin, P. Niyogi: Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput. 15(6), 1373–1396 (2003)
V. Sindhwani, P. Niyogi, M. Belkin: Beyond the point cloud: From transductive to semi-supervised learning, Int. Conf. Mach. Learn. (ICML), Vol. 22 (2005) pp. 824–831
V. Vapnik, A. Chervonenkis: The necessary and sufficient conditions for consistency in the empirical risk minimization method, Pattern Recognit. Image Anal. 1(3), 283–305 (1991)
V. Vapnik, A. Chervonenkis: Uniform convergence of frequencies of occurrence of events to their probabilities, Dokl. Akad. Nauk SSSR 181, 915–918 (1968)
V. Vapnik, A. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities, Theory Probab. Appl. 16(2), 264–280 (1971)
O. Bousquet, S. Boucheron, G. Lugosi: Introduction to statistical learning theory, Lect. Notes Comput. Sci. 3176, 169–207 (2004)
F. Cucker, D.X. Zhou: Learning Theory: An Approximation Theory Viewpoint, Cambridge Monographs on Applied and Computational Mathematics (Cambridge Univ. Press, New York 2007)
I. Steinwart, A. Christmann: Support Vector Machines, Information Science and Statistics (Springer, New York 2008)
V. Vapnik: Transductive inference and semi-supervised learning. In: Semi-Supervised Learning, ed. by O. Chapelle, B. Schölkopf, A. Zien (MIT Press, Cambridge 2006) pp. 453–472
A.N. Tikhonov: On the stability of inverse problems, Dokl. Akad. Nauk SSSR 39, 195–198 (1943)
A.N. Tikhonov: Solution of incorrectly formulated problems and the regularization method, Sov. Math. Dokl. 5, 1035 (1963)
A.N. Tikhonov, V.Y. Arsenin: Solutions of Ill-posed Problems (W.H. Winston, Washington 1977)
J. Hadamard: Sur les problèmes aux dérivées partielles et leur signification physique, Princet. Univ. Bull. 13, 49–52 (1902)
G. Kimeldorf, G. Wahba: Some results on Tchebycheffian spline functions, J. Math. Anal. Appl. 33, 82–95 (1971)
T. Evgeniou, M. Pontil, T. Poggio: Regularization networks and support vector machines, Adv. Comput. Math. 13(1), 1–50 (2000)
B. Schölkopf, R. Herbrich, A.J. Smola: A generalized representer theorem, Proc. Ann. Conf. Comput. Learn. Theory (COLT) (2001) pp. 416–426
F. Dinuzzo, B. Schölkopf: The representer theorem for Hilbert spaces: A necessary and sufficient condition, Adv. Neural Inf. Process. Syst. 25, 189–196 (2012)
S.P. Boyd, L. Vandenberghe: Convex Optimization (Cambridge Univ. Press, Cambridge 2004)
A.E. Hoerl, R.W. Kennard: Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12(1), 55–67 (1970)
D.W. Marquardt: Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation, Technometrics 12(3), 591–612 (1970)
C. Gu: Smoothing Spline ANOVA Models (Springer, New York 2002)
D.P. Bertsekas: Nonlinear Programming (Athena Scientific, Belmont 1995)
R. Tibshirani: Regression shrinkage and selection via the LASSO, J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
P. Zhao, G. Rocha, B. Yu: The composite absolute penalties family for grouped and hierarchical variable selection, Ann. Stat. 37, 3468–3497 (2009)
R. Jenatton, J.Y. Audibert, F. Bach: Structured variable selection with sparsity-inducing norms, J. Mach. Learn. Res. 12, 2777–2824 (2011)
M. Yuan, Y. Lin: Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)
C.A. Micchelli, M. Pontil: Learning the Kernel Function via Regularization, J. Mach. Learn. Res. 6, 1099–1125 (2005)
C.A. Micchelli, M. Pontil: Feature space perspectives for learning the kernel, Mach. Learn. 66(2), 297–319 (2007)
F.R. Bach, G.R.G. Lanckriet, M.I. Jordan: Multiple kernel learning, conic duality, and the SMO algorithm, Proc. 21st Int. Conf. Mach. Learn. (ICML) (ACM, New York 2004)
G.R.G. Lanckriet, T. De Bie, N. Cristianini, M.I. Jordan, W.S. Noble: A statistical framework for genomic data fusion, Bioinformatics 20(16), 2626–2635 (2004)
F.R. Bach, R. Thibaux, M.I. Jordan: Computing regularization paths for learning multiple kernels, Adv. Neural Inf. Process. Syst. 17, 41–48 (2004)
J. Baxter: Theoretical models of learning to learn. In: Learning to Learn, ed. by L. Pratt, S. Thrun (Springer, New York 1997) pp. 71–94
R. Caruana: Multitask learning. In: Learning to Learn, ed. by S. Thrun, L. Pratt (Springer, New York 1998) pp. 95–133
S. Thrun: Life-long learning algorithms. In: Learning to Learn, ed. by S. Thrun, L. Pratt (Springer, New York 1998) pp. 181–209
A. Argyriou, T. Evgeniou, M. Pontil: Multi-task feature learning, Adv. Neural Inf. Process. Syst. 19, 41–48 (2007)
A. Argyriou, T. Evgeniou, M. Pontil: Convex multi-task feature learning, Mach. Learn. 73(3), 243–272 (2008)
L. Debnath, P. Mikusiński: Hilbert Spaces with Application (Elsevier, San Diego 2005)
M. Fazel: Matrix Rank Minimization with Application, Ph.D. Thesis (Stanford University, Stanford 2002)
Z. Liu, L. Vandenberghe: Semidefinite programming methods for system realization and identification, Proc. 48th IEEE Conf. Decis. Control (CDC) (2009) pp. 4676–4681
Z. Liu, L. Vandenberghe: Interior-point method for nuclear norm approximation with application to system identification, SIAM J. Matrix Anal. Appl. 31(3), 1235–1256 (2009)
M. Signoretto, J.A.K. Suykens: Convex estimation of cointegrated var models by a nuclear norm penalty, Proc. 16th IFAC Symp. Syst. Identif. (SYSID) (2012)
A. Argyriou, C.A. Micchelli, M. Pontil: On spectral learning, J. Mach. Learn. Res. 11, 935–953 (2010)
J. Abernethy, F. Bach, T. Evgeniou, J.P. Vert: A new approach to collaborative filtering: Operator estimation with spectral regularization, J. Mach. Learn. Res. 10, 803–826 (2009)
P.L. Bartlett, S. Mendelson: Rademacher and Gaussian complexities: Risk bounds and structural results, J. Mach. Learn. Res. 3, 463–482 (2003)
P.K. Shivaswamy, T. Jebara: Maximum relative margin and data-dependent regularization, J. Mach. Learn. Res. 11, 747–788 (2010)
P.K. Shivaswamy, T. Jebara: Relative margin machines, Adv. Neural Inf. Process. Syst. 21(1–8), 7 (2008)
B. Schölkopf, A.J. Smola, R.C. Williamson, P.L. Bartlett: New support vector algorithms, Neural Comput. 12(5), 1207–1245 (2000)
J. Platt: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods – Support Vector Learning, ed. by B. Schölkopf, C.J.C. Burges, A.J. Smola (MIT Press, Cambridge 1999) pp. 185–208
C.C. Chang, C.J. Lin: LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)
R.E. Fan, P.H. Chen, C.J. Lin: Working set selection using second order information for training support vector machines, J. Mach. Learn. Res. 6, 1889–1918 (2005)
T. Joachims: Making large–scale SVM learning practical. In: Advance in Kernel Methods – Support Vector Learning, ed. by B. Schölkopf, C.J.C. Burges, A.J. Smola (MIT Press, Cambridge 1999) pp. 169–184
J.A.K. Suykens, J. Vandewalle: Least squares support vector machine classifiers, Neural Process. Lett. 9(3), 293–300 (1999)
J. Nocedal, S.J. Wright: Numerical Optimization (Springer, New York 1999)
K. Pelckmans, J. De Brabanter, J.A.K. Suykens, B. De Moor: The differogram: Non-parametric noise variance estimation and its use for model selection, Neurocomputing 69(1), 100–122 (2005)
K. Saadi, G.C. Cawley, N.L.C. Talbot: Fast exact leave-one-out cross-validation of least-square support vector machines, Eur. Symp. Artif. Neural Netw. (ESANN-2002) (2002)
R.M. Rifkin, R.A. Lippert: Notes on regularized least squares, Tech. Rep. MIT-CSAIL-TR-2007-025, CBCL-268 (2007)
T. Van Gestel, J.A.K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, J. Vandewalle: Benchmarking least squares support vector machine classifiers, Mach. Learn. 54(1), 5–32 (2004)
G. Baudat, F. Anouar: Generalized discriminant analysis using a kernel approach, Neural Comput. 12(10), 2385–2404 (2000)
S. Mika, G. Rätsch, J. Weston, B. Schölkopf, K.R. Müllers: Fisher discriminant analysis with kernels, Proc. 1999 IEEE Signal Process. Soc. Workshop (1999) pp. 41–48
T. Poggio, F. Girosi: Networks for approximation and learning, Proc. IEEE 78(9), 1481–1497 (1990)
C. Saunders, A. Gammerman, V. Vovk: Ridge regression learning algorithm in dual variables, Int. Conf. Mach. Learn. (ICML) (1998) pp. 515–521
N. Cressie: The origins of kriging, Math. Geol. 22(3), 239–252 (1990)
D.J.C. MacKay: Introduction to Gaussian processes, NATO ASI Ser. F Comput. Syst. Sci. 168, 133–166 (1998)
C.K.I. Williams, C.E. Rasmussen: Gaussian processes for regression, Advances in Neural Information Processing Systems, Vol.8 (MIT Press, Cambridge 1996) pp. 514–520
J.A.K. Suykens, T. Van Gestel, J. Vandewalle, B. De Moor: A support vector machine formulation to pca analysis and its kernel version, IEEE Trans. Neural Netw. 14(2), 447–450 (2003)
C. Alzate, J.A.K. Suykens: Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA, IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 335–347 (2010)
T. Van Gestel, J.A.K. Suykens, J. De Brabanter, B. De Moor, J. Vandewalle: Kernel canonical correlation analysis and least squares support vector machines, Lect. Notes Comput. Sci. 2130, 384–389 (2001)
J.A.K. Suykens: Data visualization and dimensionality reduction using kernel maps with a reference point, IEEE Trans. Neural Netw. 19(9), 1501–1517 (2008)
J.A.K. Suykens, J. Vandewalle: Recurrent least squares support vector machines, IEEE Trans. Circuits Syst. I: Fundam. Theory Appl. 47(7), 1109–1114 (2000)
J.A.K. Suykens, J. Vandewalle, B. De Moor: Optimal control by least squares support vector machines, Neural Netw. 14(1), 23–35 (2001)
J.A.K. Suykens, C. Alzate, K. Pelckmans: Primal and dual model representations in kernel-based learning, Stat. Surv. 4, 148–183 (2010)
J.A.K. Suykens, J. De Brabanter, L. Lukas, J. Vandewalle: Weighted least squares support vector machines: Robustness and sparse approximation, Neurocomputing 48(1), 85–105 (2002)
C.K.I. Williams, M. Seeger: Using the Nyström method to speed up kernel machines, Adv. Neural Inf. Process. Syst. 15, 682–688 (2001)
K. De Brabanter, J. De Brabanter, J.A.K. Suykens, B. De Moor: Optimized fixed-size kernel models for large data sets, Comput. Stat. Data Anal. 54(6), 1484–1504 (2010)
B. Schölkopf, A. Smola, K.-R. Müller: Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10, 1299–1319 (1998)
I. Jolliffe: Principle Component Analysis. In: Encyclopedia of Statistics in Behavioral Science, (Wiley, Chichester 2005)
J.A.K. Suykens, T. Van Gestel, J. Vandewalle, B. De Moor: A support vector machine formulation to PCA analysis and its kernel version, IEEE Trans. Neural Netw. 14(2), 447–450 (2003)
N. Weiner: Extrapolation, Interpolation, Smoothing of Stationary Time Series with Engineering Applications (MIT Press, Cambridge 1949)
A.N. Kolmogorov: Sur l'interpolation et extrapolation des suites stationnaires, CR Acad. Sci. 208, 2043–2045 (1939)
C.E. Rasmussen, C.K.I. Williams: Gaussian Processes for Machine Learning, Vol. 1 (MIT Press, Cambridge 2006)
J.O. Berger: Statistical Decision Theory and Bayesian Analysis (Springer, New York 1985)
K. Duan, S.S. Keerthi, A.N. Poo: Evaluation of simple performance measures for tuning svm hyperparameters, Neurocomputing 51, 41–59 (2003)
P.L. Bartlett, S. Boucheron, G. Lugosi: Model selection and error estimation, Mach. Learn. 48(1), 85–113 (2002)
N. Shawe-Taylor, A. Kandola: On kernel target alignment, Adv. Neural Inf. Process. Syst. 14(1), 367–373 (2002)
G.C. Cawley: Leave-one-out cross-validation based model selection criteria for weighted LS-SVMS, Int. Joint Conf. Neural Netw. (IJCNN) (2006) pp. 1661–1668
G.C. Cawley, N.L.C. Talbot: Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters, J. Mach. Learn. Res. 8, 841–861 (2007)
D.J.C. MacKay: Bayesian interpolation, Neural Comput. 4, 415–447 (1992)
D.J.C. MacKay: The evidence framework applied to classification networks, Neural Comput. 4(5), 720–736 (1992)
D.J.C. MacKay: Probable networks and plausible predictions – A review of practical Bayesian methods for supervised neural networks, Netw. Comput. Neural Syst. 6(3), 469–505 (1995)
I. Steinwart, D. Hush, C. Scovel: An explicit description of the reproducing kernel Hilbert spaces of Gaussian RBF kernels, IEEE Trans. Inform. Theory 52, 4635–4643 (2006)
J.B. Conway: A Course in Functional Analysis (Springer, New York 1990)
F. Riesz, B.S. Nagy: Functional Analysis (Frederick Ungar, New York 1955)
I. Steinwart: On the influence of the kernel on the consistency of support vector machines, J. Mach. Learn. Res. 2, 67–93 (2002)
T. Gärtner: Kernels for Structured Data, Machine Perception and Artificial Intelligence, Vol. 72 (World Scientific, Singapore 2008)
D. Haussler: Convolution kernels on discrete structures, Tech. Rep. (UC Santa Cruz, Santa Cruz 1999)
T. Jebara, R. Kondor, A. Howard: Probability product kernels, J. Mach. Learn. Res. 5, 819–844 (2004)
T.S. Jaakkola, D. Haussler: Exploiting generative models in discriminative classifiers, Adv. Neural Inf. Process. Syst. 11, 487–493 (1999)
K. Tsuda, S. Akaho, M. Kawanabe, K.R. Müller: Asymptotic properties of the Fisher kernel, Neural Comput. 16(1), 115–137 (2004)
S.V.N. Vishwanathan, N.N. Schraudolph, R. Kondor, K.M. Borgwardt: Graph kernels, J. Mach. Learn. Res. 11, 1201–1242 (2010)
T. Gärtner, P. Flach, S. Wrobel: On graph kernels: Hardness results and efficient alternatives, Lect. Notes Comput. Sci. 2777, 129–143 (2003)
S.V.N. Vishwanathan, A.J. Smola, R. Vidal: Binet-Cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes, Int. J. Comput. Vis. 73(1), 95–119 (2007)
P.M. Kroonenberg: Applied Multiway Data Analysis (Wiley, Hoboken 2008)
M. Signoretto, L. De Lathauwer, J.A.K. Suykens: A kernel-based framework to tensorial data analysis, Neural Netw. 24(8), 861–874 (2011)
L. De Lathauwer, B. De Moor, J. Vandewalle: A multilinear singular value decomposition, SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000)
M. Signoretto, E. Olivetti, L. De Lathauwer, J.A.K. Suykens: Classification of multichannel signals with cumulant-based kernels, IEEE Trans. Signal Process. 60(5), 2304–2314 (2012)
Y. LeCun, L.D. Jackel, L. Bottou, A. Brunot, C. Cortes, J.S. Denker, H. Drucker, I. Guyon, U.A. Muller, E. Sackinger, P. Simard, V. Vapnik: Comparison of learning algorithms for handwritten digit recognition, Int. Conf. Artif. Neural Netw. (ICANN) 2 (1995) pp. 53–60
D. Decoste, B. Schölkopf: Training invariant support vector machines, Mach. Learn. 46(1), 161–190 (2002)
V. Blanz, B. Schölkopf, H. Bülthoff, C. Burges, V. Vapnik, T. Vetter: Comparison of view-based object recognition algorithms using realistic 3D models, Lect. Notes Comput. Sci. 1112, 251–256 (1996)
T. Joachims: Text categorization with support vector machines: Learning with many relevant features, Lect. Notes Comput. Sci. 1398, 137–142 (1998)
S. Dumais, J. Platt, D. Heckerman, M. Sahami: Inductive learning algorithms and representations for text categorization, Proc. 7th Int. Conf. Inf. Knowl. Manag. (1998) pp. 148–155
S. Mukherjee, E. Osuna, F. Girosi: Nonlinear prediction of chaotic time series using support vector machines, 1997 IEEE Workshop Neural Netw. Signal Process. VII (1997) pp. 511–520
D. Mattera, S. Haykin: Support vector machines for dynamic reconstruction of a chaotic system. In: Advances in Kernel Methods, ed. by B. Schölkopf, C.J.C. Burges, A.J. Smola (MIT Press, Cambridge 1999) pp. 211–241
K.R. Müller, A. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, V. Vapnik: Predicting time series with support vector machines, Lect. Notes Comput. Sci. 1327, 999–1004 (1997)
M. Espinoza, J.A.K. Suykens, B. De Moor: Short term chaotic time series prediction using symmetric ls-svm regression, Proc. 2005 Int. Symp. Nonlinear Theory Appl. (NOLTA) (2005) pp. 606–609
M. Espinoza, T. Falck, J.A.K. Suykens, B. De Moor: Time series prediction using ls-svms, Eur. Symp. Time Ser. Prediction (ESTSP), Vol. 8 (2008) pp. 159–168
M. Espinoza, J.A.K. Suykens, R. Belmans, B. De Moor: Electric load forecasting, IEEE Control Syst. 27(5), 43–57 (2007)
T. Van Gestel, J.A.K. Suykens, D.E. Baestaens, A. Lambrechts, G. Lanckriet, B. Vandaele, B. De Moor, J. Vandewalle: Financial time series prediction using least squares support vector machines within the evidence framework, IEEE Trans. Neural Netw. 12(4), 809–821 (2001)
M.P.S. Brown, W.N. Grundy, D. Lin, N. Cristianini, C.W. Sugnet, T.S. Furey, M. Ares, D. Haussler: Knowledge-based analysis of microarray gene expression data by using support vector machines, Proc. Natl. Acad. Sci. USA 97(1), 262–267 (2000)
J. Luts, F. Ojeda, R. Van de Plas, B. De Moor, S. Van Huffel, J.A.K. Suykens: A tutorial on support vector machine-based methods for classification problems in chemometrics, Anal. Chim. Acta 665(2), 129 (2010)
A. Daemen, M. Signoretto, O. Gevaert, J.A.K. Suykens, B. De Moor: Improved microarray-based decision support with graph encoded interactome data, PLoS ONE 5(4), 1–16 (2010)
S. Yu, L.C. Tranchevent, B. Moor, Y. Moreau: Kernel-based Data Fusion for Machine Learning, Studies in Computational Intelligence, Vol. 345 (Springer, Berlin 2011)
T. Jaakkola, M. Diekhans, D. Haussler: A discriminative framework for detecting remote protein homologies, J. Comput. Biol. 7(1/2), 95–114 (2000)
C. Lu, T. Van Gestel, J.A.K. Suykens, S. Van Huffel, D. Timmerman, I. Vergote: Classification of ovarian tumors using Bayesian least squares support vector machines, Lect. Notes Artif. Intell. 2780, 219–228 (2003)
F. Ojeda, M. Signoretto, R. Van de Plas, E. Waelkens, B. De Moor, J.A.K. Suykens: Semi-supervised learning of sparse linear models in mass spectral imaging, Pattern Recognit. Bioinform. (PRIB) (Nijgmegen) (2010) pp. 325–334
D. Widjaja, C. Varon, A.C. Dorado, J.A.K. Suykens, S. Van Huffel: Application of kernel principal component analysis for single lead ECG-derived respiration, IEEE Trans. Biomed. Eng. 59(4), 1169–1176 (2012)
V. Van Belle, K. Pelckmans, S. Van Huffel, J.A.K. Suykens: Support vector methods for survival analysis: A comparison between ranking and regression approaches, Artif. Intell. Med. 53(2), 107–118 (2011)
V. Van Belle, K. Pelckmans, S. Van Huffel, J.A.K. Suykens: Improved performance on high-dimensional survival data by application of survival-SVM, Bioinformatics 27(1), 87–94 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Signoretto, M., Suykens, J.A.K. (2015). Kernel Methods. In: Kacprzyk, J., Pedrycz, W. (eds) Springer Handbook of Computational Intelligence. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43505-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-662-43505-2_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43504-5
Online ISBN: 978-3-662-43505-2
eBook Packages: EngineeringEngineering (R0)