Skip to main content

Kernel Methods

  • Chapter
  • First Online:

Abstract

This chapter introduces the basics of the kernel method. Extensions of the kernel method to some traditional methods are also described. The SVM method will be described in the next chapter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aflalo, J., Ben-Tal, A., Bhattacharyya, C., Nath, J. S., & Raman, S. (2011). Variable sparsity kernel learning. Journal of Machine Learning Research, 12, 565–592.

    MathSciNet  MATH  Google Scholar 

  2. Aizerman, M., Braverman, E., & Rozonoer, L. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25, 821–837.

    MATH  Google Scholar 

  3. Alzate, C., & Suykens, J. A. K. (2008). A regularized kernel CCA contrast function for ICA. Neural Networks, 21, 170–181.

    Article  MATH  Google Scholar 

  4. Alzate, C., & Suykens, J. A. K. (2008). Kernel component analysis using an epsilon-insensitive robust loss function. IEEE Transactions on Neural Networks, 19(9), 1583–1598.

    Article  Google Scholar 

  5. Alzate, C., & Suykens, J. A. K. (2010). Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 335–347.

    Article  Google Scholar 

  6. Aravkin, A. Y., Bell, B. M., Burke, J. V., & Pillonetto, G. (2015). The connection between Bayesian estimation of a Gaussian random field and RKHS. IEEE Transactions on Neural Networks and Learning Systems, 26(7), 1518–1524.

    Article  MathSciNet  Google Scholar 

  7. Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 337–404.

    Article  MathSciNet  MATH  Google Scholar 

  8. Bach, F. R., & Jordan, M. I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48.

    MathSciNet  MATH  Google Scholar 

  9. Balcan, M.-F., Blum, A., & Vempala, S. (2004). Kernels as features: On kernels, margins, and low-dimensional mappings. In Proceedings of the 15th International Conference on Algorithmic Learning Theory (pp. 194–205).

    Google Scholar 

  10. Barreto, A. M. S., Precup, D., & Pineau, J. (2016). Practical kernel-based reinforcement learning. Journal of Machine Learning Research, 17, 1–70.

    MathSciNet  MATH  Google Scholar 

  11. Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural Computation, 12(10), 2385–2404.

    Article  Google Scholar 

  12. Bohmer, W., Grunewalder, S., Nickisch, H., & Obermayer, K. (2012). Generating feature spaces for linear algorithms with regularized sparse kernel slow feature analysis. Machine Learning, 89, 67–86.

    Article  MathSciNet  MATH  Google Scholar 

  13. Boubacar, H. A., Lecoeuche, S., & Maouche, S. (2008). SAKM: Self-adaptive kernel machine. A kernel-based algorithm for online clustering. Neural Networks, 21, 1287–1301.

    Article  MATH  Google Scholar 

  14. Bouboulis, P., & Theodoridis, S. (2011). Extension of Wirtinger’s calculus to reproducing kernel Hilbert spaces and the complex kernel LMS. IEEE Transactions on Signal Processing, 59(3), 964–978.

    Article  MathSciNet  MATH  Google Scholar 

  15. Bouboulis, P., Slavakis, K., & Theodoridis, S. (2012). Adaptive learning in complex reproducing kernel Hilbert spaces employing Wirtinger’s subgradients. IEEE Transactions on Neural Networks and Learning Systems, 23(3), 425–438.

    Article  Google Scholar 

  16. Braun, M. L., Buhmann, J. M., & Muller, K.-R. (2008). On relevant dimensions in kernel feature spaces. Journal of Machine Learning Research, 9, 1875–1908.

    MathSciNet  MATH  Google Scholar 

  17. Buciu, I., Nikolaidis, N., & Pitas, I. (2008). Nonnegative matrix factorization in polynomial feature space. IEEE Transactions on Neural Networks, 19(6), 1090–1100.

    Article  Google Scholar 

  18. Cawley, G. C., & Talbot, N. L. C. (2003). Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers. Pattern Recognition, 36(11), 2585–2592.

    Article  MATH  Google Scholar 

  19. Cawley, G. C., Janacek, G. J., & Talbot, N. L. C. (2007). Generalised kernel machines. In Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, Orlando, FL (pp. 1720–1725).

    Google Scholar 

  20. Cesa-Bianchi, N., Conconi, A., & Gentile, C. (2006). Tracking the best hyperplane with a simple budget Perceptron. In Proceedings of the 19th International Conference on Learning Theory (pp. 483–498).

    Google Scholar 

  21. Cevikalp, H., Neamtu, M., & Wilkes, M. (2006). Discriminative common vector method with kernels. IEEE Transactions on Neural Networks, 17(6), 1550–1565.

    Article  Google Scholar 

  22. Cevikalp, H., Neamtu, M., & Barkana, A. (2007). The kernel common vector method: A novel nonlinear subspace classifier for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics Part B, 37(4), 937–951.

    Article  Google Scholar 

  23. Chapelle, O., & Rakotomamonjy, A. (2008). Second order optimization of kernel parameters. In NIPS Workshop on Kernel Learning: Automatic Selection of Optimal Kernels, Whistler, Canada.

    Google Scholar 

  24. Chin, T.-J., & Suter, D. (2007). Incremental kernel principal component analysis. IEEE Transactions on Image Processing, 16(6), 1662–1674.

    Article  MathSciNet  Google Scholar 

  25. Chin, T.-J., Schindler, K., & Suter, D. (2006). Incremental kernel SVD for face recognition with image sets. In Proceedings of the 7th IEEE Conference on Automatic Face and Gesture Recognition (pp. 461–466).

    Google Scholar 

  26. De la Torre, F. (2012). A least-squares framework for component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1041–1055.

    Article  Google Scholar 

  27. Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2007). The Forgetron: A kernel-based perceptron on a budget. SIAM Journal on Computing, 37(5), 1342–1372.

    Article  MathSciNet  MATH  Google Scholar 

  28. Dhanjal, C., Gunn, S. R., & Shawe-Taylor, J. (2009). Efficient sparse kernel feature extraction based on partial least squares. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1347–1361.

    Article  Google Scholar 

  29. Dhillon, I. S., Guan, Y., & Kulis, B. (2004). Kernel \(k\)-means, spectral clustering and normalized cuts. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 551–556).

    Google Scholar 

  30. Dhillon, I. S., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1944–1957.

    Article  Google Scholar 

  31. Ding, M., Tian, Z., & Xu, H. (2010). Adaptive kernel principal component analysis. Signal Processing, 90, 1542–1553.

    Article  MATH  Google Scholar 

  32. Dufrenois, F. (2015). A one-class kernel Fisher criterion for outlier detection. IEEE Transactions on Neural Networks and Learning Systems, 26(5), 982–994.

    Article  MathSciNet  Google Scholar 

  33. Engel, Y., Mannor, S., & Meir, R. (2004). The kernel recursive least-squares algorithm. IEEE Transactions on Signal Processing, 52(8), 2275–2285.

    Article  MathSciNet  MATH  Google Scholar 

  34. Filippone, M., Masulli, F., & Rovetta, S. (2010). Applying the possibilistic \(c\)-means algorithm in kernel-induced spaces. IEEE Transactions on Fuzzy Systems, 18(3), 572–584.

    Google Scholar 

  35. Frieb, T.-T., & Harrison, R. F. (1999). A kernel-based ADALINE. In Proceedings of the European Symposium on Artificial Neural Networks, Bruges, Belgium (pp. 245–250).

    Google Scholar 

  36. Fukumizu, K., Bach, F. R., & Gretton, A. (2007). Statistical consistency of kernel canonical correlation analysis. Journal of Machine Learning Research, 8, 361–383.

    MathSciNet  MATH  Google Scholar 

  37. Gao, J., Kwan, P. W., & Shi, D. (2010). Sparse kernel learning with LASSO and Bayesian inference algorithm. Neural Networks, 23, 257–264.

    Article  MATH  Google Scholar 

  38. Garcia, C., & Moreno, J. A. (2004). The Hopfield associative memory network: Improving performance with the kernel “trick”. Advances in artificial intelligence – IBERAMIA 2004. LNCS (Vol. 3315, pp. 871–880). Berlin: Springer.

    Google Scholar 

  39. Girolami, M. (2002). Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks, 13(3), 780–784.

    Article  Google Scholar 

  40. Gonen, M. (2012). Bayesian efficient multiple kernel learning. In Proceedings of the 29th International Conference on Machine Learning, Edinburgh, UK (Vol. 1, pp. 1–8).

    Google Scholar 

  41. Graves, D., & Pedrycz, W. (2010). Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets and Systems, 161, 522–543.

    Article  MathSciNet  Google Scholar 

  42. Gretton, A., Herbrich, R., Smola, A., Bousquet, O., & Scholkopf, B. (2005). Kernel methods for measuring independence. Journal of Machine Learning Research, 6, 2075–2129.

    MathSciNet  MATH  Google Scholar 

  43. Gunter, S., Schraudolph, N. N., & Vishwanathan, S. V. N. (2007). Fast iterative kernel principal component analysis. Journal of Machine Learning Research, 8, 1893–1918.

    MathSciNet  MATH  Google Scholar 

  44. Harmeling, S., Ziehe, A., Kawanabe, M., & Muller, K.-R. (2003). Kernel-based nonlinear blind source separation. Neural Computation, 15, 1089–1124.

    Article  MATH  Google Scholar 

  45. Heinz, C., & Seeger, B. (2008). Cluster kernels: Resource-aware kernel density estimators over streaming data. IEEE Transactions on Knowledge and Data Engineering, 20(7), 880–893.

    Article  Google Scholar 

  46. Heo, G., & Gader, P. (2011). Robust kernel discriminant analysis using fuzzy memberships. Pattern Recognition, 44(3), 716–723.

    Article  MATH  Google Scholar 

  47. Hoegaerts, L., De Lathauwer, L., Goethals, I., Suykens, J. A. K., Vandewalle, J., & De Moor, B. (2007). Efficiently updating and tracking the dominant kernel principal components. Neural Networks, 20, 220–229.

    Article  MATH  Google Scholar 

  48. Huang, H.-C., Chuang, Y.-Y., & Chen, C.-S. (2012). Multiple kernel fuzzy clustering. IEEE Transactions on Fuzzy Systems, 20(1), 120–134.

    Article  Google Scholar 

  49. Huang, S.-Y., Yeh, Y.-R., & Eguchi, S. (2009). Robust kernel principal component analysis. Neural Computation, 21, 3179–3213.

    Article  MathSciNet  MATH  Google Scholar 

  50. Jaakkola, T., & Haussler, D. (1999). Probabilistic kernel regression models. In Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics. San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  51. Jenssen, R. (2010). Kernel entropy component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 847–860.

    Article  Google Scholar 

  52. Ji, S., & Ye, J. (2008). Kernel uncorrelated and regularized discriminant analysis: A theoretical and computational study. IEEE Transactions on Knowledge and Data Engineering, 20(10), 1311–1321.

    Article  Google Scholar 

  53. Kim, J., & Scott, C. D. (2010). \(L_2\) kernel classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(10), 1822–1831.

    Google Scholar 

  54. Kim, D. W., Lee, K. Y., Lee, D., & Lee, K. H. (2005). A kernel-based subtractive clustering method. Pattern Recognition Letters, 26, 879–891.

    Article  Google Scholar 

  55. Kim, D. W., Lee, K. Y., Lee, D., & Lee, K. H. (2005). Evaluation of the performance of clustering algorithms kernel-induced feature space. Pattern Recognition, 38(4), 607–611.

    Article  Google Scholar 

  56. Kim, K. I., Franz, M. O., & Scholkopf, B. (2005). Iterative kernel principal component analysis for image modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9), 1351–1366.

    Article  Google Scholar 

  57. Kim, S.-J., Magnani, A., & Boyd, S. (2006). Optimal kernel selection in kernel Fisher discriminant analysis. In Proceedings of the International Conference on Machine Learning (pp. 465–472).

    Google Scholar 

  58. Kivinen, J., Smola, A., & Williamson, R. C. (2004). Online learning with kernels. IEEE Transactions on Signal Processing, 52(8), 2165–2176.

    Article  MathSciNet  MATH  Google Scholar 

  59. Kloft, M., Brefeld, U., Sonnenburg, S., & Zien, A. (2011). \(l_p\)-norm multiple kernel learning. Journal of Machine Learning Research, 12, 953–997.

    Google Scholar 

  60. Lai, P. L., & Fyfe, C. (2000). Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems, 10(5), 365–377.

    Article  Google Scholar 

  61. Lanckriet, G. R. G., Ghaoui, L. E., Bhattacharyya, C., & Jordan, M. I. (2002). A robust minimax approach to classification. Journal of Machine Learning Research, 3, 555–582.

    MathSciNet  MATH  Google Scholar 

  62. Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 27–72.

    MathSciNet  MATH  Google Scholar 

  63. Lau, K. W., Yin, H., & Hubbard, S. (2006). Kernel self-organising maps for classification. Neurocomputing, 69, 2033–2040.

    Article  Google Scholar 

  64. Le, Q., Sarlos, T., & Smola, A. (2013). Fastfood – Approximating kernel expansions in loglinear time. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA (Vol. 28, pp. 244–252).

    Google Scholar 

  65. Li, J., Tao, D., Hu, W., & Li, X. (2005). Kernel principle component analysis in pixels clustering. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (pp. 786–789).

    Google Scholar 

  66. Li, K., & Principe, J. C. (2016). The kernel adaptive autoregressive-moving-average algorithm. IEEE Transactions on Neural Networks and Learning Systems, 27(2), 334–346.

    Article  MathSciNet  Google Scholar 

  67. Liu, W., & Principe, J. C. (2008). Kernel affine projection algorithms. EURASIP Journal on Advances in Signal Processing, 2008, Article ID 784292, 12 pp.

    Google Scholar 

  68. Liu, W., Pokharel, P. P., & Principe, J. C. (2008). The kernel least-mean-square algorithm. IEEE Transactions on Signal Processing, 56(2), 543–554.

    Article  MathSciNet  MATH  Google Scholar 

  69. Liu, W., Park, I., Wang, Y., & Principe, J. C. (2009). Extended kernel recursive least squares algorithm. IEEE Transactions on Signal Processing, 57(10), 3801–3814.

    Article  MathSciNet  MATH  Google Scholar 

  70. Liwicki, S., Zafeiriou, S., Tzimiropoulos, G., & Pantic, M. (2012). Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Transactions on Neural Networks and Learning Systems, 23(10), 1624–1636.

    Article  Google Scholar 

  71. Lu, J., Plataniotis, K. N., & Venetsanopoulos, A. N. (2003). Face recognition using kernel direct discriminant analysis algorithms. IEEE Transactions on Neural Networks, 14(1), 117–126.

    Article  Google Scholar 

  72. Ma, J. (2003). Function replacement vs. kernel trick. Neurocomputing, 50, 479–483.

    Article  MATH  Google Scholar 

  73. MacDonald, D., & Fyfe, C. (2000). The kernel self organising map. In Proceedings of the 4th International Conference on Knowledge-Based Intelligence Engineering Systems and Allied Technologies (Vol. 1, pp. 317–320).

    Google Scholar 

  74. Mangasarian, O. L., & Wild, E. W. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18(1), 300–306.

    Article  Google Scholar 

  75. Mao, Q., Tsang, I. W., Gao, S., & Wang, L. (2015). Generalized multiple kernel learning with data-dependent priors. IEEE Transactions on Neural Networks and Learning Systems, 26(6), 1134–1148.

    Article  MathSciNet  Google Scholar 

  76. Martinez, D., & Bray, A. (2003). Nonlinear blind source separation using kernels. IEEE Transactions on Neural Networks, 14(1), 228–235.

    Article  Google Scholar 

  77. Mercer, T. (1909). Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London Series A, 209, 415–446.

    Article  MATH  Google Scholar 

  78. Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Muller, K.-R. (1999). Fisher discriminant analysis with kernels. In Proceedings of the IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing (pp. 41–48).

    Google Scholar 

  79. Muller, K. R., Mika, S., Ratsch, G., Tsuda, K., & Scholkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181–201.

    Article  Google Scholar 

  80. Nashed, M. Z., & Walter, G. G. (1991). General sampling theorem for functions in reproducing kernel Hilbert space. Mathematics of Control Signals and Systems, 4(4), 363–390.

    Article  MathSciNet  MATH  Google Scholar 

  81. Ogawa, H. (2009). What can we see behind sampling theorems? IEICE Transactions on Fundamentals, E92-A(3), 688–707.

    Article  Google Scholar 

  82. Ong, C. S., Smola, A. J., & Williamson, R. C. (2005). Learning the kernel with hyperkernels. Journal of Machine Learning Research, 6, 1043–1071.

    MathSciNet  MATH  Google Scholar 

  83. Orabona, F., Keshet, J., & Caputo, B. (2009). Bounded kernel-based online learning. Journal of Machine Learning Research, 10, 2643–2666.

    MathSciNet  MATH  Google Scholar 

  84. Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49, 161–178.

    Article  MATH  Google Scholar 

  85. Paiva, A. R. C., Park, I., & Principe, J. C. (2009). A reproducing kernel Hilbert space framework for spike train signal processing. Neural Computation, 21, 424–449.

    Article  MathSciNet  MATH  Google Scholar 

  86. Papaioannou, A., & Zafeiriou, S. (2014). Principal component analysis with complex kernel: The widely linear model. IEEE Transactions on Neural Networks and Learning Systems, 25(9), 1719–1726.

    Article  Google Scholar 

  87. Pekalska, E., & Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 1017–1031.

    Article  Google Scholar 

  88. Peleg, D., & Meir, R. (2009). A sparsity driven kernel machine based on minimizing a generalization error bound. Pattern Recognition, 42, 2607–2614.

    Article  MATH  Google Scholar 

  89. Perfetti, R., & Ricci, E. (2008). Recurrent correlation associative memories: A feature space perspective. IEEE Transactions on Neural Networks, 19(2), 333–345.

    Article  Google Scholar 

  90. Pokharel, P. P., Liu, W., & Principe, J. C. (2007). Kernel LMS. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI (Vol. 3, pp. 1421–1424).

    Google Scholar 

  91. Qin, A. K., & Suganthan, P. N. (2004). Kernel neural gas algorithms with application to cluster analysis. In Proceedings of the 17th International Conference on Pattern Recognition (Vol. 4, pp. 617–620).

    Google Scholar 

  92. Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems (Vol. 20, pp. 1177–1184). Red Hook, NY: Curran & Associates Inc.

    Google Scholar 

  93. Rahimi, A., & Recht, B. (2008). Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Advances in Neural Information Processing Systems (Vol. 21, pp. 1313–1320). Red Hook, NY: Curran & Associates Inc.

    Google Scholar 

  94. Rakotomamonjy, A., Bach, F., Canu, S., & Grandvalet, Y. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.

    MathSciNet  MATH  Google Scholar 

  95. Rodriguez-Lujan, I., Santa Cruz, C., & Huerta, R. (2011). On the equivalence of kernel Fisher discriminant analysis and kernel quadratic programming feature selection. Pattern Recognition Letters, 32, 1567–1571.

    Article  Google Scholar 

  96. Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel Hilbert spaces. Journal of Machine Learning Research, 2, 97–123.

    MATH  Google Scholar 

  97. Ruiz, A., & Lopez-de-Teruel, P. E. (2001). Nonlinear kernel-based statistical pattern analysis. IEEE Transactions on Neural Networks, 12(1), 16–32.

    Article  Google Scholar 

  98. Saadi, K., Talbot, N. L. C., & Cawley, G. C. (2007). Optimally regularised kernel Fisher discriminant classification. Neural Networks, 20, 832–841.

    Article  MATH  Google Scholar 

  99. Scholkopf, B. (1997). Support vector learning. Munich, Germany: R Oldenbourg Verlag.

    MATH  Google Scholar 

  100. Scholkopf, B., Smola, A., & Muller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.

    Article  Google Scholar 

  101. Scholkopf, B., Mika, S., Burges, C. J. C., Knirsch, P., Muller, K.-R., Scholz, M., et al. (1999). Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 10(5), 1000–1017.

    Article  Google Scholar 

  102. Shashua, A. (1999). On the relationship between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Processing Letters, 9(2), 129–139.

    Article  Google Scholar 

  103. Smola, A. J., Mangasarian, O., & Scholkopf, B. (1999). Sparse kernel feature analysis. Technical report 99-03. Madison, WI: Data Mining Institute, University of Wisconsin.

    Google Scholar 

  104. Song, G., & Zhang, H. (2011). Reproducing kernel Banach spaces with the \(l_1\) Norm II: Error analysis for regularized least square regression. Neural Computation, 23, 2713–2729.

    Google Scholar 

  105. Sonnenburg, S., Ratsch, G., Schafer, C., & Scholkopf, B. (2006). Large scale multiple kernel learning. Journal of Machine Learning Research, 7, 1531–1565.

    MathSciNet  MATH  Google Scholar 

  106. Subrahmanya, N., & Shin, Y. C. (2010). Sparse multiple kernel learning for signal processing applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 788–798.

    Article  Google Scholar 

  107. Suykens, J. A. K., Van Gestel, T., Vandewalle, J., & De Moor, B. (2003). A support vector machine formulation to PCA analysis and its kernel version. IEEE Transactions on Neural Networks, 14(2), 447–450.

    Article  Google Scholar 

  108. Suzuki, T., & Tomioka, R. (2011). SpicyMKL: A fast algorithm for multiple kernel learning with thousands of kernels. Machine Learning, 85, 77–108.

    Article  MathSciNet  MATH  Google Scholar 

  109. Tanaka, A., Imai, H., & Miyakoshi, M. (2010). Kernel-induced sampling theorem. IEEE Transactions on Signal Processing, 58(7), 3569–3577.

    Article  MathSciNet  MATH  Google Scholar 

  110. Teh, C. S., & Lim, C. P. (2006). Monitoring the formation of kernel-based topographic maps in a hybrid SOM-kMER model. IEEE Transactions on Neural Networks, 17(5), 1336–1341.

    Article  Google Scholar 

  111. Teh, C. S., & Lim, C. P. (2008). An artificial neural network classifier design based-on variable kernel and non-parametric density estimation. Neural Processing Letters, 27, 137–151.

    Article  Google Scholar 

  112. van Hulle, M. M. (1998). Kernel-based equiprobabilistic topographic map formation. Neural Computation, 10(7), 1847–1871.

    Article  Google Scholar 

  113. Vincent, P., & Bengio, Y. (2002). Kernel matching pursuit. Machine Learning, 48, 165–187.

    Article  MATH  Google Scholar 

  114. Vishwanathan, S. V. N., Sun, Z., Ampornpunt, N., & Varma, M. (2010). Multiple kernel learning and the SMO algorithm. Advances in neural information processing systems. Cambridge, MA: MIT Press.

    Google Scholar 

  115. Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1534–1546.

    Article  Google Scholar 

  116. Washizawa, Y. (2012). Adaptive subset kernel principal component analysis for time-varying patterns. IEEE Transactions on Neural Networks and Learning Systems, 23(12), 1961–1973.

    Article  Google Scholar 

  117. Wolf, L., & Shashua, A. (2003). Learning over sets using kernel principal angles. Journal of Machine Learning Research, 4, 913–931.

    MathSciNet  MATH  Google Scholar 

  118. Xiao, S., Tan, M., Xu, D., & Dong, Z. Y. (2016). Robust kernel low-rank representation. IEEE Transactions on Neural Networks and Learning Systems, 27(11), 2268–2281.

    Article  MathSciNet  Google Scholar 

  119. Xiong, H., Swamy, M. N. S., & Ahmad, M. O. (2005). Optimizing the kernel in the empirical feature space. IEEE Transactions on Neural Networks, 16(2), 460–474.

    Article  Google Scholar 

  120. Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems, 24(5), 749–761.

    Article  Google Scholar 

  121. Xu, Y., & Zhang, H. (2007). Refinable kernels. Journal of Machine Learning Research, 8, 2083–2120.

    MathSciNet  MATH  Google Scholar 

  122. Xu, Z., Huang, K., Zhu, J., King, I., & Lyua, M. R. (2009). A novel kernel-based maximum a posteriori classification method. Neural Networks, 22, 977–987.

    Article  MATH  Google Scholar 

  123. Yang, C., Wang, L., & Feng, J. (2008). On feature extraction via kernels. IEEE Transactions on Systems, Man, and Cybernetics Part B, 38(2), 553–557.

    Article  Google Scholar 

  124. Yang, H., Xu, Z., Ye, J., King, I., & Lyu, M. R. (2011). Efficient sparse generalized multiple kernel learning. IEEE Transactions on Neural Networks, 22(3), 433–446.

    Article  Google Scholar 

  125. Yang, J., Frangi, A. F., Yang, J.-Y., Zhang, D., & Jin, Z. (2005). KPCA plus LDA: A complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 230–244.

    Article  Google Scholar 

  126. Ye, J., Ji, S., & Chen, J. (2008). Multi-class discriminant kernel learning via convex programming. Journal of Machine Learning Research, 9, 719–758.

    MathSciNet  MATH  Google Scholar 

  127. Yin, H., & Allinson, N. (2001). Self-organising mixture networks for probability density estimation. IEEE Transactions on Neural Networks, 12, 405–411.

    Article  Google Scholar 

  128. Yoshino, H., Dong, C., Washizawa, Y., & Yamashita, Y. (2010). Kernel Wiener filter and its application to pattern recognition. IEEE Transactions on Neural Networks, 21(11), 1719–1730.

    Article  Google Scholar 

  129. You, D., Hamsici, O. C., & Martinez, A. M. (2011). Kernel optimization in discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 631–638.

    Article  Google Scholar 

  130. Zafeiriou, S., & Petrou, M. (2010). Nonlinear nonnegative component analysis algorithms. IEEE Transactions on Image Processing, 19, 1050–1066.

    Article  MathSciNet  MATH  Google Scholar 

  131. Zhang, T. (2003). Leave-one-out bounds for kernel methods. Neural Computation, 15, 1397–1437.

    Article  MATH  Google Scholar 

  132. Zhang, D. Q., & Chen, S. C. (2003). Clustering incomplete data using kernel-based fuzzy C-means algorithm. Neural Processing Letters, 18, 155–162.

    Article  Google Scholar 

  133. Zhang, B., Zhang, H., & Ge, S. S. (2004). Face recognition by applying wavelet subband representation and kernel associative memory. IEEE Transactions on Neural Networks, 15(1), 166–177.

    Article  Google Scholar 

  134. Zhang, M., Wang, X., Chen, X., & Zhang, A. (2018). The kernel conjugate gradient algorithms. IEEE Transactions on Signal Processing, 66(16), 4377–4387.

    Article  MathSciNet  MATH  Google Scholar 

  135. Zheng, W., Zhao, L., & Zou, C. (2005). Foley-Sammon optimal discriminant vectors using kernel approach. IEEE Transactions on Neural Networks, 16(1), 1–9.

    Article  Google Scholar 

  136. Zheng, W., Zhou, X., Zou, C., & Zhao, L. (2006). Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 17(1), 233–238.

    Article  Google Scholar 

  137. Zheng, W., Lin, Z., & Tang, X. (2010). A rank-one update algorithm for fast solving kernel Foley-Sammon optimal discriminant vectors. IEEE Transactions on Neural Networks, 21(3), 393–403.

    Article  Google Scholar 

  138. Zhu, J., & Hastie, T. (2002). Kernel logistic regression and the import vector machine. Advances in neural information processing systems (Vol. 14). Cambridge, MA: MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke-Lin Du .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer-Verlag London Ltd., part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Du, KL., Swamy, M.N.S. (2019). Kernel Methods. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_20

Download citation

Publish with us

Policies and ethics