Nonlinear Adaptive Filtering in Kernel Spaces

  • Badong Chen
  • Lin Li
  • Weifeng Liu
  • José C. Príncipe


Recently, a family of online kernel-learning algorithms, known as the kernel adaptive filtering (KAF) algorithms, has become an emerging area of research. The KAF algorithms are developed in reproducing kernel Hilbert spaces (RKHS), by using the linear structure of this space to implement well-established linear adaptive algorithms and to obtain nonlinear filters in the original input space. These algorithms include the kernel least mean squares (KLMS), kernel affine projection algorithms (KAPA), kernel recursive least squares (KRLS), and extended kernel recursive least squares (EX-KRLS), etc. When the kernels are radial (such as the Gaussian kernel), they naturally build a growing RBF network, where the weights are directly related to the errors in each sample. The aim of this chapter is to give a brief introduction to kernel adaptive filters. In particular, our focus is on KLMS, the simplest KAF algorithm, which is easy to implement, yet efficient. Several key aspects of the algorithm are discussed, such as self-regularization, sparsification, quantization, and the mean-square convergence. Application examples are also presented, including in particular the adaptive neural decoder for spike trains.


Spike Train Less Mean Square Adaptive Filter Reproduce Kernel Hilbert Space Less Mean Square Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



approximate linear dependency


aesthetic measure


bit error rate


cross intensity


circuit simulator


energy conservation relation


excess mean square error


Gaussian process


kernel adaptive filtering


kernel affine projection algorithm


kernel Fisher discriminant analysis


kernel least mean squares


kernel principal component analysis


kernel recursive least squares


leaky integrate-and-fire neuron


least mean square


multilayer perceptron


mean square error


NC kernel least mean square


novelty criterion


principle component analysis


persistence of excitation


Poisson process


QKLMS with global update


quantized KLMS


quantized regressor


radial basis function


reproducing kernel Hilbert space


recursive least-squares


regularization network


Schwartzʼs criterion kernel least mean squares


Schwartzʼs criterion


signal-to-noise ratio


singular value decomposition


support vector machine


ventral posterolateral nucleus


vector quantization


weight error power


logistic regression


memoryless CI kernel


nonlinear cross intensity kernel


  1. 41.1.
    B. Widrow, S.D. Stearns: Adaptive Signal Processing (Englewood Cliffs, NJ: Prentice-Hall 1985)zbMATHGoogle Scholar
  2. 41.2.
    S. Haykin: Adaptive Filtering Theory, 3rd edn. (Prentice Hall, New York 1996)Google Scholar
  3. 41.3.
    A.H. Sayed: Fundamentals of Adaptive Filtering (Wiley, Hoboken 2003)Google Scholar
  4. 41.4.
    B. Hassibi, A.H. Sayed, T. Kailath: The H optimality of the LMS algorithm, IEEE Trans. Signal Process. 44, 267–280 (1996)CrossRefGoogle Scholar
  5. 41.5.
    S.S. Narayan, A.M. Peterson, M.J. Narashima: Transform domain LMS algorithm, IEEE Trans. Acoust. Speech Signal Process. ASSP-31, 609–615 (1983)CrossRefGoogle Scholar
  6. 41.6.
    F. Beaufays: Transform-domain adaptive filters: An analytical approach, IEEE Trans. Signal Process. 43, 422–431 (1995)CrossRefGoogle Scholar
  7. 41.7.
    S. Haykin, A.H. Sayed, J.R. Zeidler, P. Yee, P.C. Wei: Adaptive tracking of linear time variant systems by extended RLS algorithm, IEEE Trans. Signal Process. 45, 1118–1128 (1997)CrossRefGoogle Scholar
  8. 41.8.
    A.H. Sayed, T. Kailath: A state-space approach to adaptive RLS filtering, IEEE Signal Process. Mag. 11, 18–60 (1994)CrossRefGoogle Scholar
  9. 41.9.
    B. Anderson, J. Moor: Optimal Filtering (Prentice-Hall, New York 1979)Google Scholar
  10. 41.10.
    S. Billings, S. Fakhouri: Identification of systems containing linear dynamics and static nonlinear elements, Automatica 18, 15–26 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 41.11.
    D. Gabor: Holographic model of temporal recall, Nature 217, 584–585 (1968)CrossRefGoogle Scholar
  12. 41.12.
    J.F. Barrett: The use of functionals in the analysis of non-linear physical systems, Int. J. Electron. 15, 567–615 (1963)Google Scholar
  13. 41.13.
    S. Haykin: Neural Networks: A Comprehensive Foundation, 2nd edn. (Prentice-Hall, Upper Saddle River 1998)zbMATHGoogle Scholar
  14. 41.14.
    J.C. Príncipe, B. de Vries, J.M. Kuo, P.G. de Oliveira: Modeling applications with the focused gamma net, Adv. Neural Inform. Process. Syst. 4, 143–150 (1992)Google Scholar
  15. 41.15.
    V. Vapnik: The Nature of Statistical Learning Theory (Springer, New York 1995)CrossRefzbMATHGoogle Scholar
  16. 41.16.
    B. Scholkopf, A.J. Smola: Learning with Kernels, Support Vector Machines, Regularization, Optimization and Beyond (MIT Press, Cambridge 2002)Google Scholar
  17. 41.17.
    F. Girosi, M. Jones, T. Poggio: Regularization theory and neural networks architectures, Neural Comput. 7, 219–269 (1995)CrossRefGoogle Scholar
  18. 41.18.
    B. Scholkopf, A.J. Smola, K. Muller: Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10, 1299–1319 (1998)CrossRefGoogle Scholar
  19. 41.19.
    M.H. Yang: Kernel eigenfaces vs kernel fisherfaces: Face recognition using kernel methods, Proc. 5th IEEE ICAFGR (Washington, 2002) pp. 215–220Google Scholar
  20. 41.20.
    W. Liu, J.C. Príncipe, S. Haykin: Kernel Adaptive Filtering: A Comprehensive Introduction (Wiley, Hoboken 2010)CrossRefGoogle Scholar
  21. 41.21.
    A.R.C. Paiva, I. Park, J.C. Príncipe: A reproducing kernel Hilbert space framework for spike train signal processing, Neural Comput. 21, 424–449 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 41.22.
    L. Li, I. Park, S. Seth, J.S. Choi, J.T. Francis, J.C. Sanchez, J.C. Príncipe: An adaptive decoder from spike trains to micro-stimulation using kernel least-mean-squares (KLMS), Mach. Learn. Signal Process. (MLSP), IEEE Int. Workshop (Beijing 2011) pp. 1–6Google Scholar
  23. 41.23.
    W. Liu, P. Pokharel, J. Príncipe: The kernel least mean square algorithm, IEEE Trans. Signal Process. 56, 543–554 (2008)MathSciNetCrossRefGoogle Scholar
  24. 41.24.
    W. Liu, J. Príncipe: Kernel affine projection algorithm, EURASIP J. Adv. Signal Process. 12, 784292 (2008)CrossRefzbMATHGoogle Scholar
  25. 41.25.
    Y. Engel, S. Mannor, R. Meir: The kernel recursive least-squares algorithm, IEEE Trans. Signal Process. 52, 2275–2285 (2004)MathSciNetCrossRefGoogle Scholar
  26. 41.26.
    W. Liu, I. Park, Y. Wang, J.C. Príncipe: Extended kernel recursive least squares algorithm, IEEE Trans. Signal Process. 57, 3801–3814 (2009)MathSciNetCrossRefGoogle Scholar
  27. 41.27.
    B.W. Silverman: Density Estimation for Statistics and Data Analysis (Chapman Hall, New York 1986)CrossRefzbMATHGoogle Scholar
  28. 41.28.
    A. Tikhonov, V. Arsenin: Solution of ill-posed Problems (Winston, Washington 1977)zbMATHGoogle Scholar
  29. 41.29.
    G. Golub, C. Loan: Matrix Computations (John Hopkins University Press, Washington, DC 1996)zbMATHGoogle Scholar
  30. 41.30.
    J. Platt: A resource-allocating network for function interpolation, Neural Comput. 3, 213–225 (1991)MathSciNetCrossRefGoogle Scholar
  31. 41.31.
    L. Csato, M. Opper: Sparse online Gaussian process, Neural Comput. 14, 641–668 (2002)CrossRefzbMATHGoogle Scholar
  32. 41.32.
    C. Richard, J.C.M. Bermudez, P. Honeine: Online prediction of time series data with kernels, IEEE Trans. Signal Process. 57, 1058–1066 (2009)MathSciNetCrossRefGoogle Scholar
  33. 41.33.
    W. Liu, I. Park, J.C. Príncipe: An information theoretic approach of designing sparse kernel adaptive filters, IEEE Trans. Neural Netw. 20, 1950–1961 (2009)CrossRefGoogle Scholar
  34. 41.34.
    B. Chen, S. Zhao, P. Zhu, J.C. Príncipe: Quantized kernel least mean square algorithm, IEEE Trans. Neural Netw. Learn. Syst. 23(1), 22–32 (2012)CrossRefGoogle Scholar
  35. 41.35.
    Y.Y. Linde, A. Buzo, R.M. Gray: An algorithm for vector quantizer design, IEEE Trans. Commun. 28, 84–95 (1980)CrossRefGoogle Scholar
  36. 41.36.
    P.A. Chou, T. Lookabaugh, R.M. Gray: Entropy-constrained vector quantization, IEEE Trans. Acoust. Speech Signal Process. 37, 31–42 (1989)MathSciNetCrossRefGoogle Scholar
  37. 41.37.
    T. Lehn-Schiøler, A. Hegde, D. Erdogmus, J.C. Principe: Vector quantization using information theoretic concepts, Nat. Comput. 4, 39–51 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 41.38.
    S. Craciun, D. Cheney, K. Gugel, J.C. Sanchez, J.C. Príncipe: Wireless transmission of neural signals using entropy and mutual information compression, IEEE Trans. Neural Syst. Rehabil. Eng. 19, 35–44 (2011)CrossRefGoogle Scholar
  39. 41.39.
    N.R. Yousef, A.H. Sayed: A unified approach to the steady-state and tracking analysis of adaptive filters, IEEE Trans. Signal Process. 49, 314–324 (2001)CrossRefGoogle Scholar
  40. 41.40.
    T.Y. Al-Naffouri, A.H. Sayed: Adaptive filters with error nonlinearities: Mean-square analysis and optimum design, EURASIP J. Appl. Signal Process. 4, 192–205 (2001)CrossRefzbMATHGoogle Scholar
  41. 41.41.
    T.Y. Al-Naffouri, A.H. Sayed: Transient analysis of data-normalized adaptive filters, IEEE Trans. Signal Process. 51, 639–652 (2003)CrossRefGoogle Scholar
  42. 41.42.
    T.Y. Al-Naffouri, A.H. Sayed: Transient analysis of adaptive filters with error nonlinearities, IEEE Trans. Signal Process. 51, 653–663 (2003)CrossRefGoogle Scholar
  43. 41.43.
    W. Sethares, C.R. Johnson: A comparison of two quantized state adaptive algorithms, IEEE Trans. Acoust. Speech Signal Process. 37, 138–143 (1989)CrossRefGoogle Scholar
  44. 41.44.
    T. Poggio, F. Girosi: Networks for approximation and learning, Proc. IEEE 78(9), 1481–1497 (1990)CrossRefzbMATHGoogle Scholar
  45. 41.45.
    D.R. Brillinger: Maximum likelihood analysis of spike trains of interacting nerve cells, Biol. Cybern. 59, 189–200 (1988)CrossRefzbMATHGoogle Scholar
  46. 41.46.
    Z. Mainen, T. Sejnowski: Reliably of spike timing in neocortical neurons, Science 268, 1503–1506 (1995)CrossRefGoogle Scholar
  47. 41.47.
    I. Park: Capturing Spike Train Similarity Structure: A Point Process Divergence Approach. Ph.D. Thesis (Univ. of Florida, Gainesville 2010)Google Scholar
  48. 41.48.
    L. Paninski, J. Pillow, J. Lewi: Statistical models for neural encoding, decoding, and optimal stimulus design, Prog. Brain Res. 165, 493–507 (2007)CrossRefGoogle Scholar
  49. 41.49.
    E.N. Brown, L.M. Frank, D. Tang, M.C. Quirk, M.A. Wilson: A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells, J. Neurosci. 18, 7411–7425 (1998)Google Scholar
  50. 41.50.
    J. Eichhorn, A. Tolias, E. Zien, M. Kuss, C.E. Rasmussen, J. Weston, N. Logothetis, B. Scholkopf: Prediction on spike data using kernel algorithms, Adv. Neural Inform. Process. Syst. 16, 1367–1374 (2004)Google Scholar
  51. 41.51.
    W. Maass, T. Natschlager, H. Markram: Real-time computing without stable states: A new framework for neural computation based on perturbations, Neural Comput. 14, 2531–2560 (2002)CrossRefzbMATHGoogle Scholar
  52. 41.52.
    S. Seth, A.J. Brockmeier, J.S. Choi, M. Semework, J.T. Francis, J.C. Príncipe: Evaluating dependence in spike train metric spaces, Int. Jt. Conf. Neural Netw. (2011)Google Scholar

Copyright information

© Springer-Verlag 2014

Authors and Affiliations

  1. 1.Institute of Artificial Intelligence and RoboticsXiʼan Jiaotong UniversityXiʼanP. R. China
  2. 2.Philips Research North AmericaBriarcliff ManorUSA
  3. 3.Jump TradingChicagoUSA
  4. 4.Department of Electrical and Computer EngineeringUniversity of FloridaGainesvilleUSA

Personalised recommendations