Sparse Kernel Modelling: A Unified Approach

  • S. Chen
  • X. Hong
  • C. J. Harris
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4881)


A unified approach is proposed for sparse kernel data modelling that includes regression and classification as well as probability density function estimation. The orthogonal-least-squares forward selection method based on the leave-one-out test criteria is presented within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic sparse kernel data modelling approach.


Mean Square Error Kernel Density Estimate Kernel Weight Kernel Width Support Vector Machine Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)zbMATHGoogle Scholar
  2. 2.
    Tipping, M.E.: Sparse Bayesian learning and the relevance vector machine. J. Machine Learning Research 1, 211–244 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Sha, F., Saul, L.K., Lee, D.D.: Multiplicative updates for nonnegative quadratic programming in support vector machines. Technical Report, MS-CIS-02-19, University of Pennsylvania, USA (2002)Google Scholar
  4. 4.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA (2002)Google Scholar
  5. 5.
    Vapnik, V., Mukherjee, S.: Support vector method for multivariate density estimation. In: Solla, S., Leen, T., Müller, K.R. (eds.) Advances in Neural Information Processing Systems, pp. 659–665. MIT Press, Cambridge (2000)Google Scholar
  6. 6.
    Girolami, M., He, C.: Probability density estimation from optimally condensed data samples. IEEE Trans. Pattern Analysis and Machine Intelligence 25(10), 1253–1264 (2003)CrossRefGoogle Scholar
  7. 7.
    Chen, S., Billings, S.A., Luo, W.: Orthogonal least squares methods and their application to non-linear system identification. Int. J. Control 50(5), 1873–1896 (1989)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Chen, S., Cowan, C.F.N., Grant, P.M.: Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans. Neural Networks 2(2), 302–309 (1991)CrossRefGoogle Scholar
  9. 9.
    Chen, S., Hong, X., Harris, C.J.: Sparse kernel regression modelling using combined locally regularized orthogonal least squares and D-optimality experimental design. IEEE Trans. Automatic Control 48(6), 1029–1036 (2003)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Chen, S., Hong, X., Harris, C.J., Sharkey, P.M.: Sparse modelling using orthogonal forward regression with PRESS statistic and regularization. IEEE Trans. Systems, Man and Cybernetics, Part B 34(2), 898–911 (2004)CrossRefGoogle Scholar
  11. 11.
    Chen, S.: Local regularization assisted orthogonal least squares regression. Neurocomputing 69(4-6), 559–585 (2006)CrossRefGoogle Scholar
  12. 12.
    Chen, S., Hong, X., Harris, C.J.: Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization. IEEE Trans. Systems, Man and Cybernetics, Part B 34(4), 1708–1717 (2004)CrossRefGoogle Scholar
  13. 13.
    Chen, S., Hong, X., Harris, C.J.: An orthogonal forward regression technique for sparse kernel density estimation. Neurocomputing (to appear, 2007)Google Scholar
  14. 14.
    Hong, X., Sharkey, P.M., Warwick, K.: Automatic nonlinear predictive model construction algorithm using forward regression and the PRESS statistic. IEE Proc. Control Theory and Applications 150(3), 245–254 (2003)CrossRefGoogle Scholar
  15. 15.
    Hong, X., Chen, S., Harris, C.J.: Fast kernel classifier construction using orthogonal forward selection to minimise leave-one-out misclassification rate. In: Proc. 2nd Int. Conf. Intelligent Computing, Kunming, China, August 16-19, pp. 106–114 (2006)Google Scholar
  16. 16.
    Parzen, E.: On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33, 1066–1076 (1962)MathSciNetGoogle Scholar
  17. 17.
    Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman Hall, London (1986)zbMATHGoogle Scholar
  18. 18.
    Myers, R.H.: Classical and Modern Regression with Applications, 2nd edn. PWS Pub. Co., Boston, MA (1990)Google Scholar
  19. 19.
  20. 20.
  21. 21.
    Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for AdaBoost. Machine Learning 42(3), 287–320 (2001)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • S. Chen
    • 1
  • X. Hong
    • 2
  • C. J. Harris
    • 1
  1. 1.School of Electronics and Computer Sciences, University of Southampton, Southampton SO17 1BJU.K.
  2. 2.School of Systems Engineering, University of Reading, Reading RG6 6AYU.K.

Personalised recommendations