Neural Processing Letters

, Volume 16, Issue 3, pp 293–302 | Cite as

Reduced Rank Kernel Ridge Regression

  • Gavin C. Cawley
  • Nicola L. C. Talbot
Article

Abstract

Ridge regression is a classical statistical technique that attempts to address the bias-variance trade-off in the design of linear regression models. A reformulation of ridge regression in dual variables permits a non-linear form of ridge regression via the well-known ‘kernel trick’. Unfortunately, unlike support vector regression models, the resulting kernel expansion is typically fully dense. In this paper, we introduce a reduced rank kernel ridge regression (RRKRR) algorithm, capable of generating an optimally sparse kernel expansion that is functionally identical to that resulting from conventional kernel ridge regression (KRR). The proposed method is demonstrated to out-perform an alternative sparse kernel ridge regression algorithm on the Motorcycle and Boston Housing benchmarks.

Ridge regression Sparse kernel approximation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baudat, G. and Anouar F.: Kernel-based Methods and Funciton Approximaiton, In: Proceedings, International Joint Conference on Neural Networks, Vol. 3. pp. 1244-1249. Washington, DC, 2001.Google Scholar
  2. 2.
    Geman, S., Bienenstock E. and Doursat R.: Neural networks and the bias/variance dilemma. Neural Computation 4(1) (1992), 1-58.Google Scholar
  3. 3.
    Harrison, D. and Rubinfeld D. L.: Hedonic prices and the demand for clean air. Journal Environmental Economics and Management 5 (1978), 81-102.Google Scholar
  4. 4.
    Hoerl, A. E. and Kennard R. W.: Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 12(1) (1970), 55-67.Google Scholar
  5. 5.
    Micchelli, C. A.: Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Funcitons. Constructive Approximation 2 (1986), 11-22.Google Scholar
  6. 6.
    Nelder, J. A. and Mead R.: A simplex method for function minimization. Computer Journal 7 (1965), 308-313.Google Scholar
  7. 7.
    Poggio, T., Mukherjee, S. Rifkin, R. Rakhlin, A. and Verri A.: 2001, 'b'. Technical Report AI Memo 2001-011, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  8. 8.
    Saunders, C., Gammerman, A. and Vovk V.: Ridge Regression Learning Algorithm in Dual Variables. In: Proceedings, 15th International Conference on Machine Learning. Madison, WI, (1998) pp. 515-521.Google Scholar
  9. 9.
    Silverman, B. W.: Some Aspects of the Spline Smoothing Approach to Non-parametric Regression Curve Fiting. Journal of the Royal Statistical Society, B 47(1) (1985), 1-52.Google Scholar
  10. 10.
    Suykens, J., Lukas, L. And Vandewalle J.: Sparse Approximaiton using Least-squres Support Vector Machines. In: Proceedings, IEEE Internaitonal Symposium on Circuits and Systems. Geneva, Switzerland, 2000, pp. 11757-11760.Google Scholar
  11. 11.
    Suykens, J. A. K., De Brabanter, J. Lukas, L. and Vandewalle J.: Weighted Least Squares Support Vector Machines: robustness and sparse approximation. Neurocomputing, 2001.Google Scholar
  12. 12.
    Tikhonov, A. A. and Arsenin V. Y.: Solutions of ill-posed problems. New York: John Wiley, 1977Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Gavin C. Cawley
    • 1
  • Nicola L. C. Talbot
    • 1
  1. 1.School of Information SystemsUniversity of East Anglia???

Personalised recommendations