Machine Learning

, Volume 46, Issue 1–3, pp 11–19 | Cite as

On a Connection between Kernel PCA and Metric Multidimensional Scaling

  • Christopher K.I. Williams


In this note we show that the kernel PCA algorithm of Schölkopf, Smola, and Müller (Neural Computation, 10, 1299–1319.) can be interpreted as a form of metric multidimensional scaling (MDS) when the kernel function k(x, y) is isotropic, i.e. it depends only on ‖xy‖. This leads to a metric MDS algorithm where the desired configuration of points is found via the solution of an eigenproblem rather than through the iterative optimization of the stress objective function. The question of kernel choice is also discussed.

metric multidimensional scaling MDS kernel PCA eigenproblem 


  1. Berg, C., Christensen, J. P. R., & Ressel, P. (1984). Harmonic analysis on semigroups New York, NY: Springer-Verlag.Google Scholar
  2. Cox, T. F. & Cox, M. A. A. (1994). Multidimensional scaling. London: Chapman and Hall.Google Scholar
  3. Critchley, F. (1978). Multidimensionsal scaling: A short critique and a new method. In L. C. A. Corsten & J. Hermnas (Eds.), COMPSTAT 1978. Vienna: Physica-Verlag.Google Scholar
  4. Kruskal, J. B. & Wish, M. (1978). Multidimensional scaling. Beverly Hills: Sage Publications.Google Scholar
  5. Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis. London: Academic Press.Google Scholar
  6. Sammon, J.W. (1969). A nonlinear mapping for data structure analysis. IEEE Trans. on Computers, 18, 401-409.Google Scholar
  7. Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299-1319.Google Scholar
  8. Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer Verlag.Google Scholar
  9. Wahba, G. (1990). Spline models for observational data. Philadelphia, PA: Society for Industrial and Applied Mathematics. CBMS-NSF Regional Conference series in applied mathematics.Google Scholar
  10. Williams, C. K. I. & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:12, 1342-1351.Google Scholar
  11. Yaglom, A. M. (1987). Correlation theory of stationary and related random functions vol. I:Basic results. Berlin: Springer Verlag.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Christopher K.I. Williams
    • 1
  1. 1.Division of InformaticsThe University of EdinburghEdinburghUK

Personalised recommendations