Abstract
In this paper, we deal with the construction of lower-dimensional manifolds from high-dimensional data which is an important task in data mining, machine learning and statistics. Here, we consider principal manifolds as the minimum of a regularized, non-linear empirical quantization error functional. For the discretization we use a sparse grid method in latent parameter space. This approach avoids, to some extent, the curse of dimension of conventional grids like in the GTM approach. The arising non-linear problem is solved by a descent method which resembles the expectation maximization algorithm. We present our sparse grid principal manifold approach, discuss its properties and report on the results of numerical experiments for one-, two- and three-dimensional model problems.
Similar content being viewed by others
References
http://www.iro.umontreal.ca/~kegl/research/pcurves/implementations/index.html
Aronzaijn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68: 337–404
Babenko K (1960) Approximation by trigonometric polynomials in a certain class of periodic functions of several variables. Soviet Math Dokl 1:672–675. Russian original in Dokl. Akad. Nauk SSSR, 132 (1960), pp. 982–985
Balder R (1994) Adaptive Verfahren für elliptische und parabolische Differentialgleichungen. Dissertation, Technische Universität München
Balder R, Zenger C (1996) The solution of the multidimensional real Helmholtz equation on sparse grids. SIAM J Sci Comp 17: 631–646
Banks D, Olszewski R (1997) Estimating local dimensionality. In: Proceedings of the Statistical Computing Section of the American Statistical Society, ASA
Bishop C, James G (1993) Analysis of multiphase flows using dual-energy gamma densitometry and neural networks. Nucl Instrum Methods Phys Res A327: 580–593
Bishop C, Svensen M, Williams C (1998) GTM: the generative topographic mapping. Neural Comput 10(2): 215–234
Bishop C, Svensen M, Williams C (1998) Developments of the generative topographic mapping. Neurocomputing 21: 203–224
Bonk T (1994) Ein rekursiver Algorithmus zur adaptiven numerischen Quadratur mehrdimensionaler Funktionen. Dissertation, Institut für Informatik, Technische Universität München
Broomhead D, King G (1986) Extracting qualitative dynamics from experimental data. Phys D 20: 217
Broomhead D, Kirby M (2000) A new approach to dimensionality reduction: Theory and algorithms. SIAM J Appl Math 60(6): 2114–2142
Bruske J, Summer G (1998) Intrinsic dimensionality estimation with optimally topology preserving maps. IEEE Trans Pattern Anal Mach Intel 20(5): 572–575
Bungartz H-J (1992) An adaptive Poisson solver using hierarchical bases and sparse grids. In: Iterative methods in linear algebra. Elsevier, Amsterdam, pp 293–310
Bungartz H-J, Griebel M (1999) A note on the complexity of solving Poisson’s equation for spaces of bounded mixed derivatives. J Complexity 15: 167–199
Bungartz H-J, Griebel M (2004) Sparse grids. Acta Numer 13: 1–121
Chang K, Ghosh J (2001) A unified model for probabilistic principal surfaces. IEEE Trans Pattern Anal Mach Intel 23(1): 22–41
Chang K, Ghosh J (2005) Probabilistic principal surfaces classifier. In: Wang L, Jin Y (eds) FSKD 2005. LNAI, vol 3614, pp 1236–1244
Carreira-Perpinan M (1997) A review of dimension reduction techniques. Technical Report CS-96-09. Department of Computer Science, University of Sheffield
Delicado P (2001) Another look at principal curves and surfaces. J Multivar Anal 77(1): 84–116
Delvos F, Schempp W (1989) Boolean methods in interpolation and approximation. Pitman Research Notes in Mathematics, vol 230. Longman Scientific and Technical, Harlow
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39(1): 1–38
Der R, Steinmetz U, Balzuweit G (1998) Nonlinear principal component analysis. Technical Report, Institut für Informatik, Universität Leipzig
DeVore R, Konyagin S, Temlyakov V (1998) Hyperbolic wavelet approximation. Constr Approx 14: 1–26
Dong D, McAvoy T (1995) Nonlinear principal component analysis, based on principal curves and neural networks. Comput Chem Eng 20(1): 65–78
Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. Adv Comput Math 13: 1–50
Feuersänger C (2005) Dünngitterverfahren für hochdimensionale elliptische partielle Differentialgleichungen. Diplomarbeit, Institut für Numerische Simulation, Universität Bonn
Garcke J, Griebel M (2000) On the computation of the eigenproblems of hydrogen and helium in strong magnetic and electric fields with the sparse grid combination technique. J Comput Phys 165: 694–716
Garcke J, Hegland M (2006) Fitting multidimensional data using gradient penalties and combination techniques. In: Proceedings of HPSC. Hanoi, Vietnam
Gerstner T, Griebel M (1998) Numerical integration using sparse grids. Numer Algorithms 18: 209–232
Gerstner T, Griebel M (2003) Dimension-adaptive tensor-product quadrature. Computing 71(1): 65–87
Gordon W (1971) Blending function methods of bivariate and multivariate interpolation and approximation. SIAM J Numer Anal 8: 158–177
Griebel M (2006) Sparse grids and related approximation schemes for higher dimensional problems. In: Pardo L, Pinkus A, Suli E, Todd MJ (eds) Proceedings of the conference on foundations of computational mathematics (FoCM05), Santander, Spain (2005), Foundations of Computational Mathematics. LMS, vol 331, Cambridge University Press, Cambridge
Griebel M (1998) Adaptive sparse grid multilevel methods for elliptic PDEs based on finite differences. Computing 61(2): 151–179
Griebel M, Knapek S (2000) Optimized tensor-product approximation spaces. Constr Approx 16(4): 525–540
Griebel M, Oswald P (1994) On additive Schwarz preconditioners for sparse grid discretizations. Numer Math 66: 449–464
Griebel M, Oswald P (1995) Tensor product type subspace splitting and multilevel iterative methods for anisotropic problems. Adv Comput Math 4: 171–206
Griebel M, Zenger C, Zimmer S (1993) Multilevel Gauss-Seidel-algorithms for full and sparse grid problems. Computing 50: 127–148
Hastie T (1984) Principal curves and surfaces. Ph.D. Thesis, Stanford University
Hastie T, Stuetzle W (1989) Principal curves. J Am Stat Assoc 84(406): 502–516
Huo X, Ni X, Smith A (2006) A survey of manifold-based learning methods. In: Mining of enterprise data, emerging nonparametric methodology, chapter 1. Springer, New York
Jamshidi AA, Kirby MJ (2007) Towards a black box algorithm for nonlinear function approximation over high-dimensional domains. SIAM J Sci Comput 29(3): 941–963
Jollife I (1986) Principal component analysis. Springer, New York
Jost J (1994) Differentialgeometrie und Minimalflächen. Springer, Heidelberg
Kégl B (1999) Principal curves: learning, design, and applications. Ph.D. Thesis, Concordia University, Canada
Kégl B, Krzyzak A, Linder T, Zeger K (2000) Learning and design of principal curves. IEEE Trans Pattern Anal Mach Intel 22(3): 281–297
Kimmelsdorf G, Wahba G (1971) Some results on Tchebycheffian spline functions. J Math Anal Appl 33: 82–95
Kirby M (2001) Geometric data analysis: an empirical approach to dimensionality reduction and the study of patterns. Wiley, New York
Kramer M (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37: 233–243
Minka T (2001) Automatic choice of dimensionality for PCA. In: Leen T, Dietterich T, Tresp V (eds) Advances in neural information processing systems, vol 13. MIT Press, Cambridge, pp 598–604
Owen A (2004) Multidimensional variation for quasi-Monte Carlo. Technical Report 2004-02, Department of Statistics, Stanford Univ
Press W, Flannery B, Teukolsky S, Vetterling W (1992) Numerical recipes in C. Cambridge University Press, Cambridge
Paskov S (1993) Average case complexity of multivariate integration for smooth functions. J Complexity 9(2): 291–312
Sandilya S, Kulkarni S (2000) Principal curves with bounded turn. IEEE Trans Inf Theory 48(10): 2789–2793
Schwab C, Todor R (2003) Sparse finite elements for stochastic elliptic problems-higher order moments. Computing 71: 43–63
Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge
Schölkopf B, Herbrich R, Smola A, Williamson R (2001) A generalized representer theorem. Technical Report 200-81, NeuroCOLT 2000. In: Proceedings COLT’2001. Lecture Notes on Artificial Intelligence. Springer, Heidelberg
Smola A, Mika S, Schölkopf B, Williamson R (2001) Regularized principal manifolds. J Mach Learn Res 1: 179–209
Smolyak S (1963) Quadrature and interpolation formulas for tensor products of certain classes of functions. Soviet Math Dokl. 4:240–243. [Russian original in Dokl. Akad. Nauk SSSR, 148:1042–1045]
Takens F (1981) Detecting strange attractors in turbulence. In: Dynamical Systems and Turbulence. Rand D, Young L (eds) Lecture Notes in Mathematics. Springer, New York, p 366
Tibshirani R (1992) Principal curves revisited. Stat Comput 2: 183–190
Wahba G (1990) Spline models for observational data. Volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia
Whitney H (1936) Differentiable manifolds. Ann Math 37: 645–680
Zenger C (1991) Sparse grids. In: Hackbusch W (ed) Parallel Algorithms for Partial Differential Equations. NNFM, vol 31, Vieweg, Braunschweig/Wiesbaden
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by W. Hackbusch.
Rights and permissions
About this article
Cite this article
Feuersänger, C., Griebel, M. Principal manifold learning by sparse grids. Computing 85, 267–299 (2009). https://doi.org/10.1007/s00607-009-0045-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-009-0045-8