A Sparse Grid Based Generative Topographic Mapping for the Dimensionality Reduction of High-Dimensional Data

Conference paper

Abstract

Most high-dimensional data exhibit some correlation such that data points are not distributed uniformly in the data space but lie approximately on a lower-dimensional manifold. A major problem in many data-mining applications is the detection of such a manifold from given data, if present at all. The generative topographic mapping (GTM) finds a lower-dimensional parameterization for the data and thus allows for nonlinear dimensionality reduction. We will show how a discretization based on sparse grids can be employed for the mapping between latent space and data space. This leads to efficient computations and avoids the ‘curse of dimensionality’ of the embedding dimension. We will use our modified, sparse grid based GTM for problems from dimensionality reduction and data classification.

References

  1. 1.
    Bache, K., Lichman, M.: UCI Machine Learning Repository. http://archive.ics.uci.edu/ml (2012)
  2. 2.
    Balder, R., Zenger, C.: The solution of multidimensional real Helmholtz equations on sparse grids. SIAM J. Sci. Comput. 17, 631–646 (1996)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Bishop, C., James, G.: Analysis of multiphase flows using dual-energy gamma densitometry and neural networks. Nucl. Instrum. Methods Phys. Res. Sect. A: Accel. Spectrom. Detect. Assoc. Equip. 327(2–3), 580–593 (1993)CrossRefGoogle Scholar
  4. 4.
    Bishop, C., Svensen, M., Williams, C.: GTM: the generative topographic mapping. Neural Comput. 10(1), 215–234 (1998)CrossRefGoogle Scholar
  5. 5.
    Bungartz, H.: Dünne Gitter und deren Anwendung bei der adaptiven Lösung der dreidimensionalen Poisson-Gleichung. Dissertation, Fakultät für Informatik, Technische Universität München (1992)Google Scholar
  6. 6.
    Bungartz, H., Griebel, M.: Sparse grids. Acta Numer. 13, 1–123 (2004)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Craven, P., Wahba, G.: Smoothing noisy data with spline functions. Numer. Math. 31(4), 377–403 (1978)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)MathSciNetMATHGoogle Scholar
  9. 9.
    Feuersänger, C.: Sparse Grid Methods for Higher Dimensional Approximation. Südwest-deutscher Verlag für Hochschulschriften AG & Company KG, Saarbrücken (2010)Google Scholar
  10. 10.
    Feuersänger, C., Griebel, M.: Principal manifold learning by sparse grids. Computing 85(4), 267–299 (2009)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Gerstner, T., Griebel, M.: Dimension–adaptive tensor–product quadrature. Computing, 71(1), 65–87 (2003)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Gorman, R., Sejnowski, T.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1, 75 (1988)CrossRefGoogle Scholar
  13. 13.
    Griebel, M., Hullmann, A.: Dimensionality reduction of high-dimensional data with a nonlinear principal component aligned generative topographic mapping. SIAM J. Sci. Comput. 36(3), A1027–A1047 (2014)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Hullmann, A.: Schnelle varianten des generative topographic mapping. Diploma thesis, Institute for Numerical Simulation, University of Bonn (2009)Google Scholar
  15. 15.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence – Volume 2 (IJCAI’95), San Francisco, pp. 1137–1143. Morgan Kaufmann (1995)Google Scholar
  16. 16.
    Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)MATHGoogle Scholar
  17. 17.
    Lee, J., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, New York/London (2007)CrossRefMATHGoogle Scholar
  18. 18.
    Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models, pp. 355–368. Kluwer Academic, Dordrecht/Boston (1998)Google Scholar
  19. 19.
    Pflüger, D., Peherstorfer, B., Bungartz, H.: Spatially adaptive sparse grids for high-dimensional data-driven problems. J. Complex. 26(5), 508–522 (2010)CrossRefMATHGoogle Scholar
  20. 20.
    Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT, Cambridge (2001)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Institute for Numerical SimulationUniversity of BonnBonnGermany

Personalised recommendations