Advertisement

A General Framework for Dimensionality Reduction for Large Data Sets

  • Barbara Hammer
  • Michael Biehl
  • Kerstin Bunte
  • Bassam Mokbel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6731)

Abstract

With electronic data increasing dramatically in almost all areas of research, a plethora of new techniques for automatic dimensionality reduction and data visualization has become available in recent years. These offer an interface which allows humans to rapidly scan through large volumes of data. With data sets becoming larger and larger, however, the standard methods can no longer be applied directly. Random subsampling or prior clustering still being one of the most popular solutions in this case, we discuss a principled alternative and formalize the approaches under a general perspectives of dimensionality reduction as cost optimization. We have a first look at the question whether these techniques can be accompanied by theoretical guarantees.

Keywords

Dimensionality Reduction Explicit Mapping Locally Linear Embedding Learn Vector Quantization Dimensionality Reduction Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asuncion, A., Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://archive.ics.uci.edu/ml/ (last visit June 19, 2009)
  2. 2.
    Bartlett, P.L., Mendelson, S.: Rademacher and gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2003)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15, 1373–15396 (2003)CrossRefzbMATHGoogle Scholar
  4. 4.
    Bunte, K., Hammer, B., Wismüller, A., Biehl, M.: Adaptive local dissimilarity measures for discriminative dimension reduction of labeled data. Neurocomputing 73(7-9), 1074–1092 (2010)CrossRefGoogle Scholar
  5. 5.
    Carreira-Perpiñán, M.Á.: The elastic embedding algorithm for dimensionality reduction. In: 27th Int. Conf. Machine Learning (ICML 2010), pp. 167–174 (2010)Google Scholar
  6. 6.
    Hinton, G., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems 15, pp. 833–840. MIT Press, Cambridge (2003)Google Scholar
  7. 7.
    Keim, D.A., Mansmann, F., Schneidewind, J., Thomas, J., Ziegler, H.: Visual analytics: Scope and challenges. In: Simoff, S.J., Böhlen, M.H., Mazeika, A. (eds.) Visual Data Mining. LNCS, vol. 4404, pp. 76–90. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ-PAK: The learning vector quantization programm package. Technical Report A30, Helsinki University of TechnologyLaboratory of Computer and Information Science, FIN-02150 Espoo, Finland (1996)Google Scholar
  9. 9.
    Lee, J., Verleysen, M.: Nonlinear dimensionality reduction, 1st edn. Springer, Heidelberg (2007)CrossRefzbMATHGoogle Scholar
  10. 10.
    Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomput. 72(7-9), 1431–1443 (2009)CrossRefGoogle Scholar
  11. 11.
    Mokbel, B., Gisbrecht, A., Hammer, B.: On the effect of clustering on quality assessment measures for dimensionality reduction. In: NIPS workshop on Challenges of Data Visualization (2010)Google Scholar
  12. 12.
    Roweis, S.T., Saul, L.K.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290(5500), 2323–2326 (2000)CrossRefGoogle Scholar
  13. 13.
    Schneider, P., Biehl, M., Hammer, B.: Adaptive relevance matrices in learning vector quantization. Neural Computation 21(12), 3532–3561 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Teh, Y.W., Roweis, S.: Automatic alignment of local representations. In: Advances in Neural Information Processing Systems 15, pp. 841–848. MIT Press, Cambridge (2003)Google Scholar
  15. 15.
    Tenenbaum, J.B., Silva, V.d., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  16. 16.
    van der Maaten, L., Hinton, G.: Visualizing data using t-sne. Journal of Machine Learning Research 9, 2579–2605 (2008)zbMATHGoogle Scholar
  17. 17.
    van der Maaten, L.J.P.: Learning a parametric embedding by preserving local structure. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AI-STATS), 5, pp. 384–391. JMLR W&CP (2009)Google Scholar
  18. 18.
    van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: A comparative review. Technical Report TiCC-TR 2009-005, Tilburg University (October 2009)Google Scholar
  19. 19.
    Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11, 451–490 (2010)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Weinberger, K.Q., Saul, L.K.: An introduction to nonlinear dimensionality reduction by maximum an introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: Proceedings of the 21st National Conference on Artificial Intelligence (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Barbara Hammer
    • 1
  • Michael Biehl
    • 2
  • Kerstin Bunte
    • 2
  • Bassam Mokbel
    • 1
  1. 1.CITEC centre of excellenceBielefeld UniversityBielefeldGermany
  2. 2.Johann Bernoulli Institute for Mathematics and Computer ScienceUniversity of GroningenGroningenThe Netherlands

Personalised recommendations