Skip to main content
  • 4280 Accesses

Abstract

This chapter gives an overview of the book. Section 1 briefly introduces high-dimensional data and the necessity of dimensionality reduction. Section 2 discusses the acquisition of high-dimensional data. When dimensions of the data are very high, we shall meet the so-called curse of dimensionality, which is discussed in Section 3. The concepts of extrinsic and intrinsic dimensions of data are discussed in Section 4. It is pointed out that most high-dimensional data have low intrinsic dimensions. Hence, the material in Section 4 shows the possibility of dimensionality reduction. Finally, Section 5 gives an outline of the book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ashbaugh, D.R.: Ridgeology. Journal of Forensic Identification 41(1) (1991). URL http://onin.com/fp/ridgeology.pdf. Accessed 1 December 2011.

    Google Scholar 

  2. McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996). http://www.cs.cmu.edu/~ mccallum/bow. Accessed 1 December 2011.

    Google Scholar 

  3. Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980).

    Article  Google Scholar 

  4. Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961).

    MATH  Google Scholar 

  5. Scott, D.W., Thompson, J.R.: Probability density estimation in higher dimensions. In: J.E. Gentle (ed.) Computer Science and Statistics: Proceedings of the Fifteenth Symposium on the Interface, pp. 173–179. North Holland-Elsevier Science Publishers, Amsterdam, New York, Oxford (1983).

    Google Scholar 

  6. Posse, C.: Tools for two-dimensional exploratory projection pursuit. Journal of Computational and Graphical Statistics 4, 83–100 (1995).

    Google Scholar 

  7. Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer (2007).

    Google Scholar 

  8. Carreira-Perpiñán, M. Á.: A review of dimension reduction techniques. Tech. Rep. CS-96-09, Dept. of Computer Science, University of Sheffield, UK (1996).

    Google Scholar 

  9. Demartines, P.: Analyse de dennées par réseaux de newrones auto-organisés. Ph.D. thesis, Institut National Polytechnique de Grenoble, Grenoble, France (1994).

    Google Scholar 

  10. Aggarwal, C.C., Hinneburg, A., Kein, D.A.: On the surpricing behavior of distance metrics in high dimensional space. In: J.V. den Busche, V. Vianu (eds.) Proceedings of the Eighth International Conference on Database Theory, Lecture Notes in Computer Science, vol. 1973, pp. 420–434. Springer, London (2001).

    Google Scholar 

  11. Beyer, K.S., Goldstein, J., U. Shaft, R.R.: When is “nearest neighbor” meaningful? In: Proceeedings of the Seventh International Conference on Database Theory, Lecture Notes in Computer Science, vol. 1540, pp. 217–235. Springer-Verlag, Jerusalem, Israel (1999).

    Google Scholar 

  12. Borodin, A., Ostrovsky, R., Rabani, Y.: Low bounds for high dimensional nearest neighbor search and relaterd problems. In: Proceeedings of the thirtyfirst annual ACM symposium on Theory of Computing, Atlanta, GA, pp. 312–321. ACM Press, New York (1999).

    Google Scholar 

  13. Francois, D.: High-dimensional data analysis: optional metrics and feature selection. Ph.D. thesis, Université catholique de Louvain, Département d’Ingénierie Mathématique, Louvain-la-Neuve, Belgium (2006).

    Google Scholar 

  14. Erlebach, T., Jansen, K., Seidel, E.: Polynomial-time approximation schemes for geometric graphs. In: Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 671–679 (2001).

    Google Scholar 

  15. Hastad, J.: Clique is hard to approximate within n 1-ε. In: Proceedings of the 37th Annual Symposium on Foundations of Computer Science, pp. 627–636 (1996).

    Google Scholar 

  16. Pettis, K.W., Bailey, T.A., Jain, A.K., Dubes, R.C.: An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans on PAMI 1, 25–37 (1979).

    Article  MATH  Google Scholar 

  17. Verveer, P., Duin, R.: An evaluation of intrinsic dimensionality estimators. IEEE Trans. on PAMI 17(1), 81–86 (1995).

    Article  Google Scholar 

  18. Kegl, B.: Intrinsic dimension estimation using packing numbers. in Advanced NIPS 14 (2002).

    Google Scholar 

  19. Little, A., Lee, J., Jung, Y.M., Maggioni, M.: Estimation of intrinsic dimensionality of samples from noisy low-dimensional manifolds in high dimensions with multiscale svd. In: Statistical Signal Processing, IEEE/SP 15th Workshop, pp. 85–88 (2009).

    Google Scholar 

  20. Lee, J.: Multiscale estimation of intrinsic dimensionality of point cloud data and multiscale analysis of dynamic graphs (2010).

    Google Scholar 

  21. Hein, M., Audibert, J.Y.: Intrinsic dimensionality estimation of submanifolds in r d. In: Proceedings of the 22nd international conference on Machine learning, ACM International Conference Proceeding Series, vol. 119. ACM, New York (2005).

    Google Scholar 

  22. Costa, J., Hero, A.O.: Geodesic entropic graphs for dimension and entropy estimation in manifold learning. IEEE Trans. on Signal Processing 52, 2210–2221 (2004).

    Article  MathSciNet  Google Scholar 

  23. Levina, E., Bickel, P.: Maximum likelihood estimation of intrinsic dimension. In: L.B. L. K. Saul Y. Weiss (ed.) Advances in NIPS 17 (2005).

    Google Scholar 

  24. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000).

    Article  Google Scholar 

  25. Weinberger, K.Q., Packer, B.D., Saul, L.K.: Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: Proc. of the 10th International Workshop on AI and Statistics (2005).

    Google Scholar 

  26. Gashler, M., Ventura, D., Martinez, T.: Iterative non-linear dimensionality reduction with manifold sculpting. In: J. Platt, D. Koller, Y. Singer, S. Roweis (eds.) Advances in Neural Information Processing Systems 20, pp. 513–520. MIT Press, Cambridge (2008).

    Google Scholar 

  27. Venna, J., Kaski, S.: Local multidimensional scaling. Neural Networks 19(6), 889–899 (2006).

    Article  MATH  Google Scholar 

  28. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000).

    Article  Google Scholar 

  29. Zhang, Z.Y., Zha, H.Y.: Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  30. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003).

    Article  MATH  Google Scholar 

  31. Donoho, D.L., Grimes, C.: Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. USA 100, 5591–5596 (2003).

    Article  MathSciNet  MATH  Google Scholar 

  32. Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21, 5–30 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  33. Ultsch, A.: Emergence in self-organizing feature maps. In: Workshop on Self-Organizing Maps (WSOM 07). Bielefeld (2007).

    Google Scholar 

  34. Bishop, C.M., Hinton, G.E., Strachan, I.G.D.: GTM through time. In: IEE Fifth International Conference on Artificial Neural Networks, pp. 111–116 (1997).

    Google Scholar 

  35. Bishop, C.M., Svensén, M., Williams, C.K.I.: GTM: The generative topographic mapping. Neural Computation 10(1), 215–234 (1998).

    Article  Google Scholar 

  36. Candes, E., Tao, T.: Near optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. on Information Theory 52(12), 5406–5425 (2006).

    Article  MathSciNet  Google Scholar 

  37. Candes, E., Wakin, M.: An introduction to compressive sampling. IEEE Trans. Signal Processing 25(2), 21–30 (2008).

    Article  Google Scholar 

  38. Donoho, D.: Compressed sensing. IEEE Trans. on Information Theory 52(4), 1289–1306 (2006).

    Article  MathSciNet  Google Scholar 

  39. Hayes, B.: The best bits. American Scientist 97(4), pp. 276 (2009).

    Article  Google Scholar 

  40. Sarvotham, S., Baron, D., Baraniuk, R.: Measurements vs. bits: Compressed sensing meets information theory. In: Proc. Allerton Conference on Communication, Control, and Computing. Monticello, IL (2006).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Higher Education Press, Beijing and Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wang, J. (2012). Introduction. In: Geometric Structure of High-Dimensional Data and Dimensionality Reduction. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27497-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27497-8_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27496-1

  • Online ISBN: 978-3-642-27497-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics