Introduction

Wang, Jianzhong

doi:10.1007/978-3-642-27497-8_1

Introduction

Jianzhong Wang²

Chapter

4284 Accesses

Abstract

This chapter gives an overview of the book. Section 1 briefly introduces high-dimensional data and the necessity of dimensionality reduction. Section 2 discusses the acquisition of high-dimensional data. When dimensions of the data are very high, we shall meet the so-called curse of dimensionality, which is discussed in Section 3. The concepts of extrinsic and intrinsic dimensions of data are discussed in Section 4. It is pointed out that most high-dimensional data have low intrinsic dimensions. Hence, the material in Section 4 shows the possibility of dimensionality reduction. Finally, Section 5 gives an outline of the book.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ashbaugh, D.R.: Ridgeology. Journal of Forensic Identification 41(1) (1991). URL http://onin.com/fp/ridgeology.pdf. Accessed 1 December 2011.
Google Scholar
McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996). http://www.cs.cmu.edu/~ mccallum/bow. Accessed 1 December 2011.
Google Scholar
Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980).
Article Google Scholar
Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961).
MATH Google Scholar
Scott, D.W., Thompson, J.R.: Probability density estimation in higher dimensions. In: J.E. Gentle (ed.) Computer Science and Statistics: Proceedings of the Fifteenth Symposium on the Interface, pp. 173–179. North Holland-Elsevier Science Publishers, Amsterdam, New York, Oxford (1983).
Google Scholar
Posse, C.: Tools for two-dimensional exploratory projection pursuit. Journal of Computational and Graphical Statistics 4, 83–100 (1995).
Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer (2007).
Google Scholar
Carreira-Perpiñán, M. Á.: A review of dimension reduction techniques. Tech. Rep. CS-96-09, Dept. of Computer Science, University of Sheffield, UK (1996).
Google Scholar
Demartines, P.: Analyse de dennées par réseaux de newrones auto-organisés. Ph.D. thesis, Institut National Polytechnique de Grenoble, Grenoble, France (1994).
Google Scholar
Aggarwal, C.C., Hinneburg, A., Kein, D.A.: On the surpricing behavior of distance metrics in high dimensional space. In: J.V. den Busche, V. Vianu (eds.) Proceedings of the Eighth International Conference on Database Theory, Lecture Notes in Computer Science, vol. 1973, pp. 420–434. Springer, London (2001).
Google Scholar
Beyer, K.S., Goldstein, J., U. Shaft, R.R.: When is “nearest neighbor” meaningful? In: Proceeedings of the Seventh International Conference on Database Theory, Lecture Notes in Computer Science, vol. 1540, pp. 217–235. Springer-Verlag, Jerusalem, Israel (1999).
Google Scholar
Borodin, A., Ostrovsky, R., Rabani, Y.: Low bounds for high dimensional nearest neighbor search and relaterd problems. In: Proceeedings of the thirtyfirst annual ACM symposium on Theory of Computing, Atlanta, GA, pp. 312–321. ACM Press, New York (1999).
Google Scholar
Francois, D.: High-dimensional data analysis: optional metrics and feature selection. Ph.D. thesis, Université catholique de Louvain, Département d’Ingénierie Mathématique, Louvain-la-Neuve, Belgium (2006).
Google Scholar
Erlebach, T., Jansen, K., Seidel, E.: Polynomial-time approximation schemes for geometric graphs. In: Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 671–679 (2001).
Google Scholar
Hastad, J.: Clique is hard to approximate within n ^1-ε. In: Proceedings of the 37th Annual Symposium on Foundations of Computer Science, pp. 627–636 (1996).
Google Scholar
Pettis, K.W., Bailey, T.A., Jain, A.K., Dubes, R.C.: An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans on PAMI 1, 25–37 (1979).
Article MATH Google Scholar
Verveer, P., Duin, R.: An evaluation of intrinsic dimensionality estimators. IEEE Trans. on PAMI 17(1), 81–86 (1995).
Article Google Scholar
Kegl, B.: Intrinsic dimension estimation using packing numbers. in Advanced NIPS 14 (2002).
Google Scholar
Little, A., Lee, J., Jung, Y.M., Maggioni, M.: Estimation of intrinsic dimensionality of samples from noisy low-dimensional manifolds in high dimensions with multiscale svd. In: Statistical Signal Processing, IEEE/SP 15th Workshop, pp. 85–88 (2009).
Google Scholar
Lee, J.: Multiscale estimation of intrinsic dimensionality of point cloud data and multiscale analysis of dynamic graphs (2010).
Google Scholar
Hein, M., Audibert, J.Y.: Intrinsic dimensionality estimation of submanifolds in r ^d. In: Proceedings of the 22nd international conference on Machine learning, ACM International Conference Proceeding Series, vol. 119. ACM, New York (2005).
Google Scholar
Costa, J., Hero, A.O.: Geodesic entropic graphs for dimension and entropy estimation in manifold learning. IEEE Trans. on Signal Processing 52, 2210–2221 (2004).
Article MathSciNet Google Scholar
Levina, E., Bickel, P.: Maximum likelihood estimation of intrinsic dimension. In: L.B. L. K. Saul Y. Weiss (ed.) Advances in NIPS 17 (2005).
Google Scholar
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000).
Article Google Scholar
Weinberger, K.Q., Packer, B.D., Saul, L.K.: Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: Proc. of the 10th International Workshop on AI and Statistics (2005).
Google Scholar
Gashler, M., Ventura, D., Martinez, T.: Iterative non-linear dimensionality reduction with manifold sculpting. In: J. Platt, D. Koller, Y. Singer, S. Roweis (eds.) Advances in Neural Information Processing Systems 20, pp. 513–520. MIT Press, Cambridge (2008).
Google Scholar
Venna, J., Kaski, S.: Local multidimensional scaling. Neural Networks 19(6), 889–899 (2006).
Article MATH Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000).
Article Google Scholar
Zhang, Z.Y., Zha, H.Y.: Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004).
Article MathSciNet MATH Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003).
Article MATH Google Scholar
Donoho, D.L., Grimes, C.: Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. USA 100, 5591–5596 (2003).
Article MathSciNet MATH Google Scholar
Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21, 5–30 (2006).
Article MathSciNet MATH Google Scholar
Ultsch, A.: Emergence in self-organizing feature maps. In: Workshop on Self-Organizing Maps (WSOM 07). Bielefeld (2007).
Google Scholar
Bishop, C.M., Hinton, G.E., Strachan, I.G.D.: GTM through time. In: IEE Fifth International Conference on Artificial Neural Networks, pp. 111–116 (1997).
Google Scholar
Bishop, C.M., Svensén, M., Williams, C.K.I.: GTM: The generative topographic mapping. Neural Computation 10(1), 215–234 (1998).
Article Google Scholar
Candes, E., Tao, T.: Near optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. on Information Theory 52(12), 5406–5425 (2006).
Article MathSciNet Google Scholar
Candes, E., Wakin, M.: An introduction to compressive sampling. IEEE Trans. Signal Processing 25(2), 21–30 (2008).
Article Google Scholar
Donoho, D.: Compressed sensing. IEEE Trans. on Information Theory 52(4), 1289–1306 (2006).
Article MathSciNet Google Scholar
Hayes, B.: The best bits. American Scientist 97(4), pp. 276 (2009).
Article Google Scholar
Sarvotham, S., Baron, D., Baraniuk, R.: Measurements vs. bits: Compressed sensing meets information theory. In: Proc. Allerton Conference on Communication, Control, and Computing. Monticello, IL (2006).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Sam Houston State University, 1901 Avenue I, Huntsville, TX, 77382-2206, USA
Prof. Jianzhong Wang

Authors

Prof. Jianzhong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wang, J. (2012). Introduction. In: Geometric Structure of High-Dimensional Data and Dimensionality Reduction. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27497-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-27497-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27496-1
Online ISBN: 978-3-642-27497-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics