Encyclopedia of Database Systems

Living Edition
| Editors: Ling Liu, M. Tamer Özsu

Dimension Reduction Techniques for Clustering

Living reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7993-3_612-2

Synonyms

Definition

High dimensional datasets is frequently encountered in data mining and statistical learning. Dimension reduction eliminates noisy data dimensions and thus and improves accuracy in classification and clustering, in addition to reduced computational cost. Here the focus is on unsupervised dimension reduction. The wide used technique is principal component analysis which is closely related to K-means cluster. Another popular method is Laplacian embedding which is closely related to spectral clustering.

Historical Background

Principal component analysis (PCA) was introduced by Pearson in 1901 and formalized in 1933 by Hotelling. PCA is the foundation for modern dimension reduction. A large number of linear dimension reduction techniques were developed during 1950–1970s.

Laplacian graph embedding (also called quadratic placement) is developed by Hall [8] in 1971. Spectral graph partitioning [6], is initially studied in 1970s; it is...

Keywords

Manifold Covariance 
This is a preview of subscription content, log in to check access

Recommended Reading

  1. 1.
    Alpert CJ, Kahng AB. Recent directions in netlist partitioning: a survey. Integ VLSI J. 1995;19:1–81.CrossRefMATHGoogle Scholar
  2. 2.
    Belkin M, Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems 14, 2001.Google Scholar
  3. 3.
    Chan PK, Schlag M, Zien JY. Spectral k-way ratio-cut partitioning and clustering. IEEE Trans CAD-Integ Circuit Syst. 1994;13:1088–96.CrossRefGoogle Scholar
  4. 4.
    Ding C, He X. K-means clustering and principal component analysis. In: Proc. 21st Int. Conf. on Machine Learning, 2004.Google Scholar
  5. 5.
    Ding C., He X., Zha H., Simon H. Unsupervised learning: self-aggregation in scaled principal component space. Principles of Data Mining and Knowledge Discovery, 6th European Conf., 2002, pp. 112–24.Google Scholar
  6. 6.
    Fiedler M. Algebraic connectivity of graphs. Czech Math J. 1973;23:298–305.MathSciNetMATHGoogle Scholar
  7. 7.
    Hagen M, Kahng AB. New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Desig. 1992;11:1074–85.CrossRefGoogle Scholar
  8. 8.
    Hall KM. R-dimensional quadratic placement algorithm. Manage Sci. 1971;17:219–29.CrossRefMATHGoogle Scholar
  9. 9.
    Ng AY, Jordan MI, Weiss Y. On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems 14, 2001.Google Scholar
  10. 10.
    Pothen A, Simon HD, Liou KP. Partitioning sparse matrices with egenvectors of graph. SIAM J Matrix Anal Appl. 1990;11:430–52.MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell. 2000;22:888–905.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.University of Texas at ArlingtonArlingtonUSA