Advertisement

K-means clustering in a low-dimensional Euclidean space

  • Geert De Soete
  • J. Douglas Carroll
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Summary

A procedure is developed for clustering objects in a low-dimensional subspace of the column space of an objects by variables data matrix. The method is based on the K-means criterion and seeks the subspace that is maximally informative about the clustering structure in the data. In this low-dimensional representation, the objects, the variables and the cluster centroids are displayed jointly. The advantages of the new method are discussed, an efficient alternating least-squares algorithm is described, and the procedure is illustrated on some artificial data.

Keywords

Singular Value Decomposition Cluster Structure Cluster Centroid Artificial Data Column Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ARABIE, P., and HUBERT, L. (in press): Cluster analysis in marketing research. In: R. P. Bagozzi (ed.): Handbook of marketing research. Blackwell, Oxford.Google Scholar
  2. CHANG, W.-C. (1983): On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275. CrossRefGoogle Scholar
  3. DESARBO, W. S., HOWARD, D. J., and JEDIDI, K. (1991): Multiclus: A new method for simultaneously performing multidimensional scaling and cluster analysis. Psychometrika, 56, 121–136. CrossRefGoogle Scholar
  4. DESARBO, W. S., JEDIDI, K., COOL, K., and SCHENDEL, D. (1990): Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146. CrossRefGoogle Scholar
  5. DE SOETE, G., and HEISER, W. J. (1993): A latent class unfolding model for analyzing single stimulus preference ratings. Psychometrika, 58, 545–565. CrossRefGoogle Scholar
  6. DE SOETE, G. and WINSBERG, S. (1993): A latent class vector model for preference data. Journal of Classification, 10, 195–218. CrossRefGoogle Scholar
  7. DOYLE, P., and SAUNDERS, J. (1985): Market segmentation and positioning in specialized industrial markets. Journal of Marketing, 49, 24–32. CrossRefGoogle Scholar
  8. FURSE, D. H., PUNJ, G. N., and STEWART, D. W. (1984): A typology of individual search strategies among purchasers of new automobiles. Journal of Consumer Research, 10, 417–431. CrossRefGoogle Scholar
  9. GABRIEL, K. R. (1971): The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58, 453–467. CrossRefGoogle Scholar
  10. HEISER, W. J. (1993): Clustering in low-dimensional space. In: O. Opitz, B. Lausen, and R. Klar (eds.): Information and classification. Springer-Verlag, Berlin, 162–173.CrossRefGoogle Scholar
  11. HUBERT, L., and ARABIE, P. (1985): Comparing partitions. Journal of Classification, 2, 193–218. CrossRefGoogle Scholar
  12. KRUSKAL, J. B. (1972): Linear transformation of multivariate data to reveal clustering. In: R. N. Shepard, A. K. Romney, and S. B. Nerlove (eds.): Multidimensional scaling. Theory and applications in the behavioral sciences. Seminar Press, New York, vol. 1, 179–191.Google Scholar
  13. MACQUEEN, J. (1967): Some methods for classification and analysis of multivariate observations. In: L. M. LeCam and J. Neyman (eds.): 5th Berkeley Symposium on Mathematics, Statistics, and Probability. University of California Press, Berkeley, vol. 1, 281–298.Google Scholar
  14. MILLIGAN, G. W. (1980): An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325–342. CrossRefGoogle Scholar
  15. VAN BUUREN, S., and HEISER, W. J. (1989): Clustering N objects into K groups under optimal scaling of variables. Psychometrika, 54, 699–706. CrossRefGoogle Scholar
  16. WINSBERG, S., and DE SOETE, G. (1993): A latent class approach to fitting the weighted Euclidean mode, Clascal. Psychometrika, 58, 315–330. CrossRefGoogle Scholar
  17. YOUNG, G. (1940): Maximum likelihood estimation and factor analysis. Psychometrika, 6, 49–53. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Geert De Soete
    • 1
  • J. Douglas Carroll
    • 2
  1. 1.Department of Data AnalysisUniversity of GhentGhentBelgium
  2. 2.Graduate School of ManagementRutgers UniversityNewarkUSA

Personalised recommendations