Order-Constrained Solutions in K-Means Clustering: Even Better Than Being Globally Optimal

Abstract

This paper proposes an order-constrained K-means cluster analysis strategy, and implements that strategy through an auxiliary quadratic assignment optimization heuristic that identifies an initial object order. A subsequent dynamic programming recursion is applied to optimally subdivide the object set subject to the order constraint. We show that although the usual K-means sum-of-squared-error criterion is not guaranteed to be minimal, a true underlying cluster structure may be more accurately recovered. Also, substantive interpretability seems generally improved when constrained solutions are considered. We illustrate the procedure with several data sets from the literature.

This is a preview of subscription content, log in to check access.

References

  1. Al-Sultan, K. (1995). A tabu search approach to the clustering problem. Pattern Recognition, 28, 1443–1451.

    Article  Google Scholar 

  2. Babu, G.P., & Murty, M.N. (1994). Simulated annealing for selecting optimal initial seeds in the K-means algorithm. Indian Journal of Pure and Applied Mathematics, 25, 85–94.

    Google Scholar 

  3. Baker, F.B., & Hubert, L.J. (1977). Applications of combinatorial programming to data analysis: Seriation using asymmetric proximity measures. British Journal of Mathematical and Statistical Psychology, 30, 154–164.

    Google Scholar 

  4. Brusco, M.J., & Cradit, J.D. (2004). Graph coloring, mimimum-diameter partitioning, and the analysis of confusion matrices. Journal of Mathematical Psychology, 48, 301–309.

    Article  Google Scholar 

  5. Brusco, M.J., & Cradit, J.D. (2005). Bicriterion methods for partitioning dissimilarity matrices. British Journal of Mathematical and Statistical Psychology, 58, 319–332.

    Article  PubMed  Google Scholar 

  6. Brusco, M.J., & Stahl, S. (2001a). An interactive multiobjective programming approach to combinatorial data analysis. Psychometrika, 66, 5–24.

    Article  Google Scholar 

  7. Brusco, M.J., & Stahl, S. (2001b). Compact integer-programming models for extracting subsets of stimuli from confusion matrices. Psychometrika, 66, 405–420.

    Article  Google Scholar 

  8. Brusco, M.J., & Stahl, S. (2005). Branch-and-bound applications in combinatorial data analysis. New York: Springer.

    Google Scholar 

  9. Chang, W.C. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275.

    Article  Google Scholar 

  10. Delattre, M., & Hansen, P. (1980). Bicriterion cluster analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 277–291.

    Article  Google Scholar 

  11. Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.

    Google Scholar 

  12. Fisher, W.D. (1958). On grouping for maximum heterogeneity. Journal of the American Statistical Association, 53, 789–798.

    Article  Google Scholar 

  13. Forgy, E.W. (1965). Cluster analysis of multivariate data: Efficiency versus interpretability of classifications. Biometrics, 21, 768–769.

    Google Scholar 

  14. Hansen, P., & Mladenovic, N. (2001). J-Means: A new local search heuristic for minimum sum of squares clustering. Pattern Recognition, 34, 405–413.

    Article  Google Scholar 

  15. Hartigan, J. (1975). Clustering algorithms. New York: Wiley.

    Google Scholar 

  16. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.

    Article  Google Scholar 

  17. Hubert, L., Arabie, P., & Meulman, J. (2001). Combinatorial data analysis: Optimization by dynamic programming. Philadelphia: SIAM.

    Google Scholar 

  18. Hubert, L., Arabie, P., & Meulman, J. (2006). The structural representation of proximity matrices with MATLAB. Philadelphia: SIAM.

    Google Scholar 

  19. Kiplinger’s personal finance. In Kiplinger’s personal finance (Vol. 57, pp. 104–123).

  20. MacQueen, J. (1967). Some methods of classification and analysis of multivariate observations. In L.M. Le Cam & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281–297). University of California Press: Berkeley.

    Google Scholar 

  21. Milligan, G.W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325–342.

    Article  Google Scholar 

  22. Milligan, G.W., & Cooper, M.C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.

    Article  Google Scholar 

  23. Milligan, G.W., & Cooper, M.C. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5, 181–204.

    Article  Google Scholar 

  24. Mirkin, B. (2005). Clustering for data mining. New York: Chapman & Hall.

    Google Scholar 

  25. Pacheco, J., & Valencia, O. (2003). Design of hybrids for the minimum sum-of-squares clustering problem. Computational Statistics and Data Analysis, 43, 235–248.

    Google Scholar 

  26. Späth, H. (1980). Cluster analysis algorithms. Chichester: Ellis Horwood.

    Google Scholar 

  27. Steinley, D. (2003). K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–304.

    Article  PubMed  Google Scholar 

  28. Steinley, D. (2004a). Standardizing variables in K-means clustering. In D. Banks, L. House, F.R. McMorris, P. Arabie, & W. Gaul (Eds.), Classification, clustering, and data mining applications (pp. 53–60). New York: Springer.

    Google Scholar 

  29. Steinley, D. (2004b). Properties of the Hubert-Arabie adjusted Rand index. Psychological Methods, 9, 386–396.

    Article  PubMed  Google Scholar 

  30. Steinley, D. (2006a). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.

    Article  PubMed  Google Scholar 

  31. Steinley, D. (2006b). Profiling local optima in K-means clustering: Developing a diagnostic technique. Psychological Methods, 11, 178–192.

    Article  PubMed  Google Scholar 

  32. Steinley, D., & Brusco, M.J. (2007). Initializing K-means batch clustering: A critical evaluation of several techniques. Journal of Classification, 24, 99–121.

    Article  Google Scholar 

  33. Steinley, D., & Brusco, M.J. (2008, in press). A new variable weighting and selection procedure for K-means cluster analysis. Multivariate Behavioral Research.

  34. Steinley, D., & Henson, R. (2005). OCLUS: An analytic method for generating clusters with known overlap. Journal of Classification, 22, 221–250.

    Article  Google Scholar 

  35. Theise, E.S. (1989). Finding a subset of stimulus-response pairs with minimum total confusion: A binary integer programming approach. Human Factors, 31, 291–305.

    Google Scholar 

  36. Thorndike, R.L. (1953). Who belongs in the family? Psychometrika, 18, 267–276.

    Article  Google Scholar 

  37. Van Ness, J.W. (1973). Admissible clustering procedures II. Biometrika, 60, 422–424.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Douglas Steinley.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Steinley, D., Hubert, L. Order-Constrained Solutions in K-Means Clustering: Even Better Than Being Globally Optimal. Psychometrika 73, 647 (2008). https://doi.org/10.1007/s11336-008-9058-z

Download citation

Keywords

  • K-means cluster analysis
  • dynamic programming
  • quadratic assignment
  • constrained optimization
  • multicriterion optimization