Abstract
Minimization of the within-cluster sums of squares (WCSS) is one of the most important optimization criteria in cluster analysis. Although cluster analysis modules in commercial software packages typically use heuristic methods for this criterion, optimal approaches can be computationally feasible for problems of modest size. This paper presents a new branch-and-bound algorithm for minimizing WCSS. Algorithmic enhancements include an effective reordering of objects and a repetitive solution approach that precludes the need for splitting the data set, while maintaining strong bounds throughout the solution process. The new algorithm provided optimal solutions for problems with up to 240 objects and eight well-separated clusters. Poorly separated problems with no inherent cluster structure were optimally solved for up to 60 objects and six clusters. The repetitive branch-and-bound algorithm was also successfully applied to three empirical data sets from the classification literature.
Similar content being viewed by others
References
Bhapkar, V.P. (1989). Conditioning on ancillary statistics and loss of information in the presence of nuisance parameters. Journal of Statistical Planning and Inference, 21, 139–160.
Cochran, W.G., & Cox, G.M. (1957). Experimental designs (2nd ed.) New York: Wiley.
Eggen, T.J.H.M. (2000). On the loss of information in conditional maximum likelihood estimation of item parameters. Psychometrika, 65, 337–362.
Eggen, T.J.H.M. (2004). Contributions to the theory and practice of computerized adaptive testing. Arnhem: Citogroep.
Gustafsson, J.-E. (1980). A solution of the conditional estimation problem for long tests in the Rasch model for dichotomous items. Educational and Psychological Measurement, 40, 377–385.
Huynh, H. (1994). On equivalence between a partial credit item and a set of independent Rasch binary items. Psychometrika, 59, 111–119.
Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Molenaar, I.W. (1995). Estimation of item parameters. In G.H. Fischer, & I.W. Molenaar, (Eds.), Rasch models (pp. 39–51). New York: Springer-Verlay.
Pukelsheim, F. (1993). Optimal design of experiments. New York: Wiley.
Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Vale, C.D. (1986). Linking item parameters onto a common scale. Applied Psychological Measurement, 10, 333–344.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brusco, M.J. A Repetitive Branch-and-Bound Procedure for Minimum Within-Cluster Sums of Squares Partitioning. Psychometrika 71, 347–363 (2006). https://doi.org/10.1007/s11336-004-1218-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-004-1218-1