Marketing Letters

, Volume 21, Issue 1, pp 83–101 | Cite as

Evaluation of structure and reproducibility of cluster solutions using the bootstrap

  • Sara Dolnicar
  • Friedrich LeischEmail author


Segmentation results derived using cluster analysis depend on (1) the structure of the data and (2) algorithm parameters. Typically, neither the data structure nor the sensitivity of the analysis to changes in algorithm parameters is assessed in advance of clustering. We propose a benchmarking framework based on bootstrapping techniques that accounts for sample and algorithm randomness. This provides much needed guidance both to data analysts and users of clustering solutions regarding the choice of the final clusters from computations that are exploratory in nature.


Cluster analysis Mixture models Bootstrap 


  1. Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Beverly Hills: Sage.Google Scholar
  2. Brusco, M. J. (2004). Clustering binary data in the presence of masking variables. Psychological Methods, 9(4), 510–523.CrossRefGoogle Scholar
  3. Brusco, M. J., Cradit, J. D., & Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: An application to customer value. Journal of Marketing Research, 40, 225–234.CrossRefGoogle Scholar
  4. Dibb, S., & Simkin, L. (1997). A program for implementing market segmentation. Journal of Business and Industrial Marketing, 12, 51–65.CrossRefGoogle Scholar
  5. Dimitriadou, E., Dolnicar, S., & Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67(1), 137–160.CrossRefGoogle Scholar
  6. Dolnicar, S., & Lazarevski, K. (2009). Methodological reasons for the theory/practice divide in market segmentation. Journal of Marketing Management, 25(3–4), 357–374.CrossRefGoogle Scholar
  7. Dolnicar, S., & Leisch, F. (2000). Behavioral market segmentation using the bagged clustering approach based on binary guest survey data: Exploring and visualizing unobserved heterogeneity. Tourism Analysis, 5(2–4), 163–170.Google Scholar
  8. Dubes, R., & Jain, A. K. (1979). Validity studies in clustering methodologies. Pattern Recognition, 11, 235–254.CrossRefGoogle Scholar
  9. Dudoit, S., & Fridlyand, J. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics, 19(9), 1090–1099.CrossRefGoogle Scholar
  10. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Monographs on statistics and applied probability. New York: Chapman & Hall.Google Scholar
  11. Evans, J. R., & Berman, B. (1997). Marketing. Englewood Cliffs: Prentice Hall.Google Scholar
  12. Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41, 578–588.CrossRefGoogle Scholar
  13. Frank, R. E., Massy, W. F., & Wind, Y. (1972). Market segmentation. Englewood Cliffs: Prentice Hall.Google Scholar
  14. Greenberg, M., & McDonald, S. (1989). Successful needs/benefits segmentation: A user’s guide. The Journal of Consumer Marketing, 6, 29.CrossRefGoogle Scholar
  15. Hothorn, T., Leisch, F., Zeileis, A., & Hornik, K. (2005). The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics, 14(3), 675–699.CrossRefGoogle Scholar
  16. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.CrossRefGoogle Scholar
  17. Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data. New York: Wiley.CrossRefGoogle Scholar
  18. Kotler, P. (1997). Marketing management: Analysis, planning, implementation and control. Englewood Cliffs: Prentice Hall.Google Scholar
  19. Kotler, P., & Armstrong, G. (2006). Principles of marketing. Upper Saddle River: Prentice Hall.Google Scholar
  20. Leisch, F. (2004). FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11(8), 1–18.Google Scholar
  21. Leisch, F. (2006). A toolbox for k-centroids cluster analysis. Computational Statistics and Data Analysis, 51(2), 526–544.CrossRefGoogle Scholar
  22. Martinetz, T., & Schulten, K. (1994). Topology representing networks. Neural Networks, 7(3), 507–522.CrossRefGoogle Scholar
  23. Mazanec, J. A., Grabler, K., & Maier, G. (1997). International city tourism: Analysis and strategy. London: Pinter/Cassell.Google Scholar
  24. Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159–179.CrossRefGoogle Scholar
  25. Morritt, R. M. (2007). Segmentation strategies for hospitality managers: Target marketing for competitive advantage. Binghamton: Haworth.Google Scholar
  26. Myers, J. H., & Tauber, E. (1977). Market structure analysis. Chicago: American Marketing Association.Google Scholar
  27. R Development Core Team (2008). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.Google Scholar
  28. Searle, S. R. (1971). Linear models. New York: Wiley.Google Scholar
  29. Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on statistics and applied probability. New York: Chapman & Hall.Google Scholar
  30. Strehl, A., & Gosh, J. (2002). Cluster ensembles—a knowldege reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.CrossRefGoogle Scholar
  31. Thorndike, R. L. (1953). Who belongs in the family? Psychometrika, 18, 267–276.CrossRefGoogle Scholar
  32. Tibshirani, R., & Walther, G. (2005). Cluster validation by prediction strength. Journal of Computational and Graphical Statistics, 14(3), 511–528.CrossRefGoogle Scholar
  33. Titterington, D., Smith, A., & Makov, U. (1985). Statistical analysis of finite mixture distributions. Chichester: Wiley.Google Scholar
  34. Wedel, M., & Boer, P. (2002). Glimmix: A program for estimation of latent class mixture and mixture regression models, version 3.0. Groningen: ProGAMMA.Google Scholar
  35. Wedel, M., & Kamakura, W. A. (1998). Market segmentation—conceptual and methodological foundations. Boston: Kluwer.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Marketing Research Innovation Centre (MRIC), School of Management and MarketingUniversity of WollongongWollongongAustralia
  2. 2.Department of StatisticsLudwig-Maximilians-Universität MünchenMunichGermany

Personalised recommendations