Skip to main content

Visualizing Cluster Analysis and Finite Mixture Models

  • Chapter
Handbook of Data Visualization

Part of the book series: Springer Handbooks Comp.Statistics ((SHCS))

Abstract

Data visualization can greatly enhance our understanding of multivariate data structures, and so it is no surprise that cluster analysis and data visualization often go hand in hand, and that textbooks like Gordon (1999) or Everitt et al. (2001) are full of figures. In particular, hierarchical cluster analysis is almost always accompanied by a dendrogram. Results frompartitioning cluster analysis can be visualized by projecting the data into two-dimensional space or using parallel coordinates. Cluster membership is usually represented by different colors and glyphs, or by dividing clusters into several panels of a trellis display (Becker et al., 1996). In addition, silhouette plots (Rousseeuw, 1987) provide a popular tool for diagnosing the quality of a partition. Some of the popularity of self-organizing feature maps (Kohonen, 1989) with practitioners in various fields can be explained by the fact that the results can be “easily” visualized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 429.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 549.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Becker, R., Cleveland, W. and Shyu, M.-J. (1996). The visual design and control of trellis display, Journal of Computational and Graphical Statistics 5:123–155.

    Article  Google Scholar 

  • Everitt, B.S., Landau, S. and Leese, M. (2001). Cluster Analysis, 4th edn, Arnold, London, UK.

    Google Scholar 

  • Fraley, C. and Raftery, A.E. (2002). Model-based clustering, discriminant analysis and density estimation, Journal of the American Statistical Association 97:611–631.

    Article  MATH  MathSciNet  Google Scholar 

  • Friendly, M. (2000). Visualizing Categorical Data, SAS Press, Cary, NC. ISBN 1-58025-660-0.

    Google Scholar 

  • Gordon, A.D. (1999). Classification, 2nd edn, Chapman & Hall / CRC, Boca Raton, FL, USA.

    MATH  Google Scholar 

  • Hartigan, J.A. (1975). Clustering Algorithms, Wiley, New York.

    MATH  Google Scholar 

  • Hartigan, J.A. and Kleiner, B. (1984). A mosaic of television ratings, The American Statistician 38(1):32–35.

    Article  Google Scholar 

  • Hartigan, J.A. and Wong, M.A. (1979). Algorithm AS136: A k-means clustering algorithm, Applied Statistics 28(1):100–108.

    Article  MATH  Google Scholar 

  • Hennig, C. (2004). Asymmetric linear dimension reduction for classification, Journal of Computational and Graphical Statistics 13(4):1–17.

    Article  MathSciNet  Google Scholar 

  • Kaufman, L. and Rousseeuw, P.J. (1990). Finding Groups in Data, Wiley, New York.

    Google Scholar 

  • Kohonen, T. (1989). Self-organization and Associative Memory, 3rd edn, Springer, New York.

    Google Scholar 

  • Lance, G.N. and Williams, W.T. (1967). A general theory of classification sorting strategies I. hierarchical systems, Computer Journal 9:373–380.

    Google Scholar 

  • Leisch, F. (2004). Exploring the structure of mixture model components, in J. Antoch (ed), Compstat 2004 – Proceedings in Computational Statistics, Physica Verlag, Heidelberg, pp. 1405–1412. ISBN 3-7908-1554-3.

    Google Scholar 

  • Leisch, F. (2006). A toolbox for k-centroids cluster analysis, Computational Statistics and Data Analysis 51(2):526–544.

    Article  MathSciNet  Google Scholar 

  • Mächler, M., Rousseeuw, P., Struyf, A. and Hubert, M. (2005). cluster: Cluster Analysis. R package version 1.10.0.

    Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations., in Cam, L.M.L. and Neyman, J. (eds), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, CA, pp. 281–297.

    Google Scholar 

  • Martinetz, T. and Schulten, K. (1994). Topology representing networks, Neural Networks 7(3):507–522.

    Article  Google Scholar 

  • Meyer, D., Zeileis, A. and Hornik, K. (2005). vcd: Visualizing Categorical Data. R package version 0.9-5.

    Google Scholar 

  • Milligan, G.W. and Cooper, M.C. (1985). An examination of procedures for determining the number of clusters in a data set, Psychometrika 50(2):159–179.

    Article  Google Scholar 

  • Murrell, P. (2005). R Graphics, Chapman & Hall / CRC, Boca Raton, FL.

    Google Scholar 

  • Pison, G., Struyf, A. and Rousseeuw, P.J. (1999). Displaying a clustering with CLUSPLOT, Computational Statistics and Data Analysis 30:381–392.

    Article  MATH  Google Scholar 

  • R Development Core Team (2007). R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org

    Google Scholar 

  • Rousseeuw, P.J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics 20:53–65.

    Article  MATH  Google Scholar 

  • Rousseeuw, P.J., Ruts, I. and Tukey, J.W. (1999). The bagplot: A bivariate boxplot, The American Statistician 53(4):382–387.

    Article  Google Scholar 

  • Tantrum, J., Murua, A. and Stuetzle, W. (2003). Assessment and pruning of hierarchical model based clustering, Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, pp. 197–205. ISBN:1-58113-737-0.

    Google Scholar 

  • Warnes, G.R. (2005). gplots: Various R programming tools for plotting data. R package version 2.0.8.

    Google Scholar 

  • Wedel, M. and DeSarbo, W.S. (1995). A mixture likelihood approach for generalized linear models, Journal of Classification 12:21–55.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Leisch, F. (2008). Visualizing Cluster Analysis and Finite Mixture Models. In: Handbook of Data Visualization. Springer Handbooks Comp.Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-33037-0_22

Download citation

Publish with us

Policies and ethics