Advertisement

Statistics and Computing

, Volume 20, Issue 4, pp 457–469 | Cite as

Neighborhood graphs, stripes and shadow plots for cluster visualization

  • Friedrich Leisch
Article

Abstract

Centroid-based partitioning cluster analysis is a popular method for segmenting data into more homogeneous subgroups. Visualization can help tremendously to understand the positions of these subgroups relative to each other in higher dimensional spaces and to assess the quality of partitions. In this paper we present several improvements on existing cluster displays using neighborhood graphs with edge weights based on cluster separation and convex hulls of inner and outer cluster regions. A new display called shadow-stars can be used to diagnose pairwise cluster separation with respect to the distribution of the original data. Artificial data and two case studies with real data are used to demonstrate the techniques.

Keywords

Cluster analysis Partition Centroid Convex hull 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Becker, R., Cleveland, W., Shyu, M.-J.: The visual design and control of trellis display. J. Comput. Graph. Stat. 5, 123–155 (1996) CrossRefGoogle Scholar
  2. Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Arnold, London (2001) Google Scholar
  3. Gordon, A.D.: Classification, 2nd edn. Chapman & Hall/CRC, Boca Raton (1999) zbMATHGoogle Scholar
  4. Hartigan, J.A., Wong, M.A.: Algorithm AS136: A k-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979) zbMATHCrossRefGoogle Scholar
  5. Hennig, C.: Asymmetric linear dimension reduction for classification. J. Comput. Graph. Stat. 13(4), 1–17 (2004) MathSciNetGoogle Scholar
  6. Hintze, J.L., Nelson, R.D.: Violin plots: A box plot-density trace synergism. Am. Stat. 52(2), 181–184 (1998) CrossRefGoogle Scholar
  7. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data. Wiley, New York (1990) CrossRefGoogle Scholar
  8. Kohonen, T.: Self-organization and Associative Memory, 3rd edn. Springer, New York (1989) Google Scholar
  9. Kruskal, J.: The relationship between multidimensional scaling and clustering. In: Ryzin, J.V. (ed.) Classification and Clustering, pp. 17–44. Academic Press, New York (1977) Google Scholar
  10. Leisch, F.: A toolbox for k-centroids cluster analysis. Comput. Stat. Data Anal. 51(2), 526–544 (2006). doi: 10.1007/s11222-009-9137-8 zbMATHCrossRefMathSciNetGoogle Scholar
  11. Leisch, F.: Visualizing cluster analysis and finite mixture models. In: Chen, C., Härdle, W., Unwin, A. (eds.) Handbook of Data Visualization. Springer Handbooks of Computational Statistics. Springer, Berlin (2008). ISBN 978-3-540-33036-3 Google Scholar
  12. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967) Google Scholar
  13. Martinetz, T., Schulten, K.: Topology representing networks. Neural Netw. 7(3), 507–522 (1994) CrossRefGoogle Scholar
  14. Martinetz, T.M., Berkovich, S.G., Schulten, K.J.: “Neural-Gas” network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Netw. 4(4), 558–569 (1993) CrossRefGoogle Scholar
  15. Mazanec, J., Grabler, K., Maier, G.: International City Tourism: Analysis and Strategy. Pinter/Cassel, London (1997) Google Scholar
  16. Pison, G., Struyf, A., Rousseeuw, P.J.: Displaying a clustering with CLUSPLOT. Comput. Stat. Data Anal. 30, 381–392 (1999) zbMATHCrossRefGoogle Scholar
  17. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2008). http://www.R-project.org. ISBN 3-900051-07-0 Google Scholar
  18. Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987) zbMATHCrossRefGoogle Scholar
  19. Rousseeuw, P.J., Ruts, I., Tukey, J.W.: The bagplot: A bivariate boxplot. Am. Stat. 53(4), 382–387 (1999) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Institut für StatistikLudwig-Maximilians-Universität MünchenMunichGermany

Personalised recommendations