Skip to main content
Log in

Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. Because each observation is displayed dendrograms are impractical when the data set is large. For non-hierarchical cluster algorithms (e.g. Kmeans) a graph like the dendrogram does not exist. This paper discusses a graph named “clustergram” to examine how cluster members are assigned to clusters as the number of clusters increases. The clustergram can also give insight into algorithms. For example, it can easily be seen that the “single linkage” algorithm tends to form clusters that consist of just one observation. It is also useful in distinguishing between random and deterministic implementations of the Kmeans algorithm. A data set related to asbestos claims and the Thailand Landmine Data are used throughout to illustrate the clustergram.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10

Similar content being viewed by others

Notes

  1. 1IThe Stata ado files can be obtained from https://doi.org/www.schonlau.net/clustergram.html or by emailing Mattbias_Sehonlau@rand.org.

References

  • Everitt, B.S., Dunn, G. (1991),Applied Multivariate Data Analysis, New York: John Wiley & Sons.

    MATH  Google Scholar 

  • Hand, D., Mannila, H., Smyth, P. (2001),Principles of Data Mining, Cambridge, MA: Massachusetts Institute of Technology.

    Google Scholar 

  • Hartigan, J.A. (1975).Clustering Algorithms. New York: Wiley.

    MATH  Google Scholar 

  • Hartigan, J.A., Wong, M.A. (1979), A k-means clustering algorithm.Applied Statistics, 28, 100–108.

    Article  Google Scholar 

  • Johnson R.A., Wichern D.W. (1988),Applied Multivariate Analysis, 2nd ed, Englewood Cliffs, NJ: Prentice Hall.

    MATH  Google Scholar 

  • MacQueen, J. (1967), Some methods for classification and analysis of multivariate observations,Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. L.M. LeCam and J. Neyman (eds.) Berkeley: University of California Press, 281–297.

    Google Scholar 

  • Rousseuw, P.J. (1987), Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.

    Article  Google Scholar 

  • Schonlau, M. (2002) The clustergram: a graph for visualizing hierarchical and non-hierarchical cluster analyses.The Stata Journal, 2, 4, 391–402.

    Article  Google Scholar 

  • Survey Action Center (2002),Landmine Impact Survey Executive Summary: Kingdom of Thailand. Implemented by the Survey Action Center and Norwegian’s Peoples Aid. Certified by the United Nations Certifications Committee. Downloadable from https://doi.org/www.sac-na.org/resources_report_thailand.html (last accessed on April 29, 2003).

Download references

Acknowledgement

I am grateful for support from the RAND statistics group. I am grateful for discussions with Brad Efron, members of the RAND statistics group, participants of the 2002 Augsburg (Germany) workshop on data visualization and for comments from two anonymous referees. I am grateful to Steve Carroll at RAND for involving me in the Asbestos project, which prompted this work. I am grateful to Aldo Benini who was part of the Landmine Impact Project and gave me access to the data.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schonlau, M. Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams. Computational Statistics 19, 95–111 (2004). https://doi.org/10.1007/BF02915278

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02915278

Key Words

Navigation