Skip to main content
Log in

Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

In agglomerative hierarchical clustering, pair-group methods suffer from a problem of non-uniqueness when two or more distances between different clusters coincide during the amalgamation process. The traditional approach for solving this drawback has been to take any arbitrary criterion in order to break ties between distances, which results in different hierarchical classifications depending on the criterion followed. In this article we propose a variable-group algorithm that consists in grouping more than two clusters at the same time when ties occur. We give a tree representation for the results of the algorithm, which we call a multidendrogram, as well as a generalization of the Lance andWilliams’ formula which enables the implementation of the algorithm in a recursive way.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ARNAU, V., MARS, S., and MARÍN, I. (2005), “Iterative Cluster Analysis of Protein Interaction Data,” Bioinformatics, 21(3), 364–378.

    Article  Google Scholar 

  • BACKELJAU, T., DE BRUYN, L., DE WOLF, H., JORDAENS, K., VAN DONGEN, S., and WINNEPENNINCKX, B. (1996), “Multiple UPGMA and Neighbor-Joining Trees and the Performance of Some Computer Packages,” Molecular Biology and Evolution, 13(2), 309–313.

    Google Scholar 

  • CORMACK, R.M. (1971), “A Review of Classification” (with discussion), Journal of the Royal Statistical Society, Ser. A, 134, 321–367.

    Article  MathSciNet  Google Scholar 

  • GORDON, A.D. (1999), Classification (2nd ed.), London/Boca Raton, FL:Chapman & Hall/CRC.

    MATH  Google Scholar 

  • HART, G. (1983), “The Occurrence of Multiple UPGMA Phenograms,” in Numerical Taxonomy, ed. J. Felsenstein, Berlin Heidelberg: Springer-Verlag, pp. 254–258.

    Google Scholar 

  • LANCE, G.N., and WILLIAMS, W.T. (1966), “A Generalized Sorting Strategy for Computer Classifications,” Nature, 212, 218.

    Article  Google Scholar 

  • MACCUISH, J., NICOLAOU, C., and MACCUISH, N.E. (2001), “Ties in Proximity and Clustering Compounds,” Journal of Chemical Information and Computer Sciences, 41, 134–146.

    Article  Google Scholar 

  • MORGAN, B.J.T., and RAY, A.P.G. (1995), “Non-uniqueness and Inversions in Cluster Analysis,” Applied Statistics, 44(1), 117–134.

    Article  MATH  Google Scholar 

  • SNEATH, P.H.A., and SOKAL, R.R. (1973), Numerical Taxonomy: The Principles and Practice of Numerical Classification, San Francisco: W. H. Freeman and Company.

    MATH  Google Scholar 

  • SZÉKELY, G.J., and RIZZO, M.L. (2005), “Hierarchical Clustering via Joint Between-Within Distances: Extending Ward’s Minimum Variance Method,” Journal of Classification, 22, 151–183.

    Article  MathSciNet  Google Scholar 

  • VAN DER KLOOT, W.A., SPAANS, A.M.J., and HEISER, W.J. (2005), “Instability of Hierarchical Cluster Analysis Due to Input Order of the Data: The Permu CLUSTER Solution,” Psychological Methods, 10(4), 468–476.

    Article  Google Scholar 

  • WARD, J.H., Jr. (1963), “Hierarchical Grouping to Optimize an Objective Function,” Journal of the American Statistical Association, 58, 236–244.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Gómez.

Additional information

The authors thank A. Arenas for discussion and helpful comments. This work was partially supported by DGES of the Spanish Government Project No. FIS2006–13321–C02–02 and by a grant of Universitat Rovira i Virgili.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernández, A., Gómez, S. Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms. J Classif 25, 43–65 (2008). https://doi.org/10.1007/s00357-008-9004-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-008-9004-x

Keywords

Navigation