Comparison tests for dendrograms: A comparative evaluation

Lapointe, François-Joseph; Legendre, Pierre

doi:10.1007/BF03040858

Comparison tests for dendrograms: A comparative evaluation

Published: September 1995

Volume 12, pages 265–282, (1995)
Cite this article

Journal of Classification Aims and scope Submit manuscript

François-Joseph Lapointe^nAff1 &
Pierre Legendre^nAff1

571 Accesses
33 Citations
Explore all metrics

Abstract

Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better; only the doublepermutation test provides unbiased type I error.

Résumé

Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique) correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires ne sont pas toujours adéquates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical measurement of trees’ similarity

Article Open access 03 January 2020

Pairwise Data Clustering Accompanied by Validation and Visualisation

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

References

ARDISSON, P.-L., BOURGET, E., and LEGENDRE, P. (1990), “Multivariate Approach to Study Species Assemblages at Large Spatiotemporal Scales: The Community Structure of the Epibenthic Fauna of the Estuary and Gulf of St. Lawrence,”Canadian Journal of Fisheries and Aquatic Sciences 47, 1364–1377.
Article Google Scholar
BURGMAN, M.A. (1987), “An Analysis of the Distribution of Plants on Granite Outcrops in Southern Western Australia Using Mantel Tests,”Vegetatio, 71, 79–86.
Google Scholar
CUERRIER, A., BARABÉ, D., and BROUILLET, L. (1992), “Bessey and Engler: A Numerical Analysis of their Classification of the Flowering Plants,”Taxon, 41, 667–684.
Article Google Scholar
CZEKANOWSKI, J. (1909), “Zur Differentialdiagnose der Neandertalgruppe,”Korrespondenz-Blatt der deutschen Gesellschaft für Anthropologie, Ethnologie und Urgeschichte, 40, 44–47.
Google Scholar
DAY, W. H. E. (1983), “Distribution of Distances Between Pairs of Classifications,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 127–131.
Google Scholar
DAY, W. H. E. (1986), “Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees,”Systematic Zoology, 35, 325–333.
Article Google Scholar
DE WALL, F. B. M., and LUTTRELL, L. M. (1988), “Mechanisms of Social Reciprocity in Three Primate Species: Symmetrical Relationship Characteristics or Cognition?”Ethology and Sociobiology, 9, 101–118.
Article Google Scholar
FELSENSTEIN, J. (1978), “The Number of Evolutionary Trees,”Systematic Zoology, 27, 27–33.
Article Google Scholar
FRANK, O., and SVENSSON, K. (1981), “On Probability Distributions of Single-Linkage Dendrograms,”Journal of Statistics and Computer Simulation, 12, 121–131.
Article MATH MathSciNet Google Scholar
FURNAS, G. W. (1984), “The Generation of Random, Binary Unordered Trees,”Journal of Classification, 1, 187–233.
Article MATH MathSciNet Google Scholar
HARDING, E. F. (1971), “The Probabilities of Rooted Tree-Shapes Generated by Random Bifurcation,”Advances in Applied Probability, 3, 44–77.
Article MATH MathSciNet Google Scholar
HARTIGAN, J. A. (1967), “Representation of Similarity Matrices by Trees,”Journal of the American Statistical Association, 62, 1140–1158.
Article MathSciNet Google Scholar
HENDY, M. D., LITTLE, C. H. C., and PENNY, D. (1984), “Comparing Trees with Pendant Vertices Labelled,”SIAM Journal of Applied Mathematics, 44, 1054–1065.
Article MATH MathSciNet Google Scholar
HUBERT, L. J., and BAKER, F. B. (1977), “The Comparison and Fitting of Given Classification Schemes,”Journal of Mathematical Psychology, 16, 233–253.
Article MATH MathSciNet Google Scholar
HUBERT, L. J., and LEVIN, J. R. (1976), “Evaluating Object Set Partitions: Free-Sort Analysis and Some Generalizations,”Journal of Verbal Learning and Verbal Behavior, 15, 459–470.
Article Google Scholar
HUDON, C., and LAMARCHE, G. (1989), “Niche Segregation Between American LobsterHomarus americanus and Rock CrabCancer irroratus,”Marine Ecology Progress Series, 52, 155–168.
Article Google Scholar
JOHNSON, S. C. (1967), “Hierarchical Clustering Schemes,”Psychometrika, 32, 241–254.
Article Google Scholar
KRACKHARDT, D., and KILDUFF, M. (1990), “Friendship Patterns and Culture: The Control of Organizational Diversity,”American Anthropologist, 92, 142–154.
Article Google Scholar
KRACKHARDT, D., and PORTER, L. W. (1986), “The Snowball Effect: Turnover Embedded in Communication Networks,”Journal of Applied Psychology, 71, 50–55.
Article Google Scholar
KULCZYNSKI, S. (1928), “Die Pflanzenassoziationen der Pieninen,”Bulletin international de l’Académie polonaise des Sciences et des Lettres. Classe des Sciences mathématiques et naturelles, Série B, Supplément II, (1927), 57–203.
Google Scholar
LAPOINTE, F.-J. (1992), “On the Congruence of Brain Evolution with Taxonomic Distances and Eco-ethological Affinities: A Statistical Evaluation,” Unpublished Dissertation Thesis, Université de Montréal.
LAPOINTE, F.-J., and LEGENDRE, P. (1990), “A Statistical Framework to Test the Consensus of Two Nested Classifications,”Systematic Zoology, 39, 1–13.
Article Google Scholar
LAPOINTE, F.-J., and LEGENDRE, P. (1991), “The Generation of Random Ultrametric Matrices Representing Dendrograms,”Journal of Classification, 8, 177–200.
Article Google Scholar
LAPOINTE, F.-J., and LEGENDRE, P. (1992), “Statistical Significance of the Matrix Correlation Coefficient for Comparing Independent Phylogenetic Trees,”Systematic Biology, 41, 378–384.
Google Scholar
LAPOINTE, F.-J., and LEGENDRE, P. (1994), “A Classification of Pure Malt Scotch Whiskies,”Applied Statistics, 43, 237–257.
Article MATH Google Scholar
LEGENDRE, P., and FORTIN, M.-J. (1989), “Spatial Pattern and Ecological Analysis,”Vegetatio, 80, 107–138.
Article Google Scholar
LUKASZEWICZ, J. (1951), “Sur la liaison et la division des points d’un ensemble fini,”Colloquium mathematicum, 2, 282–285.
MATH MathSciNet Google Scholar
MANTEL, N. (1967), “The Detection of Disease Clustering and a Generalized Regression Approach,”Cancer Research, 27, 209–220.
Google Scholar
MURTAGH, F. (1984), “Counting Dendrograms: A Survey,”Discrete Applied Mathematics, 7, 191–199.
Article MATH MathSciNet Google Scholar
ODEN, N. L., and SHAO, K. T. (1984), “An Algorithm to Equiprobably Generate All Directed Trees with k Labeled Terminal Nodes and Unlabeled Interior Nodes,”Bulletin of Mathematical Biology, 46, 379–387.
MATH MathSciNet Google Scholar
PAGE, R. D. M. (1987), “Graphs and Generalized Tracks: Quantifying Croizat’s Panbiogeography,”Systematic Zoology, 36, 1–17.
Article Google Scholar
PAGE, R. D. M. (1988), “Quantitative Cladistic Biogeography: Constructing and Comparing Area Cladograms,”Systematic Zoology, 37, 254–270.
Article Google Scholar
PAGE, R. D. M. (1990), “Temporal Congruence and Cladistic Analysis of Biogeography and Cospeciation,”Systematic Zoology, 39, 205–226.
Article Google Scholar
PAGE, R. D. M. (1991), “Random Dendrograms and Null Hypotheses in Cladistic Biogeography,”Systematic Zoology, 40, 54–62.
Article Google Scholar
PHIPPS, J. B. (1975), “The Numbers of Classifications,”Canadian Journal of Botany, 54, 686–688.
Article Google Scholar
QUIROZ, A. J. (1989), “Fast Random Generation of Binary, t-ary, and Other Types of Trees,”Journal of Classification, 6, 223–231.
Article MATH MathSciNet Google Scholar
ROHLF, F. J. (1982), “Consensus Indices for Comparing Classifications,”Mathematical Biosciences, 59, 131–144.
Article MathSciNet Google Scholar
ROHLF, F. J., and SOKAL, R. R. (1981), “Comparing Numerical Taxonomic Studies,”Systematic Zoology, 30, 459–490.
Article Google Scholar
SAVAGE, H. M. (1983), “The Shape of Evolution: Systematic Tree Topology,”Biological Journal of the Linnean Society, 20, 225–244.
Article MathSciNet Google Scholar
SCHNELL, G. D., DOUGLAS, M. E., and HOUGH, D. J. (1986), “Geographic Patterns of Variation in Offshore Spotted Dolphins (Stenella attenuata) of the Eastern Tropical Pacific Ocean,”Marine Mammal Science, 2, 186–213.
Article Google Scholar
SIMBERLOFF, D. (1987), “Calculating Probabilities that Cladograms Match: A Method of Biogeographical Inference,”Systematic Zoology, 36, 175–195.
Article Google Scholar
SIMBERLOFF, D., HECK, K. L., McCOY, E. D., and CONNOR, E. F. (1981), “There Have Been no Statistical Tests of Cladistic Biogeographical Hypotheses,” inVicariance Biogeography: A Critique, Eds., G. Nelson and D. Rosen, Columbia University Press, New York, 40–63.
Google Scholar
SHAO, K., and ROHLF, F. J. (1983), “Sampling Distributions of Consensus Indices when all Bifurcating Trees are Equally Likely” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 132–137.
Google Scholar
SHAO, K., and SOKAL, R. R. (1986), “Significance Tests of Consensus Indices,”Systematic Zoology, 35, 582–590.
Article Google Scholar
SNEATH, P. H. A. (1957), “The Application of Computers to Taxonomy,”Journal of General Microbiology, 17, 201–226.
Google Scholar
SNEATH, P. H., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: W. H. Freeman and Co.
MATH Google Scholar
SOKAL, R. R. (1979), “Testing Statistical Significance of Geographic Variation Patterns,”Systematic Zoology, 28, 227–232.
Article Google Scholar
SOKAL, R. R., and MICHENER, C. D. (1958), “A Statistical Method for Evaluating Systematic Relationships,”University of Kansas Science Bulletin, 3, 1409–1438.
Google Scholar
SOKAL, R. R., SMOUSE, P. E., and NEEL, J. V. (1986), “The Genetic Structure of a Tribal Population, the Yanomama Indians. XV. Patterns Inferred by Autocorrelation Analysis,”Genetics, 114, 259–287.
Google Scholar
SOKAL, R. R., and SNEATH, P. H. A. (1963),Principles of Numerical Taxonomy, San Francisco: W. H. Freeman and Co.
Google Scholar
SOKAL, R. R., and UNNASCH, R. S. (1988), “Geographic Covariation of Hosts and Parasites: Evidence fromPopulus andPemphigus,”Zeitschrift für zoologische Systematik und Evolutionsforschung, 26, 73–88.
Google Scholar
SOKAL, R. R., UYTTERSCHAUT, H., RÖSING, F.W., and SCHWIDETZKY, I. (1987), “A Classification of European Skulls from Three Time Periods,”American Journal of Physical Anthropology, 74, 1–20.
Article Google Scholar
SOKAL, R. R., and WARTENBERG, D. E. (1983), “A Test of Spatial Autocorrelation Using an Isolation-by-Distance Model,”Genetics, 105, 219–237.
Google Scholar
SØRENSEN, T. (1948), “A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and its Application to Analysis of the Vegetation on Danish Commons,”Biologiske Skrifter, 5, 1–34.
Google Scholar
STEEL, M. A. (1988), “Distribution of the Symmetric Difference Metric on Phylogenetic Trees,”SIAM Journal of Discrete Mathematics, 1, 541–551.
Article MATH MathSciNet Google Scholar
STEEL, M. A., and PENNY, D. (1993), “Distributions of Tree Comparison Metrics_— Some New Results,”Systematic Biology, 42, 126–141.
Google Scholar

Download references

Author information

François-Joseph Lapointe & Pierre Legendre
Present address: Département de Sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, H3C 3J7, Montréal, Québec, Canada

Authors and Affiliations

Authors

François-Joseph Lapointe
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Legendre
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lapointe, FJ., Legendre, P. Comparison tests for dendrograms: A comparative evaluation. Journal of Classification 12, 265–282 (1995). https://doi.org/10.1007/BF03040858

Download citation

Issue Date: September 1995
DOI: https://doi.org/10.1007/BF03040858

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison tests for dendrograms: A comparative evaluation

Abstract

Résumé

Access this article

Similar content being viewed by others

Statistical measurement of trees’ similarity

Pairwise Data Clustering Accompanied by Validation and Visualisation

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Comparison tests for dendrograms: A comparative evaluation

Abstract

Résumé

Access this article

Similar content being viewed by others

Statistical measurement of trees’ similarity

Pairwise Data Clustering Accompanied by Validation and Visualisation

Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation