Skip to main content
Log in

Comparison tests for dendrograms: A comparative evaluation

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better; only the doublepermutation test provides unbiased type I error.

Résumé

Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique) correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires ne sont pas toujours adéquates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ARDISSON, P.-L., BOURGET, E., and LEGENDRE, P. (1990), “Multivariate Approach to Study Species Assemblages at Large Spatiotemporal Scales: The Community Structure of the Epibenthic Fauna of the Estuary and Gulf of St. Lawrence,”Canadian Journal of Fisheries and Aquatic Sciences 47, 1364–1377.

    Article  Google Scholar 

  • BURGMAN, M.A. (1987), “An Analysis of the Distribution of Plants on Granite Outcrops in Southern Western Australia Using Mantel Tests,”Vegetatio, 71, 79–86.

    Google Scholar 

  • CUERRIER, A., BARABÉ, D., and BROUILLET, L. (1992), “Bessey and Engler: A Numerical Analysis of their Classification of the Flowering Plants,”Taxon, 41, 667–684.

    Article  Google Scholar 

  • CZEKANOWSKI, J. (1909), “Zur Differentialdiagnose der Neandertalgruppe,”Korrespondenz-Blatt der deutschen Gesellschaft für Anthropologie, Ethnologie und Urgeschichte, 40, 44–47.

    Google Scholar 

  • DAY, W. H. E. (1983), “Distribution of Distances Between Pairs of Classifications,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 127–131.

    Google Scholar 

  • DAY, W. H. E. (1986), “Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees,”Systematic Zoology, 35, 325–333.

    Article  Google Scholar 

  • DE WALL, F. B. M., and LUTTRELL, L. M. (1988), “Mechanisms of Social Reciprocity in Three Primate Species: Symmetrical Relationship Characteristics or Cognition?”Ethology and Sociobiology, 9, 101–118.

    Article  Google Scholar 

  • FELSENSTEIN, J. (1978), “The Number of Evolutionary Trees,”Systematic Zoology, 27, 27–33.

    Article  Google Scholar 

  • FRANK, O., and SVENSSON, K. (1981), “On Probability Distributions of Single-Linkage Dendrograms,”Journal of Statistics and Computer Simulation, 12, 121–131.

    Article  MATH  MathSciNet  Google Scholar 

  • FURNAS, G. W. (1984), “The Generation of Random, Binary Unordered Trees,”Journal of Classification, 1, 187–233.

    Article  MATH  MathSciNet  Google Scholar 

  • HARDING, E. F. (1971), “The Probabilities of Rooted Tree-Shapes Generated by Random Bifurcation,”Advances in Applied Probability, 3, 44–77.

    Article  MATH  MathSciNet  Google Scholar 

  • HARTIGAN, J. A. (1967), “Representation of Similarity Matrices by Trees,”Journal of the American Statistical Association, 62, 1140–1158.

    Article  MathSciNet  Google Scholar 

  • HENDY, M. D., LITTLE, C. H. C., and PENNY, D. (1984), “Comparing Trees with Pendant Vertices Labelled,”SIAM Journal of Applied Mathematics, 44, 1054–1065.

    Article  MATH  MathSciNet  Google Scholar 

  • HUBERT, L. J., and BAKER, F. B. (1977), “The Comparison and Fitting of Given Classification Schemes,”Journal of Mathematical Psychology, 16, 233–253.

    Article  MATH  MathSciNet  Google Scholar 

  • HUBERT, L. J., and LEVIN, J. R. (1976), “Evaluating Object Set Partitions: Free-Sort Analysis and Some Generalizations,”Journal of Verbal Learning and Verbal Behavior, 15, 459–470.

    Article  Google Scholar 

  • HUDON, C., and LAMARCHE, G. (1989), “Niche Segregation Between American LobsterHomarus americanus and Rock CrabCancer irroratus,”Marine Ecology Progress Series, 52, 155–168.

    Article  Google Scholar 

  • JOHNSON, S. C. (1967), “Hierarchical Clustering Schemes,”Psychometrika, 32, 241–254.

    Article  Google Scholar 

  • KRACKHARDT, D., and KILDUFF, M. (1990), “Friendship Patterns and Culture: The Control of Organizational Diversity,”American Anthropologist, 92, 142–154.

    Article  Google Scholar 

  • KRACKHARDT, D., and PORTER, L. W. (1986), “The Snowball Effect: Turnover Embedded in Communication Networks,”Journal of Applied Psychology, 71, 50–55.

    Article  Google Scholar 

  • KULCZYNSKI, S. (1928), “Die Pflanzenassoziationen der Pieninen,”Bulletin international de l’Académie polonaise des Sciences et des Lettres. Classe des Sciences mathématiques et naturelles, Série B, Supplément II, (1927), 57–203.

    Google Scholar 

  • LAPOINTE, F.-J. (1992), “On the Congruence of Brain Evolution with Taxonomic Distances and Eco-ethological Affinities: A Statistical Evaluation,” Unpublished Dissertation Thesis, Université de Montréal.

  • LAPOINTE, F.-J., and LEGENDRE, P. (1990), “A Statistical Framework to Test the Consensus of Two Nested Classifications,”Systematic Zoology, 39, 1–13.

    Article  Google Scholar 

  • LAPOINTE, F.-J., and LEGENDRE, P. (1991), “The Generation of Random Ultrametric Matrices Representing Dendrograms,”Journal of Classification, 8, 177–200.

    Article  Google Scholar 

  • LAPOINTE, F.-J., and LEGENDRE, P. (1992), “Statistical Significance of the Matrix Correlation Coefficient for Comparing Independent Phylogenetic Trees,”Systematic Biology, 41, 378–384.

    Google Scholar 

  • LAPOINTE, F.-J., and LEGENDRE, P. (1994), “A Classification of Pure Malt Scotch Whiskies,”Applied Statistics, 43, 237–257.

    Article  MATH  Google Scholar 

  • LEGENDRE, P., and FORTIN, M.-J. (1989), “Spatial Pattern and Ecological Analysis,”Vegetatio, 80, 107–138.

    Article  Google Scholar 

  • LUKASZEWICZ, J. (1951), “Sur la liaison et la division des points d’un ensemble fini,”Colloquium mathematicum, 2, 282–285.

    MATH  MathSciNet  Google Scholar 

  • MANTEL, N. (1967), “The Detection of Disease Clustering and a Generalized Regression Approach,”Cancer Research, 27, 209–220.

    Google Scholar 

  • MURTAGH, F. (1984), “Counting Dendrograms: A Survey,”Discrete Applied Mathematics, 7, 191–199.

    Article  MATH  MathSciNet  Google Scholar 

  • ODEN, N. L., and SHAO, K. T. (1984), “An Algorithm to Equiprobably Generate All Directed Trees with k Labeled Terminal Nodes and Unlabeled Interior Nodes,”Bulletin of Mathematical Biology, 46, 379–387.

    MATH  MathSciNet  Google Scholar 

  • PAGE, R. D. M. (1987), “Graphs and Generalized Tracks: Quantifying Croizat’s Panbiogeography,”Systematic Zoology, 36, 1–17.

    Article  Google Scholar 

  • PAGE, R. D. M. (1988), “Quantitative Cladistic Biogeography: Constructing and Comparing Area Cladograms,”Systematic Zoology, 37, 254–270.

    Article  Google Scholar 

  • PAGE, R. D. M. (1990), “Temporal Congruence and Cladistic Analysis of Biogeography and Cospeciation,”Systematic Zoology, 39, 205–226.

    Article  Google Scholar 

  • PAGE, R. D. M. (1991), “Random Dendrograms and Null Hypotheses in Cladistic Biogeography,”Systematic Zoology, 40, 54–62.

    Article  Google Scholar 

  • PHIPPS, J. B. (1975), “The Numbers of Classifications,”Canadian Journal of Botany, 54, 686–688.

    Article  Google Scholar 

  • QUIROZ, A. J. (1989), “Fast Random Generation of Binary, t-ary, and Other Types of Trees,”Journal of Classification, 6, 223–231.

    Article  MATH  MathSciNet  Google Scholar 

  • ROHLF, F. J. (1982), “Consensus Indices for Comparing Classifications,”Mathematical Biosciences, 59, 131–144.

    Article  MathSciNet  Google Scholar 

  • ROHLF, F. J., and SOKAL, R. R. (1981), “Comparing Numerical Taxonomic Studies,”Systematic Zoology, 30, 459–490.

    Article  Google Scholar 

  • SAVAGE, H. M. (1983), “The Shape of Evolution: Systematic Tree Topology,”Biological Journal of the Linnean Society, 20, 225–244.

    Article  MathSciNet  Google Scholar 

  • SCHNELL, G. D., DOUGLAS, M. E., and HOUGH, D. J. (1986), “Geographic Patterns of Variation in Offshore Spotted Dolphins (Stenella attenuata) of the Eastern Tropical Pacific Ocean,”Marine Mammal Science, 2, 186–213.

    Article  Google Scholar 

  • SIMBERLOFF, D. (1987), “Calculating Probabilities that Cladograms Match: A Method of Biogeographical Inference,”Systematic Zoology, 36, 175–195.

    Article  Google Scholar 

  • SIMBERLOFF, D., HECK, K. L., McCOY, E. D., and CONNOR, E. F. (1981), “There Have Been no Statistical Tests of Cladistic Biogeographical Hypotheses,” inVicariance Biogeography: A Critique, Eds., G. Nelson and D. Rosen, Columbia University Press, New York, 40–63.

    Google Scholar 

  • SHAO, K., and ROHLF, F. J. (1983), “Sampling Distributions of Consensus Indices when all Bifurcating Trees are Equally Likely” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 132–137.

    Google Scholar 

  • SHAO, K., and SOKAL, R. R. (1986), “Significance Tests of Consensus Indices,”Systematic Zoology, 35, 582–590.

    Article  Google Scholar 

  • SNEATH, P. H. A. (1957), “The Application of Computers to Taxonomy,”Journal of General Microbiology, 17, 201–226.

    Google Scholar 

  • SNEATH, P. H., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: W. H. Freeman and Co.

    MATH  Google Scholar 

  • SOKAL, R. R. (1979), “Testing Statistical Significance of Geographic Variation Patterns,”Systematic Zoology, 28, 227–232.

    Article  Google Scholar 

  • SOKAL, R. R., and MICHENER, C. D. (1958), “A Statistical Method for Evaluating Systematic Relationships,”University of Kansas Science Bulletin, 3, 1409–1438.

    Google Scholar 

  • SOKAL, R. R., SMOUSE, P. E., and NEEL, J. V. (1986), “The Genetic Structure of a Tribal Population, the Yanomama Indians. XV. Patterns Inferred by Autocorrelation Analysis,”Genetics, 114, 259–287.

    Google Scholar 

  • SOKAL, R. R., and SNEATH, P. H. A. (1963),Principles of Numerical Taxonomy, San Francisco: W. H. Freeman and Co.

    Google Scholar 

  • SOKAL, R. R., and UNNASCH, R. S. (1988), “Geographic Covariation of Hosts and Parasites: Evidence fromPopulus andPemphigus,”Zeitschrift für zoologische Systematik und Evolutionsforschung, 26, 73–88.

    Google Scholar 

  • SOKAL, R. R., UYTTERSCHAUT, H., RÖSING, F.W., and SCHWIDETZKY, I. (1987), “A Classification of European Skulls from Three Time Periods,”American Journal of Physical Anthropology, 74, 1–20.

    Article  Google Scholar 

  • SOKAL, R. R., and WARTENBERG, D. E. (1983), “A Test of Spatial Autocorrelation Using an Isolation-by-Distance Model,”Genetics, 105, 219–237.

    Google Scholar 

  • SØRENSEN, T. (1948), “A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and its Application to Analysis of the Vegetation on Danish Commons,”Biologiske Skrifter, 5, 1–34.

    Google Scholar 

  • STEEL, M. A. (1988), “Distribution of the Symmetric Difference Metric on Phylogenetic Trees,”SIAM Journal of Discrete Mathematics, 1, 541–551.

    Article  MATH  MathSciNet  Google Scholar 

  • STEEL, M. A., and PENNY, D. (1993), “Distributions of Tree Comparison Metrics_— Some New Results,”Systematic Biology, 42, 126–141.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lapointe, FJ., Legendre, P. Comparison tests for dendrograms: A comparative evaluation. Journal of Classification 12, 265–282 (1995). https://doi.org/10.1007/BF03040858

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03040858

Key words

Navigation