Abstract
Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better; only the doublepermutation test provides unbiased type I error.
Résumé
Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique) correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires ne sont pas toujours adéquates.
Similar content being viewed by others
References
ARDISSON, P.-L., BOURGET, E., and LEGENDRE, P. (1990), “Multivariate Approach to Study Species Assemblages at Large Spatiotemporal Scales: The Community Structure of the Epibenthic Fauna of the Estuary and Gulf of St. Lawrence,”Canadian Journal of Fisheries and Aquatic Sciences 47, 1364–1377.
BURGMAN, M.A. (1987), “An Analysis of the Distribution of Plants on Granite Outcrops in Southern Western Australia Using Mantel Tests,”Vegetatio, 71, 79–86.
CUERRIER, A., BARABÉ, D., and BROUILLET, L. (1992), “Bessey and Engler: A Numerical Analysis of their Classification of the Flowering Plants,”Taxon, 41, 667–684.
CZEKANOWSKI, J. (1909), “Zur Differentialdiagnose der Neandertalgruppe,”Korrespondenz-Blatt der deutschen Gesellschaft für Anthropologie, Ethnologie und Urgeschichte, 40, 44–47.
DAY, W. H. E. (1983), “Distribution of Distances Between Pairs of Classifications,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 127–131.
DAY, W. H. E. (1986), “Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees,”Systematic Zoology, 35, 325–333.
DE WALL, F. B. M., and LUTTRELL, L. M. (1988), “Mechanisms of Social Reciprocity in Three Primate Species: Symmetrical Relationship Characteristics or Cognition?”Ethology and Sociobiology, 9, 101–118.
FELSENSTEIN, J. (1978), “The Number of Evolutionary Trees,”Systematic Zoology, 27, 27–33.
FRANK, O., and SVENSSON, K. (1981), “On Probability Distributions of Single-Linkage Dendrograms,”Journal of Statistics and Computer Simulation, 12, 121–131.
FURNAS, G. W. (1984), “The Generation of Random, Binary Unordered Trees,”Journal of Classification, 1, 187–233.
HARDING, E. F. (1971), “The Probabilities of Rooted Tree-Shapes Generated by Random Bifurcation,”Advances in Applied Probability, 3, 44–77.
HARTIGAN, J. A. (1967), “Representation of Similarity Matrices by Trees,”Journal of the American Statistical Association, 62, 1140–1158.
HENDY, M. D., LITTLE, C. H. C., and PENNY, D. (1984), “Comparing Trees with Pendant Vertices Labelled,”SIAM Journal of Applied Mathematics, 44, 1054–1065.
HUBERT, L. J., and BAKER, F. B. (1977), “The Comparison and Fitting of Given Classification Schemes,”Journal of Mathematical Psychology, 16, 233–253.
HUBERT, L. J., and LEVIN, J. R. (1976), “Evaluating Object Set Partitions: Free-Sort Analysis and Some Generalizations,”Journal of Verbal Learning and Verbal Behavior, 15, 459–470.
HUDON, C., and LAMARCHE, G. (1989), “Niche Segregation Between American LobsterHomarus americanus and Rock CrabCancer irroratus,”Marine Ecology Progress Series, 52, 155–168.
JOHNSON, S. C. (1967), “Hierarchical Clustering Schemes,”Psychometrika, 32, 241–254.
KRACKHARDT, D., and KILDUFF, M. (1990), “Friendship Patterns and Culture: The Control of Organizational Diversity,”American Anthropologist, 92, 142–154.
KRACKHARDT, D., and PORTER, L. W. (1986), “The Snowball Effect: Turnover Embedded in Communication Networks,”Journal of Applied Psychology, 71, 50–55.
KULCZYNSKI, S. (1928), “Die Pflanzenassoziationen der Pieninen,”Bulletin international de l’Académie polonaise des Sciences et des Lettres. Classe des Sciences mathématiques et naturelles, Série B, Supplément II, (1927), 57–203.
LAPOINTE, F.-J. (1992), “On the Congruence of Brain Evolution with Taxonomic Distances and Eco-ethological Affinities: A Statistical Evaluation,” Unpublished Dissertation Thesis, Université de Montréal.
LAPOINTE, F.-J., and LEGENDRE, P. (1990), “A Statistical Framework to Test the Consensus of Two Nested Classifications,”Systematic Zoology, 39, 1–13.
LAPOINTE, F.-J., and LEGENDRE, P. (1991), “The Generation of Random Ultrametric Matrices Representing Dendrograms,”Journal of Classification, 8, 177–200.
LAPOINTE, F.-J., and LEGENDRE, P. (1992), “Statistical Significance of the Matrix Correlation Coefficient for Comparing Independent Phylogenetic Trees,”Systematic Biology, 41, 378–384.
LAPOINTE, F.-J., and LEGENDRE, P. (1994), “A Classification of Pure Malt Scotch Whiskies,”Applied Statistics, 43, 237–257.
LEGENDRE, P., and FORTIN, M.-J. (1989), “Spatial Pattern and Ecological Analysis,”Vegetatio, 80, 107–138.
LUKASZEWICZ, J. (1951), “Sur la liaison et la division des points d’un ensemble fini,”Colloquium mathematicum, 2, 282–285.
MANTEL, N. (1967), “The Detection of Disease Clustering and a Generalized Regression Approach,”Cancer Research, 27, 209–220.
MURTAGH, F. (1984), “Counting Dendrograms: A Survey,”Discrete Applied Mathematics, 7, 191–199.
ODEN, N. L., and SHAO, K. T. (1984), “An Algorithm to Equiprobably Generate All Directed Trees with k Labeled Terminal Nodes and Unlabeled Interior Nodes,”Bulletin of Mathematical Biology, 46, 379–387.
PAGE, R. D. M. (1987), “Graphs and Generalized Tracks: Quantifying Croizat’s Panbiogeography,”Systematic Zoology, 36, 1–17.
PAGE, R. D. M. (1988), “Quantitative Cladistic Biogeography: Constructing and Comparing Area Cladograms,”Systematic Zoology, 37, 254–270.
PAGE, R. D. M. (1990), “Temporal Congruence and Cladistic Analysis of Biogeography and Cospeciation,”Systematic Zoology, 39, 205–226.
PAGE, R. D. M. (1991), “Random Dendrograms and Null Hypotheses in Cladistic Biogeography,”Systematic Zoology, 40, 54–62.
PHIPPS, J. B. (1975), “The Numbers of Classifications,”Canadian Journal of Botany, 54, 686–688.
QUIROZ, A. J. (1989), “Fast Random Generation of Binary, t-ary, and Other Types of Trees,”Journal of Classification, 6, 223–231.
ROHLF, F. J. (1982), “Consensus Indices for Comparing Classifications,”Mathematical Biosciences, 59, 131–144.
ROHLF, F. J., and SOKAL, R. R. (1981), “Comparing Numerical Taxonomic Studies,”Systematic Zoology, 30, 459–490.
SAVAGE, H. M. (1983), “The Shape of Evolution: Systematic Tree Topology,”Biological Journal of the Linnean Society, 20, 225–244.
SCHNELL, G. D., DOUGLAS, M. E., and HOUGH, D. J. (1986), “Geographic Patterns of Variation in Offshore Spotted Dolphins (Stenella attenuata) of the Eastern Tropical Pacific Ocean,”Marine Mammal Science, 2, 186–213.
SIMBERLOFF, D. (1987), “Calculating Probabilities that Cladograms Match: A Method of Biogeographical Inference,”Systematic Zoology, 36, 175–195.
SIMBERLOFF, D., HECK, K. L., McCOY, E. D., and CONNOR, E. F. (1981), “There Have Been no Statistical Tests of Cladistic Biogeographical Hypotheses,” inVicariance Biogeography: A Critique, Eds., G. Nelson and D. Rosen, Columbia University Press, New York, 40–63.
SHAO, K., and ROHLF, F. J. (1983), “Sampling Distributions of Consensus Indices when all Bifurcating Trees are Equally Likely” inNumerical Taxonomy, Ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Springer Verlag, Berlin, 132–137.
SHAO, K., and SOKAL, R. R. (1986), “Significance Tests of Consensus Indices,”Systematic Zoology, 35, 582–590.
SNEATH, P. H. A. (1957), “The Application of Computers to Taxonomy,”Journal of General Microbiology, 17, 201–226.
SNEATH, P. H., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: W. H. Freeman and Co.
SOKAL, R. R. (1979), “Testing Statistical Significance of Geographic Variation Patterns,”Systematic Zoology, 28, 227–232.
SOKAL, R. R., and MICHENER, C. D. (1958), “A Statistical Method for Evaluating Systematic Relationships,”University of Kansas Science Bulletin, 3, 1409–1438.
SOKAL, R. R., SMOUSE, P. E., and NEEL, J. V. (1986), “The Genetic Structure of a Tribal Population, the Yanomama Indians. XV. Patterns Inferred by Autocorrelation Analysis,”Genetics, 114, 259–287.
SOKAL, R. R., and SNEATH, P. H. A. (1963),Principles of Numerical Taxonomy, San Francisco: W. H. Freeman and Co.
SOKAL, R. R., and UNNASCH, R. S. (1988), “Geographic Covariation of Hosts and Parasites: Evidence fromPopulus andPemphigus,”Zeitschrift für zoologische Systematik und Evolutionsforschung, 26, 73–88.
SOKAL, R. R., UYTTERSCHAUT, H., RÖSING, F.W., and SCHWIDETZKY, I. (1987), “A Classification of European Skulls from Three Time Periods,”American Journal of Physical Anthropology, 74, 1–20.
SOKAL, R. R., and WARTENBERG, D. E. (1983), “A Test of Spatial Autocorrelation Using an Isolation-by-Distance Model,”Genetics, 105, 219–237.
SØRENSEN, T. (1948), “A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and its Application to Analysis of the Vegetation on Danish Commons,”Biologiske Skrifter, 5, 1–34.
STEEL, M. A. (1988), “Distribution of the Symmetric Difference Metric on Phylogenetic Trees,”SIAM Journal of Discrete Mathematics, 1, 541–551.
STEEL, M. A., and PENNY, D. (1993), “Distributions of Tree Comparison Metrics_— Some New Results,”Systematic Biology, 42, 126–141.
Author information
Authors and Affiliations
Additional information
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe.
Rights and permissions
About this article
Cite this article
Lapointe, FJ., Legendre, P. Comparison tests for dendrograms: A comparative evaluation. Journal of Classification 12, 265–282 (1995). https://doi.org/10.1007/BF03040858
Issue Date:
DOI: https://doi.org/10.1007/BF03040858