Abstract
Many methods and algorithms to generate random trees of many kinds have been proposed in the literature. No procedure exists however for the generation of dendrograms with randomized fusion levels. Randomized dendrograms can be obtained by randomizing the associated cophenetic matrix. Two algorithms are described. The first one generates completely random dendrograms, i.e., trees with a random topology, random fusion level values, and random assignment of the labels. The second algorithm uses a double-permutation procedure to randomize a given dendrogram; it proceeds by randomization of the fixed fusion levels, instead of using random fusion level values. A proof is presented that the double-permutation procedure is a Uniform Random Generation Algorithmsensu Furnas (1984), and a complete example is given.
Résumé
On retrouve dans la littérature plusieurs méthodes et algorithmes destinés à générer des arbres aléatoires de toutes sortes. Il n'existe cependant aucune procédure permettant la génération de dendrogrammes comportant des niveaux de fusion aléatoires. De tels dendrogrammes peuvent être obtenus à partir des matrices cophénétiques associées. Nous décrivons deux algorithmes pour ce faire. Le premier permet de générer des dendrogrammes complètement aléatoires, c'est-à-dire des arbres possédant une topologie aléatoire, des niveaux de fusion aléatoires ainsi que des feuilles étiquetées de façon aléatoire. Le deuxième algo- rithme utilise une procédure à double permutation afin de randomiser un dendrogramme donné; on proc de dans ce cas à la permutation des véritables niveaux de fusion au lieu de génèrer des niveaux aléatoires. Nous présentons la preuve démontrant que la procédure à double permutation représente un Algorithme de Génération Aléatoire Uniformesensu Furnas (1984). Un exemple complet est également fourni.
Similar content being viewed by others
References
COLLESS, D. H. (1980), “Congruence Between Morphometric and Allozyme Data forMenidia Species: a Reappraisal,”Systematic Zoology, 29, 288–299.
DE SOETE, G. (1984), “Ultrametric Tree Representations of Incomplete Dissimilarity Data,”Journal of Classification, 1, 235–242.
FAITH, D.P., and BELBIN, L. (1986), “Comparison of Classifications Using Measures Intermediate Between Metric Dissimilarity and Consensus Similarity,”Journal of Classification, 3, 257–280.
FELSENSTEIN, J. (1978), “The Number of Evolutionary Trees,”Systematic Zoology, 27, 27–33.
FELSENSTEIN, J. (1985), “Confidence Limits on Phylogenies: An Approach Using the Bootstrap,”Evolution, 39, 783–791.
FRANK, O., and SVENSSON, K. (1981), “On Probability Distributions of Single-Linkage Dendrograms,”Journal of Statistics and Computer Simulation, 12, 121–131.
FURNAS, G. W. (1984), “The Generation of Random, Binary Unordered Trees,”Journal of Classification, 1, 187–233.
GÖBEL, F. (1980), “On a 1-1 Correspondence Between Rooted Trees and Natural Numbers,”Journal of Combinatorial Theory, Series B, 29, 141–143.
GOWER, J. C., and LEGENDRE, P. (1986), “Metric and Euclidean Properties of Dissimilarity Coefficients,”Journal of Classification, 3, 5–48.
GUENOCHE, A. (1983), “Random Spanning Trees,”Journal of Algorithms, 4, 214–220.
HAJDU L. J. (1981), “Graphical Comparison of Resemblance Measures in Phytosociology,”Vegetatio, 48, 47–59.
HARDING, E. F. (1971), “The Probabilities of Rooted Tree-Shapes Generated by Random Bifurcation,”Advances in Applied Probability, 3, 44–77.
HARTIGAN, J. A. (1967), “Representation of Similarity Matrices by Trees,”Journal of the American Statistical Association, 62, 1140–1158.
KNOTT, G. D. (1977), “A Numbering System for Binary Trees,”Communication of the Association for Computing Machinery,20(2).
LAPOINTE, F.-J., and LEGENDRE, P. (1990), “A Statistical Framework to Test the Consensus of Two Nested Classifications,”Systematic Zoology, 39, 1–14.
LAPOINTE, F.-J., and LEGENDRE, P. (submitted), “A Statistical Framework to Test the Consensus among Additive Trees (Cladograms).”
MICKEVICH, M. F. (1978), “Taxonomic Congruence,”Systematic Zoology, 27, 143–158.
MURTAGH, F. (1983), “A Probability Theory of Hierarchic Clustering Using Random Dendrograms,”Journal of Statistics and Computer Simulation, 18, 145–157.
MURTAGH, F. (1984), “Counting Dendrograms: A Survey,”Discrete Applied Mathematics, 7, 191–199.
NEMEC, A. F. L., and BRINKHURST, R. O. (1988), “Using the Bootstrap to Assess Statistical Significance in the Cluster Analysis of Species Abundance Data,”Canadian Journal of Fisheries and Aquatic Sciences, 45, 965–970.
NIJENHUIS, A., and WILF, H. S. (Eds.) (1978),Combinatorial Algorithms for Computers and Calculators, Second Edition, New York: Academic Press.
ODEN, N. L., and SHAO, K. T. (1984), “An Algorithm to Equiprobably Generate All Directed Trees With k Labeled Terminal Nodes and Unlabeled Interior Nodes,”Bulletin of Mathematical Biology, 46, 379–387.
PAGE, R. D. M. (1988), “Quantitative Cladistic Biogeography: Constructing and Comparing Area Cladograms,”Systematic Zoology, 37, 254–270.
PAGE, R. D. M. (1990), “Temporal Congruence and Cladistic Analysis of Biogeography and Cospeciation,”Systematic Zoology, 39, 205–226.
PHIPPS J. B. (1975), “The Numbers of Classifications,”Canadian Journal of Botany, 54, 686–688.
PROSKUROWSKI, A. (1980), “On the Generation of Binary Trees,”Journal of the Association for Computing Machinery, 27, 1–2.
QUIROZ, A. J. (1989), “Fast Random Generation of Binary, t-ary and Other Types of Trees,”Journal of Classification, 6, 223–231.
ROHLF, F. J. (1983), “Numbering Binary Trees With Labeled Terminal Vertices,”Bulletin of Mathematical Biology, 45, 33–40.
ROSEN, D. E. (1978), “Vicariant Patterns and Historical Explanation in Biogeography,”Systematic Zoology, 27, 159–188.
ROTEM, D., and VAROL, Y. L. (1978), “Generation of Binary Trees from Ballot Sequences,”Journal of the Association for Computing Machinery, 25, 396–404.
SHAO, K., and ROHLF, F. J. (1983), “Sampling Distributions of Consensus Indices when all Bifurcating Trees are Equally Likely” inNumerical Taxonomy, ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Berlin: Springer Verlag, 132–137.
SHAO, K., and SOKAL, R. R. (1986), “Significance Tests of Consensus Indices,”Systematic Zoology, 35, 582–590.
SIBSON, R. (1972), “Order Invariant Methods for Data Analysis,”Journal of the Royal Statistical Society, B 34, 311–349.
SNEATH, P. H. A., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: Freeman.
SOKAL, R. R., and ROHLF, F. J. (1962), “The Comparison of Dendrograms by Objective Methods,”Taxon, 11, 33–40.
SOLOMON, M., and FINKEL, R. A. (1980), “A Note on Enumerating Binary Trees,”Journal of the Association for Computing Machinery, 27, 3–5.
Author information
Authors and Affiliations
Additional information
This work was supported by NSERC Grant No. A7738 to P. Legendre and by a NSERC scholarship to F.-J. Lapointe.
Rights and permissions
About this article
Cite this article
Lapointe, FJ., Legendre, P. The generation of random ultrametric matrices representing dendrograms. Journal of Classification 8, 177–200 (1991). https://doi.org/10.1007/BF02616238
Issue Date:
DOI: https://doi.org/10.1007/BF02616238