Journal of Classification

, Volume 29, Issue 3, pp 321–340 | Cite as

Recognizing Treelike k-Dissimilarities

  • Sven Herrmann
  • Katharina T. Huber
  • Vincent Moulton
  • Andreas Spillner
Article

Abstract

A k-dissimilarity D on a finite set X, |X| ≥ k, is a map from the set of size k subsets of X to the real numbers. Such maps naturally arise from edgeweighted trees T with leaf-set X: Given a subset Y of X of size k, D(Y ) is defined to be the total length of the smallest subtree of T with leaf-set Y . In case k = 2, it is well-known that 2-dissimilarities arising in this way can be characterized by the so-called “4-point condition”. However, in case k > 2 Pachter and Speyer (2004) recently posed the following question: Given an arbitrary k-dissimilarity, how do we test whether this map comes from a tree? In this paper, we provide an answer to this question, showing that for k ≥ 3 a k-dissimilarity on a set X arises from a tree if and only if its restriction to every 2 k-element subset of X arises from some tree, and that 2 k is the least possible subset size to ensure that this is the case. As a corollary, we show that there exists a polynomial-time algorithm to determine when a k-dissimilarity arises from a tree. We also give a 6-point condition for determining when a 3-dissimilarity arises from a tree, that is similar to the aforementioned 4-point condition.

Keywords

k-dissimilarity Phylogenetic tree Dissimilarity Metric 4-point condition Ultrametric condition Equidistant tree 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BANDELT, H.-J. (1990), “Recognition of Tree Metrics,” SIAM Journal on Discrete Mathematics, 3, 1–6.MathSciNetMATHCrossRefGoogle Scholar
  2. BANDELT, H.-J., and DRESS, A.W.M. (1994), “An Order-Theoretic Framework for Overlapping Clustering,” Discrete Mathematics, 136, 21–37.MathSciNetMATHCrossRefGoogle Scholar
  3. BOCCI, C., and COOLS, F. (2009), “A Tropical Interpretation of m-Dissimilarity Maps,” Applied Mathematics and Computation, 212, 349–356.MathSciNetMATHCrossRefGoogle Scholar
  4. BUNEMAN, P. (1971), “The Recovery of Trees fromMeasures of Dissimilarity,” in Mathematics in the Archaeological and Historical Sciences, eds. D.G. Kendall and P. Tautu, Edinburgh: Edinburgh University Press, pp. 387–395.Google Scholar
  5. CHEPOI, V., and FICHET, B. (2007), “A Note on Three-Way Dissimilarities and Their Relationship with Two-Way Dissimilarities,” in Selected Contributions in Data Analysis and Classification, ed. P. Brito et al., Berlin: Springer, pp. 465–475.Google Scholar
  6. CULBERSON, J., and RUDNICKI, P. (1989), “A Fast Algorithm for Constructing Trees from Distance Matrices,” Information Processing Letters, 30, 215–220.MathSciNetMATHCrossRefGoogle Scholar
  7. DE SOETE, G. (1983), “A Least Squares Algorithm for Fitting Additive Trees to Proximity Data,” Psychometrika, 48, 621–626.CrossRefGoogle Scholar
  8. DEZA, M.-M., and ROSENBERG, I.G. (2000), “n-Semimetrics,” European Journal of Combinatorics, 21, 797–806.MathSciNetMATHCrossRefGoogle Scholar
  9. DRESS, A.W.M., HUBER, K.T., KOOLEN, J., MOULTON, V., and SPILLNER, A. (2011), Basic Phylogenetic Combinatorics, Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  10. DRESS, A.W.M., and STEEL,M. (2007), “Phylogenetic Diversity over an Abelian Group,” Annals of Combinatorics, 11, 143–160.MathSciNetMATHCrossRefGoogle Scholar
  11. FAITH, D.P. (1992), “Conservation Evaluation and Phylogenetic Diversity,” Biological Conservation, 61, 1–10.CrossRefGoogle Scholar
  12. FELSENSTEIN, J. (2003), Inferring Phylogenies, Sunderland, Massachusetts: Sinauer Associates.Google Scholar
  13. GORDON, A.D. (1987), “A Review of Hierarchical Classification,” Journal of the Royal Statistical Society. Series A. General, 150, 119–137.MathSciNetMATHCrossRefGoogle Scholar
  14. GRISHIN, N. (1999), “A Novel Approach to Phylogeny Reconstruction from Protein Sequences,” Journal of Molecular Evolution, 48, 264–273.CrossRefGoogle Scholar
  15. HAYASHI, C. (1972), “Two Dimensional Quatification Based on the Measure of Dissimilarity Among Three Elements,” Annals of the Institute of Statistical Mathematics, 24, 251–257.MathSciNetMATHCrossRefGoogle Scholar
  16. HEISER,W.J., and BENNANI, M. (1997), “Triadic Distance Models: Axiomatization and Least Squares Representation,” Journal of Mathematical Psychology, 41, 189–206.MathSciNetMATHCrossRefGoogle Scholar
  17. JOLY, S., and LE CALVÉ, G. (1995), “Three-Way Distances,” Journal of Classification, 12, 191–205.MathSciNetMATHCrossRefGoogle Scholar
  18. LEVY, D., YOSHIDA, R., and PACHTER, L. (2006), “Beyond Pairwise Distances: Neighbor-Joining with Phylogenetic Diversity Estimates,” Molecular Biology and Evolution, 23, 491–498.CrossRefGoogle Scholar
  19. PACHTER, L., and SPEYER, D. (2004), “Reconstructing Trees from Subtree Weights,” Applied Mathematics Letters, 17, 615–621.MathSciNetMATHCrossRefGoogle Scholar
  20. RUBEI, E. (2011), “Sets of Double and Triple Weights of Trees,” Annals of Combinatorics, 15, 723–734.MathSciNetMATHCrossRefGoogle Scholar
  21. SCHRIJVER, A. (1986), Theory of Linear and Integer Programming, Wiley-Interscience Series in Discrete Mathematics, Chichester: John Wiley & Sons Ltd.Google Scholar
  22. SEMPLE, C., and STEEL, M. (2003), Phylogenetics (Vol. 24), Oxford Lecture Series in Mathematics and Its Applications, Oxford: Oxford University Press.Google Scholar
  23. SMOLENSKII, Y.A. (1962), “A Method for the Linear Recording of Graphs,” U.S.S.R. Computational Mathematics and Mathematical Physics, 2, 396–397.Google Scholar
  24. STEEL, M. (2005), “Phylogenetic Diversity and the Greedy Algorithm,” Systematic Biology, 54, 527–529.CrossRefGoogle Scholar
  25. WARRENS, M.J. (2010), “n-Way Metrics,” Journal of Classification, 27, 173–190.MathSciNetCrossRefGoogle Scholar
  26. ZARETSKY, K. (1965), “Reconstruction of a Tree from the Distances Between Its Pendant Vertices,” Uspekhi Matematicheskikh Nauk (Russian Mathematical Surveys), 20, 90–92.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Sven Herrmann
    • 1
  • Katharina T. Huber
    • 1
  • Vincent Moulton
    • 1
  • Andreas Spillner
    • 2
  1. 1.School of Computing SciencesUniversity of East AngliaNorwichUK
  2. 2.Institut für Mathematik und InformatikUniversität GreifswaldGreifswaldGermany

Personalised recommendations