Computing Distances between Evolutionary Trees

  • Bhaskar DasGupta
  • Xin He
  • Tao Jiang
  • Ming Li
  • John Tromp
  • Lusheng Wang
  • Louxin Zhang


Comparing objects to find their similarities or, equivalently, dissimilarities, is a fundamental issue in many fields including pattern recognition, image analysis, drug design, the study of thermodynamic costs of computing, cognitive science, etc. Various models have been introduced to measure the degree of similarity or dissimilarity in the literature. In the latter case the degree of dissimilarity is also often referred to as the distance. While some distances are straightforward to compute, e.g. the Hamming distance for binary strings, the Euclidean distance for geometric objects; some others are formulated as combinatorial optimization problems and thus pose nontrivial challenging algorithmic problems, sometimes even uncomputable, such as the universal information distance between two objects [4].


Leaf Node Binary Tree Evolutionary Tree Internal Edge Weighted Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    D. Aldous, Triangulating the circle, at random. Amer Math. Monthly, 89, pp. 223–234, 1994.CrossRefMathSciNetGoogle Scholar
  2. [2]
    M.A. Armstrong, Groups and Symmetry, Springer Verlag, New York Inc., 1988.zbMATHGoogle Scholar
  3. [3]
    D. Barry and J.A. Hartigan, Statistical analysis of hominoid molecular evolution, Stat. Sci., 2, pp. 191–210, 1987.CrossRefMathSciNetGoogle Scholar
  4. [4]
    C.H. Bennett, P. Gács, M. Li, P. Vitányi, and W. Zurek, Information Distance, to appear in IEEE Trans. Inform. Theory.Google Scholar
  5. [5]
    R. P. Boland, E. K. Brown and W. H. E. Day, Approximating minimumlength-sequence metrics: a cautionary note, Math. Soc. Sci., 4, pp. 261–270, 1983.CrossRefzbMATHMathSciNetGoogle Scholar
  6. [6]
    K. Culik II and D. Wood, A note on some tree similarity measures, Inform. Proc. Let., 15, pp. 39–42, 1982.CrossRefzbMATHMathSciNetGoogle Scholar
  7. B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp and L. Zhang, On distances between phylogenetic trees, Proc. 8th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 427–436, 1997.Google Scholar
  8. [8]
    B. DasGupta, X. He, T. Jiang, M. Li, and J. Tromp, On the linear-cost subtree-transfer distance, Algorithmica, submitted, 1997.Google Scholar
  9. [9]
    B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, On computing the nearest neighbor interchange distance, Preprint, 1997.Google Scholar
  10. [10]
    W. H. E. Day, Properties of the nearest neighbor interchange metric for trees of small size, Journal of Theoretical Biology, 101, pp. 275–288, 1983.CrossRefMathSciNetGoogle Scholar
  11. [11]
    A. K. Dewdney, Wagner’s theorem for torus graphs, Discrete Math., 4, pp. 139–149, 1973.CrossRefzbMATHMathSciNetGoogle Scholar
  12. [12]
    A.W.F. Edwards and L.L. Cavalli-Sforza, The reconstruction of evolution, Ann. Hum. Genet., 27, 105, 1964. (Also in Heredity 18, 553.)Google Scholar
  13. [13]
    J. Felsenstein, Evolutionary trees for DNA sequences: a maximum likelihood approach. J. Mol. Evol., 17, pp. 368–376, 1981.CrossRefGoogle Scholar
  14. [14]
    J. Felsenstein, personal communication, 1996.Google Scholar
  15. [15]
    W.M. Fitch, Toward defining the course of evolution: minimum change for a specified tree topology, Syst. Zool., 20, pp. 406–416, 1971.CrossRefGoogle Scholar
  16. [16]
    W.M. Fitch and E. Margoliash, Construction of phylogenetic trees, Science, 155, pp. 279–284, 1967.CrossRefGoogle Scholar
  17. [17]
    M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, 1979.zbMATHGoogle Scholar
  18. [18]
    L. Guibas and J. Hershberger, Morphing simple polygons, Proceeding of the ACM 10th Annual Sym. of Comput. Geometry, pp. 267–276, 1994.Google Scholar
  19. [19]
    J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Math. Biosci., 98, pp. 185–200, 1990.CrossRefzbMATHMathSciNetGoogle Scholar
  20. [20]
    J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination, J. Mol. Evol., 36, pp. 396–405, 1993.CrossRefGoogle Scholar
  21. [21]
    J. Hein, personal email communication, 1996.Google Scholar
  22. [22]
    J. Hein, T. Jiang, L. Wang, and K. Zhang, On the complexity of comparing evolutionary trees, Discrete Applied Mathematics, 71, pp. 153–169, 1996.CrossRefzbMATHMathSciNetGoogle Scholar
  23. [23]
    J. Hershberger and S. Suri, Morphing binary trees. Proceeding of the ACM-SIAM 6th Annual Symposium of Discrete Algorithms, pp. 396–404, 1995.Google Scholar
  24. [24]
    F. Hurtado, M. Noy, and J. Urrutia, Flipping edges in triangulations, Proc. of the ACM 12th Annual Sym. of Comput. Geometry, pp. 214–223, 1996.Google Scholar
  25. [25]
    J. P. Jarvis, J. K. Luedeman and D. R. Shier, Counterexamples in measuring the distance between binary trees, Mathematical Social Sciences, 4, pp. 271–274, 1983.CrossRefzbMATHGoogle Scholar
  26. [26]
    J. P. Jarvis, J. K. Luedeman and D. R. Shier, Comments on computing the similarity of binary trees, Journal of Theoretical Biology, 100, pp. 427–433, 1983.CrossRefMathSciNetGoogle Scholar
  27. [27]
    J. Kececioglu and D. Gusfield, Reconstructing a history of recombinations from a set of sequences, Proc. 5th Annual ACM-SIAM Symp. Discrete Algorithms, 1994.Google Scholar
  28. [28]
    M. Kuhner and J. Felsenstein, A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol.11 (3), pp. 459–468, 1994.Google Scholar
  29. [29]
    M. Krivânek, Computing the nearest neighbor interchange metric for unlabeled binary trees is NP-complete, Journal of Classification, 3, pp. 55–60, 1986.CrossRefzbMATHMathSciNetGoogle Scholar
  30. [30]
    V. King and T. Warnow, On Measuring the nni distance between two evolutionary trees, DIMACS mini workshop on combinatorial structures in molecular biology, Rutgers University, Nov 4, 1994.Google Scholar
  31. [31]
    S. Khuller, Open Problems: 10, SIGACT News, 24 (4), p. 46, Dec 1994.MathSciNetGoogle Scholar
  32. [32]
    W.J. Le Quesne, The uniquely evolved character concept and its cladistic application, Syst. Zool., 23, pp. 513–517, 1974.CrossRefGoogle Scholar
  33. [33]
    M. Li, J. Tromp, and L.X. Zhang, On the nearest neighbor interchange distance between evolutionary trees, Journal of Theoretical Biology, 182, pp. 463–467, 1996.CrossRefGoogle Scholar
  34. [34]
    M. Li and L. Zhang, Better Approximation of Diagonal-Flip Transformation and Rotation Transformation, Manuscript, 1997.Google Scholar
  35. [35]
    G. W. Moore, M. Goodman and J. Barnabas, An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets, Journal of Theoretical Biology, 38, pp. 423–457, 1973.CrossRefGoogle Scholar
  36. [36]
    J. Pallo, On rotation distance in the lattice of binary trees, Infor. Proc. Letters, 25, pp. 369–373, 1987.CrossRefMathSciNetGoogle Scholar
  37. [37]
    D. F. Robinson, Comparison of labeled trees with valency three, Journal of Combinatorial Theory,Series B, 11, pp. 105–119, 1971.CrossRefMathSciNetGoogle Scholar
  38. [38]
    N. Saitou and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Mol. Biol. Evol., 4, pp. 406–425, 1987.Google Scholar
  39. [39]
    D. Sankoff, Minimal mutation trees of sequences, SIAM J. Appl. Math., 28, pp. 35–42, 1975.CrossRefzbMATHMathSciNetGoogle Scholar
  40. [40]
    D. Sankoff and J. Kruskal (Eds), Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, Addison Wesley, Reading Mass., 1983.Google Scholar
  41. [41]
    D. Sleator, R. Tarjan, W. Thurston, Rotation distance, triangulations, and hyperbolic geometry, J. Amer. Math. Soc., 1, pp. 647–681, 1988.CrossRefzbMATHMathSciNetGoogle Scholar
  42. [42]
    D. Sleator, R. Tarjan, W. Thurston, Short encodings of evolving structures, SIAM J. Discr. Math., 5, pp. 428–450, 1992.CrossRefzbMATHMathSciNetGoogle Scholar
  43. [43]
    K.C. Tai, The tree-to-tree correction problem, J. ACM, 26, pp. 422–433, 1979.CrossRefzbMATHMathSciNetGoogle Scholar
  44. [44]
    A. von Haseler and G.A. Churchill, Network models for sequence evolution, J. Mol. Evol., 37, pp. 77–85, 1993.Google Scholar
  45. [45]
    K. Wagner, Bemerkungen zum vierfarbenproblem, J. Deutschen Math.Verin., 46, pp. 26–32, 1936.Google Scholar
  46. [46]
    M. S. Waterman, Introduction to computational biology: maps, sequences and genomes, Chapman Sc Hall, 1995.zbMATHGoogle Scholar
  47. [47]
    M. S. Waterman and T. F. Smith, On the similarity of dendrograms, Journal of Theoretical Biology, 73, pp. 789–800, 1978.CrossRefMathSciNetGoogle Scholar
  48. [48]
    K. Zhang and D. Shasha, Simple fast algorithms for the editing distance between trees and related problems, SIAM J. Comput. 18, pp. 12451–262, 1989.CrossRefMathSciNetGoogle Scholar
  49. [49]
    K. Zhang, J. Wang and D. Sasha, On the editing distance between undirected acyclic graphs, International J. of Foundations of Computer Science 7 (13), March 1996.Google Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • Bhaskar DasGupta
    • 1
  • Xin He
    • 2
  • Tao Jiang
    • 3
  • Ming Li
    • 4
  • John Tromp
    • 5
  • Lusheng Wang
    • 6
  • Louxin Zhang
    • 7
  1. 1.Rutgers UniversityUSA
  2. 2.SUNY at BuffaloUSA
  3. 3.McMaster UniversityCanada
  4. 4.City University of Hong Kong and University of WaterlooChina
  5. 5.CWIUSA
  6. 6.City University of Hong KongChina
  7. 7.National University of SingaporeSingapore

Personalised recommendations