Bulletin of Mathematical Biology

, Volume 49, Issue 4, pp 461–467 | Cite as

Computational complexity of inferring phylogenies from dissimilarity matrices

  • William H. E. Day


Molecular biologists strive to infer evolutionary relationships from quantitative macromolecular comparisons obtained by immunological, DNA hybridization, electrophoretic or amino acid sequencing techniques. The problem is to find unrooted phylogenies that best approximate a given dissimilarity matrix according to a goodness-of-fit measure, for example the least-squares-fit criterion or Farris'sf statistic. Computational costs of known algorithms guaranteeing optimal solutions to these problems increase exponentially with problem size; practical computational considerations limit the algorithms to analyzing small problems. It is established here that problems of phylogenetic inference based on the least-squares-fit criterion and thef statistic are NP-complete and thus are so difficult computationally that efficient optimal algorithms are unlikely to exist for them.


Phylogenetic Inference Evolutionary Unit Dissimilarity Matrix Interior Vertex Result Guarantee 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bandelt, H.-J. and A. Dress. 1986. “Reconstructing the Shape of a Tree from Observed Dissimilarity Data”.Adv. appl. Math. 7, 309–343.zbMATHMathSciNetCrossRefGoogle Scholar
  2. Buneman, P. 1971. “The Recovery of Trees from Measures of Dissimilarity”. InMathematics in the Archaeological and Historical Sciences, F. R. Hodson, D. G. Kendall and P. Tautu (Eds), pp. 387–395. Edinburgh: Edinburgh University Press.Google Scholar
  3. Cavalli-Sforza, L. L. and A. W. F. Edwards. 1965. “Analysis of Human Evolution.” InGenetics Today: Proceedings of the XI International Congress of Genetics, Vol. 3, S. J. Geerts (Ed.), pp. 923–933. Oxford: Pergamon Press.Google Scholar
  4. — and —. 1967. “Phylogenetic Analysis: Models and Estimation Procedures”.Am. J. hum. Genet. 19, 233–257;Evolution 21, 550–570.Google Scholar
  5. Day, W. H. E. 1983. “Computationally Difficult Parsimony Problems in Phylogenetic Systematics”.J. theor. Biol. 103, 429–438.MathSciNetCrossRefGoogle Scholar
  6. —, D. S. Johnson and D. Sankoff. 1986. “The Computational Complexity of Inferring Rooted Phylogenies by Parsimony”.Math. Biosci. 81, 33–42.zbMATHMathSciNetCrossRefGoogle Scholar
  7. — and D. Sankoff. 1986. “Computational Complexity of Inferring Phylogenies by Compatibility”.Syst. Zool. 35, 224–229.CrossRefGoogle Scholar
  8. — and —. 1987. “Computational Complexity of Inferring Phylogenies from Chromosome Inversion Data”.J. theor. Biol. 124, 213–218.CrossRefGoogle Scholar
  9. Dobson, A. J. 1974. “Unrooted Trees for Numerical Taxonomy”.J. appl. Probab. 11, 32–42.zbMATHMathSciNetCrossRefGoogle Scholar
  10. Farris, J. S. 1972. “Estimating Phylogenetic Trees from Distance Matrices”.Am. Nat. 106, 645–668.CrossRefGoogle Scholar
  11. —.1981. “Distance Data in Phylogenetic Analysis”. InAdvances in Cladistics: Proceedings of the First Meeting of the Willi Hennig Society, V. A. Funk and D. R. Brooks (Eds), pp. 3–23. Bronx: New York Botanical Garden.Google Scholar
  12. Fitch, W. M. and E. Margoliash. 1967. “Construction of Phylogenetic Trees”.Science 155, 279–284.Google Scholar
  13. Foulds, L. R. and R. L. Graham. 1982. “The Steiner Problem in Phylogeny is NP-complete”.Adv. appl. Math. 3, 43–49.zbMATHMathSciNetCrossRefGoogle Scholar
  14. Garey, M. R. and D. S. Johnson. 1979.Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: W. H. Freeman.Google Scholar
  15. Graham, R. L. and L. R. Foulds. 1982. “Unlikelihood that Minimal Phylogenies for a Realistic Biological Study can be Constructed in Reasonable Computational Time”.Math. Biosci. 60, 133–142.zbMATHMathSciNetCrossRefGoogle Scholar
  16. Hakimi, S. L. and S. S. Yau. 1965. “Distance Matrix of a Graph and its Realizability.”Quart. appl. Math. 22, 305–317.zbMATHMathSciNetGoogle Scholar
  17. Harary, F. 1969Graph Theory. Reading, Massachusetts: Addison-Wesley.Google Scholar
  18. Hartigan, J. A. 1967. “Representation of Similarity Matrices by Trees”.J. Am. Statist. Ass. 62, 1140–1158.MathSciNetCrossRefGoogle Scholar
  19. Jardine, N. and R. Sibson.Mathematical Taxonomy. London: John Wiley.Google Scholar
  20. Křivánek, M. 1986. “On the Computational Complexity of Clustering.” InData Analysis and Informatics IV, E. Didayet al. (Eds), pp. 89–96. Amsterdam: Elsevier Science.Google Scholar
  21. — and J. Morávek. 1984. “On NP-hardness in Hierarchical Clustering.” InCompstat 1984, T. Havránek, Z. Šidák and M. Novák (Eds), pp. 189–194. Wien: Physica-Verlag.Google Scholar
  22. — and —. 1986. “NP-hard Problems in Hierarchical-tree Clustering.”Acta Inform. 23, 311–323.MathSciNetCrossRefGoogle Scholar
  23. Prager, E. M. and A. C. Wilson. 1976. “Congruency of Phylogenies Derived from Different Proteins.”J. mol. Evol. 9, 45–57.CrossRefGoogle Scholar
  24. Sattath, S. and A. Tversky. 1977. “Additive Similarity Trees.”Psychometrika 42, 319–345.CrossRefGoogle Scholar

Copyright information

© Society for Mathematical Biology 1987

Authors and Affiliations

  • William H. E. Day
    • 1
  1. 1.Department of Computer ScienceMemorial University of NewfoundlandSt. John'sCanada

Personalised recommendations