Skip to main content
Log in

Computational complexity of inferring phylogenies from dissimilarity matrices

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

Molecular biologists strive to infer evolutionary relationships from quantitative macromolecular comparisons obtained by immunological, DNA hybridization, electrophoretic or amino acid sequencing techniques. The problem is to find unrooted phylogenies that best approximate a given dissimilarity matrix according to a goodness-of-fit measure, for example the least-squares-fit criterion or Farris'sf statistic. Computational costs of known algorithms guaranteeing optimal solutions to these problems increase exponentially with problem size; practical computational considerations limit the algorithms to analyzing small problems. It is established here that problems of phylogenetic inference based on the least-squares-fit criterion and thef statistic are NP-complete and thus are so difficult computationally that efficient optimal algorithms are unlikely to exist for them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Bandelt, H.-J. and A. Dress. 1986. “Reconstructing the Shape of a Tree from Observed Dissimilarity Data”.Adv. appl. Math. 7, 309–343.

    Article  MATH  MathSciNet  Google Scholar 

  • Buneman, P. 1971. “The Recovery of Trees from Measures of Dissimilarity”. InMathematics in the Archaeological and Historical Sciences, F. R. Hodson, D. G. Kendall and P. Tautu (Eds), pp. 387–395. Edinburgh: Edinburgh University Press.

    Google Scholar 

  • Cavalli-Sforza, L. L. and A. W. F. Edwards. 1965. “Analysis of Human Evolution.” InGenetics Today: Proceedings of the XI International Congress of Genetics, Vol. 3, S. J. Geerts (Ed.), pp. 923–933. Oxford: Pergamon Press.

    Google Scholar 

  • — and —. 1967. “Phylogenetic Analysis: Models and Estimation Procedures”.Am. J. hum. Genet. 19, 233–257;Evolution 21, 550–570.

    Google Scholar 

  • Day, W. H. E. 1983. “Computationally Difficult Parsimony Problems in Phylogenetic Systematics”.J. theor. Biol. 103, 429–438.

    Article  MathSciNet  Google Scholar 

  • —, D. S. Johnson and D. Sankoff. 1986. “The Computational Complexity of Inferring Rooted Phylogenies by Parsimony”.Math. Biosci. 81, 33–42.

    Article  MATH  MathSciNet  Google Scholar 

  • — and D. Sankoff. 1986. “Computational Complexity of Inferring Phylogenies by Compatibility”.Syst. Zool. 35, 224–229.

    Article  Google Scholar 

  • — and —. 1987. “Computational Complexity of Inferring Phylogenies from Chromosome Inversion Data”.J. theor. Biol. 124, 213–218.

    Article  Google Scholar 

  • Dobson, A. J. 1974. “Unrooted Trees for Numerical Taxonomy”.J. appl. Probab. 11, 32–42.

    Article  MATH  MathSciNet  Google Scholar 

  • Farris, J. S. 1972. “Estimating Phylogenetic Trees from Distance Matrices”.Am. Nat. 106, 645–668.

    Article  Google Scholar 

  • —.1981. “Distance Data in Phylogenetic Analysis”. InAdvances in Cladistics: Proceedings of the First Meeting of the Willi Hennig Society, V. A. Funk and D. R. Brooks (Eds), pp. 3–23. Bronx: New York Botanical Garden.

    Google Scholar 

  • Fitch, W. M. and E. Margoliash. 1967. “Construction of Phylogenetic Trees”.Science 155, 279–284.

    Google Scholar 

  • Foulds, L. R. and R. L. Graham. 1982. “The Steiner Problem in Phylogeny is NP-complete”.Adv. appl. Math. 3, 43–49.

    Article  MATH  MathSciNet  Google Scholar 

  • Garey, M. R. and D. S. Johnson. 1979.Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: W. H. Freeman.

    Google Scholar 

  • Graham, R. L. and L. R. Foulds. 1982. “Unlikelihood that Minimal Phylogenies for a Realistic Biological Study can be Constructed in Reasonable Computational Time”.Math. Biosci. 60, 133–142.

    Article  MATH  MathSciNet  Google Scholar 

  • Hakimi, S. L. and S. S. Yau. 1965. “Distance Matrix of a Graph and its Realizability.”Quart. appl. Math. 22, 305–317.

    MATH  MathSciNet  Google Scholar 

  • Harary, F. 1969Graph Theory. Reading, Massachusetts: Addison-Wesley.

    Google Scholar 

  • Hartigan, J. A. 1967. “Representation of Similarity Matrices by Trees”.J. Am. Statist. Ass. 62, 1140–1158.

    Article  MathSciNet  Google Scholar 

  • Jardine, N. and R. Sibson.Mathematical Taxonomy. London: John Wiley.

  • Křivánek, M. 1986. “On the Computational Complexity of Clustering.” InData Analysis and Informatics IV, E. Didayet al. (Eds), pp. 89–96. Amsterdam: Elsevier Science.

    Google Scholar 

  • — and J. Morávek. 1984. “On NP-hardness in Hierarchical Clustering.” InCompstat 1984, T. Havránek, Z. Šidák and M. Novák (Eds), pp. 189–194. Wien: Physica-Verlag.

    Google Scholar 

  • — and —. 1986. “NP-hard Problems in Hierarchical-tree Clustering.”Acta Inform. 23, 311–323.

    Article  MathSciNet  Google Scholar 

  • Prager, E. M. and A. C. Wilson. 1976. “Congruency of Phylogenies Derived from Different Proteins.”J. mol. Evol. 9, 45–57.

    Article  Google Scholar 

  • Sattath, S. and A. Tversky. 1977. “Additive Similarity Trees.”Psychometrika 42, 319–345.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The Natural Sciences and Engineering Research Council of Canada partially supported this research through an individual operating grant (A4142) to W.H.E. Day.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Day, W.H.E. Computational complexity of inferring phylogenies from dissimilarity matrices. Bltn Mathcal Biology 49, 461–467 (1987). https://doi.org/10.1007/BF02458863

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02458863

Keywords

Navigation