Journal of Classification

, Volume 9, Issue 1, pp 91–116 | Cite as

The complexity of reconstructing trees from qualitative characters and subtrees

  • Michael Steel


In taxonomy and other branches of classification it is useful to know when tree-like classifications on overlapping sets of labels can be consistently combined into a parent tree. This paper considers the computation complexity of this problem. Recognizing when a consistent parent tree exists is shown to be intractable (NP-complete) for sets of unrooted trees, even when each tree in the set classifies just four labels. Consequently determining the compatibility of qualitative characters and partial binary characters is, in general, also NP-complete. However for sets of rooted trees an algorithm is described which constructs the “strict consensus tree” of all consistent parent trees (when they exist) in polynomial time. The related question of recognizing when a set of subtrees uniquely defines a parent tree is also considered, and a simple necessary and sufficient condition is described for rooted trees.


Trees Qualitative characters Compatibility Resolved quartets Clusters Strict consensus tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. AHO, A. V., SAVIG, Y., SZYMANSKI, T. G., and ULLMAN, J.D. (1981), “Inferring a Tree from the Lowest Common Ancestors with an Application to the Optimization of Relational Expressions”,SIAM Journal on Computing, 10(3), 405–421.zbMATHCrossRefMathSciNetGoogle Scholar
  2. BANDELT, H-J., and DRESS, A. (1986), “Reconstructing the Shape of a Tree from Observed Dissimilarity Data”,Advances in Applied Mathematics, 7, 309–343.zbMATHCrossRefMathSciNetGoogle Scholar
  3. BANDELT, H-J., VON HAESELER, A., BOLICK, J., and SCHÜTTE, H. (1990), “A Comparative Study of Sequence Dissimilarities and Evolutionary Distances Derived from Sets of Aligned RNA Sequences”, preprint.Google Scholar
  4. BONDY, J. A., and MURTY, U. S. R. (1976),Graph Theory with Applications, London: Macmillan.Google Scholar
  5. BROSSEIER, G. (1990), “Piecewise Heirarchical Clustering”,Journal of Classification, 7, 197–216.CrossRefMathSciNetGoogle Scholar
  6. BUNEMAN, P. (1971), “The Recovery of Trees from Measures of Dissimilarily”, inMathematics in the Archaeological and Historical Sciences, Eds., F. R. Hodson, D. G. Kendall, and P. Tautu, Edinburgh: Edinburgh University Press, 387–395.Google Scholar
  7. BUNEMAN, P. (1974), “A Characterization of Rigid Circuit Graphs”,Discrete Mathematics, 9, 205–212.zbMATHCrossRefMathSciNetGoogle Scholar
  8. CARTER, M., HENDY, M. D., PENNY, D., SZÉKELY, L. A., and WORMALD, N. C. (1990), “On the Distribution of Lengths of Evolutionary Trees”,SIAM Journal on Discrete Mathematics, 3, 38–47.zbMATHCrossRefMathSciNetGoogle Scholar
  9. CAVALLI-SFORZA, L. L., and EDWARDS, A. W. F. (1967), “Phylogenetic Analysis: Models and Estimation Procedures”,Evolution, 21, 550–570.CrossRefGoogle Scholar
  10. CAVENDER, J. A. and FELSENSTEIN, J. (1987), “Invariants of Phylogenies: Simple Cases with Discrete States”,Journal of Classification, 4, 57–71.zbMATHCrossRefGoogle Scholar
  11. COLONIUS, H., and SCHULZE, H. H. (1981), “Tree Structures for Proximity Data”,British Journal of Mathematical and Statistical Psychology, 34, 167–180.zbMATHMathSciNetGoogle Scholar
  12. CONSTANTINESCU, M., and SANKOFF, D. (1986), “Tree Enumeration Modulo a Consensus”,Journal of Classification, 3, 349–356.zbMATHCrossRefMathSciNetGoogle Scholar
  13. DAY, W. H. E. (1985), “Optimal Algorithms for Comparing Trees with Labeled Leaves”,Journal of Classification, 2, 7–28.zbMATHCrossRefMathSciNetGoogle Scholar
  14. DAY, W. H. E. (1985), “Analysis of Quartet Dissimilarity Measures between Undirected Phylogenetic Trees”,Systematic Zoology, 35(3), 325–333.CrossRefGoogle Scholar
  15. DAY, W. H. E., and SANKOFF, D. (1986), “Computational Complexity of Inferring Phylogenies by Compatibility”,Systematic Zoology, 35(2), 224–229.CrossRefGoogle Scholar
  16. DEKKER, M. C. H. (1986),Reconstruction Methods for Derivation Trees, Masters thesis, Vrije Universitiet, Amsterdam.Google Scholar
  17. DRESS, A., VON HAESELER, A., and KRUEGER, M. (1986), “Reconstructing Phylogenetic Trees Using Variants of the ‘Four-Point-Condition’,”Studien zur Klassifikation, 17, 299–305.Google Scholar
  18. DRESS, A., and STEEL, M. A. (1991) “Convex Tree Realizations of Partitions,”Applied Mathematics Letters (in press).Google Scholar
  19. DROLET, S., and SANKOFF, D. (1990), “Quadratic Tree Invariants for Multivalued Characters,”Journal of Theoretical Biology, 144, 117–129.CrossRefMathSciNetGoogle Scholar
  20. ERDÖS, P., and SZÉKELY, L. A. (1989) “Applications of Antilexicographic Order 1. An Enumerative Theory of Trees,”Advances in Applied Mathematics, 10, 488–496.zbMATHCrossRefMathSciNetGoogle Scholar
  21. ESTABROOK, G. F., JOHNSON, C. S., Jr., and MCMORRIS, F. R. (1976), “An Algebraic Analysis of Cladistic Characters,”Discrete Mathematics, 16, 141–147.zbMATHCrossRefMathSciNetGoogle Scholar
  22. ESTABROOK, G. F., and MCMORRIS, F. R. (1977), “When are Two Taxonomic Characters Compatible?,”Journal of Mathematical Biology, 4, 195–299.zbMATHMathSciNetGoogle Scholar
  23. ESTABROOK, G. F., and MEACHAM, C. A. (1979), “How to Determine the Compatibility of Undirected Character State Trees,”Mathematical Biosciences, 46, 251–256.CrossRefMathSciNetGoogle Scholar
  24. ESTABROOK, G. F., McMORRIS, F. R., and MEACHAM C. A. (1985), “Comparison of Undirected Phylogenetic Trees based on Subtrees of Four Evolutionary Units,”Systematic Zoology, 34(2), 193–200.CrossRefGoogle Scholar
  25. FELSENSTEIN, J. (1988), “Phylogenies from Molecular Sequences: Inference and Reliability,”Annual Review of Genetics, 22, 521–565.CrossRefGoogle Scholar
  26. FITCH, W. M. (1975), “Towards Finding the Tree of Maximum Parsimony,” inThe Eighth International Conference on Numerical Taxonomy, Ed., G. F. Estabrook, San Francisco: W.H. Freeman, 189–230.Google Scholar
  27. FOULDS, L. R., and GRAHAM, R. L. (1982), “The Steiner Problem in Phylogeny is NP-complete,”Advances in Applied Mathematics, 3, 43–49.zbMATHCrossRefMathSciNetGoogle Scholar
  28. GAREY, M. R., and JOHNSON, D.S., (1979),Computers and Intractibility, New Jersey: Bell Telephone Laboratories Ltd.Google Scholar
  29. GAVRIL, F. (1974), “The Intersection Graphs of Subtrees in Trees are Exactly the Chordal Graphs,”Journal of Combinatorial Theory (B), 16, 47–56.zbMATHCrossRefMathSciNetGoogle Scholar
  30. GOLUMBIC, M. C. (1980),Algorithmic Graph Theory and Perfect Graphs, New York: Academic Press, 92.zbMATHGoogle Scholar
  31. GORDON, A. D. (1986), “Consensus Supertrees: The Synthesis of Rooted Trees Containing Overlapping Sets of Labeled Leaves,”Journal of Classification, 3, 335–348.zbMATHCrossRefMathSciNetGoogle Scholar
  32. GUSFIELD, D. (1991), “Efficient Algorithms for Inferring Evolutionary Trees,”Networks, 21, 19–28.zbMATHMathSciNetGoogle Scholar
  33. KANNAN, S., and WARNOW, T. (1990) “Inferring Evolutionary Trees from DNA Sequences,” in31st Annual Symposium on Foundations of Computer Science (Proceedings), Los Alamitos, California: IEEE Computer Society Press, 362–371.CrossRefGoogle Scholar
  34. HALL, M. H. Jr. (1967),Combinatorial Theory, Waltham: Blaisdell, 175.zbMATHGoogle Scholar
  35. HENDY, M. D. (1989), “The Relationship Between Simple Evolutionary Tree Models and Observable Sequence Data,”Systematic Zoology, 38, 310–321.CrossRefGoogle Scholar
  36. KANT-ANTONESCU, M., and SANKOFF, D. (1991), “Efficient Construction of Supertrees,” manuscript.Google Scholar
  37. MCMORRIS, F. R. (1975), “Compatibility Criteria for Cladistic and Qualitative Taxonomic Characters,” inThe Eighth International Conference on Numerical Taxonomy, Ed., E. A. Estabrook, San Fransisco: W.H. Freeman, 399–415.Google Scholar
  38. MCMORRIS, F. R. (1977), “On the Compatibility of Binary Qualitative Taxonomic Characters,”Bulletin of Mathematical Biology, 39, 133–138.zbMATHMathSciNetGoogle Scholar
  39. MCMORRIS, F. R., WARNOW, T., and WIMER, T. (1991), “Chordal Completion of Coloured Graphs for a Fixed Number of Colours”, manuscript.Google Scholar
  40. MARGUSH, T., and MCMORRIS, F. R. (1981), “Consensus n-trees,”Bulletin of Mathematical Biology, 43, 239–244.zbMATHMathSciNetGoogle Scholar
  41. MEACHAM, C. A. (1981), “A Manual Method for Constructing Trees and Hierarchical Classifications,”Journal of Molecular Evolution, 18, 30–37.CrossRefGoogle Scholar
  42. MEACHAM, C. A. (1983), “Theoretical and Computational Considerations of the Compatibility of Qualitative Taxonomic Characters,” inNumerical Taxonomy, Ed., J. Felsenstein, NATO ASI Series Vol. G1, Berlin Heidelberg: Springer-Verlag, 304–314.Google Scholar
  43. MEACHAM, C. A., and DUNCAN, T. (1987), “The Necessity of Convex Groups in Biological Classification,”Systematic Botany, 12, 78–90.CrossRefGoogle Scholar
  44. SOKAL, R. R., and ROHLF, F. J. (1981), “Taxonomic Congruence in the Leptopodomorpha Re-examined,”Systematic Zoology, 30, 309–325.CrossRefGoogle Scholar
  45. WALTER, J. R. (1972),Representations of Rigid Cycle Graphs, Ph.D. thesis, Wayne State University, Detroit, Michigan.Google Scholar
  46. WARNOW, T. (1991),Combinatorial Algorithms for Constructing Phylogenetic Trees, PhD thesis, University of California-Berkeley.Google Scholar
  47. WATERMAN, M. S., and SMITH, T. S. (1978), “On the Similarity of Dendograms,”Journal of Theoretical Biology, 73, 789–800.CrossRefMathSciNetGoogle Scholar
  48. YANNAKAKIS, M. (1981), “Computing the Minimum Fill-in is NP-complete,”SIAM Journal on Algebraic and Discrete Methods, 2, 77–79.zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 1992

Authors and Affiliations

  • Michael Steel
    • 1
  1. 1.Z.I.F. der Universität BielefeldBielefeld 1Germany

Personalised recommendations