Journal of Classification

, Volume 2, Issue 1, pp 7–28

Optimal algorithms for comparing trees with labeled leaves

  • William H. E. Day
Authors Of Articles


LetRn denote the set of rooted trees withn leaves in which: the leaves are labeled by the integers in {1, ...,n}; and among interior vertices only the root may have degree two. Associated with each interior vertexv in such a tree is the subset, orcluster, of leaf labels in the subtree rooted atv. Cluster {1, ...,n} is calledtrivial. Clusters are used in quantitative measures of similarity, dissimilarity and consensus among trees. For anyk trees inRn, thestrict consensus tree C(T1, ...,Tk) is that tree inRn containing exactly those clusters common to every one of thek trees. Similarity between treesT1 andT2 inRn is measured by the numberS(T1,T2) of nontrivial clusters in bothT1 andT2; dissimilarity, by the numberD(T1,T2) of clusters inT1 orT2 but not in both. Algorithms are known to computeC(T1, ...,Tk) inO(kn2) time, andS(T1,T2) andD(T1,T2) inO(n2) time. I propose a special representation of the clusters of any treeT Rn, one that permits testing in constant time whether a given cluster exists inT. I describe algorithms that exploit this representation to computeC(T1, ...,Tk) inO(kn) time, andS(T1,T2) andD(T1,T2) inO(n) time. These algorithms are optimal in a technical sense. They enable well-known indices of consensus between two trees to be computed inO(n) time. All these results apply as well to comparable problems involving unrooted trees with labeled leaves.


Algorithm complexity Algorithm design Comparing hierarchical classifications Comparing phylogenetic trees Consensus index Strict consensus tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag New York Inc 1985

Authors and Affiliations

  • William H. E. Day
    • 1
  1. 1.Department of Computer ScienceMemorial University of NewfoundlandSt. John'sCanada

Personalised recommendations