# Constructing a Tree from Homeomorphic Subtrees, with Applications to Computational Evolutionary Biology

## Abstract.

We are given a set \(\cal T\) = {*T*_{ 1 }*,T*_{ 2 }*, . . .,T*_{ k } } of rooted binary trees, each *T*_{ i } leaf-labeled by a subset \( {\cal L}(T_i) \subset \{1,2, . . ., n\} \) . If *T* is a tree on {1,2, . . .,*n* }, we let \( T|{\cal L} \) denote the minimal subtree of *T* induced by the nodes of \( \cal L \) and all their ancestors. The *consensus**tree**problem* asks whether there exists a tree *T*^{ * } such that, for every *i* , \( T^* |{\cal L}(T_i) \) is homeomorphic to *T*_{ i } .

We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{*O(N n*^{ 1/2 }*), O(N+ n*^{ 2 } log *n* )}, where \( N = \sum_i | T_i | \) , and uses linear space. The randomized algorithm takes time *O(N* log^{ 3 }* n)* and uses linear space. The previous best for this problem was a 1981 *O(Nn)* algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph *G* with *n* nodes and *m* edges and a sequence of *b* batches of one or more edge deletions, then, after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time *O(n*^{ 2 } log *n + b*_{ 0 } min{*n*^{ 2 }*, m* log *n* }), where *b*_{ 0 } is the number of batches which do not result in a new component. For our particular application, \( b_0 \leq1 \) . If all edges are deleted, then the best previously known deterministic algorithm requires time \( O(m \sqrt n) \) to solve this problem. We also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology.

## Preview

Unable to display preview. Download preview PDF.