On the Generality of Phylogenies from Incomplete Directed Characters
We study a problem that arises in computational biology, when wishing to reconstruct the phylogeny of a set of species. In Incomplete Directed Perfect Phylogeny (IDP), the characters are binary and directed (i.e., species can only gain characters), and the states of some characters are unknown. The goal is to complete the missing states in a way consistent with a perfect phylogenetic tree. This problem arises in classical phylogenetic studies, when some states are missing or undetermined, and in recent phylogenetic studies based on repeat elements in DNA. The problem was recently shown to be polynomial. As different completions induce different trees, it is desirable to find a general solution tree. Such a solution is consistent with the data, and every other consistent solution can be obtained from it by node splitting. Unlike the situation for complete datasets, a general solution may not exist for IDP instances. We provide a polynomial algorithm to find a general solution for an IDP instance, or determine that none exists.
KeywordsGeneral Solution Binary Matrix Recursive Call Node Splitting Character Species
Unable to display preview. Download preview PDF.
- 2.Dan Gusfield. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, 1997.Google Scholar
- 5.M. Nikaido, A. P. Rooney, and N. Okada. Phylogenetic relationships among cetartio-dactyls based on insertions of short and long interspersed elements: Hippopotamuses are the closest extant relatives of whales. Proceedings of the National Academy of Science USA, 96:10261–10266, 1999.CrossRefGoogle Scholar
- 6.I. Pe’er, R. Shamir, and R. Sharan. Incomplete directed perfect phylogeny. In Eleventh Annual Symposium on Combinatorial Pattern Matching (CPM’00), pages 143–153, 2000.Google Scholar