Abstract
We study a character-based phylogeny reconstruction problem when an incomplete set of data is given. More specifically, we consider the situation under the directed perfect phylogeny assumption with binary characters in which for some species the states of some characters are missing. Our main object is to give an efficient algorithm to enumerate (or list) all perfect phylogenies that can be obtained when the missing entries are completed. While a simple branch-and-bound algorithm (B&B) shows a theoretically good performance, we propose another approach based on a zero-suppressed binary decision diagram (ZDD). Experimental results on randomly generated data exhibit that the ZDD approach outperforms B&B. We also prove that counting the number of phylogenetic trees consistent with a given data is #P-complete, thus providing an evidence that an efficient random sampling seems hard.
Partially supported by Grant-in-Aid for Scientific Research from Ministry of Education, Science and Culture, Japan, and Japan Society for the Promotion of Science, and by Exploratory Research for Advanced Technology (ERATO) from Japan Science and Technology Agency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM J. Comput. 33, 590–607 (2004)
Camin, J.H., Sokal, R.R.: A method for deducing branching sequences in phylogeny. Evolution 19, 311–326 (1965)
Gusfield, D., Frid, Y., Brown, D.: Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data. In: Lin, G. (ed.) COCOON 2007. LNCS, vol. 4598, pp. 51–64. Springer, Heidelberg (2007)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) VLDB, pp. 487–499. Morgan Kaufmann (1994)
Minato, S.: Zero-suppressed BDDs for set manipulation in combinatorial problems. In: DAC, pp. 272–277. ACM Press (1993)
Knuth, D.E.: The Art of Computer Programming Volume 4, Fascicle 1, Bitwise Tricks & Techniques, Binary Decision Diagrams. Pearson Education, Inc., Boston (2009)
Sinclair, A.: Algorithms for Random Generation & Counting: A Markov Chain Approach. Birkhäuser Boston, Boston Basel Berlin (1993)
Golumbic, M.C., Kaplan, H., Shamir, R.: Graph sandwich problems. J. Algorithms 19, 449–473 (1995)
Kijima, S., Kiyomi, M., Okamoto, Y., Uno, T.: On listing, sampling, and counting the chordal graphs with edge constraints. Theor. Comput. Sci. 411, 2591–2601 (2010)
Heggernes, P., Mancini, F., Papadopoulos, C., Sritharan, R.: Strongly chordal and chordal bipartite graphs are sandwich monotone. J. Comb. Optim. 22, 438–456 (2011)
Kiyomi, M., Okamoto, Y., Saitoh, T.: Efficient enumeration of the directed binary perfect phylogenies from incomplete data, arXiv:1203.3284 (2012)
Jansson, J.: Directed perfect phylogeny (binary characters). In: Kao, M.Y. (ed.) Encyclopedia of Algorithms, pp. 246–248. Springer, Heidelberg (2008)
Valiant, L.G.: The complexity of enumeration and reliability problems. SIAM J. Comput. 8, 410–421 (1979)
Hudson, R.R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002), http://home.uchicago.edu/~rhudson1/source/mksamples.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kiyomi, M., Okamoto, Y., Saitoh, T. (2012). Efficient Enumeration of the Directed Binary Perfect Phylogenies from Incomplete Data. In: Klasing, R. (eds) Experimental Algorithms. SEA 2012. Lecture Notes in Computer Science, vol 7276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30850-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-30850-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30849-9
Online ISBN: 978-3-642-30850-5
eBook Packages: Computer ScienceComputer Science (R0)