Abstract
We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect phylogeny assumption by allowing at most a constant number q of additional mutations. In this paper, we develop an algorithm for constructing optimal phylogenies and provide empirical evidence of its performance. The algorithm runs in time O((72 κ)q nm + nm 2) where n is the number of taxa, m is the number of characters and κ is the number of characters that share four gametes with some other character. This is fixed parameter tractable when q and κ are constants and significantly improves on the previous asymptotic bounds by reducing the exponent to q. Furthermore, the complexity of the previous work makes it impractical and in fact no known implementation of it exists. We implement our algorithm and demonstrate it on a selection of real data sets, showing that it substantially outperforms its worst-case bounds and yields far superior results to a commonly used heuristic method in at least one case. Our results therefore describe the first practical phylogenetic tree reconstruction algorithm that finds guaranteed optimal solutions while being easily implemented and computationally feasible for data sets of biologically meaningful size and complexity.
Supported in part by NSF grant CCR-0105548 and ITR grant CCR-0122581(The ALADDIN project)
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agarwala, R., Fernandez-Baca, D.: A Polynomial-Time Algorithm for the Perfect Phylogeny Problem when the Number of Character States is Fixed. SIAM Journal on Computing 23 (1994)
Bodlaender, H., Fellows, M., Warnow, T.: Two Strikes Against Perfect Phylogeny. In: Proc. ICALP (1992)
Damaschke, P.: Parameterized Enumeration, Transversals, and Imperfect Phylogeny Reconstruction. In: Proc. IWPEC (2004)
Eskin, E., Halperin, E., Karp, R.M.: Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny. In: JBCB (2003)
Felsenstein, J.: PHYLIP version 3.6. Distributed by the author. Department of Genome Sciences. University of Washington, Seattle (2005)
Fernandez-Baca, D., Lagergren, J.: A Polynomial-Time Algorithm for Near-Perfect Phylogeny. SIAM Journal on Computing 32 (2003)
Foulds, L.R., Graham, R.L.: The Steiner problem in Phylogeny is NP-complete. In: Advances in Applied Mathematics, vol. (3) (1982)
Gusfield, D.: Efficient Algorithms for Inferring Evolutionary Trees. Networks 21 (1991)
Gusfield, D.: Algorithms on Strings, Trees and Sequences. Cambridge University Press, Cambridge (1999)
Gusfield, D., Bansal, V.: A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters. In: Proc. RECOMB (2005)
Gusfield, D., Eddhu, S., Langley, C.: Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination. In: Proc. IEEE CSB (2003)
The International HapMap Consortium. The International HapMap Project. Nature 426 (2003)
Kannan, S., Warnow, T.: A Fast Algorithm for the Computation and Enumeration of Perfect Phylogenies. SIAM Journal on Computing 26 (1997)
Merimaa, M., Liivak, M., Heinaru, E., Truu, J., Heinaru, A.: Functional co-adaption of phenol hydroxylase and catechol 2,3-dioxygenase genes in bacteria possessing different phenol and p-cresol degradation pathways (unpublished)
Promel, H.J., Steger, A.: The Steiner Tree Problem: A Tour Through Graphs Algorithms and Complexity. Vieweg Verlag (2002)
Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)
Steel, M.A.: The Complexity of Reconstructing Trees from Qualitative Characters and Subtrees. J. Classification 9 (1992)
Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Pham, L., Smigielski, E., Sirotkin, K.: dbSNP: The NCBI Database of Genetic Variation. Nucleic Acids Research 29 (2001)
Stone, A.C., Griffiths, R.C., Zegura, S.L., Hammer, M.F.: High levels of Y-chromosome nucleotide diversity in the genus Pan. In: Proceedings of the National Academy of Sciences (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sridhar, S., Dhamdhere, K., Blelloch, G.E., Halperin, E., Ravi, R., Schwartz, R. (2006). Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758525_107
Download citation
DOI: https://doi.org/10.1007/11758525_107
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34381-3
Online ISBN: 978-3-540-34382-0
eBook Packages: Computer ScienceComputer Science (R0)