A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters
Phylogenetic networks are models of evolution that go beyond trees, allowing biological operations that are not consistent with tree-like evolution. One of the most important of these biological operations is recombination between two sequences (homologous chromosomes). The algorithmic problem of reconstructing a history of recombinations, or determining the minimum number of recombinations needed, has been studied in a number of papers [10, 11, 12, 23, 24, 25, 16, 13, 14, 6, 9, 8, 18, 19, 15, 1]. In [9, 6, 10, 8, 1] we introduced and used “conflict graphs” and “incompatibility graphs” to compute lower bounds on the minimum number of recombinations needed, and to efficiently solve constrained cases of the minimization problem. In those results, the non-trivial connected components of the graphs were the key features that were used.
In this paper we more fully develop the structural importance of non-trivial connected components of the incompatibility graph, to establish a fundamental decomposition theorem about phylogenetic networks. The result applies to phylogenetic networks where cycles reflect biological phenomena other than recombination, such as recurrent mutation and lateral gene transfer. The proof leads to an efficient O(nm2) time algorithm to find the underlying maximal tree structure defined by the decomposition, for any set of n sequences of length m each. An implementation of that algorithm is available. We also report on progress towards resolving the major open problem in this area.
KeywordsMolecular Evolution Phylogenetic Networks Perfect Phylogeny Ancestral Recombination Graph Recombination Gene-Conversion SNP
Unable to display preview. Download preview PDF.
- 3.Felsenstein, J.: Inferring Phylogenies. Sinauer, Sunderland (2004)Google Scholar
- 6.Gusfield, D.: Optimal, efficient reconstruction of Root-Unknown phylogenetic networks with constrained recombination. Technical report, Department of Computer Science, University of California, Davis, CA (2004)Google Scholar
- 7.Gusfield, D.: On the decomposition optimality conjecture for phylogenetic networks. Technical report, UC Davis, Department of Computer Science (2005)Google Scholar
- 10.Gusfield, D., Hickerson, D.: A new lower bound on the number of needed recombination nodes in both unrooted and rooted phylogenetic networks. Report UCD-ECS-06. Technical report, University of California, Davis (2004)Google Scholar
- 13.Hudson, R., Kaplan, N.: Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985)Google Scholar
- 15.Moret, B., Nakhleh, L., Warnow, T., Linder, C.R., Tholse, A., Padolina, A., Sun, J., Timme, R.: Phylogenetic networks: Modeling, reconstructibility, and accuracy. IEEE/ACM Transactions on Computatational Biology and Bioinformatics, 13–23 (2004)Google Scholar
- 16.Myers, S.R., Griffiths, R.C.: Bounds on the minimum number of recombination events in a sample history. Genetics 163, 375–394 (2003)Google Scholar
- 17.Myers, S.: The detection of recombination events using DNA sequence data. PhD thesis, University of Oxford, Oxford England, Department of Statistics (2003)Google Scholar
- 18.Nakhleh, L., Sun, J., Warnow, T., Linder, C.R., Moret, B.M.E., Tholse, A.: Towards the development of computational tools for evaluating phylogenetic network reconstruction methods. In: Proc. of 8th Pacific Symposium on Biocomputing (PSB 2003), pp. 315–326 (2003)Google Scholar
- 19.Nakhleh, L., Warnow, T., Linder, C.R.: Reconstructing reticulate evolution in species - theory and practice. In: Proc. of 8th Annual International Conference on Computational Molecular Biology, pp. 337–346 (2004)Google Scholar
- 22.Song, Y.: Personal CommunicationGoogle Scholar