A New Linear-Time Heuristic Algorithm for Computing the Parsimony Score of Phylogenetic Networks: Theoretical Bounds and Empirical Performance

  • Guohua Jin
  • Luay Nakhleh
  • Sagi Snir
  • Tamir Tuller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4463)


Phylogenies play a major role in representing the interrelationships among biological entities. Many methods for reconstructing and studying such phylogenies have been proposed, almost all of which assume that the underlying history of a given set of species can be represented by a binary tree. Although many biological processes can be effectively modeled and summarized in this fashion, others cannot: recombination, hybrid speciation, and horizontal gene transfer result in networks, rather than trees, of relationships.

In a series of papers, we have extended the maximum parsimony (MP) criterion to phylogenetic networks, demonstrated its appropriateness, and established the intractability of the problem of scoring the parsimony of a phylogenetic network. In this work we show the hardness of approximation for the general case of the problem, devise a very fast (linear-time) heuristic algorithm for it, and implement it on simulated as well as biological data.


Approximation Algorithm Maximum Parsimony Horizontal Gene Transfer Exact Algorithm Vertex Cover 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bafna, V., Bansal, V.: Improved Recombination Lower Bounds for Haplotype Data. In: Miyano, S., et al. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 569–584. Springer, Heidelberg (2005)Google Scholar
  2. 2.
    Bafna, V., Berman, P., Fujito, T.: A 2-approximation algorithm for the undirected feedback vertex set problem. SIAM J. on Discrete Mathematics 12, 289–297 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Bar-Yehuda, R.: One for the price of two: A unified approach for approximating covering problems. Algorithmica 27, 131–144 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Bar-Yehuda, R., Even, S.: A local-ratio theorem for approximating the weighted vertex cover problem. Annals of Discrete Mathematics 25, 27–46 (1985)MathSciNetGoogle Scholar
  5. 5.
    Bergthorsson, U., et al.: Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424, 197–201 (2003)CrossRefGoogle Scholar
  6. 6.
    Delwiche, C.F., Palmer, J.D.: Rampant horizontal transfer and duplication of rubisco genes in eubacteria and plastids. Mol. Biol. Evol. 13(6) (1996)Google Scholar
  7. 7.
    Doolittle, W.F., et al.: How big is the iceberg of which organellar genes in nuclear genomes are but the tip? Phil. Trans. R. Soc. Lond. B. Biol. Sci. 358, 39–57 (2003)CrossRefGoogle Scholar
  8. 8.
    Eisen, J.A.: Assessing evolutionary relationships among microbes from whole-genome analysis. Curr. Opin. Microbiol. 3, 475–480 (2000)CrossRefGoogle Scholar
  9. 9.
    Paulsen, I.T., et al.: Role of mobile DNA in the evolution of Vacomycin-resistant Enterococcus faecalis. Science 299(5615), 2071–2074 (2003)CrossRefGoogle Scholar
  10. 10.
    Fitch, W.: Toward defining the course of evolution: minimum change for a specified tree topology. Syst. Zool. 20, 406–416 (1971)CrossRefGoogle Scholar
  11. 11.
    Gusfield, D., Bansal, V.: A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters. In: Miyano, S., et al. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 217–232. Springer, Heidelberg (2005)Google Scholar
  12. 12.
    Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: Proceedings of the Eighth Annual International Conference on Computational Molecular Biology, pp. 347–356 (2004)Google Scholar
  13. 13.
    Hastad, J.: Some optimal inapproximability results. In: STOC97, pp. 1–10 (1997)Google Scholar
  14. 14.
    Hein, J.: Reconstructing evolution of sequences subject to recombination using parsimony. Mathematical Biosciences 98, 185–200 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Hein, J.: A heuristic method to reconstruct the history of sequences subject to recombination. Journal of Molecular Evolution 36, 396–405 (1993)CrossRefGoogle Scholar
  16. 16.
    Hochbaum, D.S.: Approximation Algorithms for NP-Hard Problems. PWS Publishing Company, Boston (1997)Google Scholar
  17. 17.
    Huson, D.H., et al.: Reconstruction of Reticulate Networks from Gene Trees. In: Miyano, S., et al. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 233–249. Springer, Heidelberg (2005)Google Scholar
  18. 18.
    Sung, W.-K., et al.: Constructing a Smallest Refining Galled Phylogenetic Network. In: Miyano, S., et al. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 265–280. Springer, Heidelberg (2005)Google Scholar
  19. 19.
    Jain, R., et al.: Horizontal gene transfer in microbial genome evolution. Theoretical Population Biology 61(4), 489–495 (2002)CrossRefGoogle Scholar
  20. 20.
    Jain, R., et al.: Horizontal gene transfer accelerates genome innovation and evolution. Molecular Biology and Evolution 20(10), 1598–1602 (2003)CrossRefGoogle Scholar
  21. 21.
    Jin, G., et al.: Efficient parsimony-based methods for phylogenetic network reconstruction. Bioinformatics 23, e123–e128 (2006)CrossRefGoogle Scholar
  22. 22.
    Jin, G., et al.: Inferring phylogenetic networks by the maximum parsimony criterion: A case study. Molecular Biology and Evolution 24(1), 324–337 (2007)CrossRefGoogle Scholar
  23. 23.
    Jin, G., et al.: On approximating the parsimony score of phylogenetic networks. Under review (2007)Google Scholar
  24. 24.
    Judd, W.S., Olmstead, R.G.: A survey of tricolpate (eudicot) phylogenetic relationships. American Journal of Botany 91, 1627–1644 (2004)CrossRefGoogle Scholar
  25. 25.
    Kimura, M.: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16, 111–120 (1980)CrossRefGoogle Scholar
  26. 26.
    Linder, C.R., et al.: Network (reticulate) evolution: biology, models, and algorithms. In: The Ninth Pacific Symposium on Biocomputing (PSB), A tutorial (2004)Google Scholar
  27. 27.
    Makarenkov, V., Kevorkov, D., Legendre, P.: Phylogenetic network reconstruction approaches. Applied Mycology and Biotechnology (Genes, Genomics and Bioinformatics) 6, To appear (2005)Google Scholar
  28. 28.
    Matte-Tailliez, O., et al.: Archaeal phylogeny based on ribosomal proteins. Molecular Biology and Evolution 19(5), 631–639 (2002)Google Scholar
  29. 29.
    Michelangeli, F.A., Davis, J.I., Stevenson, D.W.: Phylogenetic relationships among Poaceae and related families as inferred from morphology, inversions in the plastid genome, and sequence data from mitochondrial and plastid genomes. American Journal of Botany 90, 93–106 (2003)CrossRefGoogle Scholar
  30. 30.
    Moret, B.M.E., et al.: Phylogenetic networks: modeling, reconstructibility, and accuracy. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1(1), 13–23 (2004)CrossRefGoogle Scholar
  31. 31.
    Nakhleh, L., et al.: Reconstructing phylogenetic networks using maximum parsimony. In: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference (CSB2005), August 2005, pp. 93–102 (2005)Google Scholar
  32. 32.
    Nakhleh, L., Warnow, T., Linder, C.R.: Reconstructing reticulate evolution in species: theory and practice. In: Proceedings of the Eighth Annual International Conference on Computational Molecular Biology, pp. 337–346 (2004)Google Scholar
  33. 33.
    Nguyen, C.T., et al.: Reconstructing recombination network from sequence data: The small parsimony problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) (2006)Google Scholar
  34. 34.
    Rambaut, A., Grassly, N.C.: Seq-gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosci. 13, 235–238 (1997)Google Scholar
  35. 35.
    Sanderson, M.: r8s software package. Available from
  36. 36.
    Sankoff, D.: Minimal mutation trees of sequences. SIAM Journal on Applied Mathematics 28, 35–42 (1975)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Guohua Jin
    • 1
  • Luay Nakhleh
    • 1
  • Sagi Snir
    • 2
  • Tamir Tuller
    • 3
  1. 1.Department of Computer Science, Rice University, Houston, TX 77005USA
  2. 2.Department of Mathematics, University of California, Berkeley, CA 94720USA
  3. 3.School of Computer Science, Tel Aviv University, Tel AvivIsrael

Personalised recommendations