Advertisement

Constructing Camin-Sokal Phylogenies Via Answer Set Programming

  • Jonathan Kavanagh
  • David Mitchell
  • Eugenia Ternovska
  • Ján Maňuch
  • Xiaohong Zhao
  • Arvind Gupta
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4246)

Abstract

Constructing parsimonious phylogenetic trees from species data is a central problem in phylogenetics, and has diverse applications, even outside biology. Many variations of the problem, including the cladistic Camin-Sokal (CCS) version, are NP-complete. We present Answer Set Programming (ASP) models for the binary CCS problem, as well as a simpler perfect phylogeny version, along with experimental results of applying the models to biological data. Our contribution is three-fold. First, we solve phylogeny problems which have not previously been tackled by ASP. Second, we report on variants of our CCS model which significantly affect run time, including the interesting case of making the program “slightly tighter”. This version exhibits some of the best performance, in contrast with a tight version of the model which exhibited poor performance. Third, we are able to find proven-optimal solutions for larger instances of the CCS problem than the widely used branch-and-bound-based PHYLIP package.

Keywords

phylogeny maximum parsimony Camin-Sokal answer set programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Roderic, D., Page, M., Holmes, E.: Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Oxford, UK (1998)Google Scholar
  2. 2.
    Gusfield, D.: Haplotyping as perfect phylogeny: conceptual framework and efficient solutions. In: RECOMB 2002: Proc. of the sixth annual int’l conf. on Comp. biology, pp. 166–175 (2002)Google Scholar
  3. 3.
    Erdem, E., Lifschitz, V., Nakhleh, L., Ringe, D.: Reconstructing the evolutionary history of indo-european languages using answer set programming. In: Proc., Practical Aspects of Declarative Languages: 5th Int’l Symposium, pp. 160–176 (2003)Google Scholar
  4. 4.
    Hendy, M., Penny, D.: Branch and bound algorithms to determine minimal evolutionary trees. Mathematical Biosciences 59, 277–290 (1982)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Felsenstein, J.: Phylip home page (1980), http://evolution.genetics.washington.edu/phylip
  6. 6.
    Swofford, D.: Paup* 4.0 Phylogenetic Analysis Using Parsimony (*and Other Methods) (2001)Google Scholar
  7. 7.
    Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: Proc., Int’l Logic Programming Conference and Symposium, pp. 1070–1080 (1988)Google Scholar
  8. 8.
    Niemelä, I.: Logic programs with stable model semantics as a constraint programming paradigm. Annals of Mathematics and Artificial Intelligence 25, 241–273 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Marek, V., Truszczynski, M.: Stable logic programming - an alternative logic programming paradigm. In: Apt, K.R., Marek, V.W., Truszczynski, M., Warren, D.S. (eds.) The Logic Programming Paradigm: A 25-Year Perspective. Springer, Heidelberg (1999)Google Scholar
  10. 10.
    Niemelä, I., Simons, P., Syrjänen, T.: Smodels: A system for answer set programming. In: Proc. 8th Int’l Workshop on Non-Monotonic Reasoning, Breckenridge, Colorado, April 9-11 (2000)Google Scholar
  11. 11.
    Lierler, Y., Maratea, M.: Cmodels-2: SAT-based answer set solver enhanced to non-tight programs. In: Lifschitz, V., Niemelä, I. (eds.) LPNMR 2004. LNCS, vol. 2923, pp. 346–350. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Eck, R., Dayhoff, M.: Atlas of protein sequence and structure. National Biomedical Research Foundation (1966)Google Scholar
  13. 13.
    Camin, J., Sokal, R.: A method for deducing branching sequences in phylogeny. Evolution 19, 311–326 (1965)CrossRefGoogle Scholar
  14. 14.
    Edwards-Ingram, L., Gent, M., Hoyle, D., Hayes, A., Stateva, L., Oliver, S.: Comparative genomic hybridization provides new insights into the molecular taxonomy of the saccharomyces sensu stricto complex. Genome Research 14, 1043–1051 (2004)CrossRefGoogle Scholar
  15. 15.
    Nozaki, H., Ohta, N., Matsuzaki, M., Misumi, O., Kuroiwa, T.: Phylogeny of plastids based on cladistic analysis of gene loss inferred from complete plastid genome sequences. J. Molecular Evolution 57, 377–382 (2003)CrossRefGoogle Scholar
  16. 16.
    Pacak, A., Fiedorow, P., Dabert, J., Szweykowska-Kulińska, Z.: RAPD technique for taxonomic studies of pellia epiphylla-complex (hepaticae, metzgeriales). Genetica 104, 179–187 (1998)CrossRefGoogle Scholar
  17. 17.
    Day, W., Johnson, D., Sankoff, D.: The computational complexity of inferring rooted phylogenies by parsimony. Mathematical Biosciences 81, 33–42 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Agarwala, R., Fernandez-Baca, D.: A polynomial-time algorithm for the perfect phylogeny problem when the number of character states is fixed. SIAM Journal on Computing, 1216–1224 (1994)Google Scholar
  19. 19.
    Hellman, M., Tripathi, N., Henz, S., Lindholm, A., Weigel, D., Breden, F., Dreyer, C.: Unpublished data (2006)Google Scholar
  20. 20.
    Brooks, D.R., Erdem, E., Minett, J.W., Ringe, D.: Character-based cladistics and answer set programming. In: Hermenegildo, M.V., Cabeza, D. (eds.) PADL 2004. LNCS, vol. 3350, pp. 37–51. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  21. 21.
    Purdom Jr., W., Bradford, P., Tamura, K., Kumar, S.: Single column discrepancy and dynamic max-mini optimization for quickly finding the most parsimonious evolutionary trees. Bioinformatics 2, 140–151 (2000)CrossRefGoogle Scholar
  22. 22.
    Yan, M., Bader, D.A.: Fast character optimization in parsimony phylogeny reconstruction. Technical report (2003)Google Scholar
  23. 23.
    Moret, B., Tang, J., Wang, L., Warnow, T.: Steps toward accurate reconstruction of phylogenies from gene-order data. J. Comput. Syst. Sci. 65, 508–525 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Erdem, E., Lifschitz, V.: Tight logic programs. Theory and Practice of Logic Programming 3, 499–518 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Syrjänen, T.: Lparse user’s manual (1998), http://www.tcs.hut.fi/Software/smodels/

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jonathan Kavanagh
    • 1
  • David Mitchell
    • 1
  • Eugenia Ternovska
    • 1
  • Ján Maňuch
    • 1
  • Xiaohong Zhao
    • 1
  • Arvind Gupta
    • 1
  1. 1.Simon Fraser UniversityBurnabyCanada

Personalised recommendations