Advertisement

A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem

  • Zhihong Ding
  • Vladimir Filkov
  • Dan Gusfield
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3500)

Abstract

Since the introduction of the Perfect Phylogeny Haplotyping (PPH) Problem in Recomb 2002 [15], the problem of finding a linear-time (deterministic, worst-case) solution for it has remained open, despite broad interest in the PPH problem and a series of papers on various aspects of it. In this paper we solve the open problem, giving a practical, deterministic linear-time algorithm based on a simple data-structure and simple operations on it. The method is straightforward to program and has been fully implemented. Simulations show that it is much faster in practice than prior methods. The value of a linear-time solution to the PPH problem is partly conceptual and partly for use in the inner-loop of algorithms for more complex problems, where the PPH problem must be solved repeatedly.

Keywords

Column Number Maximal Path Tree Edge Haplotype Inference Class Root 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bafna, V., Gusfield, D., Hannenhalli, S., Yooseph, S.: A note on efficient computation of haplotypes via perfect phylogeny. J. Comp. Bio. 11(5), 858–866 (2004)CrossRefGoogle Scholar
  2. 2.
    Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach. J. Computational Biology 10, 323–340 (2003)CrossRefGoogle Scholar
  3. 3.
    Barzuza, T., Beckmann, J.S., Shamir, R., Pe’er, I.: Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNP’s. In: Proc. of CPM 2004 (2004)Google Scholar
  4. 4.
    Bixby, R.E., Wagner, D.K.: An almost linear-time algorithm for graph realization. Mathematics of Operations Research 13, 99–123 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Bonizzoni, P., Vedova, G.D., Dondi, R., Li, J.: The haplotyping problem: Models and solutions. J. Computer Science and Technology 18, 675–688 (2003)zbMATHCrossRefGoogle Scholar
  6. 6.
    Chung, R.H., Gusfield, D.: Perfect phylogeny haplotyper: Haplotye inferral using a tree model. Bioinformatics 19(6), 780–781 (2003)CrossRefGoogle Scholar
  7. 7.
    Chung, R.H., Gusfield, D.: Empirical Exploration of Perfect Phylogeny Haplotyping and Haplotypers. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 5–9. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Damaschke, P.: Fast perfect phylogeny haplotype inference. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 183–194. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Damaschke, P.: Incremental haplotype inference, phylogeny and almost bipartite graphs. In: 2nd RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, pre-proceedings, pp. 1–11 (2004)Google Scholar
  10. 10.
    Eskin, E., Halperin, E., Karp, R.M.: Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny. J. Bioinformatics and Computational Biology 1(1), 1–20 (2003)CrossRefGoogle Scholar
  11. 11.
    Eskin, E., Halperin, E., Sharan, R.: Optimally Phasing Long Genomic Regions using Local Haplotype Predictions. In: Proc. of the Second RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, Pittsburg, USA, Feburary 20–21 (2004)Google Scholar
  12. 12.
    Gramm, J., Nierhoff, T., Tantau, T., Sharan, R.: On the Complexity of Haplotyping Via Perfect Phylogeny. Presented at the Second RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, Pittsburgh, USA, February 20–21. Proceedings to appear in LNBI. Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Gramm, J., Nierhoff, T., Tantau, T.: Perfect Path Phylogeny Haplotyping with Missing Data is Fixed-Parameter Tractable. In: Downey, R.G., Fellows, M.R., Dehne, F. (eds.) IWPEC 2004. LNCS, vol. 3162, pp. 174–186. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
  15. 15.
    Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions (extended abstract). In: Proc. of RECOMB 2002, pp. 166–175 (2002)Google Scholar
  16. 16.
    Gusfield, D.: An overview of combinatorial methods for haplotype inference. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 9–25. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  17. 17.
    Halldórsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A survey of computational methods for determining haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Halldórsson, B., Bafna, V., Edwards, N., Lipert, R., Yooseph, S., Istrail, S.: Combinatorial problems arising in SNP and haplotype analysis. In: Calude, C.S., Dinneen, M.J., Vajnovszki, V. (eds.) DMTCS 2003. LNCS, vol. 2731. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Halperin, E., Eskin, E.: Haplotype reconstruction from genotype data using Imperfect Phylogeny. Bioinformatics 20, 1842–1849 (2004)CrossRefGoogle Scholar
  20. 20.
    Halperin, E., Karp, R.M.: Perfect Phylogeny and Haplotype Assignment. In: Proc. of RECOMB 2004, pp. 10–19 (2004)Google Scholar
  21. 21.
    Helmuth, L.: Genome research: Map of the human genome 3.0. Science 293(5530), 583–585 (2001)CrossRefGoogle Scholar
  22. 22.
    Hudson, R.: Gene genealogies and the coalescent process. Oxford Survey of Evolutionary Biology 7, 1–44 (1990)Google Scholar
  23. 23.
    Hudson, R.: Generating samples under the Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)CrossRefGoogle Scholar
  24. 24.
    Kimmel, G., Shamir, R.: The Incomplete Perfect Phylogeny Haplotype Problem. Presented at the Second RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, Pittsburgh, USA, February 20–21 (2004); To appear in J. Bioinformatics and Computational BiologyGoogle Scholar
  25. 25.
    Tavare, S.: Calibrating the clock: Using stochastic processes to measure the rate of evolution. In: Lander, E., Waterman, M. (eds.) Calculating the Secretes of Life. National Academy Press, Washington (1995)Google Scholar
  26. 26.
    Wiuf, C.: Inference on Recombination and Block Structure Using Unphased Data. Genetics 166(1), 537–545 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Zhihong Ding
    • 1
  • Vladimir Filkov
    • 1
  • Dan Gusfield
    • 1
  1. 1.Department of Computer ScienceUniversity of CaliforniaDavis

Personalised recommendations