COCOON 2003: Computing and Combinatorics pp 5-19 | Cite as
Empirical Exploration of Perfect Phylogeny Haplotyping and Haplotypers
Abstract
The next high-priority phase of human genomics will involve the development of a full Haplotype Map of the human genome 15. It will be used in large-scale screens of populations to associate specific haplotypes with specific complex genetic-influenced diseases. A key, perhaps bottleneck, problem is to computationally determine haplotype pairs from genotype data. An approach to this problem based on viewing it in the context of perfect phylogeny was introduced in 14 along with an efficient solution. A slower (in worst case) variation of that method was implemented 3. Two simpler methods for the perfect phylogeny approach that are also slower (in worst case) than the first algorithm were later developed 1,7. We have implemented and tested all three of these approachs in order to compare and explain the practical efficiencies of the three methods. We discuss two other empirical observations: a strong phase-transition in the frequency of obtaining a unique solution as a function of the number of individuals in the input; and results of using the method to find non-overlapping intervals where the haplotyping solution is highly reliable, as a function of the level of recombination in the data. Finally, we discuss the biological basis for the size of these tests.
Keywords
High Linkage Disequilibrium Maximal Interval Empirical Exploration Haplotype Inference Haplotype PairPreview
Unable to display preview. Download preview PDF.
References
- 1.V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph. Haplotyping as perfect phylogeny: A direct approach. Technical report, UC Davis, Department of Computer Science. July 17, 2002.Google Scholar
- 2.R. E. Bixby and D. K. Wagner. An almost linear-time algorithm for graph realization. Mathematics of Operations Research, 13:99–123, 1988.MATHMathSciNetCrossRefGoogle Scholar
- 3.R.H. Chung and D. Gusfield. Perfect phylogeny haplotyper: Haplotype inferral using a tree model. Bioinformatics, 19(6):780–781, 2003.CrossRefGoogle Scholar
- 4.A. Clark. Inference of haplotypes from PCR-amplified samples of diploid populations. Mol. Biol. Evol, 7:111–122, 1990.Google Scholar
- 5.A. Clark, K. Weiss, and D. Nickerson et. al. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am. J. Human Genetics, 63:595–612, 1998.CrossRefGoogle Scholar
- 6.M. Daly, J. Rioux, S. Schaffner, T. Hudson, and E. Lander. High-resolution haplotype structure in the human genome. Nature Genetics, 29:229–232, 2001.CrossRefGoogle Scholar
- 7.E. Eskin, E. Halperin, and R. Karp. Efficient reconstruction of haplotype structure via perfect phylogeny. Technical report, UC Berkeley, Computer Science Division (EECS), August, 2002.Google Scholar
- 8.M. Fullerton, A. Clark, Charles Sing, and et. al. Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. Am. J. of Human Genetics, pages 881–900, 2000.Google Scholar
- 9.S. Cleary and K. St. John. Analysis of Haplotype Inference Data Requirements. Preprint, 2003.Google Scholar
- 10.F. Gavril and R. Tamari. An algorithm for constructing edge-trees from hypergraphs. Networks, 13:377–388, 1983.MATHCrossRefMathSciNetGoogle Scholar
- 11.D. Gusfield. Efficient algorithms for inferring evolutionary history. Networks, 21:19–28, 1991.MATHCrossRefMathSciNetGoogle Scholar
- 12.D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.Google Scholar
- 13.D. Gusfield. Inference of haplotypes from samples of diploid populations: complexity and algorithms. Journal of computational biology, 8(3), 2001.Google Scholar
- 14.D. Gusfield. Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions (Extended Abstract). In Proceedings of RECOMB 2002: The Sixth Annual International Conference on Computational Biology, pages 166–175, 2002.Google Scholar
- 15.L. Helmuth. Genome research: Map of the human genome 3.0. Science, 293(5530):583–585, 2001.CrossRefGoogle Scholar
- 16.R. Hudson. Gene genealogies and the coalescent process. Oxford Survey of Evolutionary Biology, 7:1–44, 1990.Google Scholar
- 17.R. Hudson. Generating samples under the Wright-Fisher neutral model of genetic variation. Bioinformatics, 18(2):337–338, 2002.CrossRefGoogle Scholar
- 18.C. Langley. U.C. Davis Dept. of Evolution and Ecology. Personal Communication, 2003.Google Scholar
- 19.J.Z. Lin, A. Brown, and M. T. Clegg. Heterogeneous geographic patterns of nucleotide sequence diversity between two alcohol dehydrogenase genes in wild barley (Hordeum vulgare subspecies spontaneum). PNAS, 98:531–536, 2001.CrossRefGoogle Scholar
- 20.S. Lin, D. Cutler, M. Zwick, and A. Cahkravarti. Haplotype inference in random population samples. Am. J. of Hum. Genet., 71:1129–1137, 2003.CrossRefGoogle Scholar
- 21.T. Niu, Z. Qin, X. Xu, and J.S. Liu. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am. J. Hum. Genet, 70:157–169, 2002.CrossRefGoogle Scholar
- 22.S. Orzack, D. Gusfield, and V. Stanton. The absolute and relative accuracy of haplotype inferral methods and a consensus approach to haplotype inferral. Abstract Nr 115 in Am. Society of Human Genetics, Supplement 2001.Google Scholar
- 23.M. Stephens, N. Smith, and P. Donnelly. A new statistical method for haplotype reconstruction from population data. Am. J. Human Genetics, 68:978–989, 2001.CrossRefGoogle Scholar
- 24.S. Tavare. Calibrating the clock: Using stochastic processes to measure the rate of evolution. In E. Lander and M. Waterman, editors, Calculating the Secretes of Life. National Academy Press, 1995.Google Scholar
- 25.W.T. Tutte. An algorithm for determining whether a given binary matroid is graphic. Proc. of Amer. Math. Soc, 11:905–917, 1960.CrossRefMathSciNetGoogle Scholar
- 26.C. Wade and M. Daly et al. The mosaic structure of variation in the laboratory mouse genome. Nature, 420:574–578, 2002.CrossRefGoogle Scholar
- 27.Shibu Yooseph. Personal Communication, 2003.Google Scholar