Abstract
The perfect phylogeny model for haplotype evolution has been successfully applied to haplotype resolution from genotype data. In this study we explore the application of the perfect phylogeny model to other problems in the design and analysis of genetic studies. We consider a novel type of data, xor-genotypes, which distinguish heterozygote from homozygote sites but do not identify the homozygote alleles. We show how to resolve xor-genotypes under perfect phylogeny model, and study the degrees of freedom in such resolutions. Interestingly, given xor-genotypes that produce a single possible resolution, we show that the full genotype of at most three individuals suffice in order to determine all haplotypes across the phylogeny. Our experiments with xor-genotyping data indicate that the approach requires a number of individuals only slightly larger than full genotyping, at a potentially reduced typing cost.
We also consider selection of minimum-cost sets of tag SNPs, i.e., polymorphisms whose alleles suffice to recover the haplotype diversity. We show that this problem lends itself to divide-and-conquer linear-time solution. Finally, we study genotype tags, i.e., genotype calls that suffice to recover the alleles of all other SNPs. Since most genetic studies are genotype-based, such tags are more relevant in such studies than the haplotype tags. We show that under the perfect phylogeny model a SNP subset of haplotype tags, as it is usually defined, tags the haplotypes by genotype calls as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sachidanandam, R., et al. (International SNP Map Working Group). A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409(6822), 928–933 (2001)
Patil, N., et al.: Blocks of Limited Haplotype Diversity Revealed by High Resolution Scanning of Human Chromosome 21. Science 294(5547), 1719–1723 (2001)
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., Lander, E.S.: High resolution haplotype structure in the human genome. Nature Genetics 29(2), 229–232 (2001)
Jeffreys, A.J., Kauppi, L., Neumann, R.: Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nature Genetics 29(2), 109–111 (2001)
Nachman, M.W., Crowell, S.L.: Estimate of the mutation rate per nucleotide in humans. Genetics 156(1), 297–304 (2000)
Gabriel, S.B., et al.: The structure of haplotype blocks in human genome. Science 296(5576), 2225–2229 (2002)
Clark, A.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7(2), 111–122 (1990)
Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5), 921–927 (1995)
Gusfield, D.: Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions. In: Proceedings of the Sixth Annual International Conference on Computational Biology 2002 (RECOMB 2002), pp. 166–75 (2002)
Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as Perfect Phylogenty: A direct approach. Technical Report U.C. Davis CSE-2002-21 (2002)
Eskin, E., Halperin, E., Karp, R.: Efficient reconstruction of haplotype structure via perfect phylogeny. Journal of Bioinformatics and Computational Biology (JBCB) (2003) (to appear)
Garey, M.R., Johnson, D.S.: Computers and Intractability, p. 222. Freeman, New York (1979)
Bafna, V., Halldórsson, B.V., Schwartz, R., Clark, A.G., Istrail, S.: Haplotypes and informative SNP selection algorithms: don’t block out information. In: Proceedings of the Seventh Annual International Conference on Computational Biology 2003 (RECOMB 2003), pp. 19–27 (2003)
Xiao, W., Oefner, P.J.: Denaturing high-performance liquid chromatography: A review. Human Mutation 17(6), 439–474 (2001)
Bixby, R.E., Wagner, D.: An almost linear-time algorithm for graph realization, Mathematics of Operations Research . vol.13(1), pp. 99–123 (1988)
Tutte, W.T.: An Algorithm for determining whether a given binary matroid is graphic. Proceedings of American Mathematical Society 11, 905–917 (1960)
Gavril, F., Tamari, R.: An algorithm for constructing edge-trees from hypergraphs. Networks 13, 377–388 (1983)
Zhang, K., Deng, M., Chen, T., Waterman, M.S., Sun, F.: A dynamic programming algorithm for haplotype block partitioning. In: Proceedings of the National Academy of Sciences,vol. 99, pp. 7335–7339 (2002)
Chung, R.H., Gusfield, D.: Perfect Phylogeny Haplotyper: Haplotype Inferral Using a Tree Model. Bioinformatics 19(6), 780–781 (2002)
Johnson, G.C., et al.: Haplotype tagging for the identification of common disease genes. Nature Genetics 29(2), 233–237 (2001)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Hudson, R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)
Chung, R.H., Gusfield, D.: Empirical Exploration of Perfect Phylogeny Haplotyping and Haplotypers. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 5–19. Springer, Heidelberg (2003)
Sebastiani, P., Lazarus, R., Weiss, S.T., Kunkel, L.M., Kohane, I.S., Ramoni, M.F.: Minimal haplotype tagging. In: Proceedings of the National Academy of Sciences of the USA, vol.100(17), pp. 9900–9905 (2003)
Chapman, J.M., Cooper, J.D., Todd, J.A., Clayton, D.: Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Human Heredity 56(1-3), 18–31 (2003)
Kwok, P.: Genetic association by whole-genome analysis. Science 294(5547), 1669–1670 (2001)
Pe’er, I., Beckmann, J.: Resolution of haplotypes and haplotype frequencies from SNP genotypes of pooled samples. In: Proceedings of the Seventh Annual International Conference on Computational Biology (RECOMB 2003), pp. 237–246 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barzuza, T., Beckmann, J.S., Shamir, R., Pe’er, I. (2004). Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNPs. In: Sahinalp, S.C., Muthukrishnan, S., Dogrusoz, U. (eds) Combinatorial Pattern Matching. CPM 2004. Lecture Notes in Computer Science, vol 3109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27801-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-27801-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22341-2
Online ISBN: 978-3-540-27801-6
eBook Packages: Springer Book Archive