On the Complexity of SNP Block Partitioning Under the Perfect Phylogeny Model

  • Jens Gramm
  • Tzvika Hartman
  • Till Nierhoff
  • Roded Sharan
  • Till Tantau
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4175)


Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is NP-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect path phylogeny haplotyping, is solvable in polynomial time.


Polynomial Time Chromatic Number Haplotype Inference Perfect Phylogeny Computational Molecular Biology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach. J. of Computational Biology 10(3–4), 323–340 (2003)CrossRefGoogle Scholar
  2. 2.
    Chung, R.H., Gusfield, D.: Empirical exploration of perfect phylogeny haplotyping and haplotypers. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 5–19. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations. J. of Molecular Biology and Evolution 7(2), 111–122 (1990)Google Scholar
  4. 4.
    Dinur, I., Regev, O., Smyth, C.D.: The hardness of 3-uniform hypergraph coloring. In: Proc. 43rd Symposium on Foundations of Computer Science, pp. 33–42 (2002)Google Scholar
  5. 5.
    Eskin, E., Halperin, E., Karp, R.M.: Efficient reconstruction of haplotype structure via perfect phylogeny. J. of Bioinformatics and Computational Biology 1(1), 1–20 (2003)CrossRefGoogle Scholar
  6. 6.
    Eskin, E., Halperin, E., Sharan, R.: Optimally phasing long genomic regions using local haplotype predictions. In: Proc. 2nd RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, Pittsburgh, Pennsylvania, pp. 13–26 (2004)Google Scholar
  7. 7.
    Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5), 921–927 (1995)Google Scholar
  8. 8.
    Felsner, S., Raghavan, V., Spinrad, J.: Recognition algorithms for orders of small width and graphs of small Dilworth number. Order 20, 351–364 (2003)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Gramm, J., Nierhoff, T., Sharan, R., Tantau, T.: Haplotyping with missing data via perfect path phylogenies. Discrete Applied Mathematics (in press, 2006)Google Scholar
  10. 10.
    Gramm, J., Nierhoff, T., Tantau, T.: Perfect path phylogeny haplotyping with missing data is fixed-parameter tractable. In: Downey, R.G., Fellows, M.R., Dehne, F. (eds.) IWPEC 2004. LNCS, vol. 3162, pp. 174–186. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21, 19–28 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Gusfield, D.: Inference of haplotypes from samples of diploid populations: complexity and algorithms. J. of Computational Biology 8(3), 305–323 (2001)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. In: Proc. 6th Conf. on Computational Molecular Biology RECOMB 2002, pp. 166–175. ACM Press, New York (2002)Google Scholar
  14. 14.
    Gusfield, D., Orzack, S.H.: Haplotype Inference. In: CRC Handbook on Bioinformatics (2005)Google Scholar
  15. 15.
    Halperin, E., Karp, R.M.: Perfect phylogeny and haplotype assignment. In: Proc. 8th Conf. on Computational Molecular Biology RECOMB 2004, pp. 10–19. ACM Press, New York (2004)Google Scholar
  16. 16.
    Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. J. of the ACM 45(5), 960–981 (1994)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Carlson, C.S., Eberle, M.A., Kruglyak, L., Nickerson, D.A.: Mapping complex disease loci in whole-genome association studies. Nature 429, 446–452 (2004)CrossRefGoogle Scholar
  18. 18.
    Niu, T., Qin, S., Xu, X., Liu, J.: Bayesian haplotype inference for multiple linked single nucleotide polymorphisms. American J. of Human Genetics 70(1), 157–169 (2002)CrossRefGoogle Scholar
  19. 19.
    Patil, N., Berno, A.J., Hinds, D.A., et al.: Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294(5547), 1719–1723 (2001)CrossRefGoogle Scholar
  20. 20.
    Stephens, M., Smith, N., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. American J. of Human Genetics 68(4), 978–989 (2001)CrossRefGoogle Scholar
  21. 21.
    Wang, D.G., Fan, J.B., Siao, C.J., Berno, A., Young, P.P., et al.: Large-scale identification, mapping, and genotyping of single nucleotide polymorphisms in the human genome. Science 280(5366), 1077–1082 (1998)CrossRefGoogle Scholar
  22. 22.
    Zhang, J., Rowe, W.L., Clark, A.G., Buetow, K.H.: Genomewide distribution of high-frequency, completely mismatching SNP haplotype pairs observed to be common across human populations. American J. of Human Genetics 73(5), 1073–1081 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jens Gramm
    • 1
  • Tzvika Hartman
    • 2
  • Till Nierhoff
    • 3
  • Roded Sharan
    • 4
  • Till Tantau
    • 5
  1. 1.Wilhelm-Schickard-Institut für InformatikUniversität TübingenGermany
  2. 2.Dept. of Computer ScienceBar-Ilan UniversityRamat-GanIsrael
  3. 3.International Computer Science InstituteBerkeleyUSA
  4. 4.School of Computer ScienceTel-Aviv UniversityTel-AvivIsrael
  5. 5.Institut für Theoretische InformatikUniversität zu LübeckGermany

Personalised recommendations