Journal of Computer Science and Technology

, Volume 18, Issue 6, pp 675–688 | Cite as

The Haplotyping problem: An overview of computational models and solutions

  • Paola BonizzoniEmail author
  • Gianluca Della Vedova
  • Riccardo Dondi
  • Jing Li


The investigation of genetic differences among humans has given evidence that mutations in DNA sequences are responsible for some genetic diseases. The most common mutation is the one that involves only a single nucleotide of the DNA sequence, which is called a single nucleotide polymorphism (SNP). As a consequence, computing a complete map of all SNPs occurring in the human populations is one of the primary goals of recent studies in human genomics. The construction of such a map requires to determine the DNA sequences that from all chromosomes. In diploid organisms like humans, each chromosome consists of two sequences calledhaplotypes. Distinguishing the information contained in both haplotypes when analyzing chromosome sequences poses several new computational issues which collectively form a new emerging topic of Computational Biology known asHaplotyping.

This paper is a comprehensive study of some new combinatorial approaches proposed in this research area and it mainly focuses on the formulations and algorithmic solutions of some basic biological problems. Three statistical approaches are briefly discussed at the end of the paper.


bioinformatics combinatorial algorithms haplotypes 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    International human genome sequencing consortium. Initial sequencing and analysis of the human genome.Nature, February 2001, 409(6822): 860–921.Google Scholar
  2. [2]
    Venter J Cet al. The sequence of the human genome.Science, 2001, 291(5507): 1304–1351.CrossRefGoogle Scholar
  3. [3]
    Patil N, Berno A Jet al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.Science, 2001, 294(5547): 1669–1670.CrossRefGoogle Scholar
  4. [4]
    Daly M, Roux J, Schaffer Set al. Fine-Structure Haplotype Map of 5q31: Implications for Gene-Based Studies and Genomic Ld Mapping, 2001.Google Scholar
  5. [5]
    Gabriel S B, Schaffner S F, Nguyen Het al. The structure of haplotype blocks in the human genome.Science, 2002, 296(5576): 2225–2229.CrossRefGoogle Scholar
  6. [6]
    Lancia G, Bafna V, Istrail Set al. SNPs problems, complexity and algorithms. InProc. 9th European Symp. Algorithms (ESA), 2001, pp. 182–193.Google Scholar
  7. [7]
    Gusfield D. Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. InProc. 6th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2002, pp.166–175.Google Scholar
  8. [8]
    Halperin E, Eskin E, Karp R M. Efficient reconstruction of haplotype structure via perfect phylogeny.Journal of Bioinformatics and Computational Biology, to appear.Google Scholar
  9. [9]
    Halperin E, Eskin E, Karp R M. Large scale reconstruction of haplotypes from genotype data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.104–113.Google Scholar
  10. [10]
    Zhang K, Deng M, Chen Tet al. A dynamic programming algorithm for haplotype block partitioning. InProc. The National Academy of Sciences, USA, 2002, 99(11): 7335–7339.Google Scholar
  11. [11]
    Clark A. Inference of haplotypes from pcr-amplified samples of diploid populations.Molecular Biology and Evolution, 1990, 7(2): 111–122.Google Scholar
  12. [12]
    Gusfield D. Inference of haplotypes from samples of diploid populations: Complexity and algorithms.Journal of Computational Biology, 2001, 8(3): 305–323.CrossRefGoogle Scholar
  13. [13]
    Gusfield D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.zbMATHGoogle Scholar
  14. [14]
    Bafna V, Gusfield D, Lancia G, Yooseph S. Haplotyping as perfect phylogeny: A direct approach.Journal of Computational Biology, to appear.Google Scholar
  15. [15]
    Helmuth L. Genome research: Map of human genome 3.0.Science 2001, 5530(293): 583–585.CrossRefGoogle Scholar
  16. [16]
    O’Connel J R. Zero-recombinant haplotyping: Applications to fine mapping using snps.Genet. Epidemiol., 2000, 19(Suppl.1): S64–70.CrossRefGoogle Scholar
  17. [17]
    Qian D, Beckmann L. Minimum-recombinant haplotyping in pedigrees.Am. J. Hum. Genet., 2002, 70(6): 1434–1445.CrossRefGoogle Scholar
  18. [18]
    Tapadar P, Ghosh S, Majumder P P. Haplotyping in pedigrees via a genetic algorithm.Hum. Hered., 2000, 50(1): 43–56.CrossRefGoogle Scholar
  19. [19]
    Li J, Jiang T. Efficient rule-based haplotyping algorithms for pedigree data. InProc. 7th Annual Conference on Research in Computational Molecular Biology (RECOMB), 2003, pp.197–206.Google Scholar
  20. [20]
    Garey M R, Johnson D S. Computer and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.Google Scholar
  21. [21]
    Doi K, Li J, Jiang T. Minimum recombinant haplotype configuration on tree pedigrees. Accepted bythe 3rd International Workshop on Algorithms in Bioinformatics (WABI), Hungary, 2003.Google Scholar
  22. [22]
    Rizzi R, Bafna V, Istrail S, Lancia G. Pratical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem. InProc. Algorithms in Bioinformatics, Second International Workshop (WABI 2002), 2003, pp.29–43.Google Scholar
  23. [23]
    Grötschel M, Lovasz L, Schrijver, A. A polynomial algorithm for perfect graphs.Annals of Discrete Mathematics, 1984, 21: 325–356.Google Scholar
  24. [24]
    Booth K S, Lueker G S. Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms.Journal of Computer and System Sciences, 1976, 13(3): 335–379.zbMATHMathSciNetGoogle Scholar
  25. [25]
    Orzack S, Gusfield D, Stanton V P. The absolute and relative accuracy of haplotype inferral methods and a consensus approach to haplotype inferral. In51st Annual Meeting of the American Society of Human Genetics, 2001.Google Scholar
  26. [26]
    Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.Molecular Biology and Evolution, 1995, 12(5): 921–927.Google Scholar
  27. [27]
    Stephens M, Smith N J, Donnelly P. A new statistical method for haplotype reconstruction from population data.American Journal of Human Genetics, 2001, 68: 978–989.CrossRefGoogle Scholar
  28. [28]
    Niu T, Qin Z S, Xu X, Liu J S. Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms.American Journal of Human Genetics, 2002, 710: 157–169.CrossRefGoogle Scholar
  29. [29]
    Mitchell T M. Machine Learning. McGraw Hill, New York, 1987.Google Scholar

Copyright information

© Science Press, Beijing China and Allerton Press Inc. 2003

Authors and Affiliations

  • Paola Bonizzoni
    • 1
    Email author
  • Gianluca Della Vedova
    • 2
  • Riccardo Dondi
    • 1
  • Jing Li
    • 3
  1. 1.DISCoUniversity of Milano-BicoccaMilanoItaly
  2. 2.Dip. StatisticaUniversity of Milano-BicoccaMilanoItaly
  3. 3.Department of Computer ScienceUniversity of California at RiversideRiversideUSA

Personalised recommendations