A Faster Haplotyping Algorithm Based on Block Partition, and Greedy Ligation Strategy

  • Xiaohui Yao
  • Yun Xu
  • Jiaoyun Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6840)


Haplotype played a very important role in the study of some disease gene and drug response tests over the past years. However, it is both time consuming and very costly to obtain haplotypes by experimental way. Therefore haplotype inference was proposed which deduce haplotypes from the genotypes through computing methods. Some genetic models were presented to solve the haplotype inference problem, and Maximum Parsimony model was one of them, but at present the methods based on this principle are either simple greedy heuristic or exact ones, which are adequate only for moderate size instances. In this paper, we presented a faster greedy algorithm named FHBPGL applying partition and ligation strategy. Theoretical analysis shows that this strategy can reduce the running time for large scale dataset and following experiments demonstrated that our algorithm gained comparable accuracy compared to exact haplotyping algorithms with less time.


Maximum Parsimony Heterozygous Locus Haplotype Inference Block Partition Perfect Phylogeny 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    International HapMap Consortium: The international HapMap project. Nature 426 789–796 (2003)Google Scholar
  2. 2.
    Gusfield, D.: An Overview of Combinatorial Methods for Haplotype Inference. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 9–25. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5), 921–927 (1995)Google Scholar
  4. 4.
    Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotyping interface for multiple linked single-nucleotide polymorphisms. Am J. Hum. Genet. 70(1), 157–169 (2002)CrossRefGoogle Scholar
  5. 5.
    Xing, E.P., Jordan, M.I., Sharan, R.: Bayesian haplotype inference via the Dirichlet process. Journal of Computational Biology (JCB) 14(3), 267–284 (2007)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Zhao, Y.Z., Xu, Y., Yao, X.H., et al.: A better block partition and ligation strategy for individual haplotyping. Bioinformatics 24(23), 2720–2725 (2008)CrossRefGoogle Scholar
  7. 7.
    Clark, A.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7(2), 111–122 (1990)Google Scholar
  8. 8.
    Gusfield, D.: Haplotype inference by pure parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Wang, L.S., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14), 1773–1780 (2003)CrossRefGoogle Scholar
  10. 10.
    Lancia, G., Pinotti, C., Rizzi, R.: Haplotyping populations by pure parsimony: Complexity of exact and approximation algorithms. INFORMS J. Comp. 16, 348–359 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Zhang, Q., Che, H., Chen, G., Sun, G.: A Practical Algorithm for Haplotyping by Maximum Parsimony. Journal of Software 16(10), 1699–1707 (2005)CrossRefGoogle Scholar
  12. 12.
    Daly, M.J., et al.: High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001)CrossRefGoogle Scholar
  13. 13.
    Gabriel, S.B., et al.: The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002)CrossRefGoogle Scholar
  14. 14.
    Qin, Z.S., et al.: Partition-Ligation EM algorithm for haplotype inference with single nucleotide polymorphisms. Am. J. Hum. Genet. 71, 1242–1247 (2002)CrossRefGoogle Scholar
  15. 15.
    Scheet, P., Stephens, M.: A fast and flexible statistical model for largescale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006)CrossRefGoogle Scholar
  16. 16.
    Delaneau, O., et al.: ISHAPE: new rapid and accurate software for haplotyping. BMC Bioinformatics 8, 205 (2007)CrossRefGoogle Scholar
  17. 17.
    Rieder, M.J., et al.: Sequence variation in the human angiotensin converting enzyme. Nat. Genet. 22, 59–62 (1999)CrossRefGoogle Scholar
  18. 18.
    Hudson, R.R.: Generating samples under a wright-fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Xiaohui Yao
    • 1
    • 2
  • Yun Xu
    • 1
    • 2
  • Jiaoyun Yang
    • 1
    • 2
  1. 1.Department of Computer ScienceUniversity of Science and Technology of ChinaHefeiChina
  2. 2.Anhui Province-MOST Co-Key Laboratory of High Performance Computing and Its ApplicationUniversity of Science and Technology of ChinaHefeiChina

Personalised recommendations