Efficient Inference of Haplotypes from Genotypes on a Pedigree with Mutations and Missing Alleles (Extented Abstract)

  • Wei-Bung Wang
  • Tao Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5577)


Driven by the international HapMap project, the haplotype inference problem has become an important topic in the computational biology community. In this paper, we study how to efficiently infer haplotypes from genotypes of related individuals as given by a pedigree. Our assumption is that the input pedigree data may contain de novo mutations and missing alleles but is free of genotyping errors and recombinants, which is usually true for tightly linked markers. We formulate the problem as a combinatorial optimization problem, called the minimum mutation haplotype configuration (MMHC) problem, where we seek haplotypes consistent with the given genotypes that incur no recombinants and require the minimum number of mutations. This extends the well studied zero-recombinant haplotype configuration (ZRHC) problem. Although ZRHC is polynomial-time solvable, MMHC is NP-hard. We construct an integer linear program (ILP) for MMHC using the system of linear equations over the field F(2) that has been developed recently to solve ZRHC. Since the number of constraints in the ILP is large (exponentially large in the general case), we present an incremental approach for solving the ILP where we gradually add the constraints to a standard ILP solver until a feasible haplotype configuration is found. Our preliminary experiments on simulated data demonstrate that the method is very efficient on large pedigrees and can infer haplotypes very accurately as well as recover most of the mutations and missing alleles correctly.


Incremental Approach Pedigree Data Consistency Constraint Path Constraint International HapMap Project 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abecasis, G.R., Cherny, S.S., Cookson, W.O., Cardon, L.R.: Merlin — rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics 30(1), 97–101 (2002)CrossRefGoogle Scholar
  2. 2.
    Albers, C.A., Heskes, T., Kappen, H.J.: Haplotype inference in general pedigrees using the cluster variation method. Genetics 177(2), 1101–1116 (2007)CrossRefGoogle Scholar
  3. 3.
    Badaeva, T.N., Malysheva, D.N., Korchagin, V.I., Ryskov, A.P.: Genetic variation and De Novo mutations in the parthenogenetic caucasian rock lizard Darevskia unisexualis. PLoS ONE 3(7), e2730 (2008)CrossRefGoogle Scholar
  4. 4.
    Baruch, E., Weller, J.I., Cohen-Zinder, M., Ron, M., Seroussi, E.: Efficient inference of haplotypes from genotypes on a large animal pedigree. Genetics 172(3), 1757–1765 (2006)CrossRefGoogle Scholar
  5. 5.
    Ellegren, H.: Microsatellite mutations in the germline: Implications for evolutionary inference. Trends in Genetics 16(12), 551–558 (2000)CrossRefGoogle Scholar
  6. 6.
    Gusfield, D.: Inference of haplotypes from samples of diploid populations: Complexity and algorithms. J. Computational Biology 8(3), 305–323 (2001)CrossRefGoogle Scholar
  7. 7.
    Kimura, M., Crow, J.F.: The number of alleles that can be maintained in a finite population. Genetics 49, 725–738 (1964)Google Scholar
  8. 8.
    Lander, E.S., Green, P.: Construction of multilocus genetic linkage maps in humans. In: Proc. of the National Academy of Sciences. Genetics, vol. 84, pp. 2363–2367 (1987)Google Scholar
  9. 9.
    Li, J., Jiang, T.: Efficient inference of haplotypes from genotypes on a pedigree. J. Computational Biology 1(1), 41–69 (2003)Google Scholar
  10. 10.
    Li, J., Jiang, T.: Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming. J. Computational Biology 12(6), 719–739 (2005)CrossRefGoogle Scholar
  11. 11.
    Li, J., Jiang, T.: A survey on haplotype algorithms for tightly linked markers. J. Bioinformatics and Computational Biology 6(1), 241–259 (2008)CrossRefGoogle Scholar
  12. 12.
    Liu, L., Jiang, T.: Linear-time reconstruction of zero-recombinant mendelian inheritance on pedigrees without mating loops. Genome Informatics 19, 95–106 (2007)Google Scholar
  13. 13.
    Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am. J. Hum. Genet. 70(1), 157–169 (2002)CrossRefGoogle Scholar
  14. 14.
    Olson, T.M., Doan, T.P., Kishimoto, N.Y., Whitby, F.G., Ackerman, M.J., Fananapazir, L.: Inherited and de novo mutations in the cardiac actin gene cause hypertrophic cardiomyopathy. J. Molecular and Cellular Cardiology 32(9), 1687–1694 (2000)CrossRefGoogle Scholar
  15. 15.
    Qian, D., Beckmann, L.: Minimum-recombinant haplotyping in pedigrees. Am. J. Hum. Genet. 70(6), 1434–1445 (2002)CrossRefGoogle Scholar
  16. 16.
    Sobel, E., Lange, K., O’Connell, J.R., Weeks, D.E.: Haplotyping algorithms. In: Speed, T., Waterman, M.S. (eds.) Genetic Mapping and DNA Sequencing. IMA Volumes in Mathematics and its Applications, vol. 81, pp. 89–110. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  17. 17.
    Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68(4), 978–989 (2001)CrossRefGoogle Scholar
  18. 18.
    Tapadar, P., Ghosh, S., Majumder, P.P.: Haplotyping in pedigrees via a genetic algorithm. Human Heredity 50(1), 43–56 (2000)CrossRefGoogle Scholar
  19. 19.
    The Internaltional HapMap Consortium. The international HapMap project. Nature 426, 789–796 (2003)Google Scholar
  20. 20.
    Wang, S., Kidd, K.K., Zhao, H.: On the use of DNA pooling to estimate haplotype frequencies. Genetic Epidemiology 24(1), 74–82 (2003)CrossRefGoogle Scholar
  21. 21.
    Xiao, J., Liu, L., Xia, L., Jiang, T.: Fast elimination of redundant linear equations and reconstruction of recombination-free mendelian inheritance on a pedigree. In: 18th Annual ACM-SIAM Symposium on Descrete Algorithms, pp. 655–664 (2007)Google Scholar
  22. 22.
    Yang, Y., Zhang, J., Hoh, J., Matsuda, F., Xu, P., Lathrop, M., Ott, J.: Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA. Proc. of the National Academy of Sciences 100, 7225–7230 (2002)CrossRefGoogle Scholar
  23. 23.
    Zhang, K., Sun, F., Zhao, H.: HAPLORE: A program for haplotype reconstruction in general pedigrees without recombination. Bioinformatics 21(1), 90–103 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Wei-Bung Wang
    • 1
  • Tao Jiang
    • 1
  1. 1.Department of Computer ScienceUniversity of CaliforniaRiversideUSA

Personalised recommendations