Haplotype Inference on Pedigrees with Recombinations and Mutations

  • Yuri Pirola
  • Paola Bonizzoni
  • Tao Jiang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6293)

Abstract

Haplotype Inference (HI) is a computational challenge of crucial importance in a range of genetic studies, such as functional genomics, pharmacogenetics and population genetics. Pedigrees have been shown a valuable data that allows us to infer haplotypes from genotypes more accurately than population data, since Mendelian inheritance restricts the set of possible solutions. In order to overcome the limitations of classic statistical haplotyping methods, a combinatorial formulation of the HI problem on pedigrees has been proposed in the literature, called Minimum-Recombinant Haplotype Configuration (MRHC) problem, that allows a single type of genetic variation events, namely recombinations. In this work, we define a new problem, called Minimum-Change Haplotype Configuration (MRHC), that extends the MRHC formulation by allowing also a second type of natural variation events: mutations. We propose an efficient and accurate heuristic algorithm for MRHC based on an L-reduction to a well-known coding problem. Our heuristic can also be used to solve the original MRHC problem and it can take advantage of additional knowledge about the input genotypes, such as the presence of recombination hotspots and different rates of recombinations and mutations. Finally, we present an extensive experimental evaluation and comparison of our heuristic algorithm with several other state-of-the-art methods for HI on pedigrees under several simulated scenarios.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The hardness of approximate optima in lattices, codes, and systems of linear equations. J. of Computer and System Sciences 54(2), 317–331 (1997)CrossRefGoogle Scholar
  2. 2.
    Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  3. 3.
    Bonizzoni, P., Della Vedova, G., Dondi, R., Li, J.: The haplotyping problem: An overview of computational models and solutions. J. of Computer Science and Technology 18(6), 675–688 (2003)CrossRefGoogle Scholar
  4. 4.
    Elson, R.C., Stewart, J.: A general model for the analysis of pedigree data. Human Heredity 21, 523–542 (1971)CrossRefGoogle Scholar
  5. 5.
    Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., et al.: The structure of haplotype blocks in the human genome. Science 296(5576), 2225–2229 (2002)CrossRefPubMedGoogle Scholar
  6. 6.
    Gallager, R.G.: Low-Density Parity-Check Codes. MIT Press, Cambridge (1963)Google Scholar
  7. 7.
    Lander, E., Green, P.: Construction of multilocus genetic linkage maps in human. Proceedings of the National Academy of Sciences USA 84, 2363–2367 (1987)CrossRefGoogle Scholar
  8. 8.
    Leal, S.M., Yan, K., Müller-Myhsok, B.: SimPed: a simulation program to generate haplotype and genotype data for pedigree structures. Human heredity 60(2), 119–122 (2005)CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Li, J., Jiang, T.: Efficient inference of haplotypes from genotypes on a pedigree. J. of Bioinformatics and Computational Biology 1(1), 41–69 (2003)CrossRefPubMedGoogle Scholar
  10. 10.
    Li, J., Jiang, T.: Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming. J. of Computational Biology 12(6), 719–739 (2005)CrossRefGoogle Scholar
  11. 11.
    Pearl, J.: Reverend Bayes on inference engines: A distributed hierarchical approach. In: Proc. of the American Ass. of Artificial Intelligence National Conference on AI, Pittsburgh, PA, pp. 133–136 (1982)Google Scholar
  12. 12.
    Qian, D., Beckmann, L.: Minimum-recombinant haplotyping in pedigrees. American J. of Human Genetics 70(6), 1434–1445 (2002)CrossRefGoogle Scholar
  13. 13.
    Sobel, E., Lange, K.: Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. American J. of Human Genetics 58(6), 1323–1337 (1996)Google Scholar
  14. 14.
    The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 449(7164), 851–861 (2007)Google Scholar
  15. 15.
    Trégouët, D.A., König, I.R., Erdmann, J., et al.: Genome-wide haplotype association study identifies the SLC22A3-LPAL2-LPA gene cluster as a risk locus for coronary artery disease. Nature genetics 41(3), 283–285 (2009)CrossRefPubMedGoogle Scholar
  16. 16.
    Wang, W.B., Jiang, T.: Efficient inference of haplotypes from genotypes on a pedigree with mutations and missing alleles (extented abstract). In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 353–367. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Xiao, J., Liu, L., Xia, L., Jiang, T.: Efficient algorithms for reconstructing zero-recombinant haplotypes on a pedigree based on fast elimination of redundant linear equations. SIAM J. on Computing 38(6), 2198–2219 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yuri Pirola
    • 1
  • Paola Bonizzoni
    • 1
  • Tao Jiang
    • 2
  1. 1.DISCo, Univ. degli Studi di Milano-BicoccaMilanItaly
  2. 2.Department of Computer Science and EngineeringUniversity of CaliforniaRiversideUSA

Personalised recommendations