Skip to main content

Haplotype Inference Via Hierarchical Genotype Parsing

  • Conference paper
Algorithms in Bioinformatics (WABI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Included in the following conference series:

Abstract

The within-species genetic variation due to recombinations leads to a mosaic-like structure of DNA. This structure can be modeled, e.g. by parsing sample sequences of current DNA with respect to a small number of founders. The founders represent the ancestral sequence material from which the sample was created in a sequence of recombination steps. This scenario has recently been successfully applied on developing probabilistic Hidden Markov Methods for haplotyping genotypic data. In this paper we introduce a combinatorial method for haplotyping that is based on a similar parsing idea. We formulate a polynomial-time parsing algorithm that finds minimum cross-over parse in a simplified ‘flat’ parsing model that ignores the historical hierarchy of recombinations. The problem of constructing optimal founders that would give minimum possible parse for given genotypic sequences is shown NP-hard. A heuristic locally-optimal algorithm is given for founder construction. Combined with flat parsing this already gives quite good haplotyping results. Improved haplotyping is obtained by using a hierarchical parsing that properly models the natural recombination process. For finding short hierarchical parses a greedy polynomial-time algorithm is given. Empirical haplotyping results on HapMap data are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Daly, M., Rioux, J., Schaffner, S., Hudson, T., Lander, E.: High-resolution haplotype structure in the human genome. Nature Genetics 29, 229–232 (2001)

    Article  Google Scholar 

  2. Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory on NP-Completeness. W. H. Freeman and Company, New York (1979)

    Google Scholar 

  3. Griffiths, R., Marjoram, P.: Ancestral inference from samples of DNA sequences with recombination. Journal of Computational Biology 3, 479–502 (1996)

    Article  Google Scholar 

  4. Gusfield, D.: Haplotype inference by pure parsimony. Technical Report CSE-2003-2, Department of Computer Science, University of California (2003)

    Google Scholar 

  5. Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Comm. ACM 18, 341–343 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  6. Kececioglu, J., Gusfield, D.: Reconstructing a history of recombinations from a set of sequences. Discrete Applied Mathematics 88, 239–260 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  7. Kleinberg, J., Papadimitriou, C., Raghavan, P.: Segmentation problems. In: Proc. STOC 1998, New York, USA, pp. 473–482. ACM Press, New York (1998)

    Google Scholar 

  8. Koivisto, M., Rastas, P., Ukkonen, E.: Recombination systems. In: Karhumäki, J., Maurer, H., Păun, G., Rozenberg, G. (eds.) Theory Is Forever. LNCS, vol. 3113, pp. 159–169. Springer, Heidelberg (2004)

    Google Scholar 

  9. Lajoie, M., El-Mabrouk, N.: Recovering haplotype structure through recombination and gene conversion. Bioinformatics 21(suppl. 2), ii173–ii179 (2005)

    Google Scholar 

  10. Lancia, G., Pinotti, C., Rizzi, R.: Haplotyping populations: Complexity and approximations. Technical Report DIT-02-0080, Department of Information and Communication Technology, University of Trento (2002)

    Google Scholar 

  11. Lin, S., Cutler, D.J., Zwick, M.E., Chakravarti, A.: Haplotype inference in random population samples. American Journal of Human Genetics 71, 1129–1137 (2002)

    Article  Google Scholar 

  12. Lyngsø, R., Song, Y., Hein, J.: Minimum recombination histories by branch and bound. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 239–250. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Pääbo, S.: The mosaic in our genome. Nature 421, 409–412 (2003)

    Article  Google Scholar 

  14. Rastas, P.: Haplotyyppien määritys (Haplotype inference). Report C-2004-69 (M.Sc. thesis), Department of Computer Science, University of Helsinki (2004)

    Google Scholar 

  15. Rastas, P., Koivisto, M., Mannila, H., Ukkonen, E.: A hidden markov technique for haplotype reconstruction. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 140–151. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics 78, 629–644 (2006)

    Article  Google Scholar 

  17. Schwartz, R., Clark, A., Istrail, S.: Methods for inferring block-wise ancestral history from haploid sequences. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 44–59. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  18. The International HapMap Consortium: A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

    Google Scholar 

  19. Ukkonen, E.: Finding founder sequences from a set of recombinants. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 277–286. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  20. Wade, C., Kulbokas, E., Kirby, A., Zody, M., Mullikin, J., Lander, E., Daly, M.: The mosaic structure of variation in the laboratory mouse genome. Nature 420, 574–578 (2002)

    Article  Google Scholar 

  21. Wang, L., Zhang, K., Zhang, L.: Perfect phylogenetic networks with recombination. Journal of Computational Biology 8, 69–78 (2001)

    Article  Google Scholar 

  22. Wu, Y., Gusfield, D.: Improved algorithms for inferring the minimum mosaic of a set of recombinants. In: Proc. CPM 2007, Springer, Heidelberg (to appear, 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rastas, P., Ukkonen, E. (2007). Haplotype Inference Via Hierarchical Genotype Parsing. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics