Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event

Song, Yun S.; Wu, Yufeng; Gusfield, Dan

doi:10.1007/11557067_13

Yun S. Song²¹,
Yufeng Wu²¹ &
Dan Gusfield²¹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3692))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

1114 Accesses
12 Citations

Abstract

The haplotype inference (HI) problem is the problem of inferring 2n haplotype pairs from n observed genotype vectors. This is a key problem that arises in studying genetic variation in populations, for example in the ongoing HapMap project [5]. In order to have a hope of finding the haplotypes that actually generated the observed genotypes, we must use some (implicit or explicit) genetic model of the evolution of the underlying haplotypes. The Perfect Phylogeny Haplotyping (PPH) model was introduced in 2002 [9] to reflect the “neutral coalescent” or “perfect phylogeny” model of haplotype evolution. The PPH problem (which can be solved in polynomial time) is to determine whether there is an HI solution where the inferred haplotypes can be derived on a perfect phylogeny (tree).

Since the introduction of the PPH model, several extensions and modifications of the PPH model have been examined. The most important modification, to model biological reality better, is to allow a limited number of biological events that violate the perfect phylogeny model. This was accomplished implicitly in [7,12] with the inclusion of several heuristics into an algorithm for the PPH problem [8]. Those heuristics are invoked when the genotype data cannot be explained with haplotypes that fit the perfect phylogeny model. In this paper, we address the issue explicitly, by allowing one recombination or homoplasy event in the model of haplotype evolution. We formalize the problems and provide a polynomial time solution for one problem, using an additional, empirically-supported assumption. We present a related framework for the second problem which gives a practical algorithm. We believe the second problem can be solved in polynomial time.

Research partially supported by grant EIA-0220154 from the National Science Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Combin. 5, 1–13 (2001)
Article MathSciNet Google Scholar
Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach. J. Comput. Biol. 10, 323–340 (2003)
Article Google Scholar
Barzuza, T., Beckman, J.S., Shamir, R., Pe’er, I.: Computational problems in perfect phylogeny haplotyping: XOR genotypes and tag SNPs. In: Proc. of CPM, pp. 14–31 (2004)
Google Scholar
Chung, R.H., Gusfield, D.: Empirical exploration of perfect phylogeny haplotyping and haplotypers. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 5–19. Springer, Heidelberg (2003)
Chapter Google Scholar
International HapMap Consortium. The HapMap project. Nature 426, 789–796 (2003)
Google Scholar
Ding, Z., Filkov, V., Gusfield, D.: A linear-time algorithm for the perfect phylogeny haplotyping problem. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 585–600. Springer, Heidelberg (2005)
Chapter Google Scholar
Eskin, E., Halperin, E., Karp, R.: Large scale reconstruction of haplotypes from genotype data. In: Proc. of RECOMB, pp. 104–113 (2003)
Google Scholar
Eskin, E., Halperin, E., Karp, R.M.: Efficient reconstruction of haplotype structure via perfect phylogeny. J. Bioinf. Comput. Biol. 1, 1–20 (2003)
Article Google Scholar
Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions (Extended Abstract). In: Proc. of RECOMB, pp. 166–175 (2002)
Google Scholar
Gusfield, D.: Optimal, efficient reconstruction of Root-Unknown phylogenetic networks with constrained recombination. J. Comput. Sys. Sci. 70, 381–398 (2005)
Article MATH MathSciNet Google Scholar
Gusfield, D., Eddhu, S., Langley, C.: Optimal, efficient reconstruction of phylogenetic networks with constrained recombination. J. Bioinf. Comput. Biol. 2(1), 173–213 (2004)
Article Google Scholar
Halperin, E., Eskin, E.: Haplotype reconstruction from genotype data using Imperfect Phylogeny. Bioinformatics 20, 1842–1849 (2004)
Article Google Scholar
Hein, J.: Reconstructing evolution of sequences subject to recombination using parsimony. Math. Biosci. 98, 185–200 (1990)
Article MATH MathSciNet Google Scholar
Hudson, R.: Gene genealogies and the coalescent process. Oxford Survey of Evolutionary Biology 7, 1–44 (1990)
Google Scholar
Hudson, R.: Generating samples under the Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)
Article Google Scholar
Lin, S., Cutler, D.J., Zwick, M.E., Chakravarti, A.: Haplotype inference in random population samples. Am. J. Hum. Genet. 71, 1129–1137 (2002)
Article Google Scholar
Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)
MATH Google Scholar
Song, Y.S.: On the combinatorics of rooted binary phylogenetic trees. Ann. Combin. 7, 365–379 (2003)
Article MATH Google Scholar
Song, Y.S., Hein, J.: Constructing minimal ancestral recombination graphs. J. Comput. Biol. 12, 147–169 (2005)
Article Google Scholar
Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978–989 (2001)
Article Google Scholar
Tavaré, S.: Calibrating the clock: Using stochastic processes to measure the rate of evolution. In: Lander, E., Waterman, M. (eds.) Calculating the Secrets of Life. National Academy Press, Washington (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of California, Davis, CA, 95616, USA
Yun S. Song, Yufeng Wu & Dan Gusfield

Authors

Yun S. Song
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Dan Gusfield
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Biocomputing Group, University of Bologna, Italy
Rita Casadio
Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, USA
Gene Myers

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, Y.S., Wu, Y., Gusfield, D. (2005). Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event. In: Casadio, R., Myers, G. (eds) Algorithms in Bioinformatics. WABI 2005. Lecture Notes in Computer Science(), vol 3692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557067_13

Download citation

DOI: https://doi.org/10.1007/11557067_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29008-7
Online ISBN: 978-3-540-31812-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics