Skip to main content
Log in

An Improved (and Practical) Parameterized Algorithm for the Individual Haplotyping Problem MFR with Mate-Pairs

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The Individual Haplotyping MFR problem is a computational problem that, given a set of DNA sequence fragment data of an individual, induces the corresponding haplotypes by dropping the minimum number of fragments. Bafna, Istrail, Lancia, and Rizzi proposed an algorithm of time O(22k m 2 n+23k m 3) for the problem, where m is the number of fragments, n is the number of SNP sites, and k is the maximum number of holes in a fragment. When there are mate-pairs in the input data, the parameter k can be as large as 100, which would make the Bafna-Istrail-Lancia-Rizzi algorithm impracticable. The current paper introduces a new algorithm PM-MFR of running time \(O(nk_{2}3^{k_{2}}+m\log m+mk_{1})\) , where k 1 is the maximum number of SNP sites that a fragment covers (k 1 is smaller than n), and k 2 is the maximum number of fragments that cover a SNP site (k 2 is usually about 10). Since the time complexity of the algorithm PM-MFR is not directly related to the parameter k, the algorithm solves the Individual Haplotyping MFR problem with mate-pairs more efficiently and is more practical in real biological applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Venter, J.C., Adams, M.D., Myers, E.W., et al.: The sequence of the Human Genome. Science 291, 1304–1351 (2001)

    Article  Google Scholar 

  2. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

    Article  Google Scholar 

  3. Gabriel, S.B., Schaffner, S.F., Nguyen, H., et al.: The structure of haplotype blocks in the Human Genome. Science 296, 2225–2229 (2002)

    Article  Google Scholar 

  4. Stephens, J.C., Schneider, J.A., Tanguay, D.A., et al.: Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493 (2001)

    Article  Google Scholar 

  5. Horikawa, Y., Oda, N., Cox, N.J., et al.: Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus. Nat. Genet. 26, 163–175 (2000)

    Article  Google Scholar 

  6. Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: SNPs problems, complexity and algorithms. In: Proc. ESA, pp. 182–193 (2001)

  7. Bafna, V., Istrail, S., Lancia, G., Rizzi, R.: Polynomial and APX-hard cases of the individual haplotyping problem. Theor. Comput. Sci. 335, 109–125 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  8. Roach, J.C., Wang, K., Hood, L.: Pairwise end sequencing: A unified approach to genomic mapping and sequencing. Genomics 26(2), 345–353 (1995)

    Article  Google Scholar 

  9. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

    Article  Google Scholar 

  10. The International SNP Map Working Group. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)

    Article  Google Scholar 

  11. Hinds, D.A., Stuve, L.L., Nilsen, G.B., et al.: Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005)

    Article  Google Scholar 

  12. Sanger, F., Nicklen, S., Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. 74(12), 5463–5467 (1977)

    Article  Google Scholar 

  13. Huson, D.H., Halpern, A.L., Lai, Z., Myers, E.W., Reinert, K., Sutton, G.G.: Comparing assemblies using fragments and mate-pairs. In: Proc. WABI. Lecture Notes in Computer Science, vol. 2149, pp. 294–306. Springer, New York (2001)

    Google Scholar 

  14. Li, L., Khuri, S.: A comparison of DNA fragment assembly algorithms. In: Proc. METMBS, pp. 329–335 (2004)

  15. Wernicke, S.: On the algorithmic tractability of single nucleotide polymorphism (SNP) analysis and related problems. Ph. D. thesis, University of Tübingen, 2003

  16. Panconesi, A., Sozio, M.: Fast hare: A fast heuristic for single individual SNP haplotype reconstruction. In: Proc. WABI. Lecture Notes in Computer Science, vol. 3240, pp. 266–277. Springer, New York (2004)

    Google Scholar 

  17. Hüffner, F.: Algorithm engineering for optimal graph bipartization. In: Proc. WEA. Lecture Notes in Computer Science, vol. 3503, pp. 240–252. Springer, New York (2005)

    Google Scholar 

  18. Myers, G.: A dataset generator for whole genome shotgun sequencing. In: Proc. ISMB, pp. 202–210 (1999)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Wang.

Additional information

This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 60433020 and 60773111, the Program for New Century Excellent Talents in University No. NCET-05-0683, the Program for Changjiang Scholars and Innovative Research Team in University No. IRT0661, and the Scientific Research Fund of Hunan Provincial Education Department under Grant No. 06C526.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, M., Wang, J. An Improved (and Practical) Parameterized Algorithm for the Individual Haplotyping Problem MFR with Mate-Pairs. Algorithmica 52, 250–266 (2008). https://doi.org/10.1007/s00453-007-9150-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-007-9150-2

Keywords

Navigation