A Practical Exact Algorithm for the Individual Haplotyping Problem MEC/GI

  • Published:
Haplotypes play an important role in genetic association studies of complex diseases. Recently, computational techniques helping to determine human haplotypes were studied extensively. Given the genotype and the aligned single nucleotide polymorphism (SNP) fragments of an individual, Minimum Error Correction with Genotype Information (MEC/GI) is an important computational model to infer a pair of haplotypes compatible with the genotype by correcting minimum number of SNPs in the given SNP fragments. The MEC/GI problem has been proven NP-hard, for which there is no practical exact algorithm. Despite the rapid advances in molecular biological techniques, modern high-throughput sequencers cannot sequence directly a DNA fragment that contains more than 1200 nucleotide bases. With low SNP density, current available data reveal that the number k of SNP sites that a DNA fragment covers is usually smaller than 10. Based on the above fact, we develop a new dynamic programming algorithm with running time O(mk2k+mlog m+mk), where m is the number of fragments. Since k is small in real biological applications, the algorithm is practical and efficient.

Correspondence to Minzhu Xie.

A preliminary version of this paper was presented at The 14th Annual International Computing and Combinatorics Conference (COCOON 2008), June 27–29, 2008, Dalian, China. This research was supported in part by the National Natural Science Foundation of China under Grant No. 60773111, the National Basic Research 973 Program of China No. 2008CB317107, Postdoctoral Science Foundation of Central South University, the Program for New Century Excellent Talents in University No. NCET-05-0683, and the Program for Changjiang Scholars and Innovative Research Team in University No. IRT0661.

Wang, J., Xie, M. & Chen, J. A Practical Exact Algorithm for the Individual Haplotyping Problem MEC/GI. Algorithmica 56, 283–296 (2010).

