Identification of Deletion Polymorphisms from Haplotypes

  • Erik Corona
  • Benjamin Raphael
  • Eleazar Eskin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4453)


Numerous efforts are underway to catalog genetic variation in human populations. While the majority of studies of genetic variation have focused on single base pair differences between individuals, i.e. single nucleotide polymorphisms (SNPs), several recent studies have demonstrated that larger scale structural variation including copy number polymorphisms and inversion polymorphisms are also common. However, direct techniques for detection and validation of structural variants are generally much more expensive than detection and validation of SNPs. For some types of structural variation, in particular deletions, the polymorphism produces a distinct signature in the SNP data. In this paper, we describe a new probabilistic method for detecting deletion polymorphisms from SNP data. The key idea in our method is that we estimate the frequency of the haplotypes in a region of the genome both with and without the possibility of a deletion in the region and apply a generalized likelihood ratio test to assess the significance of a deletion. Application of our method to the HapMap Phase I data revealed 319 candidate deletions, 142 of these overlap with variants identified in earlier studies, while 177 are novel. Using Phase II HapMap data we predict 6730 deletions.


False Positive Rate Haplotype Frequency HapMap Data Deletion Polymorphism Generalize Likelihood Ratio Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altshuler, D., Brooks, L.D., Chakravarti, A., Collins, F.S., Daly, M.J., Donnelly, P.: A haplotype map of the human genome. Nature 437, 1299–1320 (2005)CrossRefGoogle Scholar
  2. 2.
    Conrad, D.F., Andrews, T.D., Carter, N.P., Hurles, M.E., Pritchard, J.K.: A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet. 38, 75–81 (2006)CrossRefGoogle Scholar
  3. 3.
    Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal Royal Stat. Soc., Series B 39(1), 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  4. 4.
    Halperin, E., Eskin, E.: Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics (Oxford, England) 20, 1842–1849 (2004)CrossRefGoogle Scholar
  5. 5.
    Hinds, D.A., Stuve, L.L., Nilsen, G.B., Halperin, E., Eskin, E., Ballinger, D.G., Frazer, K.A., Cox, D.R.: Whole-genome patterns of common dna variation in three human populations. Science 307, 1072–1079 (2005)CrossRefGoogle Scholar
  6. 6.
    Iafrate, A.J., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., Scherer, S.W., Lee, C.: Detection of large-scale variation in the human genome. Nature genetics 36, 949–951 (2004)CrossRefGoogle Scholar
  7. 7.
    Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin, Z.S., Munro, H.M., Abecasis, G.R., Donnelly, P.: A comparison of phasing algorithms for trios and unrelated individuals. American journal of human genetics 78, 437–450 (2006)CrossRefGoogle Scholar
  8. 8.
    McCarroll, S.A., Hadnott, T.N., Perry, G.H., Sabeti, P.C., Zody, M.C., Barrett, J.C., Dallaire, S., Gabriel, S.B., Lee, C., Daly, M.J., Altshuler, D.M., Consortium, I.H.: Common deletion polymorphisms in the human genome. Nat. Genet. 38, 86–92 (2006)CrossRefGoogle Scholar
  9. 9.
    Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., Cho, E.K., Dallaire, S., Freeman, J.L., González, J.R., Gratacòs, M., Huang, J., Kalaitzopoulos, D., Komura, D., MacDonald, J.R., Marshall, C.R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M.J., Tchinda, J., Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Zhang, J., Armengol, L., Conrad, D.F., Estivill, X., Tyler-Smith, C., Carter, N.P., Aburatani, H., Lee, C., Jones, K.W., Scherer, S.W., Hurles, M.E.: Global variation in copy number in the human genome. Nature 444(7118), 444–454 (2006)CrossRefGoogle Scholar
  10. 10.
    Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P.: Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004)CrossRefGoogle Scholar
  11. 11.
    Sharp, A.J., Cheng, Z., Eichler, E.E.: Structural variation of the human genome. Annu. Rev. Genomics. Hum. Genet. (2006)Google Scholar
  12. 12.
    Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., Haugen, E., Hayden, H., Albertson, D., Pinkel, D., Olson, M.V., Eichler, E.E.: Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Erik Corona
    • 1
  • Benjamin Raphael
    • 2
  • Eleazar Eskin
    • 3
  1. 1.Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92092 
  2. 2.Dept. of Computer Science & Center for Computational Molecular Biology, Brown University, Providence, RI 02912 
  3. 3.Dept. of Computer Science, Dept. of Human Genetics, University of California, Los Angeles, CA 90095 

Personalised recommendations