Frontiers of Mathematics in China

, Volume 6, Issue 6, pp 1203–1216 | Cite as

A study of biases of DNA copy number estimation based on PICR model

  • Quan Wang
  • Jianghan Qu
  • Xiaoxing Cheng
  • Yongjian Kang
  • Lin Wan
  • Minping Qian
  • Minghua Deng
Research Article


Affymetrix single-nucleotide polymorphism (SNP) arrays have been widely used for SNP genotype calling and copy number variation (CNV) studies, both of which are dependent on accurate DNA copy number estimation significantly. However, the methods for copy number estimation may suffer from kinds of difficulties: probe dependent binding affinity, crosshybridization of probes, and the whole genome amplification (WGA) of DNA sequences. The probe intensity composite representation (PICR) model, one former established approach, can cope with most complexities and achieve high accuracy in SNP genotyping. Nevertheless, the copy numbers estimated by PICR model still show array and site dependent biases for CNV studies. In this paper, we propose a procedure to adjust the biases and then make CNV inference based on both PICR model and our method. The comparison indicates that our correction of copy numbers is necessary for CNV studies.


single-nucleotide polymorphism (SNP) array copy number variation (CNV) cross-hybridization 


62P10 68U01 92D20 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, Hurles M E. A robust statistical method for case-control association testing with copy number variation. Nat Genet, 2008, 40(10): 1245–1252CrossRefGoogle Scholar
  2. 2.
    Bengtsson H, Irizarry R, Carvalho B, Speed T P. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics, 2008, 24(6): 759–767CrossRefGoogle Scholar
  3. 3.
    Bengtsson H, Wirapati P, Speed T P. A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics, 2009, 25(17): 2149–2156CrossRefGoogle Scholar
  4. 4.
    Bignell G R, Huang J, Greshock J, Watt S, Butler A, West S, Grigorova M, Jones K W, Wei W, Stratton M R, et al. High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res, 2004, 14(2): 287–295CrossRefGoogle Scholar
  5. 5.
    Carter N P. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet, 2007, 39(7 Suppl): S16–21CrossRefGoogle Scholar
  6. 6.
    Di X, Matsuzaki H, Webster T A, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, et al. Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays. Bioinformatics, 2005, 21(9): 1958–1963CrossRefGoogle Scholar
  7. 7.
    Greenman C D, Bignell G, Butler A, Edkins S, Hinton J, Beare D, Swamy S, Santarius T, Chen L, Widaa S, Futreal P A, Stratton M R. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics, 2010, 11(1): 164–175CrossRefGoogle Scholar
  8. 8.
    Held G A, Grinstein G, Tu Y. Modeling of DNA microarray data by using physical properties of hybridization. Proc Natl Acad Sci USA, 2003, 100(13): 7575–7580CrossRefGoogle Scholar
  9. 9.
    Held G A, Grinstein G, Tu Y. Relationship between gene expression and observed intensities in DNA microarrays-a modeling study. Nucleic Acids Res, 2006, 34(9): e70CrossRefGoogle Scholar
  10. 10.
    Huang J, Wei W, Chen J, Zhang J, Liu G, Di X, Mei R, Ishikawa S, Aburatani H, Jones K W, et al. CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays. BMC Bioinformatics, 2006, 7: 83CrossRefGoogle Scholar
  11. 11.
    Iafrate A J, Feuk L, Rivera MN, Listewnik ML, Donahoe P K, Qi Y, Scherer SW, Lee C. Detection of large-scale variation in the human genome. Nat Genet, 2004, 36(9): 949–951CrossRefGoogle Scholar
  12. 12.
    Johnson W E, Li W, Meyer C A, Gottardo R, Carroll J S, Brown M, Liu X S. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci USA, 2006, 103(33): 12457–12462CrossRefGoogle Scholar
  13. 13.
    Kapur K, Jiang H, Xing Y, Wong W H. Cross-hybridization modeling on Affymetrix exon arrays. Bioinformatics, 2008, 24(24): 2887–2893CrossRefGoogle Scholar
  14. 14.
    Korn J M, Kuruvilla F G, McCarroll S A, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins P J, Darvishi K, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet, 2008, 40(10): 1253–1260CrossRefGoogle Scholar
  15. 15.
    Laframboise T, Harrington D, Weir B A. PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. Biostatistics, 2007, 8(2): 323–336CrossRefMATHGoogle Scholar
  16. 16.
    McCarroll S A, Kuruvilla F G, Korn J M, Cawley S, Nemesh J, Wysoker A, Shapero M H, de Bakker P I, Maller J B, Kirby A, et al. Integrated detection and populationgenetic analysis of SNPs and copy number variation. Nat Genet, 2008, 40(10): 1166–1174CrossRefGoogle Scholar
  17. 17.
    Nannya Y, Sanada M, Nakazaki K, Hosoya N, Wang L, Hangaishi A, Kurokawa M, Chiba S, Bailey D K, Kennedy G C, et al. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res, 2005, 65(14): 6071–6079CrossRefGoogle Scholar
  18. 18.
    Olshen A B, Venkatraman E S, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 2004, 5(4): 557–572CrossRefMATHGoogle Scholar
  19. 19.
    Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T. An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays. Bioinformatics, 2008, 24(10): 1278–1285CrossRefGoogle Scholar
  20. 20.
    Pugh T J, Delaney A D, Farnoud N, Flibotte S, Griffith M, Li H I, Qian H, Farinha P, Gascoyne R D, Marra M A. Impact of whole genome amplification on analysis of copy number variants. Nucleic Acids Res, 2008, 36(13): e80CrossRefGoogle Scholar
  21. 21.
    Rabbee N, Speed T P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics, 2006, 22(1): 7–12CrossRefGoogle Scholar
  22. 22.
    Redon R, Ishikawa S, Fitch K R, Feuk L, Perry G H, Andrews T D, Fiegler H, Shapero M H, Carson A R, Chen W, et al. Global variation in copy number in the human genome. Nature, 2006, 444(7118): 444–454CrossRefGoogle Scholar
  23. 23.
    Scherer S W, Lee C, Birney E, Altshuler D M, Eichler E E, Carter N P, Hurles M E, Feuk L. Challenges and standards in integrating surveys of structural variation. Nat Genet, 2007, 39(7 Suppl): S7–15CrossRefGoogle Scholar
  24. 24.
    Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M Y, et al. Large-scale copy number polymorphism in the human genome. Science, 2004, 305(5683): 525–528CrossRefGoogle Scholar
  25. 25.
    Slater H R, Bailey D K, Ren H, Cao M, Bell K, Nasioulas S, Henke R, Choo K H, Kennedy G C. High-resolution identification of chromosomal abnormalities using oligonucleotide arrays containing 116,204 SNPs. Am J Hum Genet, 2005, 77(5): 709–726CrossRefGoogle Scholar
  26. 26.
    Wan L, Sun K, Ding Q, Cui Y, Li M, Wen Y, Elston R C, Qian M, Fu WJ. Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation. Nucleic Acids Res, 2009, 37(17): e117CrossRefGoogle Scholar
  27. 27.
    Wan L, Xiao Y, Chen Q, Deng M, Qian M. The analysis of biases of copy numbers from Affymetrix SNP arrays. Communications in Information and Systems, 2010, 10(2): 81–96Google Scholar
  28. 28.
    Weir B A, Woo MS, Getz G, Perner S, Ding L, Beroukhim R, Lin WM, Province MA, Kraja A, Johnson L A, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature, 2007, 450(7171): 893–898CrossRefGoogle Scholar
  29. 29.
    Xiao Y, Segal M R, Yang Y H, Yeh R F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics, 2007, 23(12): 1459–1467CrossRefGoogle Scholar
  30. 30.
    Zhang L, Miles M F, Aldape K D. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol, 2003, 21(7): 818–821CrossRefGoogle Scholar
  31. 31.
    Zhang L, Wu C, Carta R, Zhao H. Free energy of DNA duplex formation on short oligonucleotide microarrays. Nucleic Acids Res, 2007, 35(3): e18CrossRefGoogle Scholar

Copyright information

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Quan Wang
    • 1
  • Jianghan Qu
    • 2
  • Xiaoxing Cheng
    • 3
  • Yongjian Kang
    • 2
  • Lin Wan
    • 1
    • 3
    • 4
  • Minping Qian
    • 1
    • 3
  • Minghua Deng
    • 1
    • 3
    • 5
  1. 1.Center for Theoretical BiologyPeking UniversityBeijingChina
  2. 2.Yuanpei CollegePeking UniversityBeijingChina
  3. 3.School of Mathematical SciencesPeking UniversityBeijingChina
  4. 4.Molecular and Computational BiologyUniversity of Southern CaliforniaLos AngelesUSA
  5. 5.Center for Statistical SciencesPeking UniversityBeijingChina

Personalised recommendations