Algorithm Implementation for CNV Discovery Using Affymetrix and Illumina SNP Array Data

Part of the Methods in Molecular Biology book series (MIMB, volume 838)


SNP array data can be analysed for the purpose of calling SNP alleles but also for determining the absolute copy number of a certain genomic segment. Here, the method for detecting copy number (CN) change using intensity data from SNP arrays is focused on. Methods incorporating data from the two main genotyping platforms, Affymetrix and Illumina, are described and possible options and problems that may be faced are examined. We discuss the importance of the quality control when using this analysis method and present some guidelines for implementation, both prior and post to algorithm use. A discussion of algorithms available for CN detection is included as well as ideas for further analysis protocols.

Key words

SNP array Copy number Detection algorithm Copy number variant Illumina Affymetrix 


  1. 1.
    Ragoussis J. Genotyping technologies for genetic research. Annu Rev Genomics Hum Genet 2009;10:117–33.PubMedCrossRefGoogle Scholar
  2. 2.
    Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet 2006;7:85–97.PubMedCrossRefGoogle Scholar
  3. 3.
    Pique-Regi R, Monso-Varona J, Ortega A, Seeger RC, Triche TJ, Asgharzadeh S. Sparse representation and Bayesian detection of genome copy number alterations from microarray data. Bioinformatics 2008;24:309–18.PubMedCrossRefGoogle Scholar
  4. 4.
    Colella S, Yau C, Taylor JM, et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 2007;35:2013–25.PubMedCrossRefGoogle Scholar
  5. 5.
    International Schizophrenia Consortium. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 2008;455:237–41.Google Scholar
  6. 6.
    Glessner JT, Wang K, Cai G, et al. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature 2009;459:569–73.PubMedCrossRefGoogle Scholar
  7. 7.
    Wain LV, Pedroso I, Landers JE, et al. The role of copy number variation in susceptibility to amyotrophic lateral sclerosis: genome-wide association study and comparison with published loci. PLoS One 2009;4:e8175.PubMedCrossRefGoogle Scholar
  8. 8.
    Zhang D, Cheng L, Qian Y, et al. Singleton deletions throughout the genome increase risk of bipolar disorder. Mol Psychiatry 2009;14:376–80.PubMedCrossRefGoogle Scholar
  9. 9.
    Korn JM, Kuruvilla FG, McCarroll SA, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 2008;40:1253–60.PubMedCrossRefGoogle Scholar
  10. 10.
    Winchester L, Yau C, Ragoussis J. Comparing CNV detection methods for SNP arrays. Brief Funct Genomic Proteomic 2009;8:353–66.PubMedCrossRefGoogle Scholar
  11. 11.
    Ben-Shachar S, Lanpher B, German JR, et al. Microdeletion 15q13.3: a locus with incomplete penetrance for autism, mental retardation, and psychiatric disorders. J Med Genet 2009;46:382–8.PubMedCrossRefGoogle Scholar
  12. 12.
    Wang K, Li M, Hadley D, et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 2007;17:1665–74.PubMedCrossRefGoogle Scholar
  13. 13.
    McCarroll SA, Altshuler DM. Copy-number variation and association studies of human disease. Nat Genet 2007;39:S37–42.PubMedCrossRefGoogle Scholar
  14. 14.
    Raychaudhuri S, Plenge RM, Rossin EJ, et al. Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet 2009;5:e1000534.PubMedCrossRefGoogle Scholar
  15. 15.
    Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet 2001;2:91–9.PubMedCrossRefGoogle Scholar
  16. 16.
    Yau C, Holmes CC. CNV discovery using SNP genotyping arrays. Cytogenet Genome Res 2008;123:307–12.PubMedCrossRefGoogle Scholar
  17. 17.
    Iafrate AJ, Feuk L, Rivera MN, et al. Detection of large-scale variation in the human genome. Nat Genet 2004;36:949–51.PubMedCrossRefGoogle Scholar
  18. 18.
    Hubbard TJ, Aken BL, Ayling S, et al. Ensembl 2009. Nucleic Acids Res 2009;37:D690–7.PubMedCrossRefGoogle Scholar
  19. 19.
    Falcon S, Gentleman R. Using GOstats to test gene lists for GO term association. Bioinformatics 2007;23:257–8.PubMedCrossRefGoogle Scholar
  20. 20.
    Karolchik D, Baertsch R, Diekhans M, et al. The UCSC Genome Browser Database. Nucleic Acids Res 2003;31:51–4.PubMedCrossRefGoogle Scholar
  21. 21.
    Yavas G, Koyuturk M, Ozsoyoglu M, Gould MP, LaFramboise T. An optimization framework for unsupervised identification of rare copy number variation from SNP array data. Genome Biol 2009;10:R119.PubMedCrossRefGoogle Scholar
  22. 22.
    Lin M, Wei LJ, Sellers WR, Lieberfarb M, Wong WH, Li C. dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data. Bioinformatics 2004;20:1233–40.PubMedCrossRefGoogle Scholar
  23. 23.
    Day N, Hemmaplardh A, Thurman RE, Stama-toyannopoulos JA, Noble WS. Unsupervised segmentation of continuous genomic data. Bioinformatics 2007;23:1424–6.PubMedCrossRefGoogle Scholar
  24. 24.
    Rigaill G, Hupe P, Almeida A, et al. ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays. Bioinformatics 2008;24:768–74.PubMedCrossRefGoogle Scholar
  25. 25.
    Zerr T, Cooper GM, Eichler EE, Nickerson DA. Targeted interrogation of copy number variation using SCIMMkit. Bioinformatics 2010;26:120–2.PubMedCrossRefGoogle Scholar
  26. 26.
    Franke L, de Kovel CG, Aulchenko YS, et al. Detection, imputation, and association analysis of small deletions and null alleles on oligonucleotide arrays. Am J Hum Genet 2008;82:1316–33.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Wellcome Trust Centre for Human GeneticsOxford UniversityOxfordUK

Personalised recommendations