Copy Number Variation

  • Aurélien Macé
  • Zoltán Kutalik
  • Armand ValsesiaEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1793)


Differences between genomes can be due to single nucleotide variants (SNPs), translocations, inversions and copy number variants (CNVs, gain or loss of DNA). The latter can range from sub-microscopic events to complete chromosomal aneuploidies. Small CNVs are often benign but those larger than 250 kb are strongly associated with morbid consequences such as developmental disorders and cancer. Detecting CNVs within and between populations is essential to better understand the plasticity of our genome and to elucidate its possible contribution to disease or phenotypic traits.

While the link between SNPs and disease susceptibility has been well studied, to date there are still very few published CNV genome-wide association studies; probably owing to the fact that CNV analysis remains a slightly more complex task than SNP analysis (both in term of bioinformatics workflow and uncertainty in the CNV calling leading to high false positive rates and unknown false negative rates). This chapter aims at explaining computational methods for the analysis of CNVs, ranging from study design, data processing and quality control, up to genome-wide association study with clinical traits.

Key words

Copy number variation DNA Duplication Deletion Structural variation Genome-wide association studies Human genetics Human disease 


  1. 1.
    Valsesia A, Mace A, Jacquemont S et al (2013) The growing importance of CNVs: new insights for detection and clinical interpretation. Front Genet 4:92PubMedPubMedCentralCrossRefGoogle Scholar
  2. 2.
    Conrad DF, Pinto D, Redon R et al (2010) Origins and functional impact of copy number variation in the human genome. Nature 464(7289):704–712PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7(2):85–97PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Fiegler H, Redon R, Andrews D et al (2006) Accurate and reliable high-throughput detection of copy number variation in the human genome. Genome Res 16(12):1566–1574PubMedPubMedCentralCrossRefGoogle Scholar
  5. 5.
    Freeman JL, Perry GH, Feuk L et al (2006) Copy number variation: new insights in genome diversity. Genome Res 16(8):949–961PubMedCrossRefPubMedCentralGoogle Scholar
  6. 6.
    Iafrate AJ, Feuk L, Rivera MN et al (2004) Detection of large-scale variation in the human genome. Nat Genet 36(9):949–951PubMedCrossRefPubMedCentralGoogle Scholar
  7. 7.
    Kidd JM, Cooper GM, Donahue WF et al (2008) Mapping and sequencing of structural variation from eight human genomes. Nature 453(7191):56–64PubMedPubMedCentralCrossRefGoogle Scholar
  8. 8.
    Perry GH, Yang F, Marques-Bonet T et al (2008) Copy number variation and evolution in humans and chimpanzees. Genome Res 18(11):1698–1710PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Redon R, Ishikawa S, Fitch KR et al (2006) Global variation in copy number in the human genome. Nature 444(7118):444–454PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Sharp AJ, Locke DP, McGrath SD et al (2005) Segmental duplications and copy-number variation in the human genome. Am J Hum Genet 77(1):78–88PubMedPubMedCentralCrossRefGoogle Scholar
  11. 11.
    Valsesia A, Rimoldi D, Martinet D et al (2011) Network-guided analysis of genes with altered somatic copy number and gene expression reveals pathways commonly perturbed in metastatic melanoma. PLoS One 6(4):e18369PubMedPubMedCentralCrossRefGoogle Scholar
  12. 12.
    Dopman EB, Hartl DL (2007) A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci U S A 104(18056801):19920–19925PubMedPubMedCentralCrossRefGoogle Scholar
  13. 13.
    Fontanesi L, Martelli PL, Beretti F et al (2010) An initial comparative map of copy number variations in the goat (Capra hircus) genome. BMC Genomics 11(21083884):639PubMedPubMedCentralCrossRefGoogle Scholar
  14. 14.
    Graubert TA, Cahan P, Edwin D et al (2007) A high-resolution map of segmental DNA copy number variation in the mouse genome. PLoS Genet 3(1):e3PubMedPubMedCentralCrossRefGoogle Scholar
  15. 15.
    Guryev V, Saar K, Adamovic T et al (2008) Distribution and functional impact of DNA copy number variation in the rat. Nat Genet 40(5):538–545PubMedCrossRefPubMedCentralGoogle Scholar
  16. 16.
    Lee AS, Gutiérrez-Arcelus M, Perry GH et al (2008) Analysis of copy number variation in the rhesus macaque genome identifies candidate loci for evolutionary and human disease studies. Hum Mol Genet 17(8):1127–1136PubMedCrossRefPubMedCentralGoogle Scholar
  17. 17.
    Liu GE, Hou Y, Zhu B et al (2010) Analysis of copy number variations among diverse cattle breeds. Genome Res 20(20212021):693–703PubMedPubMedCentralCrossRefGoogle Scholar
  18. 18.
    Valsesia A, Stevenson BJ, Waterworth D et al (2012) Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort. BMC Genomics 13:241PubMedPubMedCentralCrossRefGoogle Scholar
  19. 19.
    Mannik K, Magi R, Mace A et al (2015) Copy number variations and cognitive phenotypes in unselected populations. JAMA 313(20):2044–2054PubMedPubMedCentralCrossRefGoogle Scholar
  20. 20.
    Craddock N, Hurles ME, Cardin N et al (2010) Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464(7289):713–720PubMedCrossRefPubMedCentralGoogle Scholar
  21. 21.
    Firth HV, Richards SM, Bevan AP et al (2009) DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources. Am J Hum Genet 84(19344873):524–533PubMedPubMedCentralCrossRefGoogle Scholar
  22. 22.
    Grozeva D, Kirov G, Ivanov D et al (2010) Rare copy number variants: a point of rarity in genetic risk for bipolar disorder and schizophrenia. Arch Gen Psychiatry 67(20368508):318–327PubMedPubMedCentralCrossRefGoogle Scholar
  23. 23.
    Jacquemont S, Reymond A, Zufferey F et al (2011) Mirror extreme BMI phenotypes associated with gene dosage at the chromosome 16p11.2 locus. Nature 478(7367):97–102PubMedPubMedCentralCrossRefGoogle Scholar
  24. 24.
    Walters RG, Jacquemont S, Valsesia A et al (2010) A new highly penetrant form of obesity due to deletions on chromosome 16p11.2. Nature 463(7281):671–675PubMedPubMedCentralCrossRefGoogle Scholar
  25. 25.
    Zhang F, Gu W, Hurles ME et al (2009) Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 10(19715442):451–481PubMedPubMedCentralCrossRefGoogle Scholar
  26. 26.
    Gayán J, Galan JJ, González-Pérez A et al (2010) Genetic structure of the Spanish population. BMC Genomics 11:326PubMedPubMedCentralCrossRefGoogle Scholar
  27. 27.
    Li J, Yang T, Wang L et al (2009) Whole genome distribution and ethnic differentiation of copy number variation in Caucasian and Asian populations. PLoS One 4(11):e7958PubMedPubMedCentralCrossRefGoogle Scholar
  28. 28.
    Matsuzaki H, Wang P-H, Hu J et al (2009) High resolution discovery and confirmation of copy number variants in 90 Yoruba Nigerians. Genome Biol 10(11):R125PubMedPubMedCentralCrossRefGoogle Scholar
  29. 29.
    McElroy JP, Nelson MR, Caillier SJ et al (2009) Copy number variation in African Americans. BMC Genet 10:15PubMedPubMedCentralCrossRefGoogle Scholar
  30. 30.
    Lin C-H, Li L-H, Ho S-F et al (2008) A large-scale survey of genetic copy number variations among Han Chinese residing in Taiwan. BMC Genet 9:92PubMedPubMedCentralCrossRefGoogle Scholar
  31. 31.
    Takahashi N, Tsuyama N, Sasaki K et al (2008) Segmental copy-number variation observed in Japanese by array-CGH. Ann Hum Genet 72(Pt 2):193–204PubMedCrossRefGoogle Scholar
  32. 32.
    Jeon JP, Shim SM, Jung JS et al (2009) A comprehensive profile of DNA copy number variations in a Korean population: identification of copy number invariant regions among Koreans. Exp Mol Med 41(9):618–628PubMedPubMedCentralCrossRefGoogle Scholar
  33. 33.
    Kang T-W, Jeon Y-J, Jang E et al (2008) Copy number variations (CNVs) identified in Korean individuals. BMC Genomics 9:492PubMedPubMedCentralCrossRefGoogle Scholar
  34. 34.
    Jakobsson M, Scholz SW, Scheet P et al (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451(7181):998–1003PubMedCrossRefGoogle Scholar
  35. 35.
    Kato M, Kawaguchi T, Ishikawa S et al (2010) Population-genetic nature of copy number variations in the human genome. Hum Mol Genet 19(5):761–773PubMedCrossRefGoogle Scholar
  36. 36.
    Conrad DF, Hurles ME (2007) The population genetics of structural variation. Nat Genet 39(7 Suppl):S30–S36PubMedPubMedCentralCrossRefGoogle Scholar
  37. 37.
    Nistér M, Wedell B, Betsholtz C et al (1987) Evidence for progressional changes in the human malignant glioma line U-343 MGa: analysis of karyotype and expression of genes encoding the subunit chains of platelet-derived growth factor. Cancer Res 47(18):4953–4960PubMedPubMedCentralGoogle Scholar
  38. 38.
    Leek JT, Scharpf RB, Bravo HC et al (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10):733–739PubMedCrossRefPubMedCentralGoogle Scholar
  39. 39.
    Allison DB, Cui X, Page GP, Sabripour M (2006) Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet 7(1):55–65PubMedCrossRefGoogle Scholar
  40. 40.
    Benito M, Parker J, Du Q et al (2004) Adjustment of systematic microarray data biases. Bioinformatics 20(1):105–114PubMedCrossRefPubMedCentralGoogle Scholar
  41. 41.
    Irizarry RA, Hobbs B, Collin F et al (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264PubMedCrossRefPubMedCentralGoogle Scholar
  42. 42.
    Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127PubMedCrossRefPubMedCentralGoogle Scholar
  43. 43.
    Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735PubMedCrossRefPubMedCentralGoogle Scholar
  44. 44.
    Nygaard V, Rodland EA, Hovig E (2016) Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses. Biostatistics 17(1):29–39PubMedPubMedCentralGoogle Scholar
  45. 45.
    Oytam Y, Sobhanmanesh F, Duesing K et al (2016) Risk-conscious correction of batch effects: maximising information extraction from high-throughput genomic datasets. BMC Bioinformatics 17(1):332PubMedPubMedCentralCrossRefGoogle Scholar
  46. 46.
    Reese SE, Archer KJ, Therneau TM et al (2013) A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis. Bioinformatics 29(22):2877–2883PubMedPubMedCentralCrossRefGoogle Scholar
  47. 47.
    Scharpf RB, Ruczinski I, Carvalho B et al (2011) A multilevel model to address batch effects in copy number estimation using SNP arrays. Biostatistics 12(1):33–50PubMedCrossRefPubMedCentralGoogle Scholar
  48. 48.
    Chung NC, Storey JD (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics 31(4):545–554PubMedCrossRefPubMedCentralGoogle Scholar
  49. 49.
    Manimaran S, Selby HM, Okrah K et al (2016) BatchQC: interactive software for evaluating sample and batch effects in genomic data. Bioinformatics 32(24):3836–3838PubMedPubMedCentralCrossRefGoogle Scholar
  50. 50.
    Novembre J, Johnson T, Bryc K et al (2008) Genes mirror geography within Europe. Nature 456(7218):98–101PubMedPubMedCentralCrossRefGoogle Scholar
  51. 51.
    Yang J, Lee SH, Goddard ME et al (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88(1):76–82PubMedPubMedCentralCrossRefGoogle Scholar
  52. 52.
    Lachin JM, Matts JP, Wei LJ (1988) Randomization in clinical trials: conclusions and recommendations. Control Clin Trials 9(4):365–374PubMedCrossRefPubMedCentralGoogle Scholar
  53. 53.
    Altman DG (1991) Randomisation. BMJ 302(6791):1481–1482PubMedPubMedCentralCrossRefGoogle Scholar
  54. 54.
    Altman DG, Bland JM (1999) How to randomise. BMJ 319(7211):703–704PubMedPubMedCentralCrossRefGoogle Scholar
  55. 55.
    Box GEP, Hunter JS, Hunter WG (2005) Statistics for experimenters : design, innovation, and discovery, 2nd edn. Wiley-Interscience.; xvii, Hoboken, N.J, p 633Google Scholar
  56. 56.
    Fisher RA, Bennett JH, Fisher RA et al (1990) Statistical methods, experimental design, and scientific inference. Oxford University Press, Oxford England; New YorkGoogle Scholar
  57. 57.
    Maxwell SE, Delaney HD (2004) Designing experiments and analyzing data : a model comparison perspective, 2nd edn. Lawrence Erlbaum Associates, Mahwah, N.JGoogle Scholar
  58. 58.
    Montgomery DC (2008) Design and analysis of experiments, 7th edn. Wiley. xvii, Hoboken, NJ, p 656Google Scholar
  59. 59.
    Blainey P, Krzywinski M, Altman N (2014) Points of significance: replication. Nat Methods 11(9):879–880PubMedCrossRefPubMedCentralGoogle Scholar
  60. 60.
    Dowjat K, Włodarska I (1981) G-banding patterns in mouse lymphoblastic leukemia L1210. J Natl Cancer Inst 66(1):177–182PubMedPubMedCentralGoogle Scholar
  61. 61.
    Pepler WJ, Smith M, van Niekerk WA (1968) An unusual karyotype in a patient with signs suggestive of Down's syndrome. J Med Genet 5(1):68–71PubMedPubMedCentralCrossRefGoogle Scholar
  62. 62.
    International HapMap Consortium (2003) The international HapMap project. Nature 426(6968):789–796CrossRefGoogle Scholar
  63. 63.
    Conrad DF, Andrews TD, Carter NP et al (2006) A high-resolution survey of deletion polymorphism in the human genome. Nat Genet 38(1):75–81PubMedCrossRefPubMedCentralGoogle Scholar
  64. 64.
    McCarroll SA, Hadnott TN, Perry GH et al (2006) Common deletion polymorphisms in the human genome. Nat Genet 38(1):86–92PubMedCrossRefPubMedCentralGoogle Scholar
  65. 65.
    Attiyeh EF, Diskin SJ, Attiyeh MA et al (2009) Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. Genome Res 19(2):276–283PubMedPubMedCentralCrossRefGoogle Scholar
  66. 66.
    LaFramboise T, Weir BA, Zhao X et al (2005) Allele-specific amplification in cancer revealed by SNP array analysis. PLoS Comput Biol 1(6):e65PubMedPubMedCentralCrossRefGoogle Scholar
  67. 67.
    Coin LJM, Asher JE, Walters RG et al (2010) cnvHap: an integrative population and haplotype-based multiplatform model of SNPs and CNVs. Nat Methods 7(7):541–546PubMedCrossRefPubMedCentralGoogle Scholar
  68. 68.
    Colella S, Yau C, Taylor JM et al (2007) QuantiSNP: an objective Bayes hidden-Markov model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 35(6):2013–2025PubMedPubMedCentralCrossRefGoogle Scholar
  69. 69.
    Wang K, Li M, Hadley D, Liu R et al (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17(11):1665–1674PubMedPubMedCentralCrossRefGoogle Scholar
  70. 70.
    Illumina. CNVpartition. http://wwwilluminacom/documents/products/technotes/technote_cnv_algorithmspdfGoogle Scholar
  71. 71.
    Carter NP (2007) Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet 39(7 Suppl):S16–S21PubMedPubMedCentralCrossRefGoogle Scholar
  72. 72.
    Kallioniemi A, Kallioniemi OP et al (1992) Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science 258(5083):818–821PubMedCrossRefPubMedCentralGoogle Scholar
  73. 73.
    Redon R, Rigler D, Carter NP (2009) Comparative genomic hybridization: DNA preparation for microarray fabrication. Methods Mol Biol 529:259–266PubMedPubMedCentralCrossRefGoogle Scholar
  74. 74.
    Ylstra B, van den Ijssel P, Carvalho B et al (2006) BAC to the future! Or oligonucleotides: a perspective for micro array comparative genomic hybridization (array CGH). Nucleic Acids Res 34(2):445–450PubMedPubMedCentralCrossRefGoogle Scholar
  75. 75.
    Curtis C, Lynch AG, Dunning MJ et al (2009) The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 10:588PubMedPubMedCentralCrossRefGoogle Scholar
  76. 76.
    Pinto D, Darvishi K, Shi X et al (2011) Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29(6):512–520PubMedPubMedCentralCrossRefGoogle Scholar
  77. 77.
    Bignell GR, Santarius T, Pole JCM et al (2007) Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res 17(9):1296–1303PubMedPubMedCentralCrossRefGoogle Scholar
  78. 78.
    Pinkel D, Albertson DG (2005) Array comparative genomic hybridization and its applications in cancer. Nat Genet 37(Suppl):S11–S17PubMedCrossRefPubMedCentralGoogle Scholar
  79. 79.
    Oostlander AE, Meijer GA, Ylstra B (2004) Microarray-based comparative genomic hybridization and its applications in human genetics. Clin Genet 66(6):488–495PubMedCrossRefPubMedCentralGoogle Scholar
  80. 80.
    Shaffer LG, Bejjani BA (2006) Medical applications of array CGH and the transformation of clinical cytogenetics. Cytogenet Genome Res 115(3–4):303–309PubMedCrossRefPubMedCentralGoogle Scholar
  81. 81.
    Edelmann L, Hirschhorn K (2009) Clinical utility of array CGH for the detection of chromosomal imbalances associated with mental retardation and multiple congenital anomalies. Ann N Y Acad Sci 1151:157–166PubMedCrossRefPubMedCentralGoogle Scholar
  82. 82.
    Boone PM, Bacino CA, Shaw CA et al (2010) Detection of clinically relevant exonic copy-number changes by array CGH. Hum Mutat 31(12):1326–1342PubMedPubMedCentralCrossRefGoogle Scholar
  83. 83.
    Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17(6):333–351PubMedCrossRefPubMedCentralGoogle Scholar
  84. 84.
    Pirooznia M, Goes FS, Zandi PP (2005) Whole-genome CNV analysis: advances in computational approaches. Front Genet 6:138Google Scholar
  85. 85.
    Tuzun E, Sharp AJ, Bailey JA et al (2005) Fine-scale structural variation of the human genome. Nat Genet 37(7):727–732PubMedCrossRefPubMedCentralGoogle Scholar
  86. 86.
    Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21):2865–2871PubMedPubMedCentralCrossRefGoogle Scholar
  87. 87.
    Simpson JT, Wong K, Jackman SD et al (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19(6):1117–1123PubMedPubMedCentralCrossRefGoogle Scholar
  88. 88.
    Li R, Zhu H, Ruan J et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20(2):265–272PubMedPubMedCentralCrossRefGoogle Scholar
  89. 89.
    Iqbal Z, Caccamo M, Turner I et al (2012) De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet 44(2):226–232PubMedPubMedCentralCrossRefGoogle Scholar
  90. 90.
    Simpson JT, Durbin R (2012) Efficient de novo assembly of large genomes using compressed data structures. Genome Res 22(3):549–556PubMedPubMedCentralCrossRefGoogle Scholar
  91. 91.
    Abecasis GR, Altshuler D, 1000 Genomes Project Consortium et al (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061–1073PubMedCrossRefPubMedCentralGoogle Scholar
  92. 92.
    Mills RE, Walter K, Stewart C et al (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65PubMedPubMedCentralCrossRefGoogle Scholar
  93. 93.
    Wheeler E, Huang N, Bochukova EG et al (2013) Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity. Nat Genet 45(5):513–517PubMedPubMedCentralCrossRefGoogle Scholar
  94. 94.
    Johansson MM, Van Geystelen A, Larmuseau MH et al (2015) Microarray analysis of copy number variants on the human Y chromosome reveals novel and frequent duplications overrepresented in specific haplogroups. PLoS One 10(8):e0137223PubMedPubMedCentralCrossRefGoogle Scholar
  95. 95.
    Barnes C, Plagnol V, Fitzgerald T et al (2008) A robust statistical method for case-control association testing with copy number variation. Nat Genet 40(10):1245–1252PubMedPubMedCentralCrossRefGoogle Scholar
  96. 96.
    Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR (2011) CNVassoc: association analysis of CNV data using R. BMC Med Genet 4:47Google Scholar
  97. 97.
    Glessner JT, Li J, Hakonarson H (2013) ParseCNV integrative copy number variation association software with quality tracking. Nucleic Acids Res 41(5):e64PubMedPubMedCentralCrossRefGoogle Scholar
  98. 98.
    Mace A, Tuke MA, Beckmann JS et al (2016) New quality measure for SNP array based CNV detection. Bioinformatics 32(21):3298–3305PubMedCrossRefPubMedCentralGoogle Scholar
  99. 99.
    Kutalik Z, Johnson T, Bochud M et al (2011) Methods for testing association between uncertain genotypes and quantitative traits. Biostatistics 12(1):1–17PubMedCrossRefPubMedCentralGoogle Scholar
  100. 100.
    Ionita-Laza I, Perry GH, Raby BA et al (2008) On the analysis of copy-number variations in genome-wide association studies: a translation of the family-based association test. Genet Epidemiol 32(3):273–284PubMedCrossRefPubMedCentralGoogle Scholar
  101. 101.
    Murphy A, Won S, Rogers A et al (2010) On the genome-wide analysis of copy number variants in family-based designs: methods for combining family-based and population-based information for testing dichotomous or quantitative traits, or completely ascertained samples. Genet Epidemiol 34(6):582–590PubMedPubMedCentralCrossRefGoogle Scholar
  102. 102.
    Zanda M, Onengut S, Walker N et al (2012) Validity of the family-based association test for copy number variant data in the case of non-linear intensity-genotype relationship. Genet Epidemiol 36(8):895–898PubMedPubMedCentralGoogle Scholar
  103. 103.
    Zanda M, Onengut-Gumuscu S, Walker N et al (2014) A genome-wide assessment of the role of untagged copy number variants in type 1 diabetes. PLoS Genet 10(5):e1004367PubMedPubMedCentralCrossRefGoogle Scholar
  104. 104.
    McCarroll SA, Kuruvilla FG, Korn JM et al (2008) Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 40(10):1166–1174PubMedCrossRefPubMedCentralGoogle Scholar
  105. 105.
    Greenman CD, Bignell G, Butler A et al (2010) PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics 11(1):164–175PubMedCrossRefPubMedCentralGoogle Scholar
  106. 106.
    Van Loo P, Nordgard SH, Lingjærde OC et al (2010) Allele-specific copy number analysis of tumors. Proc Natl Acad Sci U S A 107(39):16910–16915PubMedPubMedCentralCrossRefGoogle Scholar
  107. 107.
    Locke AE, Kahali B, Berndt SI et al (2015) Genetic studies of body mass index yield new insights for obesity biology. Nature 518(7538):197–206PubMedPubMedCentralCrossRefGoogle Scholar
  108. 108.
    Wood AR, Esko T, Yang J et al (2014) Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 46(11):1173–1186PubMedPubMedCentralCrossRefGoogle Scholar
  109. 109.
    Voight BF, Kang HM, Ding J et al (2012) The Metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet 8(8):e1002793PubMedPubMedCentralCrossRefGoogle Scholar
  110. 110.
    Feng S, Liu D, Zhan X et al (2014) RAREMETAL: fast and powerful meta-analysis for rare variants. Bioinformatics 30(19):2828–2829PubMedPubMedCentralCrossRefGoogle Scholar
  111. 111.
    Wu MC, Lee S, Cai T et al (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89(1):82–93PubMedPubMedCentralCrossRefGoogle Scholar
  112. 112.
    Zhan X, Girirajan S, Zhao N et al (2016) A novel copy number variants kernel association test with application to autism spectrum disorders studies. Bioinformatics 32(23):3603–3610PubMedPubMedCentralGoogle Scholar
  113. 113.
    Gao X, Starmer J, Martin ER (2008) A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol 32(4):361–369PubMedCrossRefPubMedCentralGoogle Scholar
  114. 114.
    Walters RG, Jacquemont S, Valsesia A et al (2010) A new highly penetrant form of obesity due to deletions on chromosome 16p11.2. Nature 463(20130649):671–675PubMedPubMedCentralCrossRefGoogle Scholar
  115. 115.
    Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55(4):997–1004PubMedCrossRefPubMedCentralGoogle Scholar
  116. 116.
    Kang HM, Sul JH, Service SK et al (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42(4):348–354PubMedPubMedCentralCrossRefGoogle Scholar
  117. 117.
    Loh PR, Tucker G, Bulik-Sullivan BK et al (2015) Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 47(3):284–290PubMedPubMedCentralCrossRefGoogle Scholar
  118. 118.
    Clevert DA, Mitterecker A, Mayr A et al (2011) Cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate. Nucleic Acids Res 39(12):e79PubMedPubMedCentralCrossRefGoogle Scholar
  119. 119.
    Klambauer G, Schwarzbauer K, Mayr A et al (2012) Cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res 40(9):e69PubMedPubMedCentralCrossRefGoogle Scholar
  120. 120.
    Cardon LR, Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361(9357):598–604PubMedPubMedCentralCrossRefGoogle Scholar
  121. 121.
    Rosenberg NA, Huang L, Jewett EM et al (2010) Genome-wide association studies in diverse populations. Nat Rev Genet 11(5):356–366PubMedPubMedCentralCrossRefGoogle Scholar
  122. 122.
    Cheverud JM (2001) A simple correction for multiple comparisons in interval mapping genome scans. Heredity (Edinb) 87(Pt 1):52–58CrossRefGoogle Scholar
  123. 123.
    Nyholt DR (2004) A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74(4):765–769PubMedPubMedCentralCrossRefGoogle Scholar
  124. 124.
    Stuppia L, Antonucci I, Palka G et al (2012) Use of the MLPA assay in the molecular diagnosis of gene copy number alterations in human genetic diseases. Int J Mol Sci 13(3):3245–3276PubMedPubMedCentralCrossRefGoogle Scholar
  125. 125.
    Hupe P, Stransky N, Thiery JP et al (2004) Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics 20(18):3413–3422PubMedCrossRefPubMedCentralGoogle Scholar
  126. 126.
    Bengtsson H, Irizarry R, Carvalho B et al (2008) Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics 24(6):759–767PubMedCrossRefPubMedCentralGoogle Scholar
  127. 127.
    Pique-Regi R, Monso-Varona J, Ortega A et al (2008) Sparse representation and Bayesian detection of genome copy number alterations from microarray data. Bioinformatics 24(3):309–318PubMedPubMedCentralCrossRefGoogle Scholar
  128. 128.
    Olshen AB, Venkatraman ES, Lucito R et al (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4):557–572PubMedCrossRefPubMedCentralGoogle Scholar
  129. 129.
    Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6(9):677–681PubMedPubMedCentralCrossRefGoogle Scholar
  130. 130.
    Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z et al (2009) PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol 10(2):R23PubMedPubMedCentralCrossRefGoogle Scholar
  131. 131.
    Lee WP, Stromberg MP, Ward A et al (2014) MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping. PLoS One 9(3):e90581PubMedPubMedCentralCrossRefGoogle Scholar
  132. 132.
    Hormozdiari F, Alkan C, Eichler EE et al (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19(7):1270–1278PubMedPubMedCentralCrossRefGoogle Scholar
  133. 133.
    Korbel JO, Urban AE, Affourtit JP et al (2007) Paired-end mapping reveals extensive structural variation in the human genome. Science 318(5849):420–426PubMedPubMedCentralCrossRefGoogle Scholar
  134. 134.
    Lee S, Hormozdiari F, Alkan C et al (2009) Detecting small indels from clone-end sequencing with mixtures of distributions. Nat Methods 6(7):473–474PubMedCrossRefPubMedCentralGoogle Scholar
  135. 135.
    Campbell PJ, Stephens PJ, Pleasance ED et al (2008) Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet 40(6):722–729PubMedPubMedCentralCrossRefGoogle Scholar
  136. 136.
    Chiang DY, Getz G, Jaffe DB et al (2009) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6(1):99–103PubMedCrossRefPubMedCentralGoogle Scholar
  137. 137.
    Li X, Chen S, Xie W et al (2014) PSCC: sensitive and reliable population-scale copy number variation detection method based on low coverage sequencing. PLoS One 9(1):e85096PubMedPubMedCentralCrossRefGoogle Scholar
  138. 138.
    Wang H, Nettleton D, Ying K (2014) Copy number variation detection using next generation sequencing read counts. BMC Bioinformatics 15:109PubMedPubMedCentralCrossRefGoogle Scholar
  139. 139.
    Alkan C, Kidd JM, Marques-Bonet T et al (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41(10):1061–1067PubMedPubMedCentralCrossRefGoogle Scholar
  140. 140.
    Yoon S, Xuan Z, Makarov V et al (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19(9):1586–1592PubMedPubMedCentralCrossRefGoogle Scholar
  141. 141.
    Nguyen HT, Merriman TR, Black MA (2014) The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data. Front Genet 5:248PubMedPubMedCentralCrossRefGoogle Scholar
  142. 142.
    Lin K, Smit S, Bonnema G et al (2015) Making the difference: integrating structural variation detection tools. Brief Bioinform 16(5):852–864PubMedCrossRefPubMedCentralGoogle Scholar
  143. 143.
    Schroder J, Hsu A, Boyle SE et al (2014) Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics 30(8):1064–1072PubMedPubMedCentralCrossRefGoogle Scholar
  144. 144.
    Trappe K, Emde AK, Ehrlich HC et al (2014) Detecting and correctly classifying SVs in the NGS twilight zone. Bioinformatics 30(24):3484–3490PubMedCrossRefPubMedCentralGoogle Scholar
  145. 145.
    Jiang Y, Wang Y, Brudno M (2012) PRISM: pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants. Bioinformatics 28(20):2576–2583PubMedCrossRefPubMedCentralGoogle Scholar
  146. 146.
    Zhang ZD, Du J, Lam H et al (2011) Identification of genomic indels and structural variations using split reads. BMC Genomics 12:375PubMedPubMedCentralCrossRefGoogle Scholar
  147. 147.
    Simpson JT, Durbin R (2010) Efficient construction of an assembly string graph using the FM-index. Bioinformatics 26(12):i367–i373PubMedPubMedCentralCrossRefGoogle Scholar
  148. 148.
    Massouras A, Hens K, Gubelmann C et al (2010) Primer-initiated sequence synthesis to detect and assemble structural variants. Nat Methods 7(7):485–486PubMedCrossRefPubMedCentralGoogle Scholar
  149. 149.
    Medvedev P, Fiume M, Dzamba M et al (2010) Detecting copy number variation with mated short reads. Genome Res 20(11):1613–1622PubMedPubMedCentralCrossRefGoogle Scholar
  150. 150.
    Marschall T, Hajirasouliha I, Schonhuth A (2013) MATE-CLEVER: Mendelian-inheritance-aware discovery and genotyping of midsize and long indels. Bioinformatics 29(24):3143–3150PubMedPubMedCentralCrossRefGoogle Scholar
  151. 151.
    Zhang J, Wu Y (2011) SVseq: an approach for detecting exact breakpoints of deletions with low-coverage sequence data. Bioinformatics 27(23):3228–3234PubMedCrossRefPubMedCentralGoogle Scholar
  152. 152.
    Quinlan AR, Clark RA, Sokolova S et al (2010) Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Res 20(5):623–635PubMedPubMedCentralCrossRefGoogle Scholar
  153. 153.
    Hajirasouliha I, Hormozdiari F, Alkan C et al (2010) Detection and characterization of novel sequence insertions using paired-end next-generation sequencing. Bioinformatics 26(10):1277–1283PubMedPubMedCentralCrossRefGoogle Scholar
  154. 154.
    Jiang Y, Oldridge DA, Diskin SJ et al (2015) CODEX: a normalization and copy number variation detection method for whole exome sequencing. Nucleic Acids Res 43(6):e39PubMedPubMedCentralCrossRefGoogle Scholar
  155. 155.
    Bansal V, Dorn C, Grunert M et al (2014) Outlier-based identification of copy number variations using targeted resequencing in a small cohort of patients with tetralogy of Fallot. PLoS One 9(1):e85375PubMedPubMedCentralCrossRefGoogle Scholar
  156. 156.
    Magi A, Tattini L, Cifola I et al (2013) EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biol 14(10):R120PubMedPubMedCentralCrossRefGoogle Scholar
  157. 157.
    Coin LJ, Cao D, Ren J et al (2012) An exome sequencing pipeline for identifying and genotyping common CNVs associated with disease with application to psoriasis. Bioinformatics 28(18):i370–i3i4PubMedPubMedCentralCrossRefGoogle Scholar
  158. 158.
    Fromer M, Moran JL, Chambert K et al (2012) Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91(4):597–607PubMedPubMedCentralCrossRefGoogle Scholar
  159. 159.
    Krumm N, Sudmant PH, Ko A et al (2012) Copy number variation detection and genotyping from exome sequence data. Genome Res 22(8):1525–1532PubMedPubMedCentralCrossRefGoogle Scholar
  160. 160.
    Plagnol V, Curtis J, Epstein M et al (2012) A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics 28(21):2747–2754PubMedPubMedCentralCrossRefGoogle Scholar
  161. 161.
    Korn JM, Kuruvilla FG, McCarroll SA et al (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40(10):1253–1260PubMedPubMedCentralCrossRefGoogle Scholar
  162. 162.
    Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575PubMedPubMedCentralCrossRefGoogle Scholar
  163. 163.
    Palta P, Kaplinski L, Nagirnaja L et al (2015) Haplotype phasing and inheritance of copy number variants in nuclear families. PLoS One 10(4):e0122713PubMedPubMedCentralCrossRefGoogle Scholar
  164. 164.
    Chettier R, Ward K, Albertsen HM (2014) Endometriosis is associated with rare copy number variants. PLoS One 9(8):e103968PubMedPubMedCentralCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Aurélien Macé
    • 1
    • 2
    • 3
  • Zoltán Kutalik
    • 1
    • 3
  • Armand Valsesia
    • 4
    Email author
  1. 1.Institute of Social and Preventive MedicineUniversity Hospital of LausanneLausanneSwitzerland
  2. 2.Department of Computational BiologyUniversity of LausanneLausanneSwitzerland
  3. 3.Swiss Institute of BioinformaticsLausanneSwitzerland
  4. 4.Nestlé Institute of Health SciencesLausanneSwitzerland

Personalised recommendations