Advertisement

Identification of Copy Number Variants from SNP Arrays Using PennCNV

  • Li Fang
  • Kai Wang
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1833)

Abstract

High-resolution single-nucleotide polymorphism (SNP) genotyping arrays offer a sensitive and affordable method for genome-wide detection of copy number variants (CNVs). PennCNV is a hidden Markov model (HMM)-based CNV caller for SNP arrays, first released 10 years ago. A typical CNV calling procedure using PennCNV includes preparation of input files, CNV calling, filtering CNV calls, CNV annotation, and CNV visualization. Here we describe several protocols for CNV calling using PennCNV, together with descriptions on several recent improvements to the software tool.

Key words

Copy number variants SNP array Hidden Markov model PennCNV 

References

  1. 1.
    Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7(2):85–97. https://doi.org/10.1038/nrg1767 CrossRefPubMedGoogle Scholar
  2. 2.
    Zarrei M, MacDonald JR, Merico D et al (2015) A copy number variation map of the human genome. Nat Rev Genet 16(3):172–183. https://doi.org/10.1038/nrg3871 CrossRefPubMedGoogle Scholar
  3. 3.
    Sudmant PH, Rausch T, Gardner EJ et al (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526(7571):75–81. https://doi.org/10.1038/nature15394 CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Mills RE, Walter K, Stewart C et al (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65. https://doi.org/10.1038/nature09708 CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Zhang F, Gu W, Hurles ME et al (2009) Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 10:451–481. https://doi.org/10.1146/annurev.genom.9.081307.164217 CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Girirajan S, Campbell CD, Eichler EE (2011) Human copy number variation and complex genetic disease. Annu Rev Genet 45:203–226. https://doi.org/10.1146/annurev-genet-102209-163544 CrossRefPubMedGoogle Scholar
  7. 7.
    Weischenfeldt J, Symmons O, Spitz F et al (2013) Phenotypic impact of genomic structural variation: insights from and for human disease. Nat Rev Genet 14(2):125–138. https://doi.org/10.1038/nrg3373 CrossRefPubMedGoogle Scholar
  8. 8.
    Watson CT, Marques-Bonet T, Sharp AJ et al (2014) The genetics of microdeletion and microduplication syndromes: an update. Annu Rev Genomics Hum Genet 15:215–244. https://doi.org/10.1146/annurev-genom-091212-153408 CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Zack TI, Schumacher SE, Carter SL et al (2013) Pan-cancer patterns of somatic copy number alteration. Nat Genet 45(10):1134–1140. https://doi.org/10.1038/ng.2760 CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Beroukhim R, Mermel CH, Porter D et al (2010) The landscape of somatic copy-number alteration across human cancers. Nature 463(7283):899–905. https://doi.org/10.1038/nature08822 CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Carter NP (2007) Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet 39(7 Suppl):S16–S21. https://doi.org/10.1038/ng2028 CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Pinto D, Darvishi K, Shi X et al (2011) Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29(6):512–520. https://doi.org/10.1038/nbt.1852 CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Venkatraman ES, Olshen AB (2007) A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23(6):657–663. https://doi.org/10.1093/bioinformatics/btl646 CrossRefPubMedGoogle Scholar
  14. 14.
    Olshen AB, Venkatraman ES, Lucito R et al (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4):557–572. https://doi.org/10.1093/biostatistics/kxh008 CrossRefPubMedGoogle Scholar
  15. 15.
    Price TS, Regan R, Mott R et al (2005) SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative genome hybridization data. Nucleic Acids Res 33(11):3455–3464. https://doi.org/10.1093/nar/gki643 CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Cooper GM, Zerr T, Kidd JM et al (2008) Systematic assessment of copy number variant detection via genome-wide SNP genotyping. Nat Genet 40(10):1199–1203. https://doi.org/10.1038/ng.236 CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Peiffer DA, Le JM, Steemers FJ et al (2006) High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res 16(9):1136–1148. https://doi.org/10.1101/gr.5402306 CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Wang K, Li M, Hadley D et al (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17(11):1665–1674. https://doi.org/10.1101/gr.6861907 CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Colella S, Yau C, Taylor JM et al (2007) QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 35(6):2013–2025. https://doi.org/10.1093/nar/gkm076 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Zhang X, Du R, Li S et al (2014) Evaluation of copy number variation detection for a SNP array platform. BMC Bioinformatics 15:50. https://doi.org/10.1186/1471-2105-15-50 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Marenne G, Rodriguez-Santiago B, Closas MG et al (2011) Assessment of copy number variation using the Illumina Infinium 1M SNP-array: a comparison of methodological approaches in the Spanish Bladder Cancer/EPICURO study. Hum Mutat 32(2):240–248. https://doi.org/10.1002/humu.21398 CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Dellinger AE, Saw SM, Goh LK et al (2010) Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res 38(9):e105. https://doi.org/10.1093/nar/gkq040 CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Sanders SJ, He X, Willsey AJ et al (2015) Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron 87(6):1215–1233. https://doi.org/10.1016/j.neuron.2015.09.016 CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Huang AY, Yu D, Davis LK et al (2017) Rare copy number variants in NRXN1 and CNTN6 increase risk for tourette syndrome. Neuron 94(6):1101–1111 e1107. https://doi.org/10.1016/j.neuron.2017.06.010 CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Marshall CR, Howrigan DP, Merico D et al (2017) Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat Genet 49(1):27–35. https://doi.org/10.1038/ng.3725 CrossRefPubMedGoogle Scholar
  26. 26.
    Elia J, Glessner JT, Wang K et al (2011) Genome-wide copy number variation study associates metabotropic glutamate receptor gene networks with attention deficit hyperactivity disorder. Nat Genet 44(1):78–84. https://doi.org/10.1038/ng.1013 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Green EK, Rees E, Walters JT et al (2016) Copy number variation in bipolar disorder. Mol Psychiatry 21(1):89–93. https://doi.org/10.1038/mp.2014.174 CrossRefPubMedGoogle Scholar
  28. 28.
    Rucker JJ, Tansey KE, Rivera M et al (2016) Phenotypic association analyses with copy number variation in recurrent depressive disorder. Biol Psychiatry 79(4):329–336. https://doi.org/10.1016/j.biopsych.2015.02.025 CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Glessner JT, Li J, Hakonarson H (2013) ParseCNV integrative copy number variation association software with quality tracking. Nucleic Acids Res 41(5):e64. https://doi.org/10.1093/nar/gks1346 CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    McCarroll SA, Kuruvilla FG, Korn JM et al (2008) Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 40(10):1166–1174. https://doi.org/10.1038/ng.238 CrossRefPubMedGoogle Scholar
  31. 31.
    Korn JM, Kuruvilla FG, McCarroll SA et al (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40(10):1253–1260. https://doi.org/10.1038/ng.237 CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Staaf J, Vallon-Christersson J, Lindgren D et al (2008) Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios. BMC Bioinformatics 9:409. https://doi.org/10.1186/1471-2105-9-409 CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Diskin SJ, Li M, Hou C et al (2008) Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms. Nucleic Acids Res 36(19):e126. https://doi.org/10.1093/nar/gkn556 CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Wang K, Chen Z, Tadesse MG et al (2008) Modeling genetic inheritance of copy number variations. Nucleic Acids Res 36(21):e138. https://doi.org/10.1093/nar/gkn641 CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Mace A, Tuke MA, Beckmann JS et al (2016) New quality measure for SNP array based CNV detection. Bioinformatics 32(21):3298–3305. https://doi.org/10.1093/bioinformatics/btw477 CrossRefPubMedGoogle Scholar
  36. 36.
    Glessner JT, Wang K, Sleiman PM et al (2010) Duplication of the SLIT3 locus on 5q35.1 predisposes to major depressive disorder. PLoS One 5(12):e15463. https://doi.org/10.1371/journal.pone.0015463 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Li Fang
    • 1
  • Kai Wang
    • 1
  1. 1.Raymond G. Perelman Center for Cellular and Molecular TherapeuticsChildren’s Hospital of PhiladelphiaPhiladelphiaUSA

Personalised recommendations