Whole-Genome Shotgun Sequence CNV Detection Using Read Depth

  • Fatma Kahveci
  • Can AlkanEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1833)


With the developments in high-throughput sequencing (HTS) technologies, researchers have gained a powerful tool to identify structural variants (SVs) in genomes with substantially less cost than before. SVs can be broadly classified into two main categories: balanced rearrangements and copy number variations (CNVs). Many algorithms have been developed to characterize CNVs using HTS data, with focus on different types and size range of variants using different read signatures. Read depth (RD) based tools are more common in characterizing large (>10 kb) CNVs since RD strategy does not rely on the fragment size and read length, which are limiting factors in read pair and split read analysis. Here we provide a guideline for a user friendly tool for detecting large segmental duplications and deletions that can also predict integer copy numbers for duplicated genes.

Key words

Copy number variation Whole genome shotgun sequencing Read depth mrFAST mrsFAST 



Copy number variation


Micro read Copy Number Variant Regions


Micro read Fast Alignment Search Tool


Micro read substitution only Fast Alignment Search Tool


Read depth


Tandem repeat finder


Whole genome sequencing


  1. 1.
    Alkan C, Coe BP, Eichler EE (2011) Genome structural variation discovery and genotyping. Nat Rev Genet 12:363–376CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Ventura M, Catacchio CR, Alkan C et al (2011) Gorilla genome structural variation reveals evolutionary parallelisms with chimpanzee. Genome Res 21:1640–1649CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Prado-Martinez J, Sudmant PH, Kidd JM et al (2013) Great ape genetic diversity and population history. Nature 499:471–475CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Sudmant PH, Huddleston J, Catacchio CR et al (2013) Evolution and diversity of copy number variation in the great ape lineage. Genome Res 23:1373–1382CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Sharp AJ, Cheng Z, Eichler EE (2006) Structural variation of the human genome. Annu Rev Genomics Hum Genet 7:407–442CrossRefPubMedGoogle Scholar
  6. 6.
    Sharp AJ, Hansen S, Selzer RR et al (2006) Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat Genet 38:1038–1042CrossRefPubMedGoogle Scholar
  7. 7.
    Alkan C, Kidd JM, Marques-Bonet T et al (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41:1061–1067CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Bickhart DM, Hou Y, Schroeder SG et al (2012) Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res 22:778–790CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Prado-Martinez J, Hernando-Herraez I, Lorente-Galdos B et al (2013) The genome sequencing of an albino Western lowland gorilla reveals inbreeding in the wild. BMC Genomics 14:363CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Green RE, Krause J, Briggs AW et al (2010) A draft sequence of the Neandertal genome. Science 328:710–722CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Reich D, Green RE, Kircher M et al (2010) Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468:1053–1060CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Meyer M, Kircher M, Gansauge M-T et al (2012) A high-coverage genome sequence from an archaic Denisovan individual. Science 338:222–226CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Cardone MF, D’Addabbo P, Alkan C et al (2016) Inter-varietal structural variation in grapevine genomes. Plant J 88:648–661CrossRefPubMedGoogle Scholar
  14. 14.
    Chiang DY, McCarroll SA (2009) Mapping duplicated sequences. Nat Biotechnol 27:1001–1002CrossRefPubMedGoogle Scholar
  15. 15.
    Hach F, Hormozdiari F, Alkan C et al (2010) mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat Methods 7:576–577CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Hach F, Sarrafi I, Hormozdiari F et al (2014) mrsFAST-ultra: a compact, SNP-aware mapper for high performance sequencing applications. Nucleic Acids Res 42:W494–W500CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Smit AFA, Hubley R, Green P (1996–2004) RepeatMasker Open-3.0Google Scholar
  18. 18.
    Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer EngineeringBilkent UniversityAnkaraTurkey

Personalised recommendations