Use of RAPTR-SV to Identify SVs from Read Pairing and Split Read Signatures

  • Derek M. BickhartEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1833)


High-throughput short read sequencing technologies are still the leading cost-effective means of assessing variation in individual samples. Unfortunately, while such technologies are eminently capable of detecting single nucleotide polymorphisms (SNP) and small insertions and deletions, the detection of large copy number variants (CNV) with these technologies is prone to numerous false positives. CNV detection tools that incorporate multiple variant signals and exclude regions of systemic bias in the genome tend to reduce the probability of false positive calls and therefore represent the best means of ascertaining true CNV regions. To this end, we provide instructions and details on the use of the RAPTR-SV CNV detection pipeline, which is a tool that incorporates read-pair and split-read signals to identify high confidence CNV regions in a sequenced sample. By combining two different structural variant (SV) signals in variant calling, RAPTR-SV enables the easy filtration of artifact CNV calls from large datasets.

Key words

Read pair Split-read Combined detection RAPTR-SV Whole genome sequencing 


  1. 1.
    Korbel JO, Urban AE, Affourtit JP et al (2007) Paired-end mapping reveals extensive structural variation in the human genome. Science 318:420–426. CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681. CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Hormozdiari F, Hajirasouliha I, Dao P et al (2010) Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery. Bioinformatics 26:i350–i357. CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Korbel J, Abyzov A, Mu X et al (2009) PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol 10:R23. CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25:2865–2871. CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Handsaker RE, Korn JM, Nemesh J, McCarroll SA (2011) Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet 43:269–276. CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84. CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Bickhart DM, Hutchison JL, Xu L et al (2015) RAPTR-SV: a hybrid method for the detection of structural variants. Bioinformatics 31:2084–2090. CrossRefPubMedGoogle Scholar
  9. 9.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Hach F, Hormozdiari F, Alkan C et al (2010) mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat Methods 7:576–577. CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Hach F, Sarrafi I, Hormozdiari F et al (2014) mrsFAST-ultra: a compact, SNP-aware mapper for high performance sequencing applications. Nucleic Acids Res 42:W494–W500. CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Zhang C-Z, Spektor A, Cornils H et al (2015) Chromothripsis from DNA damage in micronuclei. Nature 522:179–184. CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19:1270–1278. CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    English AC, Richards S, Han Y et al (2012) Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7:e47768. CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Research Microbiologist/BioinformaticianUSDA ARS DFRCMadisonUSA

Personalised recommendations