Structural Variant Breakpoint Detection with novoBreak

  • Zechen ChongEmail author
  • Ken ChenEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1833)


Structural variations (SVs) are an important type of genomic variants and always play a critical role for cancer development and progression. In the cancer genomics era, detecting structural variations from short sequencing data is still challenging. We developed a novel algorithm, novoBreak (Chong et al. Nat Methods 14:65–67, 2017), which achieved the highest balanced accuracy (mean of sensitivity and precision) in the ICGC-TCGA DREAM 8.5 Somatic Mutation Calling Challenge. Here we describe detailed instructions of applying novoBreak (, an open-source software, for somatic SVs detection. We also briefly introduce how to detect germline SVs using novoBreak pipeline and how to use the Workflow ( of novoBreak on the Seven Bridges Cancer Genomics Cloud.

Key words

Structural variations Algorithm Next generation sequencing data analysis DNA sequence analysis Genomic rearrangement De novo assembly k-mer Genetic variation 


  1. 1.
    Kloosterman WP, Francioli LC, Hormozdiari F et al (2015) Characteristics of de novo structural changes in the human genome. Genome Res 25:792–801CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Berger MF, Lawrence MS, Demichelis F et al (2011) The genomic complexity of primary human prostate cancer. Nature 470:214–220CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Hillmer AM, Yao F, Inaki K et al (2011) Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes. Genome Res 21:665–675CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Campbell PJ, Yachida S, Mudie LJ et al (2010) The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467:1109–1113CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Mertens F, Johansson B, Fioretos T, Mitelman F (2015) The emerging complexity of gene fusions in cancer. Nat Rev Cancer 15:371–381CrossRefPubMedGoogle Scholar
  6. 6.
    Chen K, Wallis JW, McLellan MD et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19:1270–1278CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Ye K, Schulz MH, Long Q et al (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25:2865–2871CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Hajirasouliha I, Hormozdiari F, Alkan C et al (2010) Detection and characterization of novel sequence insertions using paired-end next-generation sequencing. Bioinformatics 26:1277–1283CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Rausch T, Zichner T, Schlattl A et al (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Chen K, Chen L, Fan X et al (2014) TIGRA: a targeted iterative graph routing assembler for breakpoint assembly. Genome Res 24:310–317CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Alkan C, Coe BP, Eichler EE (2011) Genome structural variation discovery and genotyping. Nat Rev Genet 12:363–376CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Medvedev P, Stanciu M, Brudno M (2009) Computational methods for discovering structural variation with next-generation sequencing. Nat Methods 6:S13–S20CrossRefPubMedGoogle Scholar
  15. 15.
    Stephens PJ, Greenman CD, Fu B et al (2011) Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144:27–40CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Baca SC, Prandi D, Lawrence MS et al (2013) Punctuated evolution of prostate cancer genomes. Cell 153:666–677CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Li Y, Zheng H, Luo R et al (2011) Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly. Nat Biotechnol 29:723–730CrossRefPubMedGoogle Scholar
  18. 18.
    Earl D, Bradnam K, John JS, Darling A, Lin D, Fass J, Yu HOK, Buffalo V, Zerbino DR, Diekhans M et al (2011) Assemblathon 1: A competitive assessment of de novo short read assembly methods. Genome Res 21:2224–2241CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Chong Z, Ruan J, Gao M et al (2017) novoBreak: local assembly for breakpoint detection in cancer genomes. Nat Methods 14:65–67CrossRefPubMedGoogle Scholar
  20. 20.
    Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997 [q-bio.GN]Google Scholar
  21. 21.
    Warren RL, Sutton GG, Jones SJM, Holt RA (2007) Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23:500–501CrossRefPubMedGoogle Scholar
  22. 22.
    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Genetics and Informatics Institute, School of MedicineThe University of Alabama at BirminghamBirminghamUSA
  2. 2.Department of Bioinformatics and Computational BiologyThe University of Texas MD Anderson Cancer CenterHoustonUSA

Personalised recommendations