Skip to main content

Reproducible, Scalable Fusion Gene Detection from RNA-Seq

  • Protocol
Cancer Gene Profiling

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1381))

  • 2948 Accesses

Abstract

Chromosomal rearrangements resulting in the creation of novel gene products, termed fusion genes, have been identified as driving events in the development of multiple types of cancer. As these gene products typically do not exist in normal cells, they represent valuable prognostic and therapeutic targets. Advances in next-generation sequencing and computational approaches have greatly improved our ability to detect and identify fusion genes. Nevertheless, these approaches require significant computational resources. Here we describe an approach which leverages cloud computing technologies to perform fusion gene detection from RNA sequencing data at any scale. We additionally highlight methods to enhance reproducibility of bioinformatics analyses which may be applied to any next-generation sequencing experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Nowell P, Hungerford D (1960) A minute chromosome in human chronic granulocytic leukemia [abstract]. Science 132:1497

    Google Scholar 

  2. Groffen J, Stephenson JR, Heisterkamp N et al (1984) Philadelphia chromosomal breakpoints are clustered within a limited region, bcr, on chromosome 22. Cell 36:93–99

    Article  CAS  PubMed  Google Scholar 

  3. Koretzky GA (2007) The legacy of the Philadelphia chromosome. J Clin Invest 117:2030–2032

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Mitelman F, Johansson B, Mertens F (2007) The impact of translocations and gene fusions on cancer causation. Nat Rev Cancer 7:233–245

    Article  CAS  PubMed  Google Scholar 

  5. Tomlins SA, Laxman B, Varambally S et al (2008) Role of the TMPRSS2-ERG gene fusion in prostate cancer. Neoplasia 10:177–188

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  6. Tomlins SA, Rhodes DR, Perner S et al (2005) Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310:644–648

    Article  CAS  PubMed  Google Scholar 

  7. Edgren H, Murumagi A, Kangaspeska S et al (2011) Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol 12:R6

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Aplan PD (2006) Causes of oncogenic chromosomal translocation. Trends Genet 22:46–55

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  9. Mitelman F, Johansson B, Mertens F (2004) Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nat Genet 36:331–334

    Article  CAS  PubMed  Google Scholar 

  10. Mitelman database of chromosome aberrations and gene fusions in cancer. http://cgap.nci.nih.gov/Chromosomes/Mitelman. Accessed 1 Feb 2015

  11. Wang Q, Xia J, Jia P et al (2013) Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives. Brief Bioinform 14:506–519

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  12. Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682

    Article  CAS  PubMed  Google Scholar 

  13. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Kim D, Pertea G, Trapnell C et al (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36

    Article  PubMed Central  PubMed  Google Scholar 

  15. Engström PG, Steijger T, Sipos B et al (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10:1185–1191

    Article  PubMed Central  PubMed  Google Scholar 

  16. Pruitt KD, Brown GR, Hiatt SM et al (2014) RefSeq: an update on mammalian reference sequences. Nucleic Acids Res 42:D756–D763

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  17. Hubbard T, Barker D, Birney E et al (2002) The Ensembl genome database project. Nucleic Acids Res 30:38–41

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Dobin A, Davis CA, Schlesinger F et al (2012) STAR: ultrafast universal RNA-seq aligner. Bioinformatics. doi:10.1093/bioinformatics/bts635

    PubMed Central  PubMed  Google Scholar 

  19. Abate F, Acquaviva A, Paciello G et al (2012) Bellerophontes: an RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model. Bioinformatics 28:2114–2121

    Article  CAS  PubMed  Google Scholar 

  20. Chen K, Wallis JW, Kandoth C et al (2012) BreakFusion: targeted assembly-based identification of gene fusions in whole transcriptome paired-end sequencing data. Bioinformatics 28:1923–1924

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Iyer MK, Chinnaiyan AM, Maher CA (2011) ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics 27:2903–2904

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. McPherson A, Hormozdiari F, Zayed A et al (2011) deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput Biol 7, e1001138

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  23. Yorukoglu D, Hach F, Swanson L et al (2012) Dissect: detection and characterization of novel structural alterations in transcribed sequences. Bioinformatics 28:i179–i187

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. Nicorici D, Satalan M, Edgren H et al (2014) FusionCatcher—a tool for finding somatic fusion genes in paired-end RNA-sequencing data. bioRxiv. doi: 10.1101/011650

  25. Francis RW, Thompson-Wicking K, Carter KW et al (2012) FusionFinder: a software tool to identify expressed gene fusion candidates from RNA-Seq data. PLoS One 7, e39987

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  26. Li Y, Chien J, Smith DI, Ma J (2011) FusionHunter: identifying fusion transcripts in cancer using paired-end RNA-seq. Bioinformatics 27:1708–1710

    Article  CAS  PubMed  Google Scholar 

  27. Ge H, Liu K, Juan T et al (2011) FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution. Bioinformatics 27:1922–1928

    Article  CAS  PubMed  Google Scholar 

  28. Liu C, Ma J, Chang CJ, Zhou X (2013) FusionQ: a novel approach for gene fusion detection and quantification from paired-end RNA-Seq. BMC Bioinformatics 14:193

    Article  PubMed Central  PubMed  Google Scholar 

  29. Sboner A, Habegger L, Pflueger D et al (2010) FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data. Genome Biol 11:R104

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  30. Davidson NM, Majewski IJ, Oshlack A (2015) JAFFA: high sensitivity transcriptome-focused fusion gene detection. Genome Med 7(1):43

    Article  PubMed Central  PubMed  Google Scholar 

  31. Bandlamudi C, Lin P, Tian J et al (2014) Discovery and functional characterization of recurrent gene fusions from 7,470 primary tumor transcriptomes across 28 human cancers. ASHG 2014 meeting abstracts

    Google Scholar 

  32. Kinsella M, Harismendy O, Nakano M et al (2011) Sensitive gene fusion detection using ambiguously mapping RNA-Seq read pairs. Bioinformatics 27:1068–1075

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  33. Asmann YW, Hossain A, Necela BM et al (2011) A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res 39, e100

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  34. Jia W, Qiu K, He M et al (2013) SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data. Genome Biol 14:R12

    Article  PubMed Central  PubMed  Google Scholar 

  35. Wu J, Zhang W, Huang S et al (2013) SOAPfusion: a robust and effective computational fusion discovery tool for RNA-seq reads. Bioinformatics 29:2971–2978

    Article  CAS  PubMed  Google Scholar 

  36. Kim D, Salzberg SL (2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol 12:R72

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Fernandez-Cuesta L, Sun R, Menon R et al (2015) Identification of novel fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data. Genome Biol 16:7

    Article  PubMed Central  PubMed  Google Scholar 

  38. Li J-W, Wan R, Yu C-S et al (2013) ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution. Bioinformatics 29:649–651

    Article  PubMed Central  PubMed  Google Scholar 

  39. McPherson A, Wu C, Hajirasouliha I et al (2011) Comrad: detection of expressed rearrangements by integrated analysis of RNA-Seq and low coverage genome sequence data. Bioinformatics 27:1481–1488

    Article  CAS  PubMed  Google Scholar 

  40. McPherson A, Wu C, Wyatt AW et al (2012) nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing. Genome Res 22:2250–2261

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  41. Piazza R, Pirola A, Spinelli R et al (2012) FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery. Nucleic Acids Res 40, e123

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  42. Beccuti M, Carrara M, Cordero F et al (2014) Chimera: a Bioconductor package for secondary analysis of fusion products. Bioinformatics 30:3556–3557

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  43. Shugay M, Ortiz de Mendíbil I, Vizmanos JL, Novo FJ (2013) Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. Bioinformatics 29:2539–2546

    Article  CAS  PubMed  Google Scholar 

  44. Abate F, Zairis S, Ficarra E et al (2014) Pegasus: a comprehensive annotation and prediction tool for detection of driver gene fusions in cancer. BMC Syst Biol 8:97

    Article  PubMed Central  PubMed  Google Scholar 

  45. Common-workflow-language common-workflow-language/common-workflow-language. In: GitHub. https://github.com/common-workflow-language/common-workflow-language. Accessed 22 Feb 2015

  46. Docker build, ship, and run any app, anywhere. https://www.docker.com/. Accessed 1 Aug 2014

  47. rabix rabix/rabix. In: GitHub. https://github.com/rabix/rabix. Accessed 22 Feb 2015

  48. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  49. Krzywinski M, Schein J, Birol I et al (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  50. Arsenijevic V fusion transcript detection—ChimeraScan. https://igor.sbgenomics.com/lab/pipeline/view/540dd19dd79f00766c174ead/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brandi N. Davis-Dusenbery .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Arsenijevic, V., Davis-Dusenbery, B.N. (2016). Reproducible, Scalable Fusion Gene Detection from RNA-Seq. In: Grützmann, R., Pilarsky, C. (eds) Cancer Gene Profiling. Methods in Molecular Biology, vol 1381. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3204-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3204-7_13

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3203-0

  • Online ISBN: 978-1-4939-3204-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics