Abstract
Chromosomal rearrangements resulting in the creation of novel gene products, termed fusion genes, have been identified as driving events in the development of multiple types of cancer. As these gene products typically do not exist in normal cells, they represent valuable prognostic and therapeutic targets. Advances in next-generation sequencing and computational approaches have greatly improved our ability to detect and identify fusion genes. Nevertheless, these approaches require significant computational resources. Here we describe an approach which leverages cloud computing technologies to perform fusion gene detection from RNA sequencing data at any scale. We additionally highlight methods to enhance reproducibility of bioinformatics analyses which may be applied to any next-generation sequencing experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nowell P, Hungerford D (1960) A minute chromosome in human chronic granulocytic leukemia [abstract]. Science 132:1497
Groffen J, Stephenson JR, Heisterkamp N et al (1984) Philadelphia chromosomal breakpoints are clustered within a limited region, bcr, on chromosome 22. Cell 36:93–99
Koretzky GA (2007) The legacy of the Philadelphia chromosome. J Clin Invest 117:2030–2032
Mitelman F, Johansson B, Mertens F (2007) The impact of translocations and gene fusions on cancer causation. Nat Rev Cancer 7:233–245
Tomlins SA, Laxman B, Varambally S et al (2008) Role of the TMPRSS2-ERG gene fusion in prostate cancer. Neoplasia 10:177–188
Tomlins SA, Rhodes DR, Perner S et al (2005) Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310:644–648
Edgren H, Murumagi A, Kangaspeska S et al (2011) Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol 12:R6
Aplan PD (2006) Causes of oncogenic chromosomal translocation. Trends Genet 22:46–55
Mitelman F, Johansson B, Mertens F (2004) Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nat Genet 36:331–334
Mitelman database of chromosome aberrations and gene fusions in cancer. http://cgap.nci.nih.gov/Chromosomes/Mitelman. Accessed 1 Feb 2015
Wang Q, Xia J, Jia P et al (2013) Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives. Brief Bioinform 14:506–519
Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63
Kim D, Pertea G, Trapnell C et al (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36
Engström PG, Steijger T, Sipos B et al (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10:1185–1191
Pruitt KD, Brown GR, Hiatt SM et al (2014) RefSeq: an update on mammalian reference sequences. Nucleic Acids Res 42:D756–D763
Hubbard T, Barker D, Birney E et al (2002) The Ensembl genome database project. Nucleic Acids Res 30:38–41
Dobin A, Davis CA, Schlesinger F et al (2012) STAR: ultrafast universal RNA-seq aligner. Bioinformatics. doi:10.1093/bioinformatics/bts635
Abate F, Acquaviva A, Paciello G et al (2012) Bellerophontes: an RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model. Bioinformatics 28:2114–2121
Chen K, Wallis JW, Kandoth C et al (2012) BreakFusion: targeted assembly-based identification of gene fusions in whole transcriptome paired-end sequencing data. Bioinformatics 28:1923–1924
Iyer MK, Chinnaiyan AM, Maher CA (2011) ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics 27:2903–2904
McPherson A, Hormozdiari F, Zayed A et al (2011) deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput Biol 7, e1001138
Yorukoglu D, Hach F, Swanson L et al (2012) Dissect: detection and characterization of novel structural alterations in transcribed sequences. Bioinformatics 28:i179–i187
Nicorici D, Satalan M, Edgren H et al (2014) FusionCatcher—a tool for finding somatic fusion genes in paired-end RNA-sequencing data. bioRxiv. doi: 10.1101/011650
Francis RW, Thompson-Wicking K, Carter KW et al (2012) FusionFinder: a software tool to identify expressed gene fusion candidates from RNA-Seq data. PLoS One 7, e39987
Li Y, Chien J, Smith DI, Ma J (2011) FusionHunter: identifying fusion transcripts in cancer using paired-end RNA-seq. Bioinformatics 27:1708–1710
Ge H, Liu K, Juan T et al (2011) FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution. Bioinformatics 27:1922–1928
Liu C, Ma J, Chang CJ, Zhou X (2013) FusionQ: a novel approach for gene fusion detection and quantification from paired-end RNA-Seq. BMC Bioinformatics 14:193
Sboner A, Habegger L, Pflueger D et al (2010) FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data. Genome Biol 11:R104
Davidson NM, Majewski IJ, Oshlack A (2015) JAFFA: high sensitivity transcriptome-focused fusion gene detection. Genome Med 7(1):43
Bandlamudi C, Lin P, Tian J et al (2014) Discovery and functional characterization of recurrent gene fusions from 7,470 primary tumor transcriptomes across 28 human cancers. ASHG 2014 meeting abstracts
Kinsella M, Harismendy O, Nakano M et al (2011) Sensitive gene fusion detection using ambiguously mapping RNA-Seq read pairs. Bioinformatics 27:1068–1075
Asmann YW, Hossain A, Necela BM et al (2011) A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res 39, e100
Jia W, Qiu K, He M et al (2013) SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data. Genome Biol 14:R12
Wu J, Zhang W, Huang S et al (2013) SOAPfusion: a robust and effective computational fusion discovery tool for RNA-seq reads. Bioinformatics 29:2971–2978
Kim D, Salzberg SL (2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol 12:R72
Fernandez-Cuesta L, Sun R, Menon R et al (2015) Identification of novel fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data. Genome Biol 16:7
Li J-W, Wan R, Yu C-S et al (2013) ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution. Bioinformatics 29:649–651
McPherson A, Wu C, Hajirasouliha I et al (2011) Comrad: detection of expressed rearrangements by integrated analysis of RNA-Seq and low coverage genome sequence data. Bioinformatics 27:1481–1488
McPherson A, Wu C, Wyatt AW et al (2012) nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing. Genome Res 22:2250–2261
Piazza R, Pirola A, Spinelli R et al (2012) FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery. Nucleic Acids Res 40, e123
Beccuti M, Carrara M, Cordero F et al (2014) Chimera: a Bioconductor package for secondary analysis of fusion products. Bioinformatics 30:3556–3557
Shugay M, Ortiz de MendÃbil I, Vizmanos JL, Novo FJ (2013) Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. Bioinformatics 29:2539–2546
Abate F, Zairis S, Ficarra E et al (2014) Pegasus: a comprehensive annotation and prediction tool for detection of driver gene fusions in cancer. BMC Syst Biol 8:97
Common-workflow-language common-workflow-language/common-workflow-language. In: GitHub. https://github.com/common-workflow-language/common-workflow-language. Accessed 22 Feb 2015
Docker build, ship, and run any app, anywhere. https://www.docker.com/. Accessed 1 Aug 2014
rabix rabix/rabix. In: GitHub. https://github.com/rabix/rabix. Accessed 22 Feb 2015
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359
Krzywinski M, Schein J, Birol I et al (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645
Arsenijevic V fusion transcript detection—ChimeraScan. https://igor.sbgenomics.com/lab/pipeline/view/540dd19dd79f00766c174ead/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Arsenijevic, V., Davis-Dusenbery, B.N. (2016). Reproducible, Scalable Fusion Gene Detection from RNA-Seq. In: Grützmann, R., Pilarsky, C. (eds) Cancer Gene Profiling. Methods in Molecular Biology, vol 1381. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3204-7_13
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3204-7_13
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3203-0
Online ISBN: 978-1-4939-3204-7
eBook Packages: Springer Protocols