Skip to main content
Log in

Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data

  • Original Article
  • Published:
Molecular Genetics and Genomics Aims and scope Submit manuscript

Abstract

Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Bao ZS, Chen HM, Yang MY, Zhang CB, Yu K, Ye WL, Hu BQ, Yan W, Zhang W, Akers J, Ramakrishnan V, Li J, Carter B, Liu YW, Hu HM, Wang Z, Li MY, Yao K, Qiu XG, Kang CS, You YP, Fan XL, Song WS, Li RQ, Su XD, Chen CC, Jiang T (2014) RNA-seq of 272 gliomas revealed a novel, recurrent PTPRZ1-MET fusion transcript in secondary glioblastomas. Genome Res 24:1765–1773

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jane-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P Jr, de Silva M, Jagtap K, Jones MD, Wang L, Hatton C, Palescandolo E, Gupta S, Mahan S, Sougnez C, Onofrio RC, Liefeld T, MacConaill L, Winckler W, Reich M, Li N, Mesirov JP, Gabriel SB, Getz G, Ardlie K, Chan V, Myer VE, Weber BL, Porter J, Warmuth M, Finan P, Harris JL, Meyerson M, Golub TR, Morrissey MP, Sellers WR, Schlegel R, Garraway LA (2012) The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483:603–607

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Beccuti M, Carrara M, Cordero F, Lazzarato F, Donatelli S, Nadalin F, Policriti A, Calogero RA (2014) Chimera: a bioconductor package for secondary analysis of fusion products. Bioinformatics 30:3556–3557

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Bushman F (2017) Cancer Gene List. Bushman Lab. http://www.bushmanlab.org/links/genelists. Accessed 7 Feb 2017

  • Byron SA, Van Keuren-Jensen KR, Engelthaler DM, Carpten JD, Craig DW (2016) Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat Rev Genet 17:257–271

    Article  PubMed  CAS  Google Scholar 

  • Capdeville R, Buchdunger E, Zimmermann J, Matter A (2002) Glivec (STI571, imatinib), a rationally developed, targeted anticancer drug. Nat Rev Drug Discov 1:493–502

    Article  PubMed  CAS  Google Scholar 

  • Cestarelli V, Fiscon G, Felici G, Bertolazzi P, Weitschek E (2016) CAMUR: knowledge extraction from RNA-seq cancer data through equivalent classification rules. Bioinformatics 32:697–704

    Article  PubMed  CAS  Google Scholar 

  • Cleynen A, Szalat R, Kemal Samur M, Robiou du Pont S, Buisson L, Boyle E, Chretien ML, Anderson K, Minvielle S, Moreau P, Attal M, Parmigiani G, Corre J, Munshi N, Avet-Loiseau H (2017) Expressed fusion gene landscape and its impact in multiple myeloma. Nat Commun 8:1893

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Drexler HG, Dirks WG, Matsuo Y, MacLeod RA (2003) False leukemia–lymphoma cell lines: an update on over 500 cell lines. Leukemia 17:416–426

    Article  PubMed  CAS  Google Scholar 

  • Hoogstrate Y, Bottcher R, Hiltemann S, van der Spek PJ, Jenster G, Stubbs AP (2016) FuMa: reporting overlap in RNA-seq detected fusion genes. Bioinformatics 32:1226–1228

    Article  PubMed  CAS  Google Scholar 

  • Howe EA, Sinha R, Schlauch D, Quackenbush J (2011) RNA-Seq analysis in MeV. Bioinformatics 27:3209–3210

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Jia W, Qiu K, He M, Song P, Zhou Q, Zhou F, Yu Y, Zhu D, Nickerson ML, Wan S, Liao X, Zhu X, Peng S, Li Y, Wang J, Guo G (2013) SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data. Genome Biol 14:R12

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kim D, Salzberg SL (2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol 12:R72

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kuehn H, Liberzon A, Reich M, Mesirov JP (2008) Using GenePattern for gene expression analysis. Curr Protoc Bioinform. https://doi.org/10.1002/0471250953.bi0712s22

    Article  Google Scholar 

  • Kumar-Sinha C, Kalyana-Sundaram S, Chinnaiyan AM (2015) Landscape of gene fusions in epithelial cancers: seq and ye shall find. Genome Med 7:129

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Latysheva NS, Babu MM (2016) Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res 44:4487–4503

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Lee M, Lee K, Yu N, Jang I, Choi I, Kim P, Jang YE, Kim B, Kim S, Lee B, Kang J, Lee S (2017) ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data mining. Nucleic Acids Res 45:D784-D789

    PubMed  Google Scholar 

  • Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) The protein kinase complement of the human genome. Science 298:1912–1934

    Article  PubMed  CAS  Google Scholar 

  • McPherson A, Hormozdiari F, Zayed A, Giuliany R, Ha G, Sun MG, Griffith M, Heravi Moussavi A, Senz J, Melnyk N, Pacheco M, Marra MA, Hirst M, Nielsen TO, Sahinalp SC, Huntsman D, Shah SP (2011) deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput Biol 7:e1001138

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Mertens F, Johansson B, Fioretos T, Mitelman F (2015) The emerging complexity of gene fusions in cancer. Nat Rev Cancer 15:371–381

    Article  PubMed  CAS  Google Scholar 

  • Meyer C, Burmeister T, Groger D, Tsaur G, Fechina L, Renneville A, Sutton R, Venn NC, Emerenciano M, Pombo-de-Oliveira MS, Barbieri Blunck C, Almeida Lopes B, Zuna J, Trka J, Ballerini P, Lapillonne H, De Braekeleer M, Cazzaniga G, Corral Abascal L, van der Velden VHJ, Delabesse E, Park TS, Oh SH, Silva MLM, Lund-Aho T, Juvonen V, Moore AS, Heidenreich O, Vormoor J, Zerkalenkova E, Olshanskaya Y, Bueno C, Menendez P, Teigler-Schlegel A, Zur Stadt U, Lentes J, Gohring G, Kustanovich A, Aleinikova O, Schafer BW, Kubetzko S, Madsen HO, Gruhn B, Duarte X, Gameiro P, Lippert E, Bidet A, Cayuela JM, Clappier E, Alonso CN, Zwaan CM, van den Heuvel-Eibrink MM, Izraeli S, Trakhtenbrot L, Archer P, Hancock J, Moricke A, Alten J, Schrappe M, Stanulla M, Strehl S, Attarbaschi A, Dworzak M, Haas OA, Panzer-Grumayer R, Sedek L, Szczepanski T, Caye A, Suarez L, Cave H, Marschalek R (2017) The MLL recombinome of acute leukemias in 2017. Leukemia 32(2):273–284

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Morgan GJ, Walker BA, Davies FE (2012) The genetic architecture of multiple myeloma. Nat Rev Cancer 12:335–348

    Article  PubMed  CAS  Google Scholar 

  • Nicorici D, Satalan M, Edgren H, Kangaspeska S, Murumagi A, Kallioniemi O, Virtanen S, Kilkku O (2014) FusionCatcher—a tool for finding somatic fusion genes in paired-end RNA-sequencing data. BioRxiv 011650. https://doi.org/10.1101/011650

  • Panigrahi P, Jere A, Anamika K (2018) FusionHub: a unified web platform for annotation and visualization of gene fusion events in human cancer. PLoS One 13:e0196588

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Persson H, Sokilde R, Hakkinen J, Pirona AC, Vallon-Christersson J, Kvist A, Mertens F, Borg A, Mitelman F, Hoglund M, Rovira C (2017) Frequent miRNA-convergent fusion gene events in breast cancer. Nat Commun 8:788

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Roberts KG (2017) The biology of Philadelphia chromosome-like ALL. Best Pract Res Clin Haematol 30:212–221

    Article  PubMed  Google Scholar 

  • Roberts KG, Li Y, Payne-Turner D, Harvey RC, Yang YL, Pei D, McCastlain K, Ding L, Lu C, Song G, Ma J, Becksfort J, Rusch M, Chen SC, Easton J, Cheng J, Boggs K, Santiago-Morales N, Iacobucci I, Fulton RS, Wen J, Valentine M, Cheng C, Paugh SW, Devidas M, Chen IM, Reshmi S, Smith A, Hedlund E, Gupta P, Nagahawatte P, Wu G, Chen X, Yergeau D, Vadodaria B, Mulder H, Winick NJ, Larsen EC, Carroll WL, Heerema NA, Carroll AJ, Grayson G, Tasian SK, Moore AS, Keller F, Frei-Jones M, Whitlock JA, Raetz EA, White DL, Hughes TP, Guidry Auvil JM, Smith MA, Marcucci G, Bloomfield CD, Mrozek K, Kohlschmidt J, Stock W, Kornblau SM, Konopleva M, Paietta E, Pui CH, Jeha S, Relling MV, Evans WE, Gerhard DS, Gastier-Foster JM, Mardis E, Wilson RK, Loh ML, Downing JR, Hunger SP, Willman CL, Zhang J, Mullighan CG (2014) Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N Engl J Med 371:1005–1015

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Roychowdhury S, Chinnaiyan AM (2016) Translating cancer genomes and transcriptomes for precision oncology. CA Cancer J Clin 66:75–88

    Article  PubMed  Google Scholar 

  • Stransky N, Cerami E, Schalm S, Kim JL, Lengauer C (2014) The landscape of kinase fusions in cancer. Nat Commun 5:4846

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Wang Q, Xia J, Jia P, Pao W, Zhao Z (2013) Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives. Br Bioinform 14:506–519

    Article  CAS  Google Scholar 

  • Weitschek E, Felici G, Bertolazzi P (2012) MALA: a microarray clustering and classification software. In: 23rd International workshop on database and expert systems applications. IEEE, Vienna, pp 201–205

  • Wilks C, Cline MS, Weiler E, Diehkans M, Craft B, Martin C, Murphy D, Pierce H, Black J, Nelson D, Litzinger B, Hatton T, Maltbie L, Ainsworth M, Allen P, Rosewood L, Mitchell E, Smith B, Warner J, Groboske J, Telc H, Wilson D, Sanford B, Schmidt H, Haussler D, Maltbie D (2014) The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data. Database. https://doi.org/10.1093/database/bau093

    Article  CAS  Google Scholar 

  • Zhao M, Sun J, Zhao Z (2013) TSGene: a web resource for tumor suppressor genes. Nucleic Acids Res 41:D970–D976

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

C. H. K is a recipient of Mary Overton research fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chung Hoow Kok.

Ethics declarations

Conflict of interest

Sakrapee Paisitkriangkrai declares that he has no conflict of interest. Kelly Quek declares that she has no conflict of interest. Eva Nievergall declares that she has no conflict of interest. Anissa Jabbour declares that she has no conflict of interest. Andrew Zannettino declares that he has no conflict of interest. Chung H Kok declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Data availability

The RNA-seq leukemic cell lines dataset analysed during the current study is publicly available at Cancer Genomics Hub (http://cghub.ucsc.edu) and NCI Genomic Data Commons (https://gdc.nci.nih.gov/). The raw sequencing data for 272 gliomas clinical samples dataset analysed during the current study is publicly available at NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE48865. The raw FusionCatcher analysis results generated during this study are included in this published article as supplementary information files.

Additional information

Communicated by S. Hohmann.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paisitkriangkrai, S., Quek, K., Nievergall, E. et al. Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data. Mol Genet Genomics 293, 1217–1229 (2018). https://doi.org/10.1007/s00438-018-1454-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00438-018-1454-1

Keywords

Navigation