Abstract
The identification of cancer driver genes through the analysis of mutations detected with high-throughput sequencing is a useful tool and a key challenge in cancer genomics. The workflow presented here relies on unpaired RNA-seq tumoral samples, thus leveraging already available RNA-seq data and providing the intrinsical benefits of directly targeting the transcriptome. Based on well-established methods for variant detection, this workflow also involves thorough data cleaning and extensive annotation, which enable the selection for somatic mutations with functional impact and the prioritization of genes relevant to the carcinogenic processes in the input samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Picard Tools. https://broadinstitute.github.io/picard/
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311. https://doi.org/10.1093/nar/29.1.308
Mills RE, Pittard WS, Mullaney JM, Farooq U, Creasy TH, Mahurkar AA, Kemeza DM, Strassler DS, Ponting CP, Webber C, Devine SE (2011) Natural genetic variation caused by small insertions and deletions in the human genome. Genome Res 21:830–839. https://doi.org/10.1101/gr.115907.110
McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor. Bioinformatics 26:2069–2070. https://doi.org/10.1093/bioinformatics/btq330
Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213–219. https://doi.org/10.1038/nbt.2514
Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576. https://doi.org/10.1101/gr.129684.111
Radenbaugh AJ, Ma S, Ewing A, Stuart JM, Collisson EA, Zhu J, Haussler D (2014) RADIA: RNA and DNA integrated analysis for somatic mutation detection. PLoS One 9:e111516. https://doi.org/10.1371/journal.pone.0111516
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498. https://doi.org/10.1038/ng.806
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. https://doi.org/10.1093/bioinformatics/bts635
Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella KV, Altshuler D, Gabriel S, DePristo MA (2013) From FastQ data to high-confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43:11.10.1–11.10.33
Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, He X, Mieczkowski P, Grimm SA, Perou CM, MacLeod JN, Chiang DY, Prins JF, Liu J (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38:e178. https://doi.org/10.1093/nar/gkq622
Engström PG, Steijger T, Sipos B, Grant GR, Kahles A, Rätsch G, Goldman N, Hubbard TJ, Harrow J, Guigó R, Bertone P (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10:1185–1191. https://doi.org/10.1038/nmeth.2722
Exome Aggregation Consortium (ExAC), Cambridge, MA. http://exac.broadinstitute.org/
Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, Jia M, Shepherd R, Leung K, Menzies A, Teague JW, Campbell PJ, Stratton MR, Futreal PA (2011) COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39:D945–D950. https://doi.org/10.1093/nar/gkq929
Derrien T, Estellé J, Marco Sola S, Knowles DG, Raineri E, Guigó R, Ribeca P (2012) Fast computation and applications of genome mappability. PLoS One 7:e30377. https://doi.org/10.1371/journal.pone.0030377
Wang C, Davila JI, Baheti S, Bhagwate AV, Wang X, Kocher J-PA, Slager SL, Feldman AL, Novak AJ, Cerhan JR, Thompson EA, Asmann YW (2014) RVboost: RNA-seq variants prioritization using a boosting method. Bioinformatics 3:1–3. https://doi.org/10.1093/bioinformatics/btu577
Piskol R, Ramaswami G, Li JB (2013) Reliable identification of genomic variants from RNA-seq data. Am J Hum Genet 93:641–651. https://doi.org/10.1016/j.ajhg.2013.08.008
Cabanski CR, Wilkerson MD, Soloway M, Parker JS, Liu J, Prins JF, Marron JS, Perou CM, Neil Hayes D (2013) BlackOPs: increasing confidence in variant detection through mappability filtering. Nucleic Acids Res 41:1–10. https://doi.org/10.1093/nar/gkt692
Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. https://doi.org/10.1101/gr.229202
O’Brien TD, Jia P, Xia J, Saxena U, Jin H, Vuong H, Kim P, Wang Q, Aryee MJ, Mino-Kenudson M, Engelman J, Le LP, Iafrate AJ, Heist RS, Pao W, Zhao Z (2015) Inconsistency and features of single nucleotide variants detected in whole exome sequencing versus transcriptome sequencing: a case study in lung cancer. Methods 83:118–127. https://doi.org/10.1016/j.ymeth.2015.04.016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Mosen-Ansorena, D. (2019). Identification of Mutated Cancer Driver Genes in Unpaired RNA-Seq Samples. In: Krasnitz, A. (eds) Cancer Bioinformatics. Methods in Molecular Biology, vol 1878. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8868-6_5
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8868-6_5
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8866-2
Online ISBN: 978-1-4939-8868-6
eBook Packages: Springer Protocols