Big Data and Cancer Research



The advent of high-throughput technology has revolutionized biological sciences in the last two decades enabling experiments on the whole genome scale. Data from such large-scale experiments are interpreted at system’s level to understand the interplay among genome, transcriptome, epigenome, proteome, metabolome, and regulome.


Single Tumor Cell Variant Call Format International Cancer Genome Consortium Cancer Genome Project Variant Call Format File 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Research in Ganit Labs, Bio-IT Centre is funded by grants from the Government of India agencies (Department of Electronics and Information Technology; Department of Biotechnology; Department of Science and Technology; and the Council of Scientific and Industrial Research) and Department of Information Technology, Biotechnology and Science & Technology, Government of Karnataka, India. I thank Saurabh Gupta for helping in making Fig. 2, and Saurabh Gupta and Neeraja Krishnan for critically reading the manuscript. Ganit Labs is an initiative of Institute of Bioinformatics and Applied Biotechnology and Strand Life Sciences, both located in Bangalore, India.


  1. 1.
    Maxam AM, Gilbert W (1977) A new method for sequencing DNA. Proc Natl Acad Sci USA 74:560–564CrossRefGoogle Scholar
  2. 2.
    Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463–5467CrossRefGoogle Scholar
  3. 3.
    Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al (2001) The sequence of the human genome. Science 291:1304–1351CrossRefGoogle Scholar
  4. 4.
    Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921CrossRefGoogle Scholar
  5. 5.
    Metzker ML (2010) Sequencing technologies - the next generation. Nat Rev Genet 11:31–46CrossRefGoogle Scholar
  6. 6.
    Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT et al (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452:872–876CrossRefGoogle Scholar
  7. 7.
    Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59CrossRefGoogle Scholar
  8. 8.
    Homer N, Merriman B, Nelson SF (2009) BFAST: an alignment tool for large scale genome resequencing. PLoS ONE 4:e7767CrossRefGoogle Scholar
  9. 9.
    Ning Z, Cox AJ, Mullikin JC (2001) SSAHA: a fast search method for large DNA databases. Genome Res 11:1725–1729CrossRefGoogle Scholar
  10. 10.
  11. 11.
    Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939CrossRefGoogle Scholar
  12. 12.
    Novoalign (
  13. 13.
    Langmead B (2010) Aligning short sequencing reads with Bowtie. Curr Protoc Bioinform., Chap 11:Unit 11–17Google Scholar
  14. 14.
    Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359CrossRefGoogle Scholar
  15. 15.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760CrossRefGoogle Scholar
  16. 16.
    Liu Y, Schmidt B, Maskell DL (2012) CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform. Bioinformatics 28:1830–1837CrossRefGoogle Scholar
  17. 17.
    Klus P, Lam S, Lyberg D, Cheung MS, Pullan G, McFarlane I, Yeo G, Lam BY (2012) BarraCUDA—a fast short read sequence aligner using graphics processing units. BMC Res Notes 5:27CrossRefGoogle Scholar
  18. 18.
    Gupta S, Choudhury S, Panda B (2014) MUSIC: A hybrid-computing environment for Burrows-Wheeler alignment for massive amount of short read sequence data. MECBME 2014 (indexed in IEEE Xplore)Google Scholar
  19. 19.
    Schatz MC, Trapnell C, Delcher AL, Varshney A (2007) High-throughput sequence alignment using graphics processing units. BMC Bioinform 8:474CrossRefGoogle Scholar
  20. 20.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079CrossRefGoogle Scholar
  21. 21.
    McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303CrossRefGoogle Scholar
  22. 22.
    DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498CrossRefGoogle Scholar
  23. 23.
    Pattnaik S, Vaidyanathan S, Pooja DG, Deepak S, Panda B (2012) Customisation of the exome data analysis pipeline using a combinatorial approach. PLoS ONE 7:e30080CrossRefGoogle Scholar
  24. 24.
    Cibulskis K, McKenna A, Fennell T, Banks E, DePristo M, Getz G (2011) ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinformatics 27:2601–2602Google Scholar
  25. 25.
    Forbes SA, Beare D, Gunasekaran P, Leung K, Bindal N, Boutselakis H, Ding M, Bamford S, Cole C, Ward S et al (2015) COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res 43:D805–D811CrossRefGoogle Scholar
  26. 26.
    Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, Jia M, Shepherd R, Leung K, Menzies A et al (2011) COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39:D945–D950CrossRefGoogle Scholar
  27. 27.
    Forbes SA, Tang G, Bindal N, Bamford S, Dawson E, Cole C, Kok CY, Jia M, Ewing R, Menzies A et al (2010) COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38:D652–D657CrossRefGoogle Scholar
  28. 28.
    Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164CrossRefGoogle Scholar
  29. 29.
    Yourshaw M, Taylor SP, Rao AR, Martin MG, Nelson SF (2015) Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins. Brief Bioinform 16:255–264CrossRefGoogle Scholar
  30. 30.
    Douville C, Carter H, Kim R, Niknafs N, Diekhans M, Stenson PD, Cooper DN, Ryan M, Karchin R (2013) CRAVAT: cancer-related analysis of variants toolkit. Bioinformatics 29:647–648CrossRefGoogle Scholar
  31. 31.
    Gundem G, Perez-Llamas C, Jene-Sanz A, Kedzierska A, Islam A, Deu-Pons J, Furney SJ, Lopez-Bigas N (2010) IntOGen: integration and data mining of multidimensional oncogenomic data. Nat Methods 7:92–93CrossRefGoogle Scholar
  32. 32.
    Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA et al (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499:214–218CrossRefGoogle Scholar
  33. 33.
    Dees ND: MuSiC2. 2015Google Scholar
  34. 34.
    Sales G, Calura E, Martini P, Romualdi C (2013) Graphite Web: Web tool for gene set analysis exploiting pathway topology. Nucleic Acids Res 41:W89–W97CrossRefGoogle Scholar
  35. 35.
    Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD (2010) Cytoscape Web: an interactive web-based network browser. Bioinformatics 26:2347–2348CrossRefGoogle Scholar
  36. 36.
    Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B et al (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2:2366–2382CrossRefGoogle Scholar
  37. 37.
    Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504CrossRefGoogle Scholar
  38. 38.
    Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645CrossRefGoogle Scholar
  39. 39.
    Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, Byrne CJ, Heuer ML, Larsson E et al (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2:401–404CrossRefGoogle Scholar
  40. 40.
    Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26CrossRefGoogle Scholar
  41. 41.
    Hu H, Wen Y, Chua TS, Li X (2014) Toward scalable systems for big data analytics: a technology tutorial. IEEE Access 2:652–687CrossRefGoogle Scholar
  42. 42.
    Hudson TJ, Anderson W, Artez A, Barker AD, Bell C, Bernabe RR, Bhan MK, Calvo F, Eerola I, Gerhard DS et al (2010) International network of cancer genome projects. Nature 464:993–998CrossRefGoogle Scholar
  43. 43.
    Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G (2014) Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505:495–501CrossRefGoogle Scholar
  44. 44.
    Stephens PJ, McBride DJ, Lin ML, Varela I, Pleasance ED, Simpson JT, Stebbings LA, Leroy C, Edkins S, Mudie LJ et al (2009) Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature 462:1005–1010CrossRefGoogle Scholar
  45. 45.
    van Haaften G, Dalgliesh GL, Davies H, Chen L, Bignell G, Greenman C, Edkins S, Hardy C, O’Meara S, Teague J et al (2009) Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer. Nat Genet 41:521–523CrossRefGoogle Scholar
  46. 46.
    Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, Greenman CD, Varela I, Lin ML, Ordonez GR, Bignell GR et al (2010) A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191–196CrossRefGoogle Scholar
  47. 47.
    Pleasance ED, Stephens PJ, O’Meara S, McBride DJ, Meynert A, Jones D, Lin ML, Beare D, Lau KW, Greenman C et al (2010) A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463:184–190CrossRefGoogle Scholar
  48. 48.
    Papaemmanuil E, Cazzola M, Boultwood J, Malcovati L, Vyas P, Bowen D, Pellagatti A, Wainscoat JS, Hellstrom-Lindberg E, Gambacorti-Passerini C et al (2011) Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N Engl J Med 365:1384–1395CrossRefGoogle Scholar
  49. 49.
    Puente XS, Pinyol M, Quesada V, Conde L, Ordonez GR, Villamor N, Escaramis G, Jares P, Bea S, Gonzalez-Diaz M et al (2011) Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature 475:101–105CrossRefGoogle Scholar
  50. 50.
    Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, Pleasance ED, Lau KW, Beare D, Stebbings LA et al (2011) Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144:27–40CrossRefGoogle Scholar
  51. 51.
    Varela I, Tarpey P, Raine K, Huang D, Ong CK, Stephens P, Davies H, Jones D, Lin ML, Teague J et al (2011) Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma. Nature 469:539–542CrossRefGoogle Scholar
  52. 52.
    Greenman CD, Pleasance ED, Newman S, Yang F, Fu B, Nik-Zainal S, Jones D, Lau KW, Carter N, Edwards PA et al (2012) Estimation of rearrangement phylogeny for cancer genomes. Genome Res 22:346–361CrossRefGoogle Scholar
  53. 53.
    Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA et al (2012) Mutational processes molding the genomes of 21 breast cancers. Cell 149:979–993CrossRefGoogle Scholar
  54. 54.
    Stephens PJ, Tarpey PS, Davies H, Van Loo P, Greenman C, Wedge DC, Nik-Zainal S, Martin S, Varela I, Bignell GR et al (2012) The landscape of cancer genes and mutational processes in breast cancer. Nature 486:400–404Google Scholar
  55. 55.
    Wang L, Tsutsumi S, Kawaguchi T, Nagasaki K, Tatsuno K, Yamamoto S, Sang F, Sonoda K, Sugawara M, Saiura A et al (2012) Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res 22:208–219CrossRefGoogle Scholar
  56. 56.
    Cancer Genome Atlas N (2015) Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517:576–582CrossRefGoogle Scholar
  57. 57.
    India Project Team of the International Cancer Genome C (2013) Mutational landscape of gingivo-buccal oral squamous cell carcinoma reveals new recurrently-mutated genes and molecular subgroups. Nat Commun 4:2873Google Scholar
  58. 58.
    Barbieri CE, Baca SC, Lawrence MS, Demichelis F, Blattner M, Theurillat JP, White TA, Stojanov P, Van Allen E, Stransky N et al (2012) Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat Genet 44:685–689CrossRefGoogle Scholar
  59. 59.
    Van Allen EM, Wagle N, Stojanov P, Perrin DL, Cibulskis K, Marlow S, Jane-Valbuena J, Friedrich DC, Kryukov G, Carter SL et al (2014) Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat Med 20:682–688CrossRefGoogle Scholar
  60. 60.
    Wang L, Lawrence MS, Wan Y, Stojanov P, Sougnez C, Stevenson K, Werner L, Sivachenko A, DeLuca DS, Zhang L et al (2011) SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med 365:2497–2506CrossRefGoogle Scholar
  61. 61.
    Craig DW, O’Shaughnessy JA, Kiefer JA, Aldrich J, Sinari S, Moses TM, Wong S, Dinh J, Christoforides A, Blum JL et al (2013) Genome and transcriptome sequencing in prospective metastatic triple-negative breast cancer uncovers therapeutic vulnerabilities. Mol Cancer Ther 12:104–116CrossRefGoogle Scholar
  62. 62.
    Beltran H, Rickman DS, Park K, Chae SS, Sboner A, MacDonald TY, Wang Y, Sheikh KL, Terry S, Tagawa ST et al (2011) Molecular characterization of neuroendocrine prostate cancer and identification of new drug targets. Cancer Discov 1:487–495CrossRefGoogle Scholar
  63. 63.
    Drier Y, Lawrence MS, Carter SL, Stewart C, Gabriel SB, Lander ES, Meyerson M, Beroukhim R, Getz G (2013) Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement-induced hypermutability. Genome Res 23:228–235CrossRefGoogle Scholar
  64. 64.
    Eswaran J, Horvath A, Godbole S, Reddy SD, Mudvari P, Ohshiro K, Cyanam D, Nair S, Fuqua SA, Polyak K et al (2013) RNA sequencing of cancer reveals novel splicing alterations. Sci Rep 3:1689CrossRefGoogle Scholar
  65. 65.
    Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA et al (2013) Mutational landscape and significance across 12 major cancer types. Nature 502:333–339CrossRefGoogle Scholar
  66. 66.
    Wu X, Cao W, Wang X, Zhang J, Lv Z, Qin X, Wu Y, Chen W (2013) TGM3, a candidate tumor suppressor gene, contributes to human head and neck cancer. Mol Cancer 12:151CrossRefGoogle Scholar
  67. 67.
    Merid SK, Goranskaya D, Alexeyenko A (2014) Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis. BMC Bioinform 15:308CrossRefGoogle Scholar
  68. 68.
    Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84CrossRefGoogle Scholar
  69. 69.
    Dietlein F, Eschner W (2014) Inferring primary tumor sites from mutation spectra: a meta-analysis of histology-specific aberrations in cancer-derived cell lines. Hum Mol Genet 23:1527–1537CrossRefGoogle Scholar
  70. 70.
    Cole C, Krampis K, Karagiannis K, Almeida JS, Faison WJ, Motwani M, Wan Q, Golikov A, Pan Y, Simonyan V, Mazumder R (2014) Non-synonymous variations in cancer and their effects on the human proteome: workflow for NGS data biocuration and proteome-wide analysis of TCGA data. BMC Bioinform 15:28CrossRefGoogle Scholar
  71. 71.
    Wittler R (2013) Unraveling overlapping deletions by agglomerative clustering. BMC Genom 14(Suppl 1):S12CrossRefGoogle Scholar
  72. 72.
    Trifonov V, Pasqualucci L, Dalla Favera R, Rabadan R (2013) MutComFocal: an integrative approach to identifying recurrent and focal genomic alterations in tumor samples. BMC Syst Biol 7:25CrossRefGoogle Scholar
  73. 73.
    Oesper L, Mahmoody A, Raphael BJ (2013) THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol 14:R80CrossRefGoogle Scholar
  74. 74.
    Hansen NF, Gartner JJ, Mei L, Samuels Y, Mullikin JC (2013) Shimmer: detection of genetic alterations in tumors using next-generation sequence data. Bioinformatics 29:1498–1503CrossRefGoogle Scholar
  75. 75.
    Hamilton MP, Rajapakshe K, Hartig SM, Reva B, McLellan MD, Kandoth C, Ding L, Zack TI, Gunaratne PH, Wheeler DA et al (2013) Identification of a pan-cancer oncogenic microRNA superfamily anchored by a central core seed motif. Nat Commun 4:2730CrossRefGoogle Scholar
  76. 76.
    Chen Y, Yao H, Thompson EJ, Tannir NM, Weinstein JN, Su X (2013) VirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue. Bioinformatics 29:266–267CrossRefGoogle Scholar
  77. 77.
    Mosen-Ansorena D, Telleria N, Veganzones S, De la Orden V, Maestro ML, Aransay AM (2014) seqCNA: an R package for DNA copy number analysis in cancer using high-throughput sequencing. BMC Genom 15:178CrossRefGoogle Scholar
  78. 78.
    Li Y, Xie X (2014) Deconvolving tumor purity and ploidy by integrating copy number alterations and loss of heterozygosity. Bioinformatics 30:2121–2129CrossRefGoogle Scholar
  79. 79.
    Kendall J, Krasnitz A (2014) Computational methods for DNA copy-number analysis of tumors. Methods Mol Biol 1176:243–259CrossRefGoogle Scholar
  80. 80.
    Krishnan NM, Gaur P, Chaudhary R, Rao AA, Panda B (2012) COPS: a sensitive and accurate tool for detecting somatic Copy Number Alterations using short-read sequence data from paired samples. PLoS ONE 7:e47812CrossRefGoogle Scholar
  81. 81.
    Van Allen EM, Wagle N, Levy MA (2013) Clinical analysis and interpretation of cancer genome data. J Clin Oncol 31:1825–1833CrossRefGoogle Scholar
  82. 82.
    Lahti L, Schafer M, Klein HU, Bicciato S, Dugas M (2013) Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review. Brief Bioinform 14:27–35CrossRefGoogle Scholar
  83. 83.
    Lee LA, Arvai KJ, Jones D (2015) Annotation of sequence variants in cancer samples: processes and pitfalls for routine assays in the clinical laboratory. J Mol DiagnGoogle Scholar
  84. 84.
    Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM (2013) The cancer genome Atlas Pan-cancer analysis project. Nat Genet 45:1113–1120CrossRefGoogle Scholar
  85. 85.
    Zack TI, Schumacher SE, Carter SL, Cherniack AD, Saksena G, Tabak B, Lawrence MS, Zhang CZ, Wala J, Mermel CH et al (2013) Pan-cancer patterns of somatic copy number alteration. Nat Genet 45:1134–1140CrossRefGoogle Scholar
  86. 86.
    Gross AM, Orosco RK, Shen JP, Egloff AM, Carter H, Hofree M, Choueiri M, Coffey CS, Lippman SM, Hayes DN et al (2014) Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss. Nat Genet 46:939–943CrossRefGoogle Scholar
  87. 87.
    Pan-cancer initiative finds patterns of drivers (2013) Cancer Discov 3:1320Google Scholar
  88. 88.
    Taking pan-cancer analysis global (2013) Nat Genet 45:1263CrossRefGoogle Scholar
  89. 89.
    Russnes HG, Navin N, Hicks J, Borresen-Dale AL (2011) Insight into the heterogeneity of breast cancer through next-generation sequencing. J Clin Invest 121:3810–3818CrossRefGoogle Scholar
  90. 90.
    Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, Martinez P, Matthews N, Stewart A, Tarpey P et al (2012) Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366:883–892CrossRefGoogle Scholar
  91. 91.
    Swanton C (2012) Intratumor heterogeneity: evolution through space and time. Cancer Res 72:4875–4882CrossRefGoogle Scholar
  92. 92.
    Oesper L, Satas G, Raphael BJ (2014) Quantifying tumor heterogeneity in whole-genome and whole-exome sequencing data. Bioinformatics 30:3532–3540CrossRefGoogle Scholar
  93. 93.
    Hajirasouliha I, Mahmoody A, Raphael BJ (2014) A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data. Bioinformatics 30:i78–i86CrossRefGoogle Scholar
  94. 94.
    Jun G, Flickinger M, Hetrick KN, Romm JM, Doheny KF, Abecasis GR, Boehnke M, Kang HM (2012) Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet 91:839–848CrossRefGoogle Scholar
  95. 95.
    Navin N, Hicks J (2011) Future medical applications of single-cell sequencing in cancer. Genome Med 3:31CrossRefGoogle Scholar
  96. 96.
    Ji C, Miao Z, He X (2015) A simple strategy for reducing false negatives in calling variants from single-cell sequencing data. PLoS ONE 10:e0123789CrossRefGoogle Scholar
  97. 97.
    Yu C, Yu J, Yao X, Wu WK, Lu Y, Tang S, Li X, Bao L, Li X, Hou Y et al (2014) Discovery of biclonal origin and a novel oncogene SLC12A5 in colon cancer by single-cell sequencing. Cell Res 24:701–712CrossRefGoogle Scholar
  98. 98.
    Ting DT, Wittner BS, Ligorio M, Vincent Jordan N, Shah AM, Miyamoto DT, Aceto N, Bersani F, Brannigan BW, Xega K et al (2014) Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells. Cell Rep 8:1905–1918Google Scholar
  99. 99.
    Kim KI, Simon R (2014) Using single cell sequencing data to model the evolutionary history of a tumor. BMC Bioinform 15:27CrossRefGoogle Scholar
  100. 100.
    Xu Y, Hu H, Zheng J, Li B (2013) Feasibility of whole RNA sequencing from single-cell mRNA amplification. Genet Res Int 2013:724124Google Scholar
  101. 101.
    Voet T, Kumar P, Van Loo P, Cooke SL, Marshall J, Lin ML, Zamani Esteki M, Van der Aa N, Mateiu L, McBride DJ et al (2013) Single-cell paired-end genome sequencing reveals structural variation per cell cycle. Nucleic Acids Res 41:6119–6138CrossRefGoogle Scholar
  102. 102.
    Korfhage C, Fisch E, Fricke E, Baedker S, Loeffert D (2013) Whole-genome amplification of single-cell genomes for next-generation sequencing. Curr Protoc Mol Biol 104:Unit 7–14Google Scholar
  103. 103.
    Geurts-Giele WR, Dirkx-van der Velden AW, Bartalits NM, Verhoog LC, Hanselaar WE, Dinjens WN (2013) Molecular diagnostics of a single multifocal non-small cell lung cancer case using targeted next generation sequencing. Virchows Arch 462:249–254Google Scholar
  104. 104.
    Xu X, Hou Y, Yin X, Bao L, Tang A, Song L, Li F, Tsang S, Wu K, Wu H et al (2012) Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148:886–895CrossRefGoogle Scholar
  105. 105.
    Li Y, Xu X, Song L, Hou Y, Li Z, Tsang S, Li F, Im KM, Wu K, Wu H et al (2012) Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer. Gigascience 1:12CrossRefGoogle Scholar
  106. 106.
    Hou Y, Song L, Zhu P, Zhang B, Tao Y, Xu X, Li F, Wu K, Liang J, Shao D et al (2012) Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell 148:873–885CrossRefGoogle Scholar
  107. 107.
    Novak R, Zeng Y, Shuga J, Venugopalan G, Fletcher DA, Smith MT, Mathies RA (2011) Single-cell multiplex gene detection and sequencing with microfluidically generated agarose emulsions. Angew Chem Int Ed Engl 50:390–395CrossRefGoogle Scholar
  108. 108.
    Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, Cook K, Stepansky A, Levy D, Esposito D et al (2011) Tumour evolution inferred by single-cell sequencing. Nature 472:90–94CrossRefGoogle Scholar
  109. 109.
    Lasken RS (2013) Single-cell sequencing in its prime. Nat Biotechnol 31:211–212CrossRefGoogle Scholar
  110. 110.
    Nawy T (2014) Single-cell sequencing. Nat Methods 11:18CrossRefGoogle Scholar
  111. 111.
    Panda B (2012) Whither genomic diagnostics tests in India? Indian J Med Paediatr Oncol 33:250–252CrossRefGoogle Scholar
  112. 112.
    Xue W, Chen S, Yin H, Tammela T, Papagiannakopoulos T, Joshi NS, Cai W, Yang G, Bronson R, Crowley DG et al (2014) CRISPR-mediated direct mutation of cancer genes in the mouse liver. Nature 514:380–384CrossRefGoogle Scholar
  113. 113.
    Sanchez-Rivera FJ, Papagiannakopoulos T, Romero R, Tammela T, Bauer MR, Bhutkar A, Joshi NS, Subbaraj L, Bronson RT, Xue W, Jacks T (2014) Rapid modelling of cooperating genetic events in cancer through somatic genome editing. Nature 516:428–431CrossRefGoogle Scholar
  114. 114.
    Matano M, Date S, Shimokawa M, Takano A, Fujii M, Ohta Y, Watanabe T, Kanai T, Sato T (2015) Modeling colorectal cancer using CRISPR-Cas9-mediated engineering of human intestinal organoids. Nat Med 21:256–262Google Scholar
  115. 115.
    Chen S, Sanjana NE, Zheng K, Shalem O, Lee K, Shi X, Scott DA, Song J, Pan JQ, Weissleder R et al (2015) Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160:1246–1260CrossRefGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.Ganit Labs, Bio-IT CentreInstitute of Bioinformatics and Applied BiotechnologyBangaloreIndia

Personalised recommendations