Statistical Tools and R Software for Cancer Driver Probabilities

  • Giovanni Parmigiani
  • Simina Boca
  • Jie Ding
  • Lorenzo Trippa
Part of the Methods in Molecular Biology book series (MIMB, volume 1101)


This chapter provides a description and illustration of CancerMutationAnalysis and Cancer MutationMCMC, two open source R packages specifically designed for the analysis of somatic mutations in cancer genome studies, at both the gene and gene-set levels.

Key words

Markov Chain Monte Carlo Gene mutation Cancer driver gene 


  1. 1.
    Tomasetti C, Vogelstein B, Parmigiani G (2013) Half or more of the somatic mutations in cancers of self-renewing tissues originate prior to tumor initiation. In: Proceedings of the national academy of sciences of the United States of AmericaGoogle Scholar
  2. 2.
    Parmigiani G, Boca S, Lin J, Kinzler KW, Velculescu V, Vogelstein B (2009) Design and analysis issues in genome-wide somatic mutation studies of cancer. Genomics 93(1):17–21PubMedCrossRefGoogle Scholar
  3. 3.
    Trippa L, Parmigiani G (2011) False discovery rates in somatic mutation studies of cancer. Ann Appl Stat 5:1360–1378CrossRefGoogle Scholar
  4. 4.
    Ding J, Trippa L, Zhong X, Parmigiani G (2013)Hierarchical Bayesian analysis of somatic mutation data in cancer.Ann Appl Stat 7:883–903Google Scholar
  5. 5.
    Boca SM, Kinzler KW, Velculescu VE, Vogelstein B, Parmigiani G (2010) Patient-oriented gene set analysis for cancer mutation data. Genome Biol 11(11):R112PubMedCrossRefGoogle Scholar
  6. 6.
    Ihaka R, Gentleman R (1996) R: A language for data analysis and graphics. J Comput Graph Stat 5:299–314Google Scholar
  7. 7.
    Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80PubMedCrossRefGoogle Scholar
  8. 8.
    Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, et al (2006) The consensus coding sequences of human breast and colorectal cancers. Science 314(5797):268–274PubMedCrossRefGoogle Scholar
  9. 9.
    Parmigiani G, Lin J, Boca S, Sjöblom T, Kinzler KW, Velculescu VE, Vogelstein B (2007) Statistical methods for the analysis of cancer genome sequencing data. Working Paper 126, Department of Biostatistics, Johns Hopkins UniversityGoogle Scholar
  10. 10.
    Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, et al (2007) The genomic landscapes of human breast and colorectal cancers. Science 318(5853):1108PubMedCrossRefGoogle Scholar
  11. 11.
    Jones S, Zhang X, Parsons DW, Lin JCH, RJ Leary, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, et al (2008) Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321(5897):1801PubMedCrossRefGoogle Scholar
  12. 12.
    Parsons DW, Jones S, Zhang X, Lin JCH, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu I, et al (2008) An integrated genomic analysis of human glioblastoma multiforme. Science 321(5897):1807PubMedCrossRefGoogle Scholar
  13. 13.
    Parsons DW, Li M, Zhang X, Jones S, Leary RJ, Lin JCH, Boca SM, Carter H, Samayoa J, Bettegowda C, et al (2011) The genetic landscape of the childhood cancer medulloblastoma. Science 331(6016):435PubMedCrossRefGoogle Scholar
  14. 14.
    Getz G, Höfling H, Mesirov JP, Golub TR, Meyerson M, Tibshirani R, Lander ES (2007) Comment on “the consensus coding sequences of human breast and colorectal cancers”. Science 317(5844):1500PubMedCrossRefGoogle Scholar
  15. 15.
    Efron B, Tibshirani R (2002) Empirical Bayes methods and false discovery rates for microarrays. Genet Epidemiol 23(1):70–86PubMedCrossRefGoogle Scholar
  16. 16.
    Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96(456): 1151–1160CrossRefGoogle Scholar
  17. 17.
    Rajagopalan H, Bardelli A, Lengauer C, Kinzler KW, Vogelstein B, Velculescu VE (2002) Tumorigenesis: RAF/RAS oncogenes and mismatch-repair status. Nature 418(6901):934PubMedCrossRefGoogle Scholar
  18. 18.
    Parsons DW, Wang TL, Samuels Y, Bardelli A, Cummins JM, DeLong L, Silliman N, Ptak J, Szabo S, Willson JKV, et al (2005) Colorectal cancer: Mutations in a signalling pathway. Nature 436(7052):792PubMedCrossRefGoogle Scholar
  19. 19.
    Smyth GK (2005) Limma: linear models for microarray data. Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York, pp 397–420Google Scholar
  20. 20.
    Schaeffer EM, Marchionni L, Huang Z, Simons B, Blackman A, Yu W, Parmigiani G, Berman DM (2008) Androgen-induced programs for prostate epithelial growth and invasion arise in embryogenesis and are reactivated in cancer. Oncogene 27(57):7180–7191PubMedCrossRefGoogle Scholar
  21. 21.
    Lin J, Gan CM, Zhang X, Jones S, Sjöblom T, Wood LD, Parsons W, Papadopoulos N, Kinzler KW, Vogelstein B, Parmigiani G, Velculescu VV (2007) A multidimensional analysis of genes mutated in breast and colorectal and colorectal cancers. Genome Res 17(9): 1304–1318PubMedCrossRefGoogle Scholar
  22. 22.
    Jones S, Zhang X, Parsons DW, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, Hong S, Fu B, Lin M, Calhoun ES, Kamiyama M, Walter K, Nikolskaya T, Nikolsky Y, Hartigan J, Smith DR, Hidalgo M, Leach SD, Klein AP, Jaffee EM, Goggins M, Maitra A, Iacobuzio-Donahue C, Eshleman JR, Kern SE, Hruban RH, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW (2008) Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321:1801–1806PubMedCrossRefGoogle Scholar
  23. 23.
    Wood LD, DW Parsons, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JKV, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PVK, Ballinger DG, Sparks AB, Hartigan J, Smith DR, Suh E, Papadopoulos N, Buckhaults P, Markowitz SD, Parmigiani G, Kinzler KW, Velculescu VE, Vogelstein B (2007) The genomic landscapes of human breast and colorectal cancers. Science 318: 1108–1113PubMedCrossRefGoogle Scholar
  24. 24.
    Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1: 209–230CrossRefGoogle Scholar
  25. 25.
    Dunson DB (2010) Nonparametric Bayes applications to biostatistics. In: Hjort NL, Holmes C, Müller P, Walker SG (eds) Bayesian nonparametrics. Cambridge University Press, Cambridge, pp 223–270CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2014

Authors and Affiliations

  • Giovanni Parmigiani
    • 1
  • Simina Boca
    • 2
  • Jie Ding
    • 1
  • Lorenzo Trippa
    • 1
  1. 1.Department of Biostatistics and Computational BiologyDana Farber Cancer InstituteBostonUSA
  2. 2.Biostatistics Branch, Division of Cancer Epidemiology and GeneticsNational Cancer InstituteBethesdaUSA

Personalised recommendations