System-Scale Network Modeling of Cancer Using EPoC

  • Tobias Abenius
  • Rebecka Jörnsten
  • Teresia Kling
  • Linnéa Schmidt
  • José Sánchez
  • Sven Nelander
Conference paper
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 736)

Abstract

One of the central problems of cancer systems biology is to understand the complex molecular changes of cancerous cells and tissues, and use this understanding to support the development of new targeted therapies. EPoC (Endogenous Perturbation analysis of Cancer) is a network modeling technique for tumor molecular profiles. EPoC models are constructed from combined copy number aberration (CNA) and mRNA data and aim to (1) identify genes whose copy number aberrations significantly affect target mRNA expression and (2) generate markers for long- and short-term survival of cancer patients. Models are constructed by a combination of regression and bootstrapping methods. Prognostic scores are obtained from a singular value decomposition of the networks. We have previously analyzed the performance of EPoC using glioblastoma data from The Cancer Genome Atlas (TCGA) consortium, and have shown that resulting network models contain both known and candidate disease-relevant genes as network hubs, as well as uncover predictors of patient survival. Here, we give a practical guide how to perform EPoC modeling in practice using R, and present a set of alternative modeling frameworks.

Keywords

Leukemia Stratification Lasso 

Notes

Acknowledgments

The authors thank the editors and reviewer for their constructive comments. This project receives funding from Cancerfonden, Barncancerfonden (NB-CNS consortium), Vetenskapsradet (SN,RJ), BioCare (SN).

References

  1. 1.
    Adler AS, Lin M et al (2006) Genetic regulators of large-scale transcriptional signatures in cancer. Nat Genet 38:421–430PubMedCrossRefGoogle Scholar
  2. 2.
    Akavia UD, Litvin O et al (2010) An integrated approach to uncover drivers of cancer. Cell 143:1005–1017PubMedCrossRefGoogle Scholar
  3. 3.
    Bansal M, Belcastro V et al (2007) How to infer gene networks from expression profiles. Mol Syst Biol 3:78PubMedGoogle Scholar
  4. 4.
    Fisher R (1926) The arrangement of field experiments. J Ministry Agric Great Britain 33: 503–515Google Scholar
  5. 5.
    Friedman J, Hastie T et al (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302–332CrossRefGoogle Scholar
  6. 6.
    Friedman J, Hastie T et al (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441PubMedCrossRefGoogle Scholar
  7. 7.
    Friedman N, Linial M et al (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620PubMedCrossRefGoogle Scholar
  8. 8.
    Fu WJ (1998) Penalized regressions: the bridge versus the lasso. J Comput Graph Statist 7: 397–416CrossRefGoogle Scholar
  9. 9.
    Garraway LA, Widlund HR et al (2005) Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436:117–122PubMedCrossRefGoogle Scholar
  10. 10.
    Golub GH, Loan CFV (1996) Matrix computations. Johns Hopkins University Press, Baltimore, MD, USAGoogle Scholar
  11. 11.
    Haslinger A, Schwarz TJ et al (2009) Expression of Sox11 in adult neurogenic niches suggests a stage-specific role in adult neurogenesis. Eur J Neurosci 29:2103–2114PubMedCrossRefGoogle Scholar
  12. 12.
    Hastie T, Friedman J et al (2009) Elements of statistical learning, 2nd ed. Springer Verlag. Corr. 3rd printing 5th Printing, Springer-Verlag, New YorkGoogle Scholar
  13. 13.
    Jansen RC (2003) Studying complex biological systems using multifactorial perturbation. Nat Rev Genet 4:145–151PubMedCrossRefGoogle Scholar
  14. 14.
    Johnson D (1977) Efficient algorithms for shortest paths in sparse networks. J Acm 24:1–13CrossRefGoogle Scholar
  15. 15.
    Jörnsten R, Abenius T et al (2011) Large-scale network modeling and prognostic scoring of the effects of DNA copy number aberrations on gene expression in glioblastoma. Mol Syst Biol. Nature Publishing Group, 1(7)Google Scholar
  16. 16.
    Kendall MG, Smith BB (1939) The problem of m rankings. Ann Math Stat 10:275–287CrossRefGoogle Scholar
  17. 17.
    Kim YA, Wuchty S et al (2010) Simultaneous identification of causal genes and dys-regulated pathways in complex disease. Res Comput Mol Biol (RECOMB) 6044:263–280CrossRefGoogle Scholar
  18. 18.
    Lee SI, Dudley AM et al (2009) Learning a prior on regulatory potential from eQTL data. PLoS Genet 5:e1000358PubMedCrossRefGoogle Scholar
  19. 19.
    Lee SI, Pe’er D et al (2006) Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc Natl Acad Sci USA 103:14062–14067PubMedCrossRefGoogle Scholar
  20. 20.
    Margolin AA, Nemenman I et al (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 Suppl 1:S7Google Scholar
  21. 21.
    Nordling TEM, Jacobsen EW (2009) Interampatteness – a generic property of biochemical networks. IET Syst Biol 3(5):388–403PubMedCrossRefGoogle Scholar
  22. 22.
    Opgen-Rhein R, Strimmer K (2007) From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst Biol 1:37PubMedCrossRefGoogle Scholar
  23. 23.
    Peng J, Zhu J et al (2010) Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. Ann Math Stat 53–77Google Scholar
  24. 24.
    Piccirillo SGM, Binda E et al (2009) Brain cancer stem cells. J Mol Med 87:1087–1095PubMedCrossRefGoogle Scholar
  25. 25.
    Rockman MV (2008) Reverse engineering the genotype-phenotype map with natural genetic variation. Nature 456:738–744PubMedCrossRefGoogle Scholar
  26. 26.
    Savageau MA (1976) Biochemical systems analysis : a study of function and design in molecular biology; with a foreword by Robert Rosen. Advanced Book Program Addison-Wesley Pub Co, Addison-Wesley Reading, MA, USAGoogle Scholar
  27. 27.
    Schäfer J, Strimmer K (2005) An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754–764PubMedCrossRefGoogle Scholar
  28. 28.
    Shannon P, Markiel A et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504PubMedCrossRefGoogle Scholar
  29. 29.
    Shi Y, Sun G et al (2008) Neural stem cell self-renewal. Crit Rev Oncol Hematol 65:43–53PubMedCrossRefGoogle Scholar
  30. 30.
    Skogestad S, Postlethwaite I (1996) Multivariable feedback control: analysis and design? Wiley, Chichester and New YorkGoogle Scholar
  31. 31.
    Stranger BE, Forrest MS et al (2007a) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315:848–853PubMedCrossRefGoogle Scholar
  32. 32.
    Stranger BE, Nica AC et al (2007b) Population genomics of human gene expression. Nat Genet 39:1217–1224PubMedCrossRefGoogle Scholar
  33. 33.
    Suthram S, Beyer A et al (2008) eQED: an efficient method for interpreting eQTL associations using protein networks. Mol Syst Biol 4:162PubMedCrossRefGoogle Scholar
  34. 34.
    TCGA-Consortium (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061–1068CrossRefGoogle Scholar
  35. 35.
    Tegner J, Yeung MKS et al (2003) Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc Natl Acad Sci USA 100:5944–5949PubMedCrossRefGoogle Scholar
  36. 36.
    Troyanskaya O, Cantor M et al (2001) Missing value estimation methods for DNA microarrays Bioinformatics 17(6):520–525Google Scholar
  37. 37.
    Verhaak CPRG, Hoadley KA et al (2009) Reproducible Gene Expression Subtypes of Glioblastoma Show Associations with Chromosomal Aberrations Gene Mutations, and Clinical Phenotypes. ManuscriptGoogle Scholar
  38. 38.
    Witten DM, Tibshirani R et al (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10:515–534PubMedCrossRefGoogle Scholar
  39. 39.
    Zhu J, Zhang B et al (2008) Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet 40:854–861PubMedCrossRefGoogle Scholar
  40. 40.
    Zou H, Hastie T et al (2006) Sparse Principal Component Analysis. J Comput Graph Stat 2:262–286Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Tobias Abenius
    • 1
  • Rebecka Jörnsten
    • 1
  • Teresia Kling
    • 2
  • Linnéa Schmidt
    • 2
  • José Sánchez
    • 1
  • Sven Nelander
    • 2
  1. 1.Mathematical SciencesUniversity of Gothenburg and Chalmers University of TechnologyGothenburgSweden
  2. 2.Cancer Center SahlgrenskaInstitute of MedicineGothenburgSweden

Personalised recommendations