Discovering Altered Regulation and Signaling Through Network-based Integration of Transcriptomic, Epigenomic, and Proteomic Tumor Data

  • Amanda J. Kedaigle
  • Ernest FraenkelEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1711)


With the extraordinary rise in available biological data, biologists and clinicians need unbiased tools for data integration in order to reach accurate, succinct conclusions. Network biology provides one such method for high-throughput data integration, but comes with its own set of algorithmic problems and needed expertise. We provide a step-by-step guide for using Omics Integrator, a software package designed for the integration of transcriptomic, epigenomic, and proteomic data. Omics Integrator can be found at

Key words

Data integration Network biology Computational biology High-throughput data 



This work was supported by grants from National Institute of Health (R01-NS089076, T32-GM008334, and U01-CA184898). We thank Tobias Ehrenberger and Renan Escalante-Chong for helpful comments on the manuscript.


  1. 1.
    Tomczak K, Czerwińska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 19:A68–A77. Google Scholar
  2. 2.
    Encode Consortium (2013) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74.
  3. 3.
    Malo N, Hanley JA, Cerquozzi S et al (2006) Statistical practice in high-throughput screening data analysis. Nat Biotechnol 24:167–175. CrossRefPubMedGoogle Scholar
  4. 4.
    Huang S-SC, Fraenkel E (2009) Integrating proteomic, transcriptional, and interactome data reveals hidden components of signaling and regulatory networks. Sci Signal 2:ra40. PubMedPubMedCentralGoogle Scholar
  5. 5.
    Ideker T, Thorsson V, Ranish JA et al (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929–934. CrossRefPubMedGoogle Scholar
  6. 6.
    Huang SSC, Clarke DC, Gosline SJC et al (2013) Linking proteomic and transcriptional data through the interactome and epigenome reveals a map of oncogene-induced signaling. PLoS Comput Biol 9(2):e1002887. CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Barabási A-L, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113. CrossRefPubMedGoogle Scholar
  8. 8.
    Razick S, Magklaras G, Donaldson IM (2008) iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9:405. CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Tyers M, Breitkreutz A, Stark C et al (2006) BioGRID: a general repository for interaction datasets. Nucleic Acids Res 34:D535–D539. CrossRefPubMedGoogle Scholar
  10. 10.
    Szklarczyk D, Franceschini A, Wyder S et al (2015) STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43:D447–D452. CrossRefPubMedGoogle Scholar
  11. 11.
    Wishart DS, Jewison T, Guo AC et al (2013) HMDB 3.0—the human metabolome database in 2013. Nucleic Acids Res 41(Database issue):D801–D807. PubMedGoogle Scholar
  12. 12.
    Thiele I, Swainston N, Fleming RMT et al (2013) A community-driven global reconstruction of human metabolism. Nat Biotechnol 31:419–425. CrossRefPubMedGoogle Scholar
  13. 13.
    Kuhn M, Szklarczyk D, Pletscher-Frankild S et al (2014) STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res 42(Database issue):D401–D407. CrossRefPubMedGoogle Scholar
  14. 14.
    Valcárcel B, Würtz P, al Basatena NKS et al (2011) A differential network approach to exploring differences between biological states: an application to prediabetes. PLoS One 6(9):e24702. CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Kotze HL, Armitage EG, Sharkey KJ et al (2013) A novel untargeted metabolomics correlation-based network analysis incorporating human metabolic reconstructions. BMC Syst Biol 7:107. CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Tuncbag N, Braunstein A, Pagnani A et al (2013) Simultaneous reconstruction of multiple signaling pathways via the prize-collecting steiner forest problem. J Comput Biol 20:124–136. CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Tuncbag N, Gosline SJ, Kedaigle AJ et al (2016) Network-based interpretation of diverse high-throughput datasets through the Omics Integrator software package. PLoS Comput Biol 12(4):e1004879CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Aoki-Kinoshita KF, Kanehisa M (2007) Gene annotation and pathway mapping in KEGG. Methods Mol Biol 396:71–91. CrossRefPubMedGoogle Scholar
  19. 19.
    Maier T, Güell M, Serrano L (2009) Correlation of mRNA and protein in complex biological samples. FEBS Lett 583:3966–3973. CrossRefPubMedGoogle Scholar
  20. 20.
    Bernstein BE, Stamatoyannopoulos JA, Costello JF et al (2010) The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol 28:1045–1048. CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Matys V, Kel-Margoulis OV, Fricke E et al (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34:D108–D110. CrossRefPubMedGoogle Scholar
  22. 22.
    Neph S, Vierstra J, Stergachis AB et al (2012) An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489:83–90. CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Blankenberg D, Von Kuster G, Coraor N et al (2010) Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol.
  24. 24.
    Villaveces JM, Jiménez RC, Porras P et al (2015) Merging and scoring molecular interactions utilising existing community standards: tools, use-cases and a case study. Database 2015:bau131. CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Shannon P, Markiel A, Ozier O et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504. CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Smoot ME, Ono K, Ruscheinski J et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431–432. CrossRefPubMedGoogle Scholar
  27. 27.
    Love MI, Anders S, Huber W (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol.
  28. 28.
    Trapnell C, Hendrickson DG, Sauvageau M et al (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol 31:46–53. CrossRefPubMedGoogle Scholar
  29. 29.
    Bantscheff M, Lemeer S, Savitski MM, Kuster B (2012) Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal Bioanal Chem 404:939–965. CrossRefPubMedGoogle Scholar
  30. 30.
    Saito R, Smoot ME, Ono K et al (2012) A travel guide to Cytoscape plugins. Nat Methods 9:1069–1076. CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2018

Authors and Affiliations

  1. 1.Computational and Systems BiologyMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Department of Biological EngineeringMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations