Skip to main content

Merging Multiple Omics Datasets In Silico: Statistical Analyses and Data Interpretation

  • Protocol
  • First Online:
Systems Metabolic Engineering

Part of the book series: Methods in Molecular Biology ((MIMB,volume 985))

Abstract

By the combinations of high-throughput analytical technologies in the fields of transcriptomics, proteomics, and metabolomics, we are now able to gain comprehensive and quantitative snapshots of the intracellular processes. Dynamic intracellular activities and their regulations can be elucidated by systematic observation of these multi-omics data. On the other hand, careful statistical analysis is necessary for such integration, since each of the omics layers as well as the specific analytical methodologies harbor different levels of noise and variations. Moreover, interpretation of such multitude of data requires an intuitive pathway context. Here we describe such statistical methods for the integration and comparison of multi-omics data, as well as the computational methods for pathway reconstruction, ID conversion, mapping, and visualization that play key roles for the efficient study of multi-omics information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stoughton RB (2005) Applications of DNA microarrays in biology. Annu Rev Biochem 74:53–82

    Article  CAS  Google Scholar 

  2. Kandpal R, Saviola B, Felton J (2009) The era of ‘omics unlimited. Biotechniques 46(351–352):354–355

    Google Scholar 

  3. Becker CH, Bern M (2011) Recent developments in quantitative proteomics. Mutat Res 722:171–182

    Article  CAS  Google Scholar 

  4. Ishihama Y (2005) Proteomic LC-MS systems using nanoscale liquid chromatography with tandem mass spectrometry. J Chromatogr A 1067:73–83

    Article  CAS  Google Scholar 

  5. Patti GJ, Yanes O, Siuzdak G (2012) Innovation: metabolomics: the apogee of the omics trilogy. Nat Rev Mol Cell Biol 13:263–269

    Article  CAS  Google Scholar 

  6. Ramautar R, Mayboroda OA, Somsen GW, de Jong GJ (2011) CE-MS for metabolomics: developments and applications in the period 2008-2010. Electrophoresis 32:52–65

    Article  CAS  Google Scholar 

  7. Saito N, Ohashi Y, Soga T, Tomita M (2010) Unveiling cellular biochemical reactions via metabolomics-driven approaches. Curr Opin Microbiol 13:358–362

    Article  CAS  Google Scholar 

  8. Gibbons JG, Janson EM, Hittinger CT, Johnston M, Abbot P, Rokas A (2009) Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Mol Biol Evol 26:2731–2744

    Article  CAS  Google Scholar 

  9. Niedringhaus TP, Milanova D, Kerby MB, Snyder MP, Barron AE (2011) Landscape of next-generation sequencing technologies. Anal Chem 83:4327–4341

    Article  CAS  Google Scholar 

  10. Werner T (2010) Next generation sequencing in functional genomics. Brief Bioinform 11:499–511

    Article  CAS  Google Scholar 

  11. Citri A, Pang ZP, Sudhof TC, Wernig M, Malenka RC (2011) Comprehensive qPCR profiling of gene expression in single neuronal cells. Nat Protoc 7:118–127

    Article  Google Scholar 

  12. Geiss GK, Bumgarner RE, Birditt B, Dahl T, Dowidar N, Dunaway DL, Fell HP, Ferree S, George RD, Grogan T et al (2008) Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat Biotechnol 26:317–325

    Article  CAS  Google Scholar 

  13. Vogel C, Marcotte EM (2012) Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 13:227–232

    CAS  Google Scholar 

  14. Kitano H (2002) Computational systems biology. Nature 420:206–210

    Article  CAS  Google Scholar 

  15. Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664

    Article  CAS  Google Scholar 

  16. Arita M, Robert M, Tomita M (2005) All systems go: launching cell simulation fueled by integrated experimental biology data. Curr Opin Biotechnol 16:344–349

    Article  CAS  Google Scholar 

  17. Tomita M (2001) Towards computer aided design (CAD) of useful microorganisms. Bioinformatics 17:1091–1092

    Article  CAS  Google Scholar 

  18. Buescher JM, Liebermeister W, Jules M, Uhr M, Muntel J, Botella E, Hessling B, Kleijn RJ, Le Chat L, Lecointe F et al (2012) Global network reorganization during dynamic adaptations of Bacillus subtilis metabolism. Science 335:1099–1103

    Article  CAS  Google Scholar 

  19. Canelas AB, Harrison N, Fazio A, Zhang J, Pitkanen JP, van den Brink J, Bakker BM, Bogner L, Bouwman J, Castrillo JI et al (2010) Integrated multilaboratory systems biology reveals differences in protein metabolism between two reference yeast strains. Nat Commun 1:145

    Article  Google Scholar 

  20. Ishii N, Nakahigashi K, Baba T, Robert M, Soga T, Kanai A, Hirasawa T, Naba M, Hirai K, Hoque A et al (2007) Multiple high-throughput analyses monitor the response of E. coli to perturbations. Science 316:593–597

    Article  CAS  Google Scholar 

  21. Park SJ, Lee SY, Cho J, Kim TY, Lee JW, Park JH, Han MJ (2005) Global physiological understanding and metabolic engineering of microorganisms based on omics studies. Appl Microbiol Biotechnol 68:567–579

    Article  CAS  Google Scholar 

  22. Moxley JF, Jewett MC, Antoniewicz MR, Villas-Boas SG, Alper H, Wheeler RT, Tong L, Hinnebusch AG, Ideker T, Nielsen J et al (2009) Linking high-resolution metabolic flux phenotypes and transcriptional regulation in yeast modulated by the global regulator Gcn4p. Proc Natl Acad Sci U S A 106:6477–6482

    Article  CAS  Google Scholar 

  23. Gehlenborg N, O’Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D et al (2010) Visualization of omics data for systems biology. Nat Methods 7:S56–S68

    Article  CAS  Google Scholar 

  24. Zhang W, Li F, Nie L (2010) Integrating multiple ‘omics’ analysis for microbial biology: application and methodologies. Microbiology 156:287–301

    Article  CAS  Google Scholar 

  25. Joyce AR, Palsson BO (2006) The model organism as a system: integrating ‘omics’ data sets. Nat Rev Mol Cell Biol 7:198–210

    Article  CAS  Google Scholar 

  26. De Keersmaecker SC, Thijs IM, Vanderleyden J, Marchal K (2006) Integration of omics data: how well does it work for bacteria? Mol Microbiol 62:1239–1250

    Article  Google Scholar 

  27. Steinfath M, Repsilber D, Scholz M, Walther D, Selbig J (2007) Integrated data analysis for genome-wide research. EXS 97:309–329

    CAS  Google Scholar 

  28. Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74:829–836

    Article  Google Scholar 

  29. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M (2005) Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics 4:1265–1272

    Article  CAS  Google Scholar 

  30. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628

    Article  CAS  Google Scholar 

  31. Arakawa K, Kido N, Oshita K, Tomita M (2010) G-language genome analysis environment with REST and SOAP web service interfaces. Nucleic Acids Res 38:W700–W705

    Article  CAS  Google Scholar 

  32. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M (2007) KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res 35:W182–W185

    Article  Google Scholar 

  33. Kono N, Arakawa K, Ogawa R, Kido N, Oshita K, Ikegami K, Tamaki S, Tomita M (2009) Pathway projector: web-based zoomable pathway browser using KEGG atlas and Google Maps API. PLoS One 4:e7710

    Article  Google Scholar 

  34. Arakawa K, Yamada Y, Shinoda K, Nakayama Y, Tomita M (2006) GEM system: automatic prototyping of cell-wide metabolic pathway models from genomes. BMC Bioinformatics 7:168

    Article  Google Scholar 

  35. Sun J, Zeng AP (2004) IdentiCS–identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence. BMC Bioinformatics 5:112

    Article  Google Scholar 

  36. Hyland C, Pinney JW, McConkey GA, Westhead DR (2006) metaSHARK: a WWW platform for interactive exploration of metabolic networks. Nucleic Acids Res 34:W725–W728

    Article  CAS  Google Scholar 

  37. Zhang KX, Ouellette BF (2009) Pandora, a pathway and network discovery approach based on common biological evidence. Bioinformatics 26:529–535

    Article  CAS  Google Scholar 

  38. Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L et al (2010) Pathway tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform 11:40–79

    Article  CAS  Google Scholar 

  39. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M et al (2008) The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75

    Article  Google Scholar 

  40. Paley SM, Karp PD (2006) The pathway tools cellular overview diagram and omics Viewer. Nucleic Acids Res 34:3771–3778

    Article  CAS  Google Scholar 

  41. Yamada T, Letunic I, Okuda S, Kanehisa M, Bork P (2011) iPath2.0: interactive pathway explorer. Nucleic Acids Res 39:W412–W415

    Article  CAS  Google Scholar 

  42. Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39:D691–D697

    Article  Google Scholar 

  43. Junker BH, Klukas C, Schreiber F (2006) VANTED: a system for advanced data analysis and visualization in the context of biological networks. BMC Bioinformatics 7:109

    Article  Google Scholar 

  44. Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcantara R, Darsow M, Guedj M, Ashburner M (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res 36:D344–D350

    Article  CAS  Google Scholar 

  45. Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:W623–W633

    Article  CAS  Google Scholar 

  46. Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sinica 12:111–139

    Google Scholar 

  47. Levene H (1960) Robust tests for the equality of variance. In: Olkin I (ed) Contributions to probability and statistics. Stanford University Press, Palo Alto, CA, pp 278–292

    Google Scholar 

  48. Bewick V, Cheek L, Ball J (2004) Statistics review 9: one-way analysis of variance. Crit Care 8:130–136

    Article  Google Scholar 

  49. Welch BL (1951) On the comparison of several mean values: an alternative approach. Biometrika 38:330–336

    Google Scholar 

  50. Games PA, Howell JF (1976) Pairwise multiple comparison procedures with unequal N’s and/or variances: a Monte Carlo study. J Educ Stat 1:113–125

    Article  Google Scholar 

  51. Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682

    Article  CAS  Google Scholar 

  52. Baart GJ, Martens DE (2012) Genome-scale metabolic models: reconstruction and analysis. Methods Mol Biol 799:107–126

    Article  CAS  Google Scholar 

  53. Toya Y, Kono N, Arakawa K, Tomita M (2011) Metabolic flux analysis and visualization. J Proteome Res 10:3313–3323

    Article  CAS  Google Scholar 

  54. Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39:309–338

    Article  CAS  Google Scholar 

  55. Altenhoff AM, Dessimoz C (2012) Inferring orthology and paralogy. Methods Mol Biol 855:259–279

    Article  Google Scholar 

  56. Tipton K, Boyce S (2000) History of the enzyme nomenclature system. Bioinformatics 16:34–40

    Article  CAS  Google Scholar 

  57. Karp PD, Riley M, Paley SM, Pellegrini-Toole A (2002) The MetaCyc database. Nucleic Acids Res 30:59–61

    Article  CAS  Google Scholar 

  58. Karp PD (2004) Call for an enzyme genomics initiative. Genome Biol 5:401

    Article  Google Scholar 

  59. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30:e15

    Article  Google Scholar 

  60. Hilton A, Armstrong RA (2006) Statnote 6: Post-hoc ANOVA tests. Microbiologist 7:34–36

    Google Scholar 

Download references

Acknowledgements

This work was supported by funds from the Yamagata Prefectural Government and Tsuruoka City.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazuharu Arakawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Arakawa, K., Tomita, M. (2013). Merging Multiple Omics Datasets In Silico: Statistical Analyses and Data Interpretation. In: Alper, H. (eds) Systems Metabolic Engineering. Methods in Molecular Biology, vol 985. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-299-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-299-5_23

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-298-8

  • Online ISBN: 978-1-62703-299-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics