Skip to main content

Integration of Omics Data for Cancer Research

  • Chapter
  • First Online:
An Omics Perspective on Cancer Research

Abstract

The development of high-throughput techniques for analyzing cell components has provided vast amounts of data in recent years. This development of gene-sequencing methods was followed by advances in techniques for analyzing and managing data from transcriptomes, proteomes, and other omics data. The so-called omics revolution has led to the development of numerous databases describing specific cell components. A recent study suggests that cell behavior cannot be modeled by analyzing its constituents separately, but rather calls for an integrative approach (Barabási and Oltvai 2004). Thus, specialized techniques are being developed to integrate omics information. To enable new research avenues that can take advantage of and apply this information to new therapies – e.g. in cancer research – methods must be designed that provide a seamless integration of these new databases with classical clinical data.

The problem of database integration has been studied at length over the last 15 years, with special emphasis in the post-genomic era on publicly available online data. In the field of genomic medicine, the integration of phenotype and genotype data is of special interest for the prevention of patient intolerance to specific drugs and for defining personalized therapies. Patients’ individual characteristics will play a fundamental role in future treatment design. These characteristics include, of course, genetic profiles.

To address the issues surrounding omics data integration, this chapter is organized as follows. Section 14.1 describes the role of data integration in cancer research. Section 14.2 analyzes omics data integration problems and techniques. Section 14.3 introduces a range of international efforts in database integration. Finally, Section 14.4 presents future trends in omics data integration for cancer research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abbott R (2004) Emergence, entities, entropy, and binding forces. The Agent 2004 Conference on Social Dynamics: Interaction, Reflexivity and Emergence, Argonne National Labs and University of Chicago. http://abbott.calstatela.edu/PapersAndTalks/abbott_agent_2004.pdf, Accessed 7 November 2008

  • Abbott R, “Emergence, Entities, Entropy, and Binding Forces,” The Agent 2004 Conference on: Social Dynamics: Interaction, Reflexivity, and Emergence, Argonne National Labs and University of Chicago, October 2004

    Google Scholar 

  • Albert R (2005) Scale-free networks in cell biology. J Cell Sci 118:4947–4957

    Article  CAS  PubMed  Google Scholar 

  • Alonso-Calvo R, Maojo V, Billhardt H et al (2007) An agent- and ontology-based system for integrating public gene, protein, and disease databases. J Biomed Inform 40:17–29

    Article  CAS  PubMed  Google Scholar 

  • Astakhov V, Gupta A, Santini S et al (2005) Data integration in the Biomedical Informatics Research Network (BIRN), Data integration in the life sciences, 1st edn. Springer, Berlin

    Google Scholar 

  • Baker PG, Brass A, Bechhofer S et al (1998) TAMBIS: Transparent access to multiple bioinformatics information sources. An overview. In: Proceedings of the Sixth International Conference of Intelligent Systems for Molecular Biology (ISMB98), Montreal.

    Google Scholar 

  • Barabási AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113

    Article  PubMed  Google Scholar 

  • Bhowmick SS, Singh DT, Laud A (2003) Data management in metaboloinformatics: issues and challenges. LNCS 2736:392–402

    Google Scholar 

  • Billings PR, Carlson RJ, Carlson J et al (2005) Ready for genomic medicine? Perspectives of health care decision makers. Arch Intern Med 165:1917–1919

    Article  PubMed  Google Scholar 

  • Biomedical Informatics Research Network. http://www.nbirn.net/index.shtm. Accessed 7 November 2008

  • Branson A, Hauer T, McClatchey R et al (2008) A data model for integrating heterogeneous medical data in the health-e-child project. Stud Health Technol Inform 138:13–23

    PubMed  Google Scholar 

  • Burgun A, Bodenreider O (2008) Accessing and integrating data and knowledge for biomedical research. Yearbook of medical informatics, pp 91–101

    Google Scholar 

  • Cali A, De Giacomo G, Lenzerini M (2001) Models for information integration: turning local-as-view into global-as-view. In: Proceedings of International Workshop on Foundations of Models for Information Integration (10th Workshop in the series foundations of models and languages for data and objects), Viterbo.

    Google Scholar 

  • The CellML web page. http://www.cellml.org/index_html. Accessed 7 November 2008

  • Cheng Y, Church GM (2000) Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8:93–103

    CAS  PubMed  Google Scholar 

  • Coen M, Ruepp SU, Lindon JC et al (2004) Integrated application of transcriptomics and metabonomics yields new insight into the toxicity due to paracetamol in the mouse. J Pharm Biomed Anal 35:93–105

    Article  CAS  PubMed  Google Scholar 

  • Collins FS, McKusick VA (2001) Implications of the Human Genome Project for medical science. JAMA 285:540–544

    Article  CAS  PubMed  Google Scholar 

  • Corthésy-Theulaz I, den Dunnen JT, Ferré P et al (2005) Nutrigenomics: the impact of biomics technology on nutrition research. Ann Nutr Metab 49:355–365

    Article  PubMed  Google Scholar 

  • Cusick ME, Klitgord N, Vidal M et al (2008) Interactome: gateway into systems biology. Hum Mol Genet 14:R171–181

    Article  Google Scholar 

  • Davidson EH, McClay DR, Hood L (2003) Regulatory gene networks and the properties of the developmental process. Proc Natl Acad Sci USA 100:1475–1480

    Article  CAS  PubMed  Google Scholar 

  • de Groen PC, Dettinger R, Johnson P (2003) Mayo Clinic/IBM computational biology collaboration: a simple user interface for complex queries. In: Universal access in HCI – volume 4 of the proceedings of human–computer interaction (HCI) international, pp 1083–1087

    Google Scholar 

  • Enard W, Khaitovich P, Klose J et al (2002) Intra- and interspecific variation in primate gene expression patterns. Science 296:340–343

    Article  CAS  PubMed  Google Scholar 

  • Fiehn O (2002) Metabolomics – the link between genotypes and phenotypes. Plant Mol Biol 48:155–171

    Article  CAS  PubMed  Google Scholar 

  • Freund J, Comaniciu D, Ioannis Y et al (2006) Health-e-Child Consortium. Health-e-child: an integrated biomedical platform for grid-based paediatric applications. Stud Health Technol Inform 120:259–270

    PubMed  Google Scholar 

  • Galperin MY (2008) The molecular biology database collection: 2008 update. Nucleic Acids Res 36:D2–4

    Article  CAS  PubMed  Google Scholar 

  • geneConnect. https://cabig.nci.nih.gov/tools/GeneConnect. Accessed 7 November 2008.

    Google Scholar 

  • The GRAM Algorithm. http://www.psrg.lcs.mit.edu/Networks/alg/GRAM.pdf. Accessed 7 November 2008

  • Gruber TR (1993) A translation approach to portable ontologies. Knowl Acquis 5:199–220

    Article  Google Scholar 

  • Heijne WH, Stierum RH, Slijper M et al (2003) Toxicogenomics of bromobenzene hepatotoxicity: a combined transcriptomics and proteomics approach. Biochem Pharmacol 65:857–875

    Article  CAS  PubMed  Google Scholar 

  • Hirai MY, Yano M, Goodenowe DB et al (2004) Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proc Natl Acad Sci USA 101:10205–10210

    Article  CAS  PubMed  Google Scholar 

  • Hirai MY, Klein M, Fujikawa Y et al (2005) Elucidation of gene-to-gene and metabolite-to-gene networks in arabidopsis by integration of metabolomics and transcriptomics. J Biol Chem 280:25590–25595

    Article  CAS  PubMed  Google Scholar 

  • Clinical Genomics special interest group. http://www.haifa.ibm.com/projects/software/cgl7/specifications.html. Accessed 7 November 2008

  • Ihmels J, Bergmann S, Gerami-Nejad M et al (2005) Rewiring of the yeast transcriptional network through the evolution of motif usage. Science 309:938–940

    Article  CAS  PubMed  Google Scholar 

  • Iozzo RV (2001) Heparan sulfate proteoglycans: intricate molecules with intriguing functions. J Clin Invest 108:165–167

    CAS  PubMed  Google Scholar 

  • Ippolito JE, Xu J, Jain S et al (2005) An integrated functional genomics and metabolomics approach for defining poor prognosis in human neuroendocrine cancers. Proc Natl Acad Sci USA 102:9901–9906

    Article  CAS  PubMed  Google Scholar 

  • Jarke M, Jeusfeld M A, Quix C et al (1998) Architecture and quality in data warehouses. In: Pernici B, Thanos C (eds) Proceedings of the 10th international conference on advanced information systems engineering (08–12 June 1998). Lecture notes in computer science, volume 1413. Springer, London, pp 93–113

    Google Scholar 

  • Joyce AR, Palsson B (2006) The model organism as a system: integrating ‘omics’ data sets. Nat Rev Mol Cell Biol 7:198–210

    Article  CAS  PubMed  Google Scholar 

  • Khaitovich P, Muetzel B, She X et al (2004) Regional patterns of gene expression in human and chimpanzee brains. Genome Res 14:1462–1473

    Article  CAS  PubMed  Google Scholar 

  • Khaitovich P, Hellmann I, Enard W et al (2005) Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 309:1850–1854

    Article  CAS  PubMed  Google Scholar 

  • Kimball R (1996) The data warehouse toolkit: practical techniques for building dimensional data warehouses. New York: John Wiley

    Google Scholar 

  • Kraj P, McIndoe RA (2005) caBIONet-A.NET wrapper to access and process genomic data stored at the National Cancer Institute’s Center for Bioinformatics databases. Bioinformatics 21:3456–3458

    Article  CAS  PubMed  Google Scholar 

  • Kristensen C, Morant M, Olsen CE et al (2005) Metabolic engineering of dhurrin in transgenic Arabidopsis plants with marginal inadvertent effects on the metabolome and transcriptome. Proc Natl Acad Sci USA 102:1779–1784

    Article  CAS  PubMed  Google Scholar 

  • Langella SA, Oster S, Hastings S et al (2007) The Cancer Biomedical Informatics Grid (caBIG) Security infrastructure. In: AMIA annual symposium proceedings, pp 433–437

    Google Scholar 

  • Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. PODS ’02 ACM, New York, pp 233–246

    Google Scholar 

  • Levy AY, Rajaraman A, Ordille JJ (1996) Querying heterogeneous information sources using source descriptions. In: Proceedings of the twenty-second international conference on very large databases. Mumbai, India, pp 251–262

    Google Scholar 

  • Levy S, Sutton G, Ng PC et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5:e254

    Article  PubMed  Google Scholar 

  • Lloyd CM, Halstead MD, Nielsen PF (2004) CellML: its future, present and past. Prog Biophys Mol Biol 85:433–450

    Article  CAS  PubMed  Google Scholar 

  • Luscombe NM, Babu MM, Yu H et al (2004) Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431:308–312

    Article  CAS  PubMed  Google Scholar 

  • Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1:24–45

    Article  CAS  PubMed  Google Scholar 

  • Maojo V, Tsiknakis M (2007) Biomedical informatics and healthGRIDs: a European perspective. IEEE Eng Med Biol Mag 26:34–41

    Article  PubMed  Google Scholar 

  • Martín L, Bonsma E, Anguita A et al (2007) Data Access and Management in ACGT: Tools to solve syntactic and semantic heterogeneities between clinical and image databases, in First International Workshop on Conceptual Modelling for Life Sciences Applications (CMLSA 2007): 4802 (LNCS) / pp 24-335-9 Nov 2007, Auckland (New Zealand)

    Google Scholar 

  • Mason CE, Seringhaus MR, Sattler de Sousa e Brito C (2007) Personalized Genomic Medicine with a Patchwork, Partially Owned Genome. Yale J Biol Med 80:145–151

    Google Scholar 

  • What is systems biology? The Munich systems biology forum. http://www.msbf.mpg.de/ho_sys_ch.html. Accessed 7 November 2008

  • Editorial (2004) Making data dreams come true. Nature 428:239

    Google Scholar 

  • Nikolsky Y, Nikolskaya T, Bugrim A (2005) Biological networks and analysis of experimental data in drug discovery. Drug Discov Today 10:653–662

    Article  CAS  PubMed  Google Scholar 

  • PANTHER classification system. http://www.pantherdb.org/pathway/. Accessed 7 November 2008

  • PathArt database. http://bioinformatics.unc.edu/software/pathart/index.htm. Accessed 7 November 2008

  • Pérez-Rey D, Maojo V, García-Remesal M et al (2005) ONTOFUSION: ontology-based integration of genomic and clinical databases. Comput Biol Med 36:712–730

    Google Scholar 

  • Pérez-Rey D, Anguita A, Crespo J (2006) OntoDataClean: ontology-based integration and preprocessing of distributed data. Lecture Notes Comput Sci 4345:262–272

    Google Scholar 

  • Petrik V, Loosemore A, Howe FA et al (2006) OMICS and brain tumor biomarkers. Br J Neurosurg 20:275–280

    Article  PubMed  Google Scholar 

  • Personal genome project. http://www.personalgenomes.org/. Accessed 7 November 2008

  • PID. http://pid.nci.nih.gov/. Accessed 7 November 2008

  • Rebbeck TR (2006) Inherited genetic markers and cancer outcomes: personalized medicine in the postgenome era. J Clin Oncol 24:1972–1974

    Article  CAS  PubMed  Google Scholar 

  • Rubinstein WS, Roy HK (2005) Practicing medicine at the front lines of the genomic revolution. Arch Intern Med 165:1815–1817

    Article  PubMed  Google Scholar 

  • Russ Abbott, Emergence, Entities, Entropy and Binding Forces, In Proceedings of “The Agent 2004 Conference on: Social Dynamics: Interaction, Reflexivity and Emergence”, Chicago, 2004

    Google Scholar 

  • The systems biology markup language. http://sbml.org/Main_Page. Accessed 7 November 2008

  • SBO: systems biology ontology. http://www.ebi.ac.uk/sbo/. Accessed 7 November 2008

  • Shironoshita EP, Jean-Mary YR, Bradley RM et al (2008) semCDI: a query formulation for semantic data integration in caBIG. J Am Med Inform Assoc 15:559–568

    Article  PubMed  Google Scholar 

  • Shriver Z, Raguram S, Sasisekharan R (2004) Glycomics: a pathway to a class of new and improved therapeutics. Nature Rev Drug Discov 3:863–873

    Article  CAS  Google Scholar 

  • Sohal D, Yeatts A, Ye K et al (2008) Meta-analysis of microarray studies reveals a novel hematopoietic progenitor cell signature and demonstrates feasibility of inter-platform data integration. PLoS ONE 3:e2965

    Article  PubMed  Google Scholar 

  • Stierum R, Heijne W, Kienhuis A et al (2005) Toxicogenomics concepts and applications to study hepatic effects of food additives and chemicals. Toxicol Appl Pharmacol 207:179–188

    Article  PubMed  Google Scholar 

  • Sujanski W (2001) Heterogeneous database integration in biomedicine. J Biomed Inform 34:285–298

    Article  Google Scholar 

  • Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18:S136–144

    PubMed  Google Scholar 

  • Tanay A, Regev A, Shamir R (2005) Conservation and evolvability in regulatory networks: the evolution of ribosomal regulation in yeast. Proc Natl Acad Sci USA 102:7203–7208

    Article  CAS  PubMed  Google Scholar 

  • Tsiknakis M, Kafetzopoulos D, Potamias G et al (2006) Building a European biomedical grid on cancer: the ACGT Integrated Project. Stud Health Technol Inform 120:247–258

    CAS  PubMed  Google Scholar 

  • Wenk MR (2005) The emerging field of lipidomics. Nat Rev Drug Discov 4:594–610

    Article  CAS  PubMed  Google Scholar 

  • Wiechert W, Schweissgut O, Takanaga H et al (2007) Fluxomics: mass spectrometry versus quantitative imaging. Curr Opin Plant Biol 10:323–330

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Martín .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Martín, L., Anguita, A., Maojo, V., Crespo, J. (2010). Integration of Omics Data for Cancer Research. In: Cho, W. (eds) An Omics Perspective on Cancer Research. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-2675-0_14

Download citation

Publish with us

Policies and ethics