Abstract
With the advent of microarrays and next-generation biotechnologies, the use of gene expression data has become ubiquitous in biological research. One potential drawback of these data is that they are very rich in features or genes though cost considerations allow for the use of only relatively small sample sizes. A useful way of getting at biologically meaningful interpretations of the environmental or toxicological condition of interest would be to make inferences at the level of a priori defined biochemical pathways or networks of interacting genes or proteins that are known to perform certain biological functions. This chapter describes approaches taken in the literature to make such inferences at the biochemical pathway level. In addition this chapter describes approaches to create hypotheses on genes playing important roles in response to a treatment, using organism level gene coexpression or protein–protein interaction networks. Also, approaches to reverse engineer gene networks or methods that seek to identify novel interactions between genes are described. Given the relatively small sample numbers typically available, these reverse engineering approaches are generally useful in inferring interactions only among a relatively small or an order 10 number of genes. Finally, given the vast amounts of publicly available gene expression data from different sources, this chapter summarizes the important sources of these data and characteristics of these sources or databases. In line with the overall aims of this book of providing practical knowledge to a researcher interested in analyzing gene expression data from a network perspective, the chapter provides convenient publicly accessible tools for performing analyses described, and in addition describe three motivating examples taken from the published literature that illustrate some of the relevant analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Crick F (1970) Central dogma of molecular biology. Nature 227(5258):561–563
Greenbaum D et al (2003) Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol 4(9):117
Barrett T, Edgar R (2006) Gene expression omnibus: microarray data storage, submission, retrieval, and analysis. DNA Microarrays, Part B: Databases Stat 411:352–369
Parkinson H et al (2009) ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res 37(Suppl 1):D868
Brazma A et al (2001) Minimum information about a microarray experiment (MIAME)—toward standards for microarray data. Nat Genet 29(4):365–371
Von Mering C et al (2006) STRING 7—recent developments in the integration and prediction of protein interactions. Nucleic Acids Res 35(Suppl 1):D358
Gentleman RC et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80
Team RDC (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Crawley MJ (2005) Statistics: an introduction using R. Wiley, Chichester
Shannon P et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498
Zhu Y et al (2008) GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus. Bioinformatics 24(23):2798
Davis S, Meltzer PS (2007) GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23(14):1846
Kauffmann A et al (2009) Importing arrayexpress datasets into r/bioconductor. Bioinformatics 25(16):2092
Widenius M, Axmark D, DuBois P (2002) MySQL reference manual. O’Reilly & Associates, Inc., Sebastopol, CA
Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25
Al-Shahrour F, DĂaz-Uriarte R, Dopazo J (2004) FatiGO: a web tool for finding significant associations of gene ontology terms with groups of genes. Bioinformatics 20(4):578
BeiĂźbarth T, Speed TP (2004) GOstat: find statistically overrepresented gene ontologies within a group of genes. Bioinformatics 20(9):1464
Subramanian A et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102(43):15545
Thomas R et al (2009) Choosing the right path: enhancement of biologically relevant sets of genes or proteins using pathway structure. Genome Biol 10(4):R44
Goeman JJ, BĂĽhlmann P (2007) Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23(8):980
Alexa A, RahnenfĂĽhrer J, Lengauer T (2006) Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22(13):1600
Da Wei Huang BTS, Lempicki RA (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57
Huang DW, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1
Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98(9):5116
Kanehisa M et al (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36(Suppl 1):D480
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27
Kanehisa M et al (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34(Database Issue):D354
Subramanian A et al (2007) GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics 23(23):3251
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575
Akutsu T, Miyano S, Kuhara S (2000) Inferring qualitative relations in genetic networks and metabolic pathways. Bioinformatics 16(8):727
Bernardo D, Gardner T, Collins JJ (2004) Robust identification of large genetic networks
Chen T, He HL, Church GM (1999) Modeling gene expression with differential equations. Pac Symp Biocomput 4:29–40
D’haeseleer P, Liang S, Somogyi R (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16(8):707
Ideker TE, Thorsson V, Karp RM (2000) Discovery of regulatory interactions through perturbation: inference and experimental design. Pac Symp Biocomput 5:302–313
Margolin A et al (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform 7(Suppl 1):S7
Hartemink AJ et al (2002) Bayesian methods for elucidating genetic regulatory networks. IEEE Intell Syst 17:37–43
Yamanaka T et al (2004) The TAO-Gen algorithm for identifying gene interaction networks with application to SOS repair in E. coli. Environ Health Perspect 112(16):1614
Thomas R et al (2004) A model-based optimization framework for the inference of gene regulatory networks from DNA array data. Bioinformatics 20(17):3221–3235
Thomas R et al (2007) A model-based optimization framework for the inference of regulatory interactions using time-course DNA microarray expression data. BMC Bioinform 8(1):228
Dasika M et al (2003) A mixed integer linear programming (MILP) framework for inferring time delay in gene regulatory networks. World Scientific Pub Co Inc.
Sales G, Romualdi C (2011) Parmigene—a parallel R package for mutual information estimation and gene network reconstruction. Bioinformatics 27:1876–1877
McHale C et al (2010) Global gene expression profiling of a population exposed to a range of benzene levels. Environ Health Perspect 10
Auerbach SS et al (2010) Comparative phenotypic assessment of cardiac pathology, physiology, and gene expression in C3H/HeJ, C57BL/6J, and B6C3F1/J mice. Toxicol Pathol 38(6):923
Jupiter D, Chen H, VanBuren V (2009) STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data. BMC Bioinform 10(1):332
Toyoshiba H et al (2006) Gene interaction network analysis suggests differences between high and low doses of acetaminophen. Toxicol Appl Pharmacol 215(3):306–316
Disclaimer
The findings and conclusions in this report are those of the authors and do not necessarily represent the views and positions of the Centers for Disease Control and Prevention or the Agency for Toxic Substances and Disease Registry.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Thomas, R., Portier, C.J. (2013). Gene Expression Networks. In: Reisfeld, B., Mayeno, A. (eds) Computational Toxicology. Methods in Molecular Biology, vol 930. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-059-5_7
Download citation
DOI: https://doi.org/10.1007/978-1-62703-059-5_7
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-058-8
Online ISBN: 978-1-62703-059-5
eBook Packages: Springer Protocols