Abstract
The inference of gene co-expression networks is a valuable resource for novel hypotheses in experimental research. Routine high-throughput microarray transcript profiling experiments and the rapid development of next-generation sequencing (NGS) technologies generate a large amount of publicly available data, enabling in silico reconstruction of regulatory networks. Analysis of the transcriptome under various experimental conditions proved that genes with an overall similar expression pattern often have similar functions. Consistently, genes involved in the same metabolic pathway are found in co-expressed modules. In this chapter, we describe a detailed workflow for analyzing gene co-expression networks using large-scale gene expression data and explain critical steps from design and data analysis to prediction of functionally related modules. This protocol is platform independent and can be used for data generated by ATH1 arrays, tiling arrays, or RNA sequencing for any organism. The most important feature of this workflow is that it can infer statistically significant gene co-expression networks for any number of genes and transcriptome data sets and it does not involve any particular hardware requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
BarabĂ¡si AL, Oltvai ZN (2004) Network biology: Understanding the cell‘s functional organization. Nat Rev Genet 5(2):101–115
Huber W, Carey VJ, Long L et al (2007) Graphs in molecular biology. BMC Bioinformatics 8(Suppl 6):S8
Lèbre S, Lelandais G (2009) Modeling a regulatory network using temporal gene expression data: why and how? In: G. Alterovitz, R. Benson and M. Ramoni (eds) Automation in proteomics and genomics: an engineering case-based approach. Wiley, Chichester UK. pp. 69–96
Jeong H, Tombor B, Albert R et al (2000) The large-scale organization of metabolic networks. Nature 407(6804):651–654
Oliver S (2000) Guilt-by-association goes global. Nature 403(6770):601–603
Gao F, Foat BC, Bussemaker HJ (2004) Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data. BMC Bioinformatics 5:31
Hughes TR, Marton MJ, Jones AR et al (2000) Functional discovery via a compendium of expression profiles. Cell 102(1):109–126
Stuart JM, Segal E, Koller D et al (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643):249–255
Gachon CMM, Langlois-Meurinne M, Henry Y et al (2005) Transcriptional co-regulation of secondary metabolism enzymes in Arabidopsis: functional and evolutionary implications. Plant Mol Biol 58(2):229–245
Wei HR, Persson S, Mehta T et al (2006) Transcriptional coordination of the metabolic network in Arabidopsis. Plant Physiol 142(2):762–774
Heyndrickx KS, Vandepoele K (2012) Systematic identification of functional plant modules through the integration of complementary data sources. Plant Physiol 159(3):884–901
Mentzen WI, Wurtele ES (2008) Regulon organization of Arabidopsis. BMC Plant Biol 8:99
Tieri P, de la Fuente A, Termanini A et al (2011) Integrating Omics data for signaling pathways, interactome reconstruction, and functional analysis. Methods Mol Biol 719:415–433
Jeong H, Mason SP, Barabasi AL et al (2001) Lethality and centrality in protein networks. Nature 411(6833):41–42
Leclerc RD (2008) Survival of the sparsest: robust gene networks are parsimonious. Mol Syst Biol 4:213
Albert R, Jeong H, Barabasi AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382
Jalili M (2011) Error and attack tolerance of small-worldness in complex networks. J Informetrics 5(3):422–430
Krylov DM, Wolf YI, Rogozin IB et al (2003) Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res 13(10):2229–2235
Zotenko E, Mestre J, O‘Leary DP et al (2008) Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS Comput Biol 4(8):e1000140
Hartwell LH, Hopfield JJ, Leibler S et al (1999) From molecular to modular cell biology. Nature 402(6761):47–52
Usadel B, Obayashi T, Mutwil M et al (2009) Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant Cell Environ 32(12):1633–1651
Gavin AC, Aloy P, Grandi P et al (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636
Freeman TC, Goldovsky L, Brosch M et al (2007) Construction, visualisation, and clustering of transcription networks from Microarray expression data. PLoS Comput Biol 3(10):2032–2042
Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461
Huang CY, Cheng CY, Sun CT (2007) Bridge and brick network motifs: identifying significant building blocks from complex biological systems. Artif Intell Med 41(2):117–127
Milo R, Shen-Orr S, Itzkovitz S et al (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827
Kim TH, Kim J, Heslop-Harrison P et al (2011) Evolutionary design principles and functional characteristics based on kingdom-specific network motifs. Bioinformatics 27(2):245–251
Mao LY, Van Hemert JL, Dash S et al (2009) Arabidopsis gene co-expression network and its functional modules. BMC Bioinformatics 10:346
Aoki K, Ogata Y, Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol 48(3):381–390
Lisso J, Steinhauser D, Altmann T et al (2005) Identification of brassinosteroid-related genes by means of transcript co-response analyses. Nucleic Acids Res 33(8):2685–2696
Hirai MY, Sugiyama K, Sawada Y et al (2007) Omics-based identification of Arabidopsis Myb transcription factors regulating aliphatic glucosinolate biosynthesis. Proc Natl Acad Sci U S A 104(15):6478–6483
Meier S, Tzfadia O, Vallabhaneni R et al (2011) A transcriptional analysis of carotenoid, chlorophyll and plastidial isoprenoid biosynthesis genes during development and osmotic stress responses in Arabidopsis thaliana. BMC Syst Biol 5:77
VranovĂ¡ E, Coman D, Gruissem W (2012) Structure and dynamics of the isoprenoid pathway network. Mol Plant 5(2):318–333
Mutwil M, Usadel B, Schutte M et al (2010) Assembly of an interactive correlation network for the Arabidopsis genome using a novel heuristic clustering algorithm. Plant Physiol 152(1):29–43
Zampieri M, Soranzo N, Bianchini D et al (2008) Origin of co-expression patterns in E. coli and S. cerevisiae emerging from reverse engineering algorithms. PLoS One 3(8):e2981
Zare H, Sangurdekar D, Srivastava P et al (2009) Reconstruction of Escherichia coli transcriptional regulatory networks via regulon-based associations. BMC Syst Biol 3:39
Segal E, Shapira M, Regev A et al (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34(2):166–176
Ficklin SP, Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass species: maize and rice. Plant Physiol 156(3):1244–1256
Ma S, Shi M, Li Y et al (2010) Incorporating gene co-expression network in identification of cancer prognosis markers. BMC Bioinformatics 11:271
Oldham MC, Langfelder P, Horvath S (2012) Network methods for describing sample relationships in genomic datasets: application to Huntington's disease. BMC Syst Biol 6:63
Horan K, Jang C, Bailey-Serres J et al (2008) Annotating genes of known and unknown function by large-scale coexpression analysis. Plant Physiol 147(1):41–57
Ehlting J, Provart NJ, Werck-Reichhart D (2006) Functional annotation of the Arabidopsis P450 superfamily based on large-scale co-expression analysis. Biochem Soc Trans 34:1192–1198
Brown DM, Zeef LAH, Ellis J et al (2005) Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17(8):2281–2295
Persson S, Wei H, Milne J et al (2005) Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proc Natl Acad Sci U S A 102(24):8633–8638
Wille A, Zimmermann P, Vranova E et al (2004) Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol 5(11):R92
Ruiz-Sola MA, Rodriguez-Concepcion M (2012) Carotenoid biosynthesis in Arabidopsis: a colorful pathway. Arabidopsis Book 10:e0158
Xu XJ, Wang LS, Ding DF (2004) Learning module networks from genome-wide location and expression data. FEBS Lett 578(3):297–304
Vandepoele K, Quimbaya M, Casneuf T et al (2009) Unraveling transcriptional control in Arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol 150(2):535–546
Movahedi S, Van Bel M, Heyndrickx KS et al (2012) Comparative co-expression analysis in plant biology. Plant Cell Environ 35(10):1787–1798
Weirauch MT (2011) Gene coexpression networks for the analysis of DNA microarray data. Applied statistics for network biology. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, pp 215–250
Kanehisa M, Goto S, Sato Y et al (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40(1):109–114
VranovĂ¡ E, Hirsch-Hoffmann M, Gruissem W (2011) AtIPD: a curated database of Arabidopsis isoprenoid pathway models and genes for isoprenoid network analysis. Plant Physiol 156(4):1655–1660
Toufighi K, Brady SM, Austin R et al (2005) The botany array resource: e-Northerns, expression angling, and promoter analyses. Plant J 43(1):153–163
Mockler TC, Michael TP, Priest HD et al (2007) The Diurnal project: Diurnal and circadian expression profiling, model-based pattern matching, and promoter analysis. Cold Spring Harb Symp 72:353–363
Barrett T, Troup DB, Wilhite SE et al (2011) NCBI GEO: archive for functional genomics data sets-10 years on. Nucleic Acids Res 39:1005–1010
Parkinson H, Sarkans U, Kolesnikov N et al (2011) ArrayExpress update-an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 39:1002–1004
Irizarry RA, Hobbs B, Collin F et al (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264
Smoot ME, Ono K, Ruscheinski J et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3):431–432
Obayashi T, Nishida K, Kasahara K et al (2011) ATTED-II updates: condition-specific gene coexpression to extend coexpression analyses and applications to a broad range of flowering Plants. Plant Cell Physiol 52(2):213–219
Srinivasasainagendra V, Page GP, Mehta T et al (2008) CressExpress: a tool for large-scale mining of expression data from Arabidopsis. Plant Physiol 147(3):1004–1016
Hruz T, Laule O, Szabo G et al (2008) Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics 2008:420747
D'haeseleer P (2005) How does gene expression clustering work? Nat Biotechnol 23:1499–1501
Bickel DR (2003) Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically. Bioinformatics 19(7):818–824
Ma S, Gong Q, Bohnert HJ (2007) An Arabidopsis gene network based on the graphical Gaussian model. Genome Res 17(11):1614–1625
ErdÅ‘s P, RĂ©nyi A (1961) On the strength of connectedness of a random graph. Acta Math Hung 12:261–267
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442
BarabĂ¡si AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Acknowledgments
We thank Dr. Eva VranovĂ¡ and Prof. Peter BĂ¼hlmann for helpful discussions and Philipp Ihmor for critically reading the manuscript. This work was supported by the Seventh Framework Program of the European Commission through the TiMet collaborative project (grant 245143) to W.G.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Coman, D., RĂ¼timann, P., Gruissem, W. (2014). A Flexible Protocol for Targeted Gene Co-expression Network Analysis. In: RodrĂguez-ConcepciĂ³n, M. (eds) Plant Isoprenoids. Methods in Molecular Biology, vol 1153. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0606-2_21
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0606-2_21
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0605-5
Online ISBN: 978-1-4939-0606-2
eBook Packages: Springer Protocols