Correlation and Gene Co-Expression Networks

  • Steve Horvath


A correlation network is a network whose adjacency matrix is constructed on the basis of pairwise correlations between numeric vectors. The numeric vectors may represent observed quantitative measurements of variables. For example, the gene expression levels (transcript abundances) across different conditions can be represented by a numeric vector. In general, the relationship between a pair of numeric vectors can be measured in many ways, in particular, using a correlation coefficient (e.g., the Pearson-, Spearman-, or biweight mid-correlation) or using the concordance index. Mouse gene expression data are used to illustrate how network concepts can be used to describe the pairwise relationships among gene expression profiles. While cluster trees and heat maps can be used to visualize relationships between variables, concepts of correlation networks can be used to quantify them. Brain cancer gene expression data are used to illustrate the topological effects of hard- and soft-thresholding. We provide an overview of weighted gene coexpression network analysis and different gene network (re-)construction methods.


Adjacency Matrix Correlation Network Network Concept Numeric Vector Brown Module 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Albert R, Barabasi AL (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97CrossRefGoogle Scholar
  2. Albert R, Jeong H, Barabasi AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382PubMedCrossRefGoogle Scholar
  3. Barrett CL, Palsson BO (2006) Iterative reconstruction of transcriptional regulatory networks: An algorithmic approach. PLoS Comput Biol 2(5):e52PubMedCrossRefGoogle Scholar
  4. Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A (2005) Reverse engineering of regulatory networks in human B cells. Nat Genet 37(4):382–390PubMedCrossRefGoogle Scholar
  5. Butte AJ, Kohane IS (2000) Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurments. Pac Symp Biocomput 5:418–429Google Scholar
  6. Butte A, Tamayo P, Slonim D, Golub T, Kohane I (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci USA 97:12182–12186PubMedCrossRefGoogle Scholar
  7. Cabusora L, Sutton E, Fulmer A, Forst CV (2005) Differential network expression during drug and stress response. Bioinformatics 21(12):2898–2905PubMedCrossRefGoogle Scholar
  8. Carlson M, Zhang B, Fang Z, Mischel P, Horvath S, Nelson SF (2006) Gene connectivity, function, and sequence conservation: Predictions from modular yeast co-expression networks. BMC Genomics 7(7):40PubMedCrossRefGoogle Scholar
  9. Carter SL, Brechbuler CM, Griffin M, Bond AT (2004) Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics 20(14):2242–2250PubMedCrossRefGoogle Scholar
  10. Cokus S, Rose S, Haynor D, GronbechJensen N, Pellegrini M (2006) Modelling the network of cell cycle transcription factors in the yeast Saccharomyces cerevisiae. BMC Bioinform 7:381CrossRefGoogle Scholar
  11. D’haeseleer P, Liang S, Somogyi R (2000) Genetic network inference: From co-expression clustering to reverse engineering. Bioinformatics 16(8):707–726PubMedCrossRefGoogle Scholar
  12. Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1(1):24PubMedCrossRefGoogle Scholar
  13. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863–14868PubMedCrossRefGoogle Scholar
  14. Fuller TF, Ghazalpour A, Aten JE, Drake T, Lusis AJ, Horvath S (2007) Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome 18(6–7): 463–472PubMedCrossRefGoogle Scholar
  15. Gargalovic PS, Imura M, Zhang B, Gharavi NM, Clark MJ, Pagnon J, Yang WP, He A, Truong A, Patel S, Nelson SF, Horvath S, Berliner JA, Kirchgessner TG, Lusis AJ (2006) Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids. Proc Natl Acad Sci USA 103(34):12741–12746PubMedCrossRefGoogle Scholar
  16. Ghazalpour A, Doss S, Zhang B, Plaisier C, Wang S, Schadt EE, Thomas A, Drake TA, Lusis AJ, Horvath S (2006) Integrating genetics and network analysis to characterize genes related to mouse weight. PloS Genet 2(2):8CrossRefGoogle Scholar
  17. Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, Dupuy D, Walhout AJ, Cusick ME, Roth FP, Vidal M (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430(6995):88–93PubMedCrossRefGoogle Scholar
  18. Hardin J, Mitani A, Hicks L, VanKoten B (2007) A robust measure of correlation between two genes on a microarray. BMC Bioinformatics 8(1):220PubMedCrossRefGoogle Scholar
  19. Harrell F (2001) Regression modeling strategies, corrected edition. Springer, New YorkCrossRefGoogle Scholar
  20. Horvath S, Dong J (2008) Geometric interpretation of gene co-expression network analysis. PLoS Comput Biol 4(8):e1000117PubMedCrossRefGoogle Scholar
  21. Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a novel molecular target. Proc Natl Acad Sci USA 103(46):17402–17407PubMedCrossRefGoogle Scholar
  22. Huang Y, Li H, Hu H, Yan X, Waterman MS, Huang H, Zhou XJ (2007) Systematic discovery of functional modules and context-specific functional annotation of human genome. Bioinformatics 23(13):i222–i229PubMedCrossRefGoogle Scholar
  23. Jeong H, Mason SP, Barabasi AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411:41PubMedCrossRefGoogle Scholar
  24. Jordan IK, MarinoRamirez L, Wolf YI, Koonin EV (2004) Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol 21(11):2058–2070PubMedCrossRefGoogle Scholar
  25. Keller MP, Choi YJ, Wang P, Belt Davis D, Rabaglia ME, Oler AT, Stapleton DS, Argmann C, Schueler KL, Edwards S, Steinberg HA, Chaibub Neto E, Kleinhanz R, Turner S, Hellerstein MK, Schadt EE, Yandell BS, Kendziorski C, Attie AD (2008) A gene expression network model of type 2 diabetes links cell cycle regulation in islets with diabetes susceptibility. Genome Res 18(5):706–716PubMedCrossRefGoogle Scholar
  26. Langfelder P, Horvath S (2011) Fast R functions for robust correlations and hierarchical clustering. J Stat Software. In pressGoogle Scholar
  27. Lim WK, Wang K, Lefebvre C, Califano A (2007) Comparative analysis of microarray normalization procedures: Effects on reverse engineering gene networks. Bioinformatics 23(13): i282–i288PubMedCrossRefGoogle Scholar
  28. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, Califano A (2006) ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform 7(Suppl. 1):S7CrossRefGoogle Scholar
  29. Mason M, Fan G, Plath K, Zhou Q, Horvath S (2009) Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC Genomics 10(1):327PubMedCrossRefGoogle Scholar
  30. Miller JA, Horvath S, Geschwind DH (2010) Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc Natl Acad Sci USA 107(28):12698–12703PubMedCrossRefGoogle Scholar
  31. Mumford JA, Horvath S, Oldham MC, Langfelder P, Geschwind DH, Poldrack RA (2010) Detecting network modules in fMRI time series: A weighted network analysis approach. NeuroImage 52(4):1465–1476PubMedCrossRefGoogle Scholar
  32. Needham CJ, Bradford JR, Bulpitt AJ, Westhead DR (2007) A primer on learning in bayesian networks for computational biology. PLoS Comput Biol 3(8):e129PubMedCrossRefGoogle Scholar
  33. Oldham MC, Horvath S, Geschwind DH (2006) Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci USA 103(47):17973–17978PubMedCrossRefGoogle Scholar
  34. Oldham MC, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S, Geschwind DH (2008) Functional organization of the transcriptome in human brain. Nat Neurosci 11(11):1271–1282PubMedCrossRefGoogle Scholar
  35. Perkins TJ, Jaeger J, Reinitz J, Glass L (2005) Reverse engineering the gap gene network of Drosophila melanogaster. PLoS Comput Biol 2(5):e51CrossRefGoogle Scholar
  36. Price MN, Dehal PS, Arkin AP (2007) Orthologous transcription factors in bacteria have different functions and regulate different genes. PLoS Comput Biol 3(9):e175CrossRefGoogle Scholar
  37. Shieh G, Chen CM, Yu CY, Huang J, Wang WF, Lo YC (2008) Inferring transcriptional compensation interactions in yeast via stepwise structure equation modeling. BMC Bioinform 9(1):134CrossRefGoogle Scholar
  38. Smith GD (2006) Randomized by (your) god: Robust inference from an observational study design. J Epidemiol Community Health 60:382–388PubMedCrossRefGoogle Scholar
  39. Snel B, van Noort V, Huynen MA (2004) Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes. Nucleic Acids Res 32(16):4725–4731PubMedCrossRefGoogle Scholar
  40. van Someren EP, Wessels LF, Backer E, Reinders MJ (2002) Genetic network modeling. Pharmacogenomics 3(4):507–525PubMedCrossRefGoogle Scholar
  41. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297PubMedGoogle Scholar
  42. Steffen M, Petti A, Aach J, D’haeseleer P, Church G (2002) Automated modelling of signal transduction networks. BMC Bioinform 3(1):34CrossRefGoogle Scholar
  43. Stuart JM, Segal E, Koller D, Kim SK (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643):249–255PubMedCrossRefGoogle Scholar
  44. Swindell W (2007) Gene expression profiling of long-lived dwarf mice: Longevity-associated genes and relationships with diet, gender and aging. BMC Genomics 8(1):353PubMedCrossRefGoogle Scholar
  45. Thakar J, Pilione M, Kirimanjeswara G, Harvill ET, Albert R (2007) Modeling systems-level regulation of host immune responses. PLoS Comput Biol 3(6):e109PubMedCrossRefGoogle Scholar
  46. Voy BH, Scharff JA, Perkins AD, Saxton AM, Borate B, Chesler EJ, Branstetter LK, Langston MA (2006) Extracting gene networks for low-dose radiation using graph theoretical algorithms. PLoS Comput Biol 2(7):e89PubMedCrossRefGoogle Scholar
  47. Wang J, Zhang S, Wang Y, Chen L, Zhang XS (2009) Disease-aging network reveals significant roles of aging genes in connecting genetic diseases. PLoS Comput Biol 5(9):e1000521PubMedCrossRefGoogle Scholar
  48. Wang S, Yehya N, Schadt EE, Drake TA, Lusis AJ (2006) Genetic and genomic analysis of fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genet 2(2):e15PubMedCrossRefGoogle Scholar
  49. Wei H, Persson S, Mehta T, Srinivasasainagendra V, Chen L, Page GP, Somerville C, Loraine A (2006) Transcriptional coordination of the metabolic network in arabidopsis. Plant Physiol 142(2):762–774PubMedCrossRefGoogle Scholar
  50. Weston D, Gunter L, Rogers A, Wullschleger S (2008) Connecting genes, coexpression modules, and molecular signatures to environmental stress phenotypes in plants. BMC Syst Biol 2(1):16PubMedCrossRefGoogle Scholar
  51. Wiggins C, Nemenman I (2003) Process pathway inference via time series analysis. Exp Mech 43(3):361–370CrossRefGoogle Scholar
  52. Wilcox RR (1997) Introduction to robust estimation and hypothesis testing. Academic, San Diego, CAGoogle Scholar
  53. Zhang B, Horvath S (2005) General framework for weighted gene coexpression analysis. Stat Appl Genet Mol Biol 4:17Google Scholar
  54. Zhou X, Kao MJ, Wong WH (2002) Transitive functional annotation by shortest path analysis of gene expression data. Proc Natl Acad Sci USA 99(20):12783–12788PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.University of California, Los AngelesLos AngelesUSA

Personalised recommendations