Network Analysis of Gene Expression

  • Roby Joehanes
Part of the Methods in Molecular Biology book series (MIMB, volume 1783)


Studies have pointed out that the expression of genes are highly regulated, which result in a cascade of distinct patterns of coexpression forming a network. Identifying and understanding such patterns is crucial in deciphering molecular mechanisms that underlie the pathophysiology of diseases. With the advance of high throughput assay of messenger RNA (mRNA) and high performance computing, reconstructing such network from molecular data such as gene expression is now possible. This chapter discusses an overview of methods of constructing such networks, practical considerations, and an example.

Key words

Genes Messenger RNA Coexpression 


  1. 1.
    Crick F (1970) Central dogma of molecular biology. Nature 227:561–563CrossRefPubMedGoogle Scholar
  2. 2.
    Crick FH (1958) On protein synthesis. Symp Soc Exp Biol 12:138–163PubMedGoogle Scholar
  3. 3.
    Goldberger RF (1974) Autogenous regulation of gene expression. Science 183:810–816CrossRefPubMedGoogle Scholar
  4. 4.
    Savageau MA (1977) Design of molecular control mechanisms and the demand for gene expression. Proc Natl Acad Sci U S A 74:5647–5651CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Maniatis T, Goodbourn S, Fischer JA (1987) Regulation of inducible and tissue-specific gene expression. Science 236:1237–1245CrossRefPubMedGoogle Scholar
  6. 6.
    Killary AM, Fournier REK (1984) A genetic analysis of extinction: trans-dominant loci regulate expression of liver-specific traits in hepatoma hybrid cells. Cell 38:523–534CrossRefPubMedGoogle Scholar
  7. 7.
    Wen X et al (1998) Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci U S A 95:334–339CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Lockhart DJ et al (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14:1675–1680CrossRefPubMedGoogle Scholar
  9. 9.
    Liang S, Fuhrman S, Somogyi R (1998) Reveal, a general reverse engineering algorithm for inference of genetic network architectures. Pac Symp Biocomput 3:18–29Google Scholar
  10. 10.
    Somogyi R, Sniegoski CA (1996) Modeling the complexity of genetic networks: understanding multigenic and pleiotropic regulation. Complexity 1:45–63CrossRefGoogle Scholar
  11. 11.
    Turing AM (1936) On computable numbers, with an application to the Entscheidungsproblem. Lond Math Soc Ser 2 42:230–265Google Scholar
  12. 12.
    Von Neumann J (1951) The general and logical theory of automataCollected Works of John Von Neumann, vol 5. Wiley, Oxford, pp 288–326Google Scholar
  13. 13.
    Alon U et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A 96:6745–6750CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620CrossRefPubMedGoogle Scholar
  15. 15.
    Pearl J (2009) Causality: models, reasoning, and inference. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  16. 16.
    Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:17CrossRefGoogle Scholar
  17. 17.
    Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Song L, Langfelder P, Horvath S (2012) Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC Bioinformatics 13:328CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Cilibrasi R, Vitanyi PMB (2005) Clustering by compression. IEEE Trans Inf Theory 51:1523–1545CrossRefGoogle Scholar
  20. 20.
    Arndt C (2004) Information measures: information and its description in science and engineering. Springer, New York, NYGoogle Scholar
  21. 21.
    Kullback S, Leibler RA (1951) On Information and Sufficiency. Ann Math Stat 22:79–86CrossRefGoogle Scholar
  22. 22.
    Opgen-Rhein R, Strimmer K (2007) From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst Biol 1:37CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Butte AJ, Kohane IS (1999) Unsupervised knowledge discovery in medical databases using relevance networks. Proc AMIA Symp:711–715Google Scholar
  24. 24.
    Ritchie SC et al (2016) A scalable permutation approach reveals replication and preservation patterns of network modules in large datasets. Cell Syst 3:71–82CrossRefPubMedGoogle Scholar
  25. 25.
    Barabási A (1999) Emergence of scaling in random networks. Science 286:509–512CrossRefGoogle Scholar
  26. 26.
    Yang Y et al (2014) Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types. Nat Commun 5:3231CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Wang X, Dalkic E, Wu M, Chan C (2008) Gene module level analysis: identification to networks and dynamics. Curr Opin Biotechnol 19:482–491CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Newman AM, Cooper JB (2010) AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number. BMC Bioinformatics 11:117CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Harary F (1994) Graph theory. Westview Press, Boulder, COGoogle Scholar
  30. 30.
    Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1:24CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Ashburner M et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Subramanian A et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Aibar S, Fontanillo C, Droste C, De Las Rivas J (2015) Functional gene networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering. Bioinformatics 31:1686–1688CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Zhang L, Feng XK, Ng YK, Li SC (2016) Reconstructing directed gene regulatory network by only gene expression data. BMC Genomics 17:430CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Schadt EE et al (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Schadt EE et al (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biol e107:6Google Scholar
  38. 38.
    Andersson SA, Madigan DB, Perlman MD (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25:505. CrossRefGoogle Scholar
  39. 39.
    Mähler N et al (2017) Gene co-expression network connectivity is an important determinant of selective constraint. PLoS Genet 13:e1006402CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Huan T et al (2013) A systems biology framework identifies molecular underpinnings of coronary heart disease. Arterioscler Thromb Vasc Biol 33:1427CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1:54CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Espadaler J, Romero-Isart O, Jackson RM, Oliva B (2005) Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships. Bioinformatics 21:3360–3368CrossRefPubMedGoogle Scholar
  43. 43.
    Heckerman D (1995) A tutorial on learning with Bayesian network. Microsoft Tech. Rep. MSTR-TR-95-06. Microsoft Research, Advanced Technology Division, Redmond, WA, pp 1–58Google Scholar
  44. 44.
    Niculescu RS, Mitchell TM, Rao RB (2006) Bayesian network learning with parameter constraints. J Mach Learn Res 7:1357–1383Google Scholar
  45. 45.
    Niculescu RS, Mitchell TM, Rao RB (2007) A theoretical framework for learning Bayesian networks with parameter inequality constraints. IJCAI07 Proc. 20th Int. Jt. Conf. Artifical Intell. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 155–160Google Scholar
  46. 46.
    Tong Y, Ji Q (2008) Learning Bayesian Networks with qualitative constraints. IEEE, Washington, DC, pp 1–8. CrossRefGoogle Scholar
  47. 47.
    Reed E, Mengshoel OJ (2014) Bayesian network parameter learning using EM with parameter sharing. Proc Elev UAI Conf Bayesian Model Appl Workshop, pp 48–59Google Scholar
  48. 48.
    Liao W, Ji Q (2009) Learning Bayesian network parameters under incomplete data with domain knowledge. Pattern Recognit 42:3046–3056CrossRefGoogle Scholar
  49. 49.
    Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer, New York, NYCrossRefGoogle Scholar
  50. 50.
    Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc 50:157–224Google Scholar
  51. 51.
    Dechter R (1996) Bucket elimination: a unifying framework for probabilistic inference. in UAI ’96 Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence. 211–219Google Scholar
  52. 52.
    Irizarry RA et al (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostat Oxf Engl 4:249–264CrossRefGoogle Scholar
  53. 53.
    Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32:462–464CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Li P, Piao Y, Shon HS, Ryu KH (2015) Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data. BMC Bioinformatics 16:347CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Joehanes R et al (2013) Gene expression signatures of coronary heart disease. Arterioscler Thromb Vasc Biol 33:1418–1426CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8:118–127CrossRefPubMedGoogle Scholar
  58. 58.
    Akulenko R, Merl M, Helms V (2016) BEclear: batch effect detection and adjustment in DNA methylation data. PLoS One 11:e0159921CrossRefPubMedPubMedCentralGoogle Scholar
  59. 59.
    Westra H-J et al (2013) Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet 45:1238–1243CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Yao C et al (2015) Integromic analysis of genetic variation and gene expression identifies networks for cardiovascular disease phenotypes. Circulation 131:536–549CrossRefPubMedGoogle Scholar
  61. 61.
    Peters MJ et al (2015) The transcriptional landscape of age in human peripheral blood. Nat Commun 6:8570CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Joehanes R et al (2016) Epigenetic signatures of cigarette smoking. Circ Cardiovasc Genet 9:436–447CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Huan T et al (2016) A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking. Hum Mol Genet 25:4611–4623PubMedPubMedCentralGoogle Scholar
  64. 64.
    Xiao Y (2009) A tutorial on analysis and simulation of Boolean gene regulatory network models. Curr Genomics 10:511–525CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.
    Segal E et al (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34:166–176CrossRefPubMedGoogle Scholar
  66. 66.
    Sipser M (2010) Introduction to the theory of computation. Thomson Course Technology, Boston, MAGoogle Scholar
  67. 67.
    Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5:1287–1330Google Scholar
  68. 68.
    Daly R, Shen Q, Aitken S (2011) Learning Bayesian networks: approaches and issues. Knowl Eng Rev 26:99–157CrossRefGoogle Scholar
  69. 69.
    Voineagu I et al (2011) Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474:380–384CrossRefPubMedPubMedCentralGoogle Scholar
  70. 70.
    Serin EAR, Nijveen H, Hilhorst HWM, Ligterink W (2016) Learning from co-expression networks: possibilities and challenges. Front Plant Sci 7:444CrossRefPubMedPubMedCentralGoogle Scholar
  71. 71.
    David CJ, Manley JL (2010) Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged. Genes Dev 24:2343–2364CrossRefPubMedPubMedCentralGoogle Scholar
  72. 72.
    Zhang J, Manley JL (2013) Misregulation of pre-mRNA alternative splicing in cancer. Cancer Discov 3:1228–1237CrossRefPubMedGoogle Scholar
  73. 73.
    Schwerk C, Schulze-Osthoff K (2005) Regulation of apoptosis by alternative pre-mRNA splicing. Mol Cell 19:1–13CrossRefPubMedGoogle Scholar
  74. 74.
    Cao J, Qi X, Zhao H (2012) Modeling gene regulation networks using ordinary differential equations. Methods Mol Biol (Clifton NJ) 802:185–197CrossRefGoogle Scholar
  75. 75.
    Bansal K, Yang K, Nistala GJ, Gennis RB, Bhalerao KD (2010) A positive feedback-based gene circuit to increase the production of a membrane protein. J Biol Eng 4:6CrossRefPubMedPubMedCentralGoogle Scholar
  76. 76.
    Nomura M, Yates JL, Dean D, Post LE (1980) Feedback regulation of ribosomal protein gene expression in Escherichia coli: structural homology of ribosomal RNA and ribosomal protein MRNA. Proc Natl Acad Sci U S A 77:7084–7088CrossRefPubMedPubMedCentralGoogle Scholar
  77. 77.
    Singh A (2011) Negative feedback through mRNA provides the best control of gene-expression noise. IEEE Trans Nanobioscience 10:194–200CrossRefPubMedGoogle Scholar
  78. 78.
    Liu B, de la Fuente A, Hoeschele I (2008) Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178:1763–1776CrossRefPubMedPubMedCentralGoogle Scholar
  79. 79.
    Cai X, Bazerque JA, Giannakis GB (2013) Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations. PLoS Comput Biol 9:e1003068CrossRefPubMedPubMedCentralGoogle Scholar
  80. 80.
    Nelsen RB (1999) An introduction to copulas. Springer, New York, NYCrossRefGoogle Scholar
  81. 81.
    Kim J-M et al (2008) A copula method for modeling directional dependence of genes. BMC Bioinformatics 9:225CrossRefPubMedPubMedCentralGoogle Scholar
  82. 82.
    Žitnik M, Zupan B (2015) Gene network inference by fusing data from diverse distributions. Bioinformatics 31:i230–i239CrossRefPubMedPubMedCentralGoogle Scholar
  83. 83.
    Bao L, Zhu Z, Ye J (2009) Modeling oncology gene pathways network with multiple genotypes and phenotypes via a copula method. IEEE, Washington, DC, pp 237–246. CrossRefGoogle Scholar
  84. 84.
    Jin Y, Lindsey M (2008) Stability analysis of genetic regulatory network with additive noises. BMC Genomics 9:S21CrossRefPubMedPubMedCentralGoogle Scholar
  85. 85.
    Rajapakse JC, Mundra PA (2011) Stability of building gene regulatory networks with sparse autoregressive models. BMC Bioinformatics 12:S17CrossRefPubMedPubMedCentralGoogle Scholar
  86. 86.
    Wu S et al (2016) Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks. Proc Natl Acad Sci 113:4290–4295CrossRefPubMedGoogle Scholar
  87. 87.
    Gibson SM et al (2013) Massive-scale gene co-expression network construction and robustness testing using random matrix theory. PLoS One 8:e55871CrossRefPubMedPubMedCentralGoogle Scholar
  88. 88.
    Montojo J, Zuberi K, Rodriguez H, Bader GD, Morris Q (2015) GeneMANIA: fast gene network construction and function prediction for Cytoscape. F1000Research 3:153. CrossRefGoogle Scholar
  89. 89.
    Ghahramani Z (1998) Learning dynamic Bayesian networks. In: Adaptive processing of sequences and data structures. Springer, New York, NY, pp 168–197CrossRefGoogle Scholar
  90. 90.
    Murphy KP (2002) Dynamic Bayesian networks: representation, inference and learning. University of California, Berkeley, CAGoogle Scholar
  91. 91.
    Sanghai S, Domingos P, Weld D (2005) Relational dynamic Bayesian networks. J Artif Intell Res 24:759–797CrossRefGoogle Scholar
  92. 92.
    Zou M, Conzen SD (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21:71–79CrossRefPubMedGoogle Scholar
  93. 93.
    Opgen-Rhein R, Strimmer K (2006) Inferring gene dependency networks from genomic longitudinal data: a functional data approach. REVSTAT Stat J 4:53–65Google Scholar
  94. 94.
    Bender C et al (2011) Inferring signalling networks from longitudinal data using sampling based approaches in the R-package ‘ddepn’. BMC Bioinformatics 12:291CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Hebrew SeniorLife, Beth Israel Deaconess Medical CenterHarvard Medical SchoolBostonUSA

Personalised recommendations