Skip to main content

Inferring Genetic Regulatory Networks from Microarray Experiments with Bayesian Networks

  • Chapter
Book cover Probabilistic Modeling in Bioinformatics and Medical Informatics

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

  • 2030 Accesses

Summary

Molecular pathways consisting of interacting proteins underlie the major functions of living cells, and a central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Several approaches to the reverse engineering of genetic regulatory networks from gene expression data have been explored. At the most refined level of detail is a mathematical description of the biophysical processes in terms of a system of coupled differential equations that describe, for instance, the processes of transcription factor binding, protein and RNA degradation, and diffusion. Besides facing inherent identifiability problems, this approach is usually restricted to very small systems. At the other extreme is the coarse-scale approach of clustering, which provides a computationally cheap way to extract useful information from large-scale expression data sets. However, while clustering indicates which genes are co-regulated and may therefore be involved in related biological processes, it does not lead to a fine resolution of the interaction processes that would indicate, for instance, whether an interaction between two genes is direct or mediated by other genes, or whether a gene is a regulator or regulatee. A promising compromise between these two extremes is the approach of Bayesian networks, which are interpretable and flexible models for representing conditional dependence relations between multiple interacting quantities, and whose probabilistic nature is capable of handling noise inherent in both the biological processes and the microarray experiments. This chapter will first briefly recapitulate the Bayesian network paradigm and the work of Friedman et al. [8], [23], who spearheaded the application of Bayesian networks to gene expression data. Next, the chapter will discuss the shortcomings of static Bayesian networks and show how these shortcomings can be overcome with dynamic Bayesian networks. Finally, the chapter will address the important question of the reliability of the inference procedure. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very sparse data sets, typically containing only a few dozen time points during a cell cycle. The results of a simulation study to test the viability of the Bayesian network paradigm are reported. In this study, gene expression data are simulated from a realistic molecular biological network involving DNAs, mRNAs and proteins, and then regulatory networks are inferred from these data in a reverse engineering approach, using dynamic Bayesian networks and Bayesian learning with Markov chain Monte Carlo. The simulation results are presented as receiver operator characteristics (ROC) curves. This allows an estimation of the proportion of spurious gene interactions incurred for a specified target proportion of recovered true interactions. The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy, and the inclusion of further, sequence-based information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403–410, 1990.

    Article  Google Scholar 

  2. P. W. Atkins. Physical Chemistry. Oxford University Press, Oxford, 3rd edition, 1986.

    Google Scholar 

  3. N. Barkai and S. Leibler. Circadian clocks limited by noise. Nature, 403:267–268, 2000.

    Google Scholar 

  4. T. Chen, H. L. He, and G. M. Church. Modeling gene expression with differential equations. Pacific Symposium on Biocomputing, 4:29–40, 1999.

    Google Scholar 

  5. H. De Jong. Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9(1):67–103, 2002.

    Article  Google Scholar 

  6. P. D’haeseleer, S. Liang, and R. Somogyi. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics, 16(8):707–726, 2000.

    Google Scholar 

  7. M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95:14863–14868, 1998.

    Article  Google Scholar 

  8. N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7:601–620, 2000.

    Article  Google Scholar 

  9. N. Friedman, K. Murphy, and S. Russell. Learning the structure of dynamic probabilistic networks. In G. F. Cooper and S. Moral, editors, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 139–147, San Francisco, CA, 1998. Morgan Kaufmann Publishers.

    Google Scholar 

  10. T. S. Gardner, C. R. Cantor, and J. J. Collins. Construction of a genetic toggle switch in Escherichia coli. Nature, 403:339–342, 2000.

    Article  Google Scholar 

  11. P. Green. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82:711–732, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  12. A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6:422–433, 2001.

    Google Scholar 

  13. A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Combining location and expression data for principled discovery of genetic network models. Pacific Symposium on Biocomputing, 7:437–449, 2002.

    Google Scholar 

  14. D. Heckerman. A tutorial on learning with Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models, Adaptive Computation and Machine Learning, pages 301–354, Cambridge, MA, 1999. MIT Press.

    Google Scholar 

  15. J. Hertz, A. Krogh, and R. G. Palmer. Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City, CA, 1991.

    Google Scholar 

  16. D. Husmeier. Reverse engineering of genetic networks with Bayesian networks. Biochemical Society Transactions, 31(6):1516–1518, 2003.

    Article  Google Scholar 

  17. D. Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19:2271–2282, 2003.

    Article  Google Scholar 

  18. S. A. Kauffman. Metabolic stability and epigenesis in randomly connected nets. Journal of Theoretical Biology, 22:437–467, 1969.

    Article  MathSciNet  Google Scholar 

  19. S. A. Kauffman. The Origins of Order, Self-Organization and Selection in Evolution. Oxford University Press, 1993.

    Google Scholar 

  20. P. J. Krause. Learning probabilistic networks. Knowledge Engineering Review, 13:321–351, 1998.

    Article  Google Scholar 

  21. K. P. Murphy and S. Milan. Modelling gene expression data using dynamic Bayesian networks. Technical report, MIT Artificial Intelligence Laboratory, 1999. http://www.ai.mit.edu/∼murphyk/Papers/ismb99.ps.gz.

    Google Scholar 

  22. I. Ong, J. Glasner, and D. Page. Modelling regulatory pathways in E. coli from time series expression profiles. Bioinformatics, 18(Suppl.1):S241–S248, 2002.

    Google Scholar 

  23. D. Pe’er, A. Regev, G. Elidan, and N. Friedman. Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17:S215–S224, 2001.

    Google Scholar 

  24. B. Ren, F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young. Genome-wide location and function of DNA binding proteins. Science, 290:2306–2309, 2000.

    Article  Google Scholar 

  25. E. Segal, Y. Barash, I. Simon, N. Friedman, and D. Koller. From promoter sequence to expression: a probabilistic framework. Research in Computational Molecular Biology (RECOMB), 6:263–272, 2002.

    Google Scholar 

  26. V. A. Smith, E. D. Jarvis, and A. J. Hartemink. Evaluating functional network inference using simulations of complex biological systems. Bioinformatics, 18:S216–S224, 2002. (ISMB02 special issue).

    Google Scholar 

  27. P. Spellman, G. Sherlock, M. Zhang, V. Iyer, K. Anders, M. Eisen, P. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9:3273–3297, 1998.

    Google Scholar 

  28. J. Vilo, A. Brazma, I. Jonassen, A. Robinson, and E. Ukkonen. Mining for putative regulatory elements in the yeast genome using gene expression data. In P. E. Bourne, M. Gribskov, R. B. Altman, N. Jensen, D. Hope, T. Lengauer, J. C. Mitchell, E. Scheeff, C. Smith, S. Strande, and H. Weissig, editors, Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 384–394. AAAI, 2000.

    Google Scholar 

  29. D. E. Zak, F. J. Doyle, G. E. Gonye, and J. S. Schwaber. Simulation studies for the identification of genetic networks from cDNA array and regulatory activity data. Proceedings of the Second International Conference on Systems Biology, pages 231–238, 2001.

    Google Scholar 

  30. D. E. Zak, F. J. Doyle, and J. S. Schwaber. Local identifiability: when can genetic networks be identified from microarray data? Proceedings of the Third International Conference on Systems Biology, pages 236–237, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag London Limited

About this chapter

Cite this chapter

Husmeier, D. (2005). Inferring Genetic Regulatory Networks from Microarray Experiments with Bayesian Networks. In: Husmeier, D., Dybowski, R., Roberts, S. (eds) Probabilistic Modeling in Bioinformatics and Medical Informatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-119-9_8

Download citation

  • DOI: https://doi.org/10.1007/1-84628-119-9_8

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-778-0

  • Online ISBN: 978-1-84628-119-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics