Inferring Genetic Regulatory Networks from Microarray Experiments with Bayesian Networks

Husmeier, Dirk

doi:10.1007/1-84628-119-9_8

Dirk Husmeier⁴

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

2030 Accesses

Summary

Molecular pathways consisting of interacting proteins underlie the major functions of living cells, and a central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Several approaches to the reverse engineering of genetic regulatory networks from gene expression data have been explored. At the most refined level of detail is a mathematical description of the biophysical processes in terms of a system of coupled differential equations that describe, for instance, the processes of transcription factor binding, protein and RNA degradation, and diffusion. Besides facing inherent identifiability problems, this approach is usually restricted to very small systems. At the other extreme is the coarse-scale approach of clustering, which provides a computationally cheap way to extract useful information from large-scale expression data sets. However, while clustering indicates which genes are co-regulated and may therefore be involved in related biological processes, it does not lead to a fine resolution of the interaction processes that would indicate, for instance, whether an interaction between two genes is direct or mediated by other genes, or whether a gene is a regulator or regulatee. A promising compromise between these two extremes is the approach of Bayesian networks, which are interpretable and flexible models for representing conditional dependence relations between multiple interacting quantities, and whose probabilistic nature is capable of handling noise inherent in both the biological processes and the microarray experiments. This chapter will first briefly recapitulate the Bayesian network paradigm and the work of Friedman et al. [8], [23], who spearheaded the application of Bayesian networks to gene expression data. Next, the chapter will discuss the shortcomings of static Bayesian networks and show how these shortcomings can be overcome with dynamic Bayesian networks. Finally, the chapter will address the important question of the reliability of the inference procedure. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very sparse data sets, typically containing only a few dozen time points during a cell cycle. The results of a simulation study to test the viability of the Bayesian network paradigm are reported. In this study, gene expression data are simulated from a realistic molecular biological network involving DNAs, mRNAs and proteins, and then regulatory networks are inferred from these data in a reverse engineering approach, using dynamic Bayesian networks and Bayesian learning with Markov chain Monte Carlo. The simulation results are presented as receiver operator characteristics (ROC) curves. This allows an estimation of the proportion of spurious gene interactions incurred for a specified target proportion of recovered true interactions. The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy, and the inclusion of further, sequence-based information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403–410, 1990.
Article Google Scholar
P. W. Atkins. Physical Chemistry. Oxford University Press, Oxford, 3rd edition, 1986.
Google Scholar
N. Barkai and S. Leibler. Circadian clocks limited by noise. Nature, 403:267–268, 2000.
Google Scholar
T. Chen, H. L. He, and G. M. Church. Modeling gene expression with differential equations. Pacific Symposium on Biocomputing, 4:29–40, 1999.
Google Scholar
H. De Jong. Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9(1):67–103, 2002.
Article Google Scholar
P. D’haeseleer, S. Liang, and R. Somogyi. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics, 16(8):707–726, 2000.
Google Scholar
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95:14863–14868, 1998.
Article Google Scholar
N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7:601–620, 2000.
Article Google Scholar
N. Friedman, K. Murphy, and S. Russell. Learning the structure of dynamic probabilistic networks. In G. F. Cooper and S. Moral, editors, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 139–147, San Francisco, CA, 1998. Morgan Kaufmann Publishers.
Google Scholar
T. S. Gardner, C. R. Cantor, and J. J. Collins. Construction of a genetic toggle switch in Escherichia coli. Nature, 403:339–342, 2000.
Article Google Scholar
P. Green. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82:711–732, 1995.
Article MATH MathSciNet Google Scholar
A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6:422–433, 2001.
Google Scholar
A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Combining location and expression data for principled discovery of genetic network models. Pacific Symposium on Biocomputing, 7:437–449, 2002.
Google Scholar
D. Heckerman. A tutorial on learning with Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models, Adaptive Computation and Machine Learning, pages 301–354, Cambridge, MA, 1999. MIT Press.
Google Scholar
J. Hertz, A. Krogh, and R. G. Palmer. Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City, CA, 1991.
Google Scholar
D. Husmeier. Reverse engineering of genetic networks with Bayesian networks. Biochemical Society Transactions, 31(6):1516–1518, 2003.
Article Google Scholar
D. Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19:2271–2282, 2003.
Article Google Scholar
S. A. Kauffman. Metabolic stability and epigenesis in randomly connected nets. Journal of Theoretical Biology, 22:437–467, 1969.
Article MathSciNet Google Scholar
S. A. Kauffman. The Origins of Order, Self-Organization and Selection in Evolution. Oxford University Press, 1993.
Google Scholar
P. J. Krause. Learning probabilistic networks. Knowledge Engineering Review, 13:321–351, 1998.
Article Google Scholar
K. P. Murphy and S. Milan. Modelling gene expression data using dynamic Bayesian networks. Technical report, MIT Artificial Intelligence Laboratory, 1999. http://www.ai.mit.edu/∼murphyk/Papers/ismb99.ps.gz.
Google Scholar
I. Ong, J. Glasner, and D. Page. Modelling regulatory pathways in E. coli from time series expression profiles. Bioinformatics, 18(Suppl.1):S241–S248, 2002.
Google Scholar
D. Pe’er, A. Regev, G. Elidan, and N. Friedman. Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17:S215–S224, 2001.
Google Scholar
B. Ren, F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young. Genome-wide location and function of DNA binding proteins. Science, 290:2306–2309, 2000.
Article Google Scholar
E. Segal, Y. Barash, I. Simon, N. Friedman, and D. Koller. From promoter sequence to expression: a probabilistic framework. Research in Computational Molecular Biology (RECOMB), 6:263–272, 2002.
Google Scholar
V. A. Smith, E. D. Jarvis, and A. J. Hartemink. Evaluating functional network inference using simulations of complex biological systems. Bioinformatics, 18:S216–S224, 2002. (ISMB02 special issue).
Google Scholar
P. Spellman, G. Sherlock, M. Zhang, V. Iyer, K. Anders, M. Eisen, P. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9:3273–3297, 1998.
Google Scholar
J. Vilo, A. Brazma, I. Jonassen, A. Robinson, and E. Ukkonen. Mining for putative regulatory elements in the yeast genome using gene expression data. In P. E. Bourne, M. Gribskov, R. B. Altman, N. Jensen, D. Hope, T. Lengauer, J. C. Mitchell, E. Scheeff, C. Smith, S. Strande, and H. Weissig, editors, Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 384–394. AAAI, 2000.
Google Scholar
D. E. Zak, F. J. Doyle, G. E. Gonye, and J. S. Schwaber. Simulation studies for the identification of genetic networks from cDNA array and regulatory activity data. Proceedings of the Second International Conference on Systems Biology, pages 231–238, 2001.
Google Scholar
D. E. Zak, F. J. Doyle, and J. S. Schwaber. Local identifiability: when can genetic networks be identified from microarray data? Proceedings of the Third International Conference on Systems Biology, pages 236–237, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

JCMB, Biomathematics and Statistics Scotland (BioSS), The King’s Buildings, Edinburgh, EH9 3JZ, UK
Dirk Husmeier

Authors

Dirk Husmeier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Biomathematics and Statistics-BioSS, UK
Dirk Husmeier DiplPhys, MSc, PhD
InferSpace, UK
Richard Dybowski BSc, MSc, PhD
Oxford University, UK
Stephen Roberts MA, DPhil, MIEEE, MIoP, CPhys

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Husmeier, D. (2005). Inferring Genetic Regulatory Networks from Microarray Experiments with Bayesian Networks. In: Husmeier, D., Dybowski, R., Roberts, S. (eds) Probabilistic Modeling in Bioinformatics and Medical Informatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-119-9_8

Download citation

DOI: https://doi.org/10.1007/1-84628-119-9_8
Publisher Name: Springer, London
Print ISBN: 978-1-85233-778-0
Online ISBN: 978-1-84628-119-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics