Summary
Molecular pathways consisting of interacting proteins underlie the major functions of living cells, and a central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Several approaches to the reverse engineering of genetic regulatory networks from gene expression data have been explored. At the most refined level of detail is a mathematical description of the biophysical processes in terms of a system of coupled differential equations that describe, for instance, the processes of transcription factor binding, protein and RNA degradation, and diffusion. Besides facing inherent identifiability problems, this approach is usually restricted to very small systems. At the other extreme is the coarse-scale approach of clustering, which provides a computationally cheap way to extract useful information from large-scale expression data sets. However, while clustering indicates which genes are co-regulated and may therefore be involved in related biological processes, it does not lead to a fine resolution of the interaction processes that would indicate, for instance, whether an interaction between two genes is direct or mediated by other genes, or whether a gene is a regulator or regulatee. A promising compromise between these two extremes is the approach of Bayesian networks, which are interpretable and flexible models for representing conditional dependence relations between multiple interacting quantities, and whose probabilistic nature is capable of handling noise inherent in both the biological processes and the microarray experiments. This chapter will first briefly recapitulate the Bayesian network paradigm and the work of Friedman et al. [8], [23], who spearheaded the application of Bayesian networks to gene expression data. Next, the chapter will discuss the shortcomings of static Bayesian networks and show how these shortcomings can be overcome with dynamic Bayesian networks. Finally, the chapter will address the important question of the reliability of the inference procedure. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very sparse data sets, typically containing only a few dozen time points during a cell cycle. The results of a simulation study to test the viability of the Bayesian network paradigm are reported. In this study, gene expression data are simulated from a realistic molecular biological network involving DNAs, mRNAs and proteins, and then regulatory networks are inferred from these data in a reverse engineering approach, using dynamic Bayesian networks and Bayesian learning with Markov chain Monte Carlo. The simulation results are presented as receiver operator characteristics (ROC) curves. This allows an estimation of the proportion of spurious gene interactions incurred for a specified target proportion of recovered true interactions. The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy, and the inclusion of further, sequence-based information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403–410, 1990.
P. W. Atkins. Physical Chemistry. Oxford University Press, Oxford, 3rd edition, 1986.
N. Barkai and S. Leibler. Circadian clocks limited by noise. Nature, 403:267–268, 2000.
T. Chen, H. L. He, and G. M. Church. Modeling gene expression with differential equations. Pacific Symposium on Biocomputing, 4:29–40, 1999.
H. De Jong. Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9(1):67–103, 2002.
P. D’haeseleer, S. Liang, and R. Somogyi. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics, 16(8):707–726, 2000.
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95:14863–14868, 1998.
N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7:601–620, 2000.
N. Friedman, K. Murphy, and S. Russell. Learning the structure of dynamic probabilistic networks. In G. F. Cooper and S. Moral, editors, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 139–147, San Francisco, CA, 1998. Morgan Kaufmann Publishers.
T. S. Gardner, C. R. Cantor, and J. J. Collins. Construction of a genetic toggle switch in Escherichia coli. Nature, 403:339–342, 2000.
P. Green. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82:711–732, 1995.
A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6:422–433, 2001.
A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Combining location and expression data for principled discovery of genetic network models. Pacific Symposium on Biocomputing, 7:437–449, 2002.
D. Heckerman. A tutorial on learning with Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models, Adaptive Computation and Machine Learning, pages 301–354, Cambridge, MA, 1999. MIT Press.
J. Hertz, A. Krogh, and R. G. Palmer. Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City, CA, 1991.
D. Husmeier. Reverse engineering of genetic networks with Bayesian networks. Biochemical Society Transactions, 31(6):1516–1518, 2003.
D. Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19:2271–2282, 2003.
S. A. Kauffman. Metabolic stability and epigenesis in randomly connected nets. Journal of Theoretical Biology, 22:437–467, 1969.
S. A. Kauffman. The Origins of Order, Self-Organization and Selection in Evolution. Oxford University Press, 1993.
P. J. Krause. Learning probabilistic networks. Knowledge Engineering Review, 13:321–351, 1998.
K. P. Murphy and S. Milan. Modelling gene expression data using dynamic Bayesian networks. Technical report, MIT Artificial Intelligence Laboratory, 1999. http://www.ai.mit.edu/∼murphyk/Papers/ismb99.ps.gz.
I. Ong, J. Glasner, and D. Page. Modelling regulatory pathways in E. coli from time series expression profiles. Bioinformatics, 18(Suppl.1):S241–S248, 2002.
D. Pe’er, A. Regev, G. Elidan, and N. Friedman. Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17:S215–S224, 2001.
B. Ren, F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J. Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J. Wilson, S. P. Bell, and R. A. Young. Genome-wide location and function of DNA binding proteins. Science, 290:2306–2309, 2000.
E. Segal, Y. Barash, I. Simon, N. Friedman, and D. Koller. From promoter sequence to expression: a probabilistic framework. Research in Computational Molecular Biology (RECOMB), 6:263–272, 2002.
V. A. Smith, E. D. Jarvis, and A. J. Hartemink. Evaluating functional network inference using simulations of complex biological systems. Bioinformatics, 18:S216–S224, 2002. (ISMB02 special issue).
P. Spellman, G. Sherlock, M. Zhang, V. Iyer, K. Anders, M. Eisen, P. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9:3273–3297, 1998.
J. Vilo, A. Brazma, I. Jonassen, A. Robinson, and E. Ukkonen. Mining for putative regulatory elements in the yeast genome using gene expression data. In P. E. Bourne, M. Gribskov, R. B. Altman, N. Jensen, D. Hope, T. Lengauer, J. C. Mitchell, E. Scheeff, C. Smith, S. Strande, and H. Weissig, editors, Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 384–394. AAAI, 2000.
D. E. Zak, F. J. Doyle, G. E. Gonye, and J. S. Schwaber. Simulation studies for the identification of genetic networks from cDNA array and regulatory activity data. Proceedings of the Second International Conference on Systems Biology, pages 231–238, 2001.
D. E. Zak, F. J. Doyle, and J. S. Schwaber. Local identifiability: when can genetic networks be identified from microarray data? Proceedings of the Third International Conference on Systems Biology, pages 236–237, 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag London Limited
About this chapter
Cite this chapter
Husmeier, D. (2005). Inferring Genetic Regulatory Networks from Microarray Experiments with Bayesian Networks. In: Husmeier, D., Dybowski, R., Roberts, S. (eds) Probabilistic Modeling in Bioinformatics and Medical Informatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-119-9_8
Download citation
DOI: https://doi.org/10.1007/1-84628-119-9_8
Publisher Name: Springer, London
Print ISBN: 978-1-85233-778-0
Online ISBN: 978-1-84628-119-8
eBook Packages: Computer ScienceComputer Science (R0)