Abstract
Applications of Bayesian networks in systems biology are computationally demanding due to the large number of model parameters. Conventional MCMC schemes based on proposal moves in structure space tend to be too slow in mixing and convergence, and have recently been superseded by proposal moves in the space of node orders. A disadvantage of the latter approach is the intrinsic inability to specify the prior probability on network structures explicitly. The relative paucity of different experimental conditions in contemporary systems biology implies a strong influence of the prior probability on the posterior probability and, hence, the outcome of inference. Consequently, the paradigm of performing MCMC proposal moves in order rather than structure space is not entirely satisfactory. In the present article, we propose a new and more extensive edge reversal move in the original structure space, and we show that this significantly improves the convergence of the classical structure MCMC scheme.
Article PDF
Similar content being viewed by others
References
Beinlich, I., Suermondt, R., Chavez, R., & Cooper, G. (1989). The alarm monitoring system: A case study with two probabilistic inference techniques for belief networks. In J. Hunter (Ed.), Proceedings of the second European conference on artificial intelligence and medicine. Berlin: Springer.
Castelo, R., & Kočka, T. (2003). On inclusion-driven learning of Bayesian networks. Journal of Machine Learning Research, 4, 527–574.
Chickering, D. M. (1995). A transformational characterization of equivalent Bayesian network structures. In International conference on uncertainty in artificial intelligence (UAI) (Vol. 11, pp. 87–98).
Chickering, D. M. (2002). Learning equivalence classes of Bayesian network structures. Journal of Machine Learning Research, 2, 445–498.
Cooper, G. F., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309–347.
Eaton, D., & Murphy, K. (2007). Bayesian structure learning using dynamic programming and MCMC. In Proceedings of the twenty-third conference on uncertainty in artificial intelligence (UAI 2007).
Ellis, B., & Wong, W. (2006). Sampling Bayesian networks quickly. In Interface, Pasadena, CA.
Friedman, N., & Koller, D. (2003). Being Bayesian about network structure. Machine Learning, 50, 95–126.
Friedman, N., Linial, M., Nachman, I., & Pe’er, D. (2000). Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7, 601–620.
Geiger, D., & Heckerman, D. (1994). Learning Gaussian networks. In Proceedings of the tenth conference on uncertainty in artificial intelligence (pp. 235–243).
Giudici, P., & Castelo, R. (2003). Improving Markov chain Monte Carlo model search for data mining. Machine Learning, 50, 127–158.
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Husmeier, D. (2003). Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19, 2271–2282.
Imoto, S., Higuchi, T., Goto, T., Kuhara, S., & Miyano, S. (2003). Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. In Proceedings IEEE computer society bioinformatics conference (CSB’03) (pp. 104–113).
Imoto, S., Higuchi, T., Goto, T., & Miyano, S. (2006). Error tolerant model for incorporating biological knowledge with expression data in estimating gene networks. Statistical Methodology, 3(1), 1–16.
Jensen, F. V. (1996). An introduction to Bayesian networks. London: UCL Press.
Kanehisa, M. (1997). A database for post-genome analysis. Trends in Genetics, 13, 375–376.
Kanehisa, M., & Goto, S. (2000). Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27–30.
Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K., Itoh, M., Kawashima, S., Katayama, T., Araki, M., & Hirakawa, M. (2006). From genomics to chemical genomics new developments in kegg. Nucleic Acids Research, 34, 354–357.
Kovisto, M. (2006). Advances in exact Bayesian structure discovery in Bayesian networks. In Proceedings of the twenty-second conference on uncertainty in artificial intelligence (UAI 2006).
Kovisto, M., & Sood, K. (2004). Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 5, 549–573.
Larget, B., & Simon, D. L. (1999). Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Molecular Biology and Evolution, 16(6), 750–759.
Madigan, D., & York, J. (1995). Bayesian graphical models for discrete data. International Statistical Review, 63, 215–232.
Mansinghka, V. K., Kemp, C., Tenenbaum, J. B., & Griffiths, T. L. (2006). Structured priors for structure learning. In Proceedings of the twenty-second conference on uncertainty in artificial intelligence (UAI 2006).
Moore, A., & Wong, W. K. (2003). Optimal Reinsertion: a new search operator for accelerated and more accurate Bayesian network structure learning. In T. Fawcett & N. Mishra (Eds.), Proceedings of the 20th international conference on machine learning (ICML ’03) (pp. 552–559). Menlo Park: AAAI Press.
Nariai, N., Tamada, Y., Imoto, S., & Miyano, S. (2005). Estimating gene regulatory networks and protein-protein interactions of saccharomyces cerevisiae from multiple genome-wide data. Bioinformatics, 21(Suppl. 2), ii206–ii212.
Newman, D. J., Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.
Ott, S., Imoto, S., & Miyano, S. (2004). Finding optimal models for small gene networks. In Pacific symposium on biocomputing (Vol. 9, pp. 557–567).
Pearl, J. (2000). Causality: models, reasoning and intelligent systems. London: Cambridge University Press.
Sachs, K., Perez, O., Pe’er, D. A., Lauffenburger, D. A., & Nolan, G. P. (2005). Protein-signaling networks derived from multiparameter single-cell data. Science, 308, 523–529.
Tierney, L. (1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22(4), 1701–1728.
Verma, T., & Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the sixth conference on uncertainty in artificial intelligence (Vol. 6, pp. 220–227).
Werhli, A. V., & Husmeier, D. (2007). Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Statistical Applications in Genetics and Molecular Biology, 6 (Article 15).
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Kevin P. Murphy.
Rights and permissions
About this article
Cite this article
Grzegorczyk, M., Husmeier, D. Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Mach Learn 71, 265–305 (2008). https://doi.org/10.1007/s10994-008-5057-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-008-5057-7