Machine Learning

, Volume 90, Issue 2, pp 191–230 | Cite as

Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure

Article

Abstract

The proper functioning of any living cell relies on complex networks of gene regulation. These regulatory interactions are not static but respond to changes in the environment and evolve during the life cycle of an organism. A challenging objective in computational systems biology is to infer these time-varying gene regulatory networks from typically short time series of transcriptional profiles. While homogeneous models, like conventional dynamic Bayesian networks, lack the flexibility to succeed in this task, fully flexible models suffer from inflated inference uncertainty due to the limited amount of available data. In the present paper we explore a semi-flexible model based on a piecewise homogeneous dynamic Bayesian network regularized by gene-specific inter-segment information sharing. We explore different choices of prior distribution and information coupling and evaluate their performance on synthetic data. We apply our method to gene expression time series obtained during the life cycle of Drosophila melanogaster, and compare the predicted segmentation with other state-of-the-art techniques. We conclude our evaluation with an application to synthetic biology, where the objective is to predict an in vivo regulatory network of five genes in Saccharomyces cerevisiae subjected to a changing environment.

Keywords

Dynamic Bayesian networks Hierarchical Bayesian models Multiple changepoint processes Reversible jump Markov chain Monte Carlo Gene expression time series Systems and synthetic biology 

References

  1. Ahmed, A., & Xing, E. P. (2009). Recovering time-varying networks of dependencies in social and biological studies. Proceedings of the National Academy of Sciences, 106, 11878–11883. CrossRefGoogle Scholar
  2. Andrianantoandro, E., Basu, S., Karig, D., & Weiss, R. (2006). Synthetic biology: new engineering rules for an emerging discipline. Molecular Systems Biology, 2(1), E1–E14. Google Scholar
  3. Andrieu, C., & Doucet, A. (1999). Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Transactions on Signal Processing, 47(10), 2667–2676. CrossRefGoogle Scholar
  4. Arbeitman, M., Furlong, E., Imam, F., Johnson, E., Null, B., Baker, B., Krasnow, M., Scott, M., Davis, R., & White, K. (2002). Gene expression during the life cycle of Drosophila melanogaster. Science, 297(5590), 2270–2275. CrossRefGoogle Scholar
  5. Cantone, I., Marucci, L., Iorio, F., Ricci, M.A., Belcastro, V., Bansal, M., Santini, S., di Bernardo, M., di Bernardo, D., & Cosma, M. P. (2009). A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell, 137(1), 172–181. CrossRefGoogle Scholar
  6. Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the 23rd international conference on machine learning (p. 240). New York: ACM. Google Scholar
  7. Dondelinger, F. (2012). A machine learning approach to reconstructing signalling pathways and interaction networks in biology. PhD thesis, University of Edinburgh (in preparation). Google Scholar
  8. Dondelinger, F., Lebre, S., & Husmeier, D. (2010). Heterogeneous continuous dynamic Bayesian networks with flexible structure and inter-time segment information sharing. In Proceedings of the 27th international conference on machine learning (ICML). Google Scholar
  9. Formstecher, E., Aresta, S., Collura, V., Hamburger, A., Meil, A., Trehin, A., Reverdy, C., Betin, V., Maire, S., Brun, C., et al. (2005). Protein interaction mapping: a Drosophila case study. Genome Research, 15(3), 376. CrossRefGoogle Scholar
  10. Gelman, A., & Rubin, D. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457–472. CrossRefGoogle Scholar
  11. Green, P. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732. MathSciNetMATHCrossRefGoogle Scholar
  12. Grzegorczyk, M., & Husmeier, D. (2009). Non-stationary continuous dynamic Bayesian networks. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (NIPS) (Vol. 22, pp. 682–690). Google Scholar
  13. Grzegorczyk, M., & Husmeier, D. (2011). Non-homogeneous dynamic Bayesian networks for continuous data. Machine Learning, 83, 355–419. MATHCrossRefGoogle Scholar
  14. Guo, F., Hanneke, S., Fu, W., & Xing, E. (2007). Recovering temporally rewiring networks: a model-based approach. In Proceedings of the 24th international conference on machine learning (p. 328). New York: ACM. Google Scholar
  15. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109. MATHCrossRefGoogle Scholar
  16. Homyk, T. Jr, & Emerson, C. Jr (1988). Functional interactions between unlinked muscle genes within haploinsufficient regions of the Drosophila genome. Genetics, 119(1), 105. Google Scholar
  17. Husmeier, D., & McGuire, G. (2003). Detecting recombination in 4-taxa DNA sequence alignments with Bayesian hidden Markov models and Markov chain Monte Carlo. Molecular Biology and Evolution, 20(3), 315–337. CrossRefGoogle Scholar
  18. Husmeier, D., Dondelinger, F., & Lèbre, S. (2010). Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks. In J. Lafferty (Ed.), Proceedings of the twenty-fourth annual conference on neural information processing systems (NIPS) (Vol. 23, pp. 901–909). New York: Curran Associates. Google Scholar
  19. Kolar, M., Song, L., & Xing, E. (2009). Sparsistent learning of varying-coefficient models with structural changes. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (NIPS) (Vol. 22, pp. 1006–1014). Google Scholar
  20. Larget, B., & Simon, D. L. (1999). Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Molecular Biology and Evolution, 16(6), 750–759. CrossRefGoogle Scholar
  21. Lèbre, S. (2007). Stochastic process analysis for genomics and dynamic Bayesian networks inference. PhD thesis, Université d‘Evry-Val-d‘Essonne, France. Google Scholar
  22. Lèbre, S., Becq, J., Devaux, F., Lelandais, G., & Stumpf, M. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC Systems Biology, 4, 130. CrossRefGoogle Scholar
  23. Locke, J., Kozma-Bognár, L., Gould, P., Fehér, B., Kevei, E., Nagy, F., Turner, M., Hall, A., & Millar, A. (2006). Experimental validation of a predicted feedback loop in the multi-oscillator clock of Arabidopsis thaliana. Molecular Systems Biology, 2(1), 59. Google Scholar
  24. Montana, E., & Littleton, J. (2004). Characterization of a hypercontraction-induced myopathy in Drosophila caused by mutations in mhc. The Journal of Cell Biology, 164(7), 1045. CrossRefGoogle Scholar
  25. Nongthomba, U., Cummins, M., Clark, S., Vigoreaux, J., & Sparrow, J. (2003). Suppression of muscle hypercontraction by mutations in the myosin heavy chain gene of Drosophila melanogaster. Genetics, 164(1), 209. Google Scholar
  26. Parkhurst, S., & Ish-Horowicz, D. (1991). WIMP, a dominant maternal-effect mutation, reduces transcription of a specific subset of segmentation genes in Drosophila. Genes & Development, 5(3), 341. CrossRefGoogle Scholar
  27. Pokhilko, A., Hodge, S., Stratford, K., Knox, K., Edwards, K., Thomson, A., Mizuno, T., & Millar, A. (2010). Data assimilation constrains new connections and components in a complex, eukaryotic circadian clock model. Molecular Systems Biology, 6(1), 416. Google Scholar
  28. Prill, R. J., Marbach, D., Saez-Rodriguez, J., Sorger, P. K., Alexopoulos, L. G., Xue, X., Clarke, N. D., Altan-Bonnet, G., & Stolovitzky, G. (2010). Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS ONE, 5(2), e9202. CrossRefGoogle Scholar
  29. Punskaya, E., Andrieu, C., Doucet, A., & Fitzgerald, W. (2002). Bayesian curve fitting using MCMC with applications to signal segmentation. IEEE Transactions on Signal Processing, 50(3), 747–758. CrossRefGoogle Scholar
  30. Robinson, J. W., & Hartemink, A. J. (2009). Non-stationary dynamic Bayesian networks. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (NIPS) (Vol. 21, pp. 1369–1376). San Mateo: Morgan Kaufmann. Google Scholar
  31. Robinson, J., & Hartemink, A. (2010). Learning non-stationary dynamic Bayesian networks. Journal of Machine Learning Research, 11, 3647–3680. MathSciNetMATHGoogle Scholar
  32. Sanchez, C., Lachaize, C., Janody, F., Bellon, B., Roeder, L., Euzenat, J., Rechenmann, F., & Jacq, B. (1999). Grasping at molecular interactions and genetic networks in Drosophila melanogaster using FlyNets, an internet database. Nucleic Acids Research, 27(1), 89. CrossRefGoogle Scholar
  33. Sims, D., Bursteinas, B., Gao, Q., Zvelebil, M., & Baum, B. (2006). FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets. Nucleic Acids Research, 34(suppl 1), D479. CrossRefGoogle Scholar
  34. Talih, M., & Hengartner, N. (2005). Structural learning with time-varying components: tracking the cross-section of financial time series. Journal of the Royal Statistical Society B, 67(3), 321–341. MathSciNetMATHCrossRefGoogle Scholar
  35. Wang, Z., Kuruoglu, E., Yang, X., Xu, Y., & Huang, T. (2011). Time varying dynamic Bayesian network for non-stationary events modeling and online inference. IEEE Transactions on Signal Processing, 4(59), 1553. CrossRefGoogle Scholar
  36. Werhli, A. V., & Husmeier, D. (2008). Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. Journal of Bioinformatics and Computational Biology, 6(3), 543–572. CrossRefGoogle Scholar
  37. Xuan, X., & Murphy, K. (2007). Modeling changing dependency structure in multivariate time series. In Z. Ghahramani (Ed.), Proceedings of the 24th annual international conference on machine learning (ICML 2007) (pp. 1055–1062). New York: Omnipress. CrossRefGoogle Scholar
  38. Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In P. Goel & A. Zellner (Eds.), Bayesian inference and decision techniques (pp. 233–243). Amsterdam: Elsevier. Google Scholar
  39. Zhao, W., Serpedin, E., & Dougherty, E. (2006). Inferring gene regulatory networks from time series data using the minimum description length principle. Bioinformatics, 22(17), 2129. CrossRefGoogle Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  • Frank Dondelinger
    • 1
    • 2
  • Sophie Lèbre
    • 3
  • Dirk Husmeier
    • 1
    • 4
  1. 1.Biomathematics and Statistics ScotlandJCMBEdinburghUK
  2. 2.Institute for Adaptive and Neural ComputationThe University of EdinburghEdinburghUK
  3. 3.LSIIT, UMR 7005Université de StrasbourgIllkirchFrance
  4. 4.School of Mathematics and StatisticsUniversity of GlasgowGlasgowUK

Personalised recommendations