Skip to main content

Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks

  • Protocol
  • First Online:
Gene Regulatory Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1883))

Abstract

Biological networks are a very convenient modeling and visualization tool to discover knowledge from modern high-throughput genomics and post-genomics data sets. Indeed, biological entities are not isolated but are components of complex multilevel systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems. We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Scutari M (2010) Learning Bayesian Networks with the bnlearn R Package. J Stat Softw 35(3):1–22

    Article  Google Scholar 

  2. Scutari M, Denis JB (2014) Bayesian networks: with examples in R. Texts in statistical science. CRC Press: Taylor & Francis Group, Boca Raton

    Google Scholar 

  3. Edwards JS, Palsson BO (1999) Systems properties of the Haemophilus influenzae Rd metabolic genotype. J Biol Chem 274(25):17410–17416

    Article  CAS  PubMed  Google Scholar 

  4. Kitano H (2002) Systems biology: a brief overview. Science 295(2):1662–1664

    Article  CAS  PubMed  Google Scholar 

  5. Noble D (2006) The music of life: biology beyond genes. Oxford University Press, Oxford

    Google Scholar 

  6. Barabási AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113

    Article  PubMed  CAS  Google Scholar 

  7. Gericke NM, Hagberg M (2007) Definition of historical models of gene function and their relation to students’ understanding of genetics. Sci Educ 16(7):849–881

    Article  Google Scholar 

  8. Pennisi E (2007) DNA study forces rethink of what it means to be a gene. Science 316(5831):1556–1557

    Article  CAS  PubMed  Google Scholar 

  9. McElreath R (2015) Statistical rethinking: a Bayesian course with examples in R and Stan. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  10. Scutari M, Howell P, Balding DJ, Mackay I (2014) Multiple quantitative trait analysis using Bayesian networks. Genetics 198(1):129–137

    Article  PubMed  PubMed Central  Google Scholar 

  11. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:17

    Article  Google Scholar 

  12. Tenenhaus A, Guillemot V, Gidrol X, Frouin V (2010) Gene association networks from microarray data using a regularized estimation of partial correlation based on PLS regression. IEEE/ACM Trans Comput Biol Bioinform 7(2):251–262

    Article  CAS  PubMed  Google Scholar 

  13. Rau A, Maugis-Rabusseau C, Martin-Magniette ML, Celeux G (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics 31(9):1420–1427

    Article  CAS  PubMed  Google Scholar 

  14. Jacob F, Monod J (1961) Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol 3(3):318–356

    Article  CAS  PubMed  Google Scholar 

  15. Hentschel U, Steinert M, Hacker J (2000) Common molecular mechanisms of symbiosis and pathogenesis. Trends Microbiol 8(5):226–231

    Article  CAS  PubMed  Google Scholar 

  16. Dupont PY, Eaton CJ, Wargent JJ, Fechtner S, Solomon P, Schmid J, Day RC, Scott B, Cox MP (2015) Fungal endophyte infection of ryegrass reprograms host metabolism and alters development. New Phytol 208(4):1227–1240

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, New York

    Book  Google Scholar 

  18. Mangan S, Alon U (2003) Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci 100(21):11980–11985

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Zabet NR (2011) Negative feedback and physical limits of genes. J Theor Biol 284(1):82–91

    Article  PubMed  Google Scholar 

  20. Shojaie A, Jauhiainen A, Kallitsis M, Michailidis G (2014) Inferring regulatory networks by combining perturbation screens and steady state gene expression profiles. PLoS ONE 9:1–16

    Article  CAS  Google Scholar 

  21. Ghahramani Z (1998) Learning dynamic Bayesian networks. In: Adaptive processing of sequences and data structures. Lecture notes in computer sciences. Springer, New York, pp 168–197

    Chapter  Google Scholar 

  22. Friedman N, Murphy K, Russell S (1998) Learning the structure of dynamic probabilistic networks. In: Proceedings of the fourteenth conference on uncertainty in artificial intelligence, UAI’98. Morgan Kaufmann, San Francisco, pp 139–147

    Google Scholar 

  23. Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19(17):2271–2282

    Article  CAS  PubMed  Google Scholar 

  24. Tulupyev AL, Nikolenko SI (2005) Directed cycles in Bayesian belief networks: probabilistic semantics and consistency checking complexity. In: Gelbukh A, de Albornoz Á, Terashima-Marín H (eds) MICAI 2005: Advances in artificial intelligence. Springer, Berlin, pp 214–223

    Chapter  Google Scholar 

  25. Harary F, Norman R, Cartwright D (1965) Structural models: an introduction to the theory of directed graphs. Wiley, New York

    Google Scholar 

  26. Lacerda G, Spirtes P, Ramsey J, Hoyer P (2008) Discovering cyclic causal models by independent components analysis. In: Proceedings of the twenty-fourth conference annual conference on uncertainty in artificial intelligence (UAI-08). AUAI Press, Corvallis, pp 366–374

    Google Scholar 

  27. Quackenbush J (2007) Extracting biology from high-dimensional biological data. J Exp Biol 210(9):1507–1517

    Article  CAS  PubMed  Google Scholar 

  28. Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data – methods, theory and applications. Springer, Berlin

    Book  Google Scholar 

  29. Verzelen N (2012) Minimax risks for sparse regressions: ultra-high dimensional phenomenons. Electron J Stat. 6:38–90

    Article  Google Scholar 

  30. Giraud C (2014) Introduction to high-dimensional statistics. Chapman & Hall/CRC, New York

    Book  Google Scholar 

  31. Oates CJ, Dondelinger F, Bayani N, Korkola J, Gray JW, Mukherjee S (2014) Causal network inference using biochemical kinetics. Bioinformatics 30(17):i468–i474

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Shojaie A, Michailidis G (2010) Discovering graphical Granger causality using the truncating lasso penalty. Bioinformatics 26(18):i517–i523

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Rau A, Jaffrézic F, Foulley JL, Doerge RW (2010) An empirical Bayesian method for estimating biological networks from temporal microarray data. Stat Appl Genet Mol Biol 9:1

    Article  Google Scholar 

  34. Marchand G, Huynh-Thu VA, Kane NC, Arribat S, Varès D, Rengel D, Balzergue S, Rieseberg LH, Vincourt P, Geurts P, Vignes M, Langlade NB (2014) Bridging physiological and evolutionary time-scales in a gene regulatory network. New Phytol 203(2):685–696

    Article  CAS  PubMed  Google Scholar 

  35. Chandrasekaran V, Parrilo PA, Willsky AS (2012) Latent variable graphical model selection via convex optimization. Ann Stat 40(4):1935–1967

    Article  Google Scholar 

  36. Blanchet J, Vignes M (2009) A model-based approach to gene clustering with missing observation reconstruction in a Markov random field framework. J Comput Biol 16(3):475–486

    Article  CAS  PubMed  Google Scholar 

  37. Colombo D, Maathuis MH, Kalisch M, Richardson TS (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat 40(1):294–321

    Article  Google Scholar 

  38. Fusi N, Stegle O, Lawrence ND (2012) Joint modelling of confounding factors and prominent genetic regulators provides increased accuracy in genetical genomics studies. PLoS Comput Biol 8(1):1–9

    Article  CAS  Google Scholar 

  39. Sadeh MJ, Moffa G, Spang R (2013) Considering unknown unknowns: reconstruction of nonconfoundable causal relations in biological networks. Bayesian Anal 11(20):920–932

    Google Scholar 

  40. Mooij JM, Janzing D, Heskes T, Schölkopf B (2011) On causal discovery with cyclic additive noise models. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates Inc., Red Hook, pp 639–647

    Google Scholar 

  41. de Jong H (2004) Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol 9(1):67–103

    Article  Google Scholar 

  42. Markowetz F, Spang R (2007) Inferring cellular networks – a review. BMC Bioinf 8(6):S5

    Article  CAS  Google Scholar 

  43. Lee WP, Tzou WS (2009) Computational methods for discovering gene networks from expression data. Brief Bioinform 10(4):408–423

    CAS  PubMed  Google Scholar 

  44. Emmert-Streib F, Glazko G, Göokmen A, De Matos Simoes R (2012) Statistical inference and reverse engineering of gene regulatory networks from observational expression data. Front Genet 3:8

    Article  PubMed  PubMed Central  Google Scholar 

  45. Maathuis MH, Kalisch M, Bhlmann P (2009) Estimating high-dimensional intervention effects from observational data. Ann Stat 37(6A):3133–3164

    Article  Google Scholar 

  46. Oates CJ, Mukherjee S (2012) Network inference and biological dynamics. Ann Appl Stat 6(3):1209–1235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Fu F, Zhou Q (2013) Learning sparse causal Gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108(501):288–300

    Article  CAS  Google Scholar 

  48. Werhli AV, Grzegorczyk M, Husmeier D (2006) Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks. Bioinformatics 22(20):2523–2531

    Article  CAS  PubMed  Google Scholar 

  49. Altay G, Emmert-Streib F (2010) Revealing differences in gene network inference algorithms on the network level by ensemble methods. Bioinformatics 26(14):1738–1744

    Article  CAS  PubMed  Google Scholar 

  50. Emmert-Streib F, Altay G (2010) Local network-based measures to assess the inferability of different regulatory networks. IET Syst Biol 4:277–288

    Article  CAS  PubMed  Google Scholar 

  51. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G (2010) Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci 107(14):6286–6291

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Consortium TD, Kellis M, Collins JJ, Stolovitzky G (2014) Wisdom of crowds for robust gene network inference. Nat Methods 9(8):796–804

    Google Scholar 

  53. Meyer P, Cokelaer T, Chandran D, Kim KH, Loh PR, Tucker G, Lipson M, Berger B, Kreutz C, Raue A, Steiert B, Timmer J, Bilal E, Sauro HM, Stolovitzky G, Saez-Rodriguez J (2014) Network topology and parameter estimation: from experimental design methods to gene regulatory network kinetics using a community based approach. BMC Syst Biol 8(1):13

    Article  PubMed  PubMed Central  Google Scholar 

  54. Hill SM, Heiser LM, Cokelaer T, Unger M, Nesser NK, Carlin DE, Zhang Y, Sokolov A, Paull EO, Wong CK, Graim K, Bivol A, Wang H, Zhu F, Afsari B, Danilova LV, Favorov AV, Lee WS, Taylor D, Hu CW, Long BL, Noren DP, Bisberg AJ, HPN-DREAM Consortium, Mills GB, Gray JW, Kellen M, Norman T, Friend S, Qutub AA, Fertig EJ, Guan Y, Song M, Stuart JM, Spellman PT, Koeppl H, Stolovitzky G, Saez-Rodriguez J, Mukherjee S (2016) Inferring causal molecular networks: empirical assessment through a community-based effort. Nat Methods 13(4):310–318

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Allouche D, Cierco-Ayrolles C, de Givry S, Guillermin G, Mangin B, Schiex T, Vandel J, Vignes M (2013) A panel of learning methods for the reconstruction of gene regulatory networks in a systems genetics context. Springer, Berlin, pp 9–31

    Google Scholar 

  56. Bontempi G, Haibe-Kains B, Desmedt C, Sotiriou C, Quackenbush J (2011) Multiple-input multiple-output causal strategies for gene selection. BMC Bioinf 12(1):458

    Article  Google Scholar 

  57. Engelmann JC, Amann T, Ott-Rtzer B, Ntzel M, Reinders Y, Reinders J, Thasler WE, Kristl T, Teufel A, Huber CG, Oefner PJ, Spang R, Hellerbrand C (2015) Causal modeling of cancer-stromal communication identifies PAPPA as a novel stroma-secreted factoractivating NFκB signaling in hepatocellular carcinoma. PLoS Comput Biol 11(5):1–22

    Article  CAS  Google Scholar 

  58. Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721):523–529

    Article  CAS  PubMed  Google Scholar 

  59. Ness RO, Sachs K, Vitek O (2016) From correlation to causality: Statistical approaches to learning regulatory relationships in large-scale biomolecular investigations. J Proteome Res 15(3):683–690

    Article  CAS  PubMed  Google Scholar 

  60. Gagneur J, Stegle O, Zhu C, Jakob P, Tekkedil MM, Aiyar RS, Schuon AK, Pe’er D, Steinmetz LM (2013) Genotype-environment interactions reveal causal pathways that mediate genetic effects on phenotype. PLoS Genet 9(9):e1003803

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Maathuis MH, Colombo D, Kalisch M, Bühlmann P (2012) Predicting causal effects in large-scale systems from observational data. Nat Methods 7:47–48

    Google Scholar 

  62. Taruttis F, Spang R, Engelmann JC (2015) A statistical approach to virtual cellular experiments: improved causal discovery using accumulation IDA (aida). Bioinformatics 31(23):3807–3814

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Michailidis G, d’Alché Buc F (2013) Autoregressive models for gene regulatory network inference: sparsity, stability and causality issues. Math Biosci 246(2):326–334

    Article  PubMed  Google Scholar 

  64. Werhli AV, Husmeier D (2007) Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Stat Appl Genet Mol Biol 6:1

    Article  Google Scholar 

  65. Mordelet F, Vert JP (2008) SIRENE: supervised inference of regulatory networks. Bioinformatics 24(16):i76–i82

    Article  PubMed  Google Scholar 

  66. Eberhardt F, Glymour C, Scheines R (2005) On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence, UAI’05. AUAI Press, Arlington, pp 178–184

    Google Scholar 

  67. Hauser A, Bhlmann P (2014) Two optimal strategies for active learning of causal models from interventional data. Int J Approx Reason 55(4):926–939. Special issue on the sixth European Workshop on Probabilistic Graphical Models

    Article  Google Scholar 

  68. Meinshausen N, Hauser A, Mooij JM, Peters J, Versteeg P, Bühlmann P (2016) Methods for causal inference from gene perturbation experiments and validation. Proc Natl Acad Sci 113(27):7361–7368

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Mooij JM, Peters J, Janzing D, Zscheischler J, Schölkopf B (2016) Distinguishing cause from effect using observational data: methods and benchmarks. J Mach Learn Res 17(32):1–102

    Google Scholar 

  70. Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci USA 113(27):7353–7360

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Chen G, Larsen P, Almasri E, Dai Y (2008) Rank-based edge reconstruction for scale-free genetic regulatory networks. BMC Bioinf 9(1):75

    Article  CAS  Google Scholar 

  72. Agrawal H (2002) Extreme self-organization in networks constructed from gene expression data. Phys Rev Lett 89:268702

    Article  PubMed  CAS  Google Scholar 

  73. Opgen-Rhein R, Strimmer K (2007) From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst Biol 1(1):37

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Xiong M, Li J, Fang X (2004) Identification of genetic networks. Genetics 166(2):1037–1052

    Article  PubMed  PubMed Central  Google Scholar 

  75. Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7(3–4):601–620

    Article  CAS  PubMed  Google Scholar 

  76. Spirtes P (2005) Graphical models, causal inference, and econometric models. J Econ Methodol 12(1):3–34

    Article  Google Scholar 

  77. Pearl J (2009) Causal inference in statistics: an overview. Stat Surv 3(1):96–146

    Article  Google Scholar 

  78. Hume D (1738–1740) A treatise of human nature. John Noon, London

    Google Scholar 

  79. Wright S (1921) Correlation and causation. J Agric Res 20(7):557–585

    Google Scholar 

  80. Neyman J (1990) On the application of probability theory to agricultural experiments. Essay on principles. Section 9 (translated and edited by d. m. dabrowska and t. p. speed from the polish original, which appeared in roczniki nauk rolniczych tom x (1923) 1–51 (annals of agricultural science)). Stat Sci 5(4):465–472

    Google Scholar 

  81. Fisher RA (1925) Statistical methods for research workers. Oliver & Boyd, Edinburgh,

    Google Scholar 

  82. Rubin D (1974) Estimating causal effects of treatments in randomized and non-randomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  83. Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960

    Article  Google Scholar 

  84. Wainer H (2014) Visual revelations: happiness and causal inference. Chance 27(4):61–64

    Article  Google Scholar 

  85. Bottou L, Peters J, Quiñonero Candela J, Charles DX, Chickering DM, Portugaly E, Ray D, Simard P, Snelson E (2013) Counterfactual reasoning and learning systems: the example of computational advertising. J Mach Learn Res 14:3207–3260

    Google Scholar 

  86. Dawid AP (2000) Causal inference without counterfactuals. J Am Stat Assoc 95(450):407–424

    Article  Google Scholar 

  87. Tan Z (2006) Regression and weighting methods for causal inference using instrumental variables. J Am Stat Assoc 101(476):1607–1618

    Article  CAS  Google Scholar 

  88. Bollen KA (1989) Structural equations with latent variables. Wiley, New York

    Book  Google Scholar 

  89. Tarka P (2017) An overview of structural equation modeling: its beginnings, historical development, usefulness and controversies in the social sciences. Qual Quant 52:313–354

    Article  PubMed  PubMed Central  Google Scholar 

  90. Robins JM (1987) A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. J Chronic Dis 40(Suppl 2):139S–161S

    Article  PubMed  Google Scholar 

  91. Lopez-Paz D, Muandet K, Schölkopf B, Tolstikhin I (2015) Towards a learning theory of cause-effect inference. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning, Lille, Proceedings of machine learning research, vol 37, pp 1452–1461

    Google Scholar 

  92. Suppes P (1970) A Probabilistic theory of causality. North-Holland, Amsterdam

    Google Scholar 

  93. Eells E (1970) Probabilistic causality. Cambridge University Press, Cambridge

    Google Scholar 

  94. Buchsbaum D, Bridgers S, Skolnick Weisberg D, Gopnik A (2012) The power of possibility: causal learning, counterfactual reasoning, and pretend play. Philos Trans R Soc B Biol Sci 367(1599):2202–2212

    Article  Google Scholar 

  95. Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 10(1):37–45

    Article  CAS  PubMed  Google Scholar 

  96. Verkuyten M, Thijs J (2002) School satisfaction of elementary school children: the role of performance, peer relations, ethnicity and gender. Soc Indic Res 59(2):203–228

    Article  Google Scholar 

  97. Cardenas IC, Voordijk H, Dewulf G (2017) Beyond theory: towards a probabilistic causation model to support project governance in infrastructure projects. Int J Proj Manag 35(3):432–450

    Article  Google Scholar 

  98. Gupta S, Kim HW (2008) Linking structural equation modeling to Bayesian networks: decision support for customer retention in virtual communities. Eur J Oper Res 190(3):818–833

    Article  Google Scholar 

  99. Kleinberg S, Hripcsak G (2011) A review of causal inference for biomedical informatics. J Biomed Inform 44(6):1102–1112

    Article  PubMed  PubMed Central  Google Scholar 

  100. Martin W (2014) Making valid causal inferences from observational data. Prev Vet Med 113(3):281–297. Special Issue: Schwabe Symposium 2012

    Article  PubMed  Google Scholar 

  101. Wu R, Casella G (2010) Statistical genetics - associating genotypic differences with measurable outcomes. In: Tanur J (ed) Statistics: a guide to the unknown, pp 243–254. Holden-Day, San Francisco

    Google Scholar 

  102. Frommlet F, Bogdan M, Ramsey D (2016) Phenotypes and genotypes. Springer, Berlin

    Book  Google Scholar 

  103. Rakitsch B, Stegle O (2016) Modelling local gene networks increases power to detect trans-acting genetic effects on gene expression. Genome Biol 17(1):33

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  104. Brazhnik P, de la Fuente A, Mendes P (2002) Gene networks: how to put the function in genomics. Trends Biotechnol 20(11):467–472

    Article  CAS  PubMed  Google Scholar 

  105. Hu H, Li Z, Vetta AR (2014) Randomized experimental design for causal graph discovery. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27. Curran Associates, Inc., Red Hook, pp 2339–2347

    Google Scholar 

  106. Isabelle Guyon I, Janzing D, Schölkopf B (2010) Causality: Objectives and assessment. In: Guyon I, Janzing D, Schölkopf B (eds) Proceedings of workshop on causality: objectives and assessment at NIPS 2008, Whistler. Proceedings of machine learning research, vol 6, pp 1–42

    Google Scholar 

  107. Djordjevic D, Yang A, Zadoorian A, Rungrugeecharoen K, Ho JW (2014) How difficult is inference of mammalian causal gene regulatory networks? PLoS ONE 9(11):1–10

    Article  CAS  Google Scholar 

  108. Anjum S, Doucet A, Holmes CC (2009) A boosting approach to structure learning of graphs with and without prior knowledge. Bioinformatics 25(22):2929–2936

    Article  CAS  PubMed  Google Scholar 

  109. Deng M, Emad A, Milenkovic O (2012) Causal compressive sensing for gene network inference. In: 2012 IEEE statistical signal processing workshop, SSP 2012, pp 696–699

    Google Scholar 

  110. Krouk G, Lingeman J, Colon AM, Coruzzi G, Denis S (2013) Gene regulatory networks in plants: learning causality from time and perturbation. Genome Biol 14(6):123

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  111. Dondelinger F, Lèbre S, Husmeier D (2013) Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure. Mach Learn 90(2):191–230

    Article  Google Scholar 

  112. Cai X, Bazerque JA, Giannakis GB (2013) Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations. PLoS Comput Biol 9(5):1–13

    Article  CAS  Google Scholar 

  113. Rau A, Jaffrézic F, Nuel G (2013) Joint estimation of causal effects from observational and intervention gene expression data. BMC Syst Biol 7(1):111

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  114. Monneret G, Jaffrézic F, Rau A, Zerjal T, Nuel G (2017) Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS ONE 12(3):1–13

    Article  CAS  Google Scholar 

  115. Liu B, de la Fuente A, Hoeschele I (2008) Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178(3):1763–1776

    Article  PubMed  PubMed Central  Google Scholar 

  116. Tasaki S, Sauerwine B, Hoff B, Toyoshiba H, Gaiteri C, Chaibub Neto E (2015) Bayesian network reconstruction using systems genetics data: comparison of MCMC methods. Genetics 199(4):973–989

    Article  PubMed  PubMed Central  Google Scholar 

  117. Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P (2012) Causal inference using graphical models with the R package pcalg. J Stat Softw 47:11

    Article  Google Scholar 

  118. Koski TJ, Noble JM (2012) A review of Bayesian networks and structure learning. Math Appl 40(1):53–103

    Google Scholar 

  119. Hendriksen JMT, Geersing GJ, Moons KGM, de Groot JAH (2013) Diagnostic and prognostic prediction models. J Thromb Haemost 11:129–141

    Article  PubMed  Google Scholar 

  120. Dawid AP, Musio M, Fienberg SE (2016) From statistical evidence to evidence of causality. Bayesian Anal 11(3):725–752

    Article  Google Scholar 

  121. Sebastiani P, Milton J, Wang L (2011) Designing microarray experiments. Springer, Boston, pp 271–290

    Google Scholar 

  122. Bühlmann P, Kalisch M, Meier L (2014) High-dimensional statistics with a view toward applications in biology. Ann Rev Stat Appl 1(1):255–278

    Article  Google Scholar 

  123. Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search, adaptive computation and machine learning, 2nd edn. The MIT Press, Cambridge. With additional material by David Hecke. A Bradford Book

    Google Scholar 

  124. Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. The Morgan Kaufmann series in representation and reasoning. Morgan Kaufmann, San Francisco

    Google Scholar 

  125. Koller D, Pfeffer A (1997) Object-oriented Bayesian networks. In: Proceedings of the thirteenth conference on uncertainty in artificial intelligence, UAI’97. Morgan Kaufmann, San Francisco, pp 302–313

    Google Scholar 

  126. Marsland S (2015) Machine learning: an algorithmic perspective, 2nd edn. Chapman & HallCRC machine learning & pattern recognition series. CRC Press, Boca Raton

    Google Scholar 

  127. Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E (2003) Algorithms for large scale markov blanket discovery. In: FLAIRS conference, vol 2, pp 376–380

    Google Scholar 

  128. Friedman N, Nachman I, Peér D (1999) Learning Bayesian network structure from massive datasets: the sparse candidate algorithm. In: Proceedings of the fifteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann, San Francisco, pp 206–215

    Google Scholar 

  129. Brown LE, Tsamardinos I, Aliferis CF (2004) A novel algorithm for scalable and accurate Bayesian network learning. In: Proceedings of 11th World Congress in Medical Informatics (MEDINFO 04), vol 107, pp 711–715

    Google Scholar 

  130. Fu LD, Tsamardinos I (2005) A comparison of Bayesian network learning algorithms from continuous data. In: AMIA annual symposium proceedings, vol 960

    Google Scholar 

  131. Vignes M, Vandel J, Allouche D, Ramadan-Alban N, Cierco-Ayrolles C, Schiex T, Mangin B, de Givry S (2011) Gene regulatory network reconstruction using Bayesian networks, the Dantzig selector, the Lasso and their meta-analysis. PLoS ONE 6(12):1–15

    Article  CAS  Google Scholar 

  132. Qi X, Shi Y, Wang H, Gao Y (2016) Grouping parallel Bayesian network structure learning algorithm based on variable ordering. In: Yin H, Gao Y, Li B, Zhang D, Yang M, Li Y, Klawonn F, Tallón-Ballesteros AJ (eds) Intelligent data engineering and automated learning – IDEAL 2016. Springer International Publishing, Cham, pp 405–415

    Chapter  Google Scholar 

  133. Mengshoel OJ (2010) Understanding the scalability of Bayesian network inference using clique tree growth curves. Artif Intell 174(12):984–1006

    Article  Google Scholar 

  134. De Campos CP (2011) New complexity results for map in Bayesian networks. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 3, IJCAI’11. AAAI Press, Menlo Park, pp 2100–2106

    Google Scholar 

  135. Cooper GF (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artif Intell 42(2):393–405

    Article  Google Scholar 

  136. Handa H, Katai O (2003) Estimation of Bayesian network algorithm with GA searching for better network structure. In: Proceedings of the 2003 international conference on neural networks and signal processing, vol 1, pp 436–439

    Google Scholar 

  137. Malone B, Yuan C, Hansen EA, Bridges S (2011) Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. In: Proceedings of the twenty-seventh conference on uncertainty in artificial intelligence, UAI’11. AUAI Press, Arlington, pp 479–488

    Google Scholar 

  138. Adabor ES, Acquaah-Mensah GK, Oduro FT (2015) SAGA: a hybrid search algorithm for Bayesian network structure learning of transcriptional regulatory networks. J Biomed Inform 53:27–35

    Article  PubMed  Google Scholar 

  139. Nikolova O, Aluru S (2012) Parallel Bayesian network structure learning with application to gene networks. In: 2012 International conference for high performance computing, networking, storage and analysis (SC), pp 1–9

    Google Scholar 

  140. Madsen AL, Jensen F, Salmer A, Langseth H, Nielsen TD (2017) A parallel algorithm for Bayesian network structure learning from large data sets. Knowl Based Syst 117:46–55

    Article  Google Scholar 

  141. Thibault G, Aussem A, Bonnevay S (2009) Incremental Bayesian network learning for scalable feature selection. In: Adams NM, Robardet C, Siebes A, Boulicaut JF (eds) Advances in intelligent data analysis VIII. Springer, Berlin, pp 202–212

    Chapter  Google Scholar 

  142. Stegle O, Janzing D, Zhang K, Mooij JM, Schölkopf B (2010) Probabilistic latent variable models for distinguishing between cause and effect. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems, vol 23. Curran Associates, Inc., Red Hook, pp 1687–1695

    Google Scholar 

  143. He Y, Jia J, Yu B (2013) Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Ann Stat 41(4): 1742–1779

    Article  Google Scholar 

  144. Peters J, Bhlmann P (2015) Structural intervention distance for evaluating causal graphs. Neural Comput 27(3):771–779

    Article  PubMed  Google Scholar 

  145. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  Google Scholar 

  146. Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P (2010) Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5:1–10

    Article  CAS  Google Scholar 

  147. Vandel J, Mangin B, Vignes M, Leroux D, Loudet O, Martin-Magniette ML, De Givry S (2012) Gene regulatory network inference with extended scores for Bayesian networks. Revue d’Intelligence Artificielle 26(6):679–708

    Article  Google Scholar 

  148. Chiquet J, Smith A, Grasseau G, Matias C, Ambroise C (2009) SIMoNe: Statistical Inference for MOdular NEtworks. Bioinformatics 25(3):417–418

    Article  CAS  PubMed  Google Scholar 

  149. Vallat L, Kemper CA, Jung N, Maumy-Bertrand M, Bertrand F, Meyer N, Pocheville A, Fisher JW, Gribben JG, Bahram S (2013) Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia. Proc Natl Acad Sci 110(2):459–464

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Alex White was partly supported by a Summer Scholarship grant from the Institute of Fundamental Sciences at Massey University (NZ). This work was also supported by a visiting professor scholarship from Aix-Marseille University granted to Matthieu Vignes in 2017. The material in this chapter slowly hatched after many discussions with a multitude of quantitative field colleagues and biologists during some works we were involved in, e.g., [34, 131, 147], but most of our inspiration stems from other works we read about and discussed (e.g., [60, 113, 146, 148, 149] to cite only a few). We are very appreciative of the many discussions with Stephen Marsland (Victoria University of Wellington, NZ). Lastly, we are very grateful to the reviewers of this chapter for their insightful comments.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

White, A., Vignes, M. (2019). Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks. In: Sanguinetti, G., Huynh-Thu, V. (eds) Gene Regulatory Networks. Methods in Molecular Biology, vol 1883. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8882-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8882-2_5

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8881-5

  • Online ISBN: 978-1-4939-8882-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics