Directed Acyclic Graphs

Foraita, Ronja; Spallek, Jacob; Zeeb, Hajo

doi:10.1007/978-0-387-09834-0_65

Ronja Foraita³,
Jacob Spallek³ &
Hajo Zeeb³

13k Accesses
10 Citations

Abstract

A directed acyclic graph (DAG) can be thought of as a kind of flowchart that visualizes a whole causal etiological network, linking causes and effects. In epidemiology, the terms causal graph, causal diagram, and DAG are used as synonyms (Greenland et al. 1999). DAGs are considered to be of use for embedding causality in a formal causal framework (Hernán and Robins 2006; Robins 2001; Hernán et al. 2004). In probability theory, there is a somewhat different understanding of DAGs, which we will discuss later. This chapter aims to demonstrate how DAGs can help to formalize the search for answers to different research questions in epidemiology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 999.99; Price excludes VAT (USA)

Hardcover Book: USD 1,399.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andersson SA, Madigan D, Perlman MD (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25:505–541
Article Google Scholar
Berkson J (1946) Limitations of the application of fourfold tables to hospital data. Biom Bull 2:47–53
Article CAS Google Scholar
Bishop CM (2007) Pattern recognition and machine learning. Springer, New York
Google Scholar
Borsuk ME (2008) Bayesian networks. In: Jørgensen SE, Fath B (eds) Encyclopedia of ecology. Elsevier, Burlington, pp 307–317
Chapter Google Scholar
Bottcher SG, Dethlefsen C (2011) Deal: learning bayesian networks with mixed variables. http://CRAN.R-project.org/package=deal. R package version 1.2–34
Breitling L (2010) dagR: a suite of R functions for directed acyclic graphs. Epidemiology 21:586–587
Article PubMed Google Scholar
Chickering D, Meek C (2002) Finding optimal Bayesian networks. In: Darwiche A, Friedman N (eds) Proceedings of the eighteenth annual conference on uncertainty in artificial intelligence (UAI-02). Morgan Kaufmann, San Francisco, pp 94–102
Google Scholar
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz HJ (eds) Learning from data: artificial intelligence and statistics V. Lecture notes in statistics, vol 112. Springer, New York, pp 121–130
Chapter Google Scholar
Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J Mach Learn Res 5:1287–1330
Google Scholar
Cobb BR, Rumí R, Salmerón A (2007) Bayesian network models with discrete and continuous variables. In: Lucas P, Gámez JA, Salmerón A (eds) Advances in probabilistic graphical models. Studies in fuzziness and soft computing, vol 213. Springer, Berlin, pp 81–102
Chapter Google Scholar
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347
Google Scholar
Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Information science and statistics. Springer, New York
Google Scholar
Dagum P, Luby M (1993) Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif Intell 60:141–154
Article Google Scholar
Daly R, Shen Q, Aitken S (2011) Learning Bayesian networks: approaches and issues. Knowl Eng Rev 26:99–157
Article Google Scholar
Darwiche A (2009) Modeling and reasoning with Bayesian networks. Cambridge University Press, Cambridge
Book Google Scholar
Darwiche A (2010) Bayesian networks. Commun ACM 53:80–90
Article Google Scholar
Dawid AP (2010a) Beware of the DAG! JMLR workshop Conf Proc 6:59–86
Google Scholar
Dawid AP (2010b) Seeing and doing: the Pearlian synthesis. In: Dechter R, Geffner H, Halpern JY (eds) Heuristics, probability and causality: a tribute to Judea Pearl. College Publications, London, pp 309–325
Google Scholar
Dethlefsen C, Højsgaard S (2005) A common platform for graphical models in R: the gRbase package. J Stat Softw 14:1–12
Google Scholar
Didelez V, Sheehan NA (2007) Mendelian randomisation: why epidemiology needs a formal language for causality. In: Russo F, Williamson J (eds) Causality and probability in the sciences. Texts in philosophy, vol 5. College Publications, London, pp 263–292
Google Scholar
Fast A, Hay M, Jensen D (2008) Improving accuracy of constraint-based structure learning. Technical Report 08-48, Computer Science Department, University of Massachusetts Amherst
Google Scholar
Friedman N (1997) Learning belief networks in the presence of missing values and hidden variables. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML ’97). Morgan Kaufmann, San Francisco, pp 125–133
Google Scholar
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303:799–805
Article PubMed CAS Google Scholar
Friedman N, Goldszmidt M, Wyner A (1999a) Data analysis with Bayesian networks: a bootstrap approach. In: Prade H, Laskey K (eds) Proceedings of the fifteenth annual conference on uncertainty in artificial intelligence (UAI-99). Morgan Kaufmann, San Francisco, pp 196–205
Google Scholar
Friedman N, Goldszmidt M, Wyner A (1999b) On the application of the bootstrap for computing confidence measures on features of induced bayesian networks. In: Heckerman D, Whittaker J (eds) Proceedings of the seventh international workshop on artificial intelligence and statistics. Morgan Kaufmann, San Francisco, pp 197–202
Google Scholar
Geiger D, Heckerman D, King H, Me (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29:505–529
Google Scholar
Geneletti S, Mason A, Best N (2011) Adjusting for selection effects in epidemiologic studies: why sensitivity analysis is the only “solution”. Epidemiology 22:36–39
Article PubMed Google Scholar
Getoor L, Rhee JT, Koller D, Small P (2004) Understanding tuberculosis epidemiology using structured statistical models. Artif Intell Med 30:233–256
Article PubMed Google Scholar
Gilks WR, Richardson T, Spiegelhalter D (1996) Markov Chain Monte Carlo in practice. Chapman & Hall, Boca Raton
Book Google Scholar
Glover F (1989) Tabu search – part i. ORSA J Comput 1:190–206
Article Google Scholar
Glover F (1990) Tabu search – part ii. ORSA J Comput 2:4–32
Article Google Scholar
Glymour C, Scheines R, Spirtes P, Ramsey J (2012) TETRAD project. http://www.phil.cmu.edu/projects/tetrad/. Accessed 15 Aug 2012
Glymour MM (2006) Using causal diagrams to understand common problems in social epidemiology. In: Oakes J, Kaufmann J (eds) Methods in social epidemiology. Jossey-Bass, San Francisco, pp 393–428
Google Scholar
Glymour MM, Greenland S (2008) Causal diagrams. In: Rothman K, Greenland S, Lash T (eds) Modern epidemiology, 3rd edn. Lippincott Williams & Wilkins, Philadelphia, pp 183–209
Google Scholar
Greenland S, Brumback B (2002) An overview of relations among causal modelling methods. Int J Epidemiol 31:1030–1037
Article PubMed Google Scholar
Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 10:37–48
Article PubMed CAS Google Scholar
Heckerman D (1999) A tutorial on learning with Bayesian networks. In: Jordan M (ed) Learning in graphical models. MIT, Cambridge, pp 301–354
Google Scholar
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20:197–243
Google Scholar
Hernán MA, Robins JM (2006) Instruments for causal inference: an epidemiologist’s dream? Epidemiology 17:360–372
Article PubMed Google Scholar
Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol 155:176–184
Article PubMed Google Scholar
Hernán MA, Hernández-Díaz S, Robins JM (2004) A structural approach to selection bias. Epidemiology 15:615–625
Article PubMed Google Scholar
Højsgaard S (2012) Graphical independence networks with the gRain package for R. J Stat Softw 46:1–26
Google Scholar
Højsgaard S, Edwards D, Lauritzen SL (2012) Graphical models with R. Springer, New York
Book Google Scholar
Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19:2271–2282
Article PubMed CAS Google Scholar
Husmeier D (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, London
Book Google Scholar
Imoto S, Goto T, Miyano S (2002) Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pac Symp Biocomput 7:175–186
Google Scholar
Imoto S, Kim S, Goto T, Miyano S, Aburatani S, Tashiro K, Kuhara S (2003) Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network. J Bioinform Comput Biol 1:231–252
Article PubMed CAS Google Scholar
Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs. Springer, New York
Book Google Scholar
Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J Mach Learn Res 8:613–636
Google Scholar
Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P (2012) Causal inference using graphical models with the R package pcalg. J Stat Softw 47:1–26
Google Scholar
Kirkpatrick S, Gelatt CDJ, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
Article PubMed CAS Google Scholar
Kjærulff UB, Madsen AL (2008) Bayesian networks and influence diagrams: a guide to construction and analysis. Springer, New York
Book Google Scholar
Knüppel S (2011) DAG program. http://epi.dife.de/dag/. Accessed 3 Oct 2012
Knüppel S, Stang A (2010) DAG program: identifying minimal sufficient adjustment sets. Epidemiology 21:159
Article PubMed Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT, Cambridge
Google Scholar
Korb KB, Nicholson AE (2011) Bayesian artificial intelligence. 2nd edn. CRC, Boca Raton
Google Scholar
Lauritzen SL (1990) Graphical models. Clarendon, Oxford
Google Scholar
Lauritzen SL (1992) Propagation of probabilities, means, and variances in mixed graphical association models. J Am Stat Assoc 87:1098–1108
Article Google Scholar
Lauritzen SL (1995) The EM algorithm for graphical association models with missing data. Comput Stat Data An 19:191–201
Article Google Scholar
Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J Roy Stat Soc B 50:157–224
Google Scholar
Lauritzen SL, Dawid AP, Larsen BN, Leimer HG (1990) Independence properties of directed Markov fields. Networks 20:491–505
Article Google Scholar
Li J, Wang ZJ (2009) Controlling the false discovery rate of the association/causality structure learned with the PC algorithm. J Mach Learn Res 10:475–514
Google Scholar
Liu Z, Malone B, Yuan C (2012) Empirical evaluation of scoring functions for Bayesian network model selection. BMC Bioinform 13:S14
Article Google Scholar
Lunn D, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions. Stat Med 28:3049–3067
Article PubMed Google Scholar
Madsen AL, Lang M,, Kjærulff UB, Jensen F (2003) The Hugin tool for learning Bayesian networks. In: Nielsen TD, Zhang NL (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2711. Springer, Berlin, pp 594–605
Chapter Google Scholar
Markowetz F, Spang R (2007) Inferring cellular networks – a review. BMC Bioinform 8(Suppl 6):S5
Article CAS Google Scholar
Moral S, Rumí R, Salmeó A (2001) Mixtures of truncated exponentials in hybrid Bayesian networks. In: Benferhat S, Besnard P (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2143. Springer, Berlin, pp 156–167
Chapter Google Scholar
Murphy K (2007) Software for graphical models: a review. ISBA Bull 14:13–15
Google Scholar
Murphy K (2012) Software packages for graphical models/ Bayesian networks. http://www.cs.ubc.ca/~murphyk/Software/bnsoft.html. Accessed 15 Aug 2012
Nadathur SG, Warren JR (2011) Emergency department triaging of admitted stroke patients – a Bayesian network analysis. Health Inform J 17:294–312
Article Google Scholar
Nguefack-Tsague G (2011) Using Bayesian networks to model hierarchical relationships in epidemiological studies. Epidemiol Health 33:e2011006
Article PubMed Central PubMed Google Scholar
Pearl J (2009) Causality – models, reasoning and inference. 2nd edn. Cambridge University Press, Cambridge
Book Google Scholar
R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/. Accessed 15 Aug 2012
Ramsey J (2010) Bootstrapping the PC and CPC algorithms to improve search accuracy. Tech Rep 101, Department of Philosophy, Carnegie Mellon University. http://repository.cmu.edu/philosophy/101. Accessed 15 Aug 2012
Ramsey J, Zhang J, Spirtes P (2006) Adjacency-faithfulness and conservative causal inference. In: Proceedings of the twenty-second annual conference on uncertainty in artificial intelligence (UAI-06). AUAI, Arlington, pp 401–408
Google Scholar
Robins JM (2001) Data, design, and background knowledge in etiologic inference. Epidemiology 12:313–320
Article PubMed CAS Google Scholar
Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of aids patients. Epidemiology 3:319–336
Article PubMed CAS Google Scholar
Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560
Article PubMed CAS Google Scholar
Robins JM, Scheines R, Spirtes P, Wasserman L (2003) Uniform consistency in causal inference. Biometrika 90:491–515
Article Google Scholar
Robinson R (1977) Counting unlabeled acyclic digraphs. In: Little H (ed) Combinatorial mathematics V. Lecture notes in mathematics, vol 622. Springer, Berlin, pp 28–43
Chapter Google Scholar
Rothman KJ (1976) Causes. Am J Epidemiol 104:587–592
PubMed CAS Google Scholar
Rothman KJ, Greenland S, Lash T (2008) Modern epidemiology. 3rd edn. Lippincott Williams & Wilkins, Philadelphia
Google Scholar
Rubin D (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Article Google Scholar
Scutari M (2010) Learning Bayesian networks with the bnlearn R package. J Stat Softw 35:1–22
Google Scholar
Shenoy PP (2011) A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks. In: Liu W (ed) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 6717. Springer, Berlin, pp 98–109
Chapter Google Scholar
Shrier I, Platt RW (2008) Reducing bias through directed acyclic graphs. BMC Med Res Methodol 8:70
Article PubMed Central PubMed Google Scholar
Spiegelhalter DJ, Lauritzen SL (1990) Sequential updating of conditional probabilities on directed graphical structures. Networks 20:579–605
Article Google Scholar
Spirtes P, Glymour C (1990) An algorithm for fast recovery of sparse causal graphs. Report CMU-PHIL-15, Department of Philosophy, Carnegie Mellon University
Google Scholar
Spirtes P, Meek C, Richardson T (1995) Causal inference in the presence of latent variables and selection bias. In: Besnard P, Hanks S (eds) Proceedings of the eleventh conference on uncertainty in artificial intelligence (UAI-95). Morgan Kaufmann, San Francisco, pp 499–506
Google Scholar
Spirtes P, Glymour C, Scheines R (2001) Causation, prediction and search, 2nd edn. MIT, Cambridge
Google Scholar
Stefanini FM, Coradini D, Biganzoli E (2009) Conditional independence relations among biological markers may improve clinical decision as in the case of triple negative breast cancers. BMC Bioinform 10(Suppl 12):S13
Article CAS Google Scholar
Textor J (2012) DAGitty v.10. http://www.dagitty.net/. Accessed 3 Oct 2012
Textor J, Hardt J, Knüppel S (2011) DAGitty: a graphical tool for analyzing causal diagrams. Epidemiology 5:745
Article Google Scholar
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65:31–78
Article Google Scholar
VanderWeele TJ, Robins JM (2007a) Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect. Am J Epidemiol 166:1096–1104
Article PubMed Google Scholar
VanderWeele TJ, Robins JM (2007b) Four types of effect modification: a classification based on directed acyclic graphs. Epidemiology 18:561–568
Article PubMed Google Scholar
Verma T, Pearl J (1991) Equivalence and synthesis of causal models. In: Bonissone P, Henrion M, Kanal L, Lemmer J (eds) Proceedings of the sixth conference on uncertainty in artificial intelligence (UAI-90). Elsevier, Amsterdam, pp 258–268
Google Scholar
Verma T, Pearl J (1992) An algorithm for deciding if a set of observed independencies has a causal explanation. In: Dubois D, Wellman MP, D’Ambrosio B, Smets P (eds) Proceedings of the eighth conference on uncertainty in artificial intelligence (UAI-92). Morgan Kaufmann, San Mateo, pp 323–330
Google Scholar
Wang M, Chen Z, Cloutier S (2007) A hybrid Bayesian network learning method for constructing gene networks. Comput Biol Chem 31:361–372
Article PubMed CAS Google Scholar
Weinberg CR (1993) Toward a clearer definition of confounding. Am J Epidemiol 137:1–8
PubMed CAS Google Scholar
Weinberg CR (2007) Can DAGs clarify effect modification? Epidemiology 18:569–572
Article PubMed Central PubMed Google Scholar
Wong ML, Lee SY, Leung KS (2002) A hybrid approach to discover Bayesian networks from databases using evolutionary programming. In: Proceedings of the 2002 IEEE international conference on data mining, ICDM ’02. IEEE Computer Society, Los Alamitos, pp 498–505
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biometry and Data Management, Leibniz Institute for Prevention Research and Epidemiology - BIPS, Achterstr. 30, 28359, Bremen, Germany
Ronja Foraita, Jacob Spallek & Hajo Zeeb

Authors

Ronja Foraita
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Spallek
View author publications
You can also search for this author in PubMed Google Scholar
Hajo Zeeb
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Epidemiological Methods and Etiologic Research, Leibniz Institute for Prevention Research and Epidemiology – BIPS, Bremen, Germany
Wolfgang Ahrens
Department of Biometry and Data Management, Leibniz Institute for Prevention Research and Epidemiology – BIPS, Bremen, Germany
Iris Pigeot

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Foraita, R., Spallek, J., Zeeb, H. (2014). Directed Acyclic Graphs. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_65

Download citation

DOI: https://doi.org/10.1007/978-0-387-09834-0_65
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-09833-3
Online ISBN: 978-0-387-09834-0
eBook Packages: MedicineReference Module Medicine

Publish with us

Policies and ethics