Abstract
A directed acyclic graph (DAG) can be thought of as a kind of flowchart that visualizes a whole causal etiological network, linking causes and effects. In epidemiology, the terms causal graph, causal diagram, and DAG are used as synonyms (Greenland et al. 1999). DAGs are considered to be of use for embedding causality in a formal causal framework (Hernán and Robins 2006; Robins 2001; Hernán et al. 2004). In probability theory, there is a somewhat different understanding of DAGs, which we will discuss later. This chapter aims to demonstrate how DAGs can help to formalize the search for answers to different research questions in epidemiology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andersson SA, Madigan D, Perlman MD (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25:505–541
Berkson J (1946) Limitations of the application of fourfold tables to hospital data. Biom Bull 2:47–53
Bishop CM (2007) Pattern recognition and machine learning. Springer, New York
Borsuk ME (2008) Bayesian networks. In: Jørgensen SE, Fath B (eds) Encyclopedia of ecology. Elsevier, Burlington, pp 307–317
Bottcher SG, Dethlefsen C (2011) Deal: learning bayesian networks with mixed variables. http://CRAN.R-project.org/package=deal. R package version 1.2–34
Breitling L (2010) dagR: a suite of R functions for directed acyclic graphs. Epidemiology 21:586–587
Chickering D, Meek C (2002) Finding optimal Bayesian networks. In: Darwiche A, Friedman N (eds) Proceedings of the eighteenth annual conference on uncertainty in artificial intelligence (UAI-02). Morgan Kaufmann, San Francisco, pp 94–102
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz HJ (eds) Learning from data: artificial intelligence and statistics V. Lecture notes in statistics, vol 112. Springer, New York, pp 121–130
Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J Mach Learn Res 5:1287–1330
Cobb BR, Rumí R, Salmerón A (2007) Bayesian network models with discrete and continuous variables. In: Lucas P, Gámez JA, Salmerón A (eds) Advances in probabilistic graphical models. Studies in fuzziness and soft computing, vol 213. Springer, Berlin, pp 81–102
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347
Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Information science and statistics. Springer, New York
Dagum P, Luby M (1993) Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif Intell 60:141–154
Daly R, Shen Q, Aitken S (2011) Learning Bayesian networks: approaches and issues. Knowl Eng Rev 26:99–157
Darwiche A (2009) Modeling and reasoning with Bayesian networks. Cambridge University Press, Cambridge
Darwiche A (2010) Bayesian networks. Commun ACM 53:80–90
Dawid AP (2010a) Beware of the DAG! JMLR workshop Conf Proc 6:59–86
Dawid AP (2010b) Seeing and doing: the Pearlian synthesis. In: Dechter R, Geffner H, Halpern JY (eds) Heuristics, probability and causality: a tribute to Judea Pearl. College Publications, London, pp 309–325
Dethlefsen C, Højsgaard S (2005) A common platform for graphical models in R: the gRbase package. J Stat Softw 14:1–12
Didelez V, Sheehan NA (2007) Mendelian randomisation: why epidemiology needs a formal language for causality. In: Russo F, Williamson J (eds) Causality and probability in the sciences. Texts in philosophy, vol 5. College Publications, London, pp 263–292
Fast A, Hay M, Jensen D (2008) Improving accuracy of constraint-based structure learning. Technical Report 08-48, Computer Science Department, University of Massachusetts Amherst
Friedman N (1997) Learning belief networks in the presence of missing values and hidden variables. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML ’97). Morgan Kaufmann, San Francisco, pp 125–133
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303:799–805
Friedman N, Goldszmidt M, Wyner A (1999a) Data analysis with Bayesian networks: a bootstrap approach. In: Prade H, Laskey K (eds) Proceedings of the fifteenth annual conference on uncertainty in artificial intelligence (UAI-99). Morgan Kaufmann, San Francisco, pp 196–205
Friedman N, Goldszmidt M, Wyner A (1999b) On the application of the bootstrap for computing confidence measures on features of induced bayesian networks. In: Heckerman D, Whittaker J (eds) Proceedings of the seventh international workshop on artificial intelligence and statistics. Morgan Kaufmann, San Francisco, pp 197–202
Geiger D, Heckerman D, King H, Me (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29:505–529
Geneletti S, Mason A, Best N (2011) Adjusting for selection effects in epidemiologic studies: why sensitivity analysis is the only “solution”. Epidemiology 22:36–39
Getoor L, Rhee JT, Koller D, Small P (2004) Understanding tuberculosis epidemiology using structured statistical models. Artif Intell Med 30:233–256
Gilks WR, Richardson T, Spiegelhalter D (1996) Markov Chain Monte Carlo in practice. Chapman & Hall, Boca Raton
Glover F (1989) Tabu search – part i. ORSA J Comput 1:190–206
Glover F (1990) Tabu search – part ii. ORSA J Comput 2:4–32
Glymour C, Scheines R, Spirtes P, Ramsey J (2012) TETRAD project. http://www.phil.cmu.edu/projects/tetrad/. Accessed 15 Aug 2012
Glymour MM (2006) Using causal diagrams to understand common problems in social epidemiology. In: Oakes J, Kaufmann J (eds) Methods in social epidemiology. Jossey-Bass, San Francisco, pp 393–428
Glymour MM, Greenland S (2008) Causal diagrams. In: Rothman K, Greenland S, Lash T (eds) Modern epidemiology, 3rd edn. Lippincott Williams & Wilkins, Philadelphia, pp 183–209
Greenland S, Brumback B (2002) An overview of relations among causal modelling methods. Int J Epidemiol 31:1030–1037
Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 10:37–48
Heckerman D (1999) A tutorial on learning with Bayesian networks. In: Jordan M (ed) Learning in graphical models. MIT, Cambridge, pp 301–354
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20:197–243
Hernán MA, Robins JM (2006) Instruments for causal inference: an epidemiologist’s dream? Epidemiology 17:360–372
Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol 155:176–184
Hernán MA, Hernández-Díaz S, Robins JM (2004) A structural approach to selection bias. Epidemiology 15:615–625
Højsgaard S (2012) Graphical independence networks with the gRain package for R. J Stat Softw 46:1–26
Højsgaard S, Edwards D, Lauritzen SL (2012) Graphical models with R. Springer, New York
Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19:2271–2282
Husmeier D (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, London
Imoto S, Goto T, Miyano S (2002) Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pac Symp Biocomput 7:175–186
Imoto S, Kim S, Goto T, Miyano S, Aburatani S, Tashiro K, Kuhara S (2003) Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network. J Bioinform Comput Biol 1:231–252
Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs. Springer, New York
Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J Mach Learn Res 8:613–636
Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P (2012) Causal inference using graphical models with the R package pcalg. J Stat Softw 47:1–26
Kirkpatrick S, Gelatt CDJ, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
Kjærulff UB, Madsen AL (2008) Bayesian networks and influence diagrams: a guide to construction and analysis. Springer, New York
Knüppel S (2011) DAG program. http://epi.dife.de/dag/. Accessed 3 Oct 2012
Knüppel S, Stang A (2010) DAG program: identifying minimal sufficient adjustment sets. Epidemiology 21:159
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT, Cambridge
Korb KB, Nicholson AE (2011) Bayesian artificial intelligence. 2nd edn. CRC, Boca Raton
Lauritzen SL (1990) Graphical models. Clarendon, Oxford
Lauritzen SL (1992) Propagation of probabilities, means, and variances in mixed graphical association models. J Am Stat Assoc 87:1098–1108
Lauritzen SL (1995) The EM algorithm for graphical association models with missing data. Comput Stat Data An 19:191–201
Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J Roy Stat Soc B 50:157–224
Lauritzen SL, Dawid AP, Larsen BN, Leimer HG (1990) Independence properties of directed Markov fields. Networks 20:491–505
Li J, Wang ZJ (2009) Controlling the false discovery rate of the association/causality structure learned with the PC algorithm. J Mach Learn Res 10:475–514
Liu Z, Malone B, Yuan C (2012) Empirical evaluation of scoring functions for Bayesian network model selection. BMC Bioinform 13:S14
Lunn D, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions. Stat Med 28:3049–3067
Madsen AL, Lang M,, Kjærulff UB, Jensen F (2003) The Hugin tool for learning Bayesian networks. In: Nielsen TD, Zhang NL (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2711. Springer, Berlin, pp 594–605
Markowetz F, Spang R (2007) Inferring cellular networks – a review. BMC Bioinform 8(Suppl 6):S5
Moral S, Rumí R, Salmeó A (2001) Mixtures of truncated exponentials in hybrid Bayesian networks. In: Benferhat S, Besnard P (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2143. Springer, Berlin, pp 156–167
Murphy K (2007) Software for graphical models: a review. ISBA Bull 14:13–15
Murphy K (2012) Software packages for graphical models/ Bayesian networks. http://www.cs.ubc.ca/~murphyk/Software/bnsoft.html. Accessed 15 Aug 2012
Nadathur SG, Warren JR (2011) Emergency department triaging of admitted stroke patients – a Bayesian network analysis. Health Inform J 17:294–312
Nguefack-Tsague G (2011) Using Bayesian networks to model hierarchical relationships in epidemiological studies. Epidemiol Health 33:e2011006
Pearl J (2009) Causality – models, reasoning and inference. 2nd edn. Cambridge University Press, Cambridge
R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/. Accessed 15 Aug 2012
Ramsey J (2010) Bootstrapping the PC and CPC algorithms to improve search accuracy. Tech Rep 101, Department of Philosophy, Carnegie Mellon University. http://repository.cmu.edu/philosophy/101. Accessed 15 Aug 2012
Ramsey J, Zhang J, Spirtes P (2006) Adjacency-faithfulness and conservative causal inference. In: Proceedings of the twenty-second annual conference on uncertainty in artificial intelligence (UAI-06). AUAI, Arlington, pp 401–408
Robins JM (2001) Data, design, and background knowledge in etiologic inference. Epidemiology 12:313–320
Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of aids patients. Epidemiology 3:319–336
Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560
Robins JM, Scheines R, Spirtes P, Wasserman L (2003) Uniform consistency in causal inference. Biometrika 90:491–515
Robinson R (1977) Counting unlabeled acyclic digraphs. In: Little H (ed) Combinatorial mathematics V. Lecture notes in mathematics, vol 622. Springer, Berlin, pp 28–43
Rothman KJ (1976) Causes. Am J Epidemiol 104:587–592
Rothman KJ, Greenland S, Lash T (2008) Modern epidemiology. 3rd edn. Lippincott Williams & Wilkins, Philadelphia
Rubin D (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Scutari M (2010) Learning Bayesian networks with the bnlearn R package. J Stat Softw 35:1–22
Shenoy PP (2011) A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks. In: Liu W (ed) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 6717. Springer, Berlin, pp 98–109
Shrier I, Platt RW (2008) Reducing bias through directed acyclic graphs. BMC Med Res Methodol 8:70
Spiegelhalter DJ, Lauritzen SL (1990) Sequential updating of conditional probabilities on directed graphical structures. Networks 20:579–605
Spirtes P, Glymour C (1990) An algorithm for fast recovery of sparse causal graphs. Report CMU-PHIL-15, Department of Philosophy, Carnegie Mellon University
Spirtes P, Meek C, Richardson T (1995) Causal inference in the presence of latent variables and selection bias. In: Besnard P, Hanks S (eds) Proceedings of the eleventh conference on uncertainty in artificial intelligence (UAI-95). Morgan Kaufmann, San Francisco, pp 499–506
Spirtes P, Glymour C, Scheines R (2001) Causation, prediction and search, 2nd edn. MIT, Cambridge
Stefanini FM, Coradini D, Biganzoli E (2009) Conditional independence relations among biological markers may improve clinical decision as in the case of triple negative breast cancers. BMC Bioinform 10(Suppl 12):S13
Textor J (2012) DAGitty v.10. http://www.dagitty.net/. Accessed 3 Oct 2012
Textor J, Hardt J, Knüppel S (2011) DAGitty: a graphical tool for analyzing causal diagrams. Epidemiology 5:745
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65:31–78
VanderWeele TJ, Robins JM (2007a) Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect. Am J Epidemiol 166:1096–1104
VanderWeele TJ, Robins JM (2007b) Four types of effect modification: a classification based on directed acyclic graphs. Epidemiology 18:561–568
Verma T, Pearl J (1991) Equivalence and synthesis of causal models. In: Bonissone P, Henrion M, Kanal L, Lemmer J (eds) Proceedings of the sixth conference on uncertainty in artificial intelligence (UAI-90). Elsevier, Amsterdam, pp 258–268
Verma T, Pearl J (1992) An algorithm for deciding if a set of observed independencies has a causal explanation. In: Dubois D, Wellman MP, D’Ambrosio B, Smets P (eds) Proceedings of the eighth conference on uncertainty in artificial intelligence (UAI-92). Morgan Kaufmann, San Mateo, pp 323–330
Wang M, Chen Z, Cloutier S (2007) A hybrid Bayesian network learning method for constructing gene networks. Comput Biol Chem 31:361–372
Weinberg CR (1993) Toward a clearer definition of confounding. Am J Epidemiol 137:1–8
Weinberg CR (2007) Can DAGs clarify effect modification? Epidemiology 18:569–572
Wong ML, Lee SY, Leung KS (2002) A hybrid approach to discover Bayesian networks from databases using evolutionary programming. In: Proceedings of the 2002 IEEE international conference on data mining, ICDM ’02. IEEE Computer Society, Los Alamitos, pp 498–505
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Foraita, R., Spallek, J., Zeeb, H. (2014). Directed Acyclic Graphs. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_65
Download citation
DOI: https://doi.org/10.1007/978-0-387-09834-0_65
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-09833-3
Online ISBN: 978-0-387-09834-0
eBook Packages: MedicineReference Module Medicine