Graphical Causal Models

Elwert, Felix

doi:10.1007/978-94-007-6094-3_13

Felix Elwert²

Part of the book series: Handbooks of Sociology and Social Research ((HSSR))

10k Accesses
118 Citations
1 Altmetric

Abstract

This chapter discusses the use of directed acyclic graphs (DAGs) for causal inference in the observational social sciences. It focuses on DAGs’ main uses, discusses central principles, and gives applied examples. DAGs are visual representations of qualitative causal assumptions: They encode researchers’ beliefs about how the world works. Straightforward rules map these causal assumptions onto the associations and independencies in observable data. The two primary uses of DAGs are (1) determining the identifiability of causal effects from observed data and (2) deriving the testable implications of a causal model. Concepts covered in this chapter include identification, d-separation, confounding, endogenous selection, and overcontrol. Illustrative applications then demonstrate that conditioning on variables at any stage in a causal process can induce as well as remove bias, that confounding is a fundamentally causal rather than an associational concept, that conventional approaches to causal mediation analysis are often biased, and that causal inference in social networks inherently faces endogenous selection bias. The chapter discusses several graphical criteria for the identification of causal effects of single, time-point treatments (including the famous backdoor criterion), as well identification criteria for multiple, time-varying treatments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Identification is also relative to the set of observed variables. Identification may be possible for one set of observed variables, but not for another set. Mimicking the logic of secondary data analysis, here I assume that the analyst is given a set of observed variables (and hence that all other variables are unobserved). Identification analysis can also be used to ask what sets of variables should be observed to achieve identification.
2.
A detailed tutorial for reading counterfactuals (including nested counterfactuals) off a DAG is presented in Section 4.4 of Pearl (2012a).
3.
By contrast, marginally correlated error terms must be explicitly included in the causal DAG, since they represent common causes.
4.
These assumptions are, first, the causal Markov assumption, which states that a variable is independent of its nondescendants given its parents, and second, stability or faithfulness, which, among other things, rules out exact cancelation of positive and negative effects. In this chapter, I mostly use weak faithfulness, which is the reason for interpreting arrows as possible rather than certain direct effects. Glymour and Greenland (2008) give an accessible summary. See Pearl (2009) and Spirtes et al. ([1993] 2001) for technical details.
5.
Common cause confounding by unobserved variables is sometimes represented by a bi-headed dashed arrow.
6.
Terminology is in flux. The name “endogenous selection bias” highlights that the problem originates from conditioning on an endogenous variable. Others prefer “selection bias” (Hernán et al. 2004), “collider stratification bias” (Greenland 2003), “M-bias” (Greenland 2003), “Berkson’s [1946] bias,” “explaining away effect” (Kim and Pearl 1983), or “conditioning bias” (Morgan and Winship 2007). Simpson’s paradox (Hernán et al. 2011) and the Monty-Hall dilemma (Burns and Wieth 2004) involve specific examples of endogenous selection. The shared structure of some examples of endogenous selection bias has been known in the social sciences at least since Heckman (1976). For a comprehensive treatment, see Elwert and Winship (forthcoming).
7.
Endogenous selection bias is guaranteed if one assumes that positive and negative arrows do not cancel each other out exactly, i.e. if the DAG is faithful. Faithfulness is a mild assumption since exact cancelation is exceedingly unlikely in practice.
8.
Occasionally, a variable may be both a collider and a common cause. In that case, conditioning on the variable may eliminate confounding bias but induce endogenous selection bias, whereas not conditioning on the variable would lead to confounding bias yet eliminate endogenous selection bias (Greenland 2003). Nevertheless, the definitions of confounding and endogenous selection remain distinct.
9.
D-Connectedness necessarily implies statistical dependence if the DAG is faithful.
10.
Table 13.1 assumes faithfulness. Under weak faithfulness, the DAG only implies the listed marginal and conditional independences. Many authors prefer to assume weak faithfulness. Free software for deriving testable implications is reported in Textor et al. (2011) and in Kyono (2010).
11.
Morgan and Winship (2012) use DAGs to discuss effect heterogeneity in sociological applications. Elwert and Winship (2010) use DAGs to show that unobserved effect heterogeneity can bias linear regression models.
12.
The requirement not to condition on a descendant of a variable on a causal path is explained in the discussion of Fig. 13.9 below.
13.
Pearl (1995, 2009) and others use so-called do-operator notation to write \( P\left( {{y^T}} \right) \) as \( P\left( {Y=y\,|\,\mathrm{do}(T=t)} \right) \). The do-operator do(T = t) emphasizes that T is set to t by intervention (“doing”). \( P(Y=y\,|\,\mathrm{do}(T=t)) \) gives the post-intervention distribution of Y if one intervened on T to set it to some specific value t, that is, the counterfactual distribution of T.
14.
D ₅ is a descendant of the collider T→D ₂←e ₂ (recall the implied existence of idiosyncratic error terms), which opens the noncausal path T---e ₂→D ₂→Y.
15.
The difference between Fig. 13.13a, b illustrates why identifying the magnitude of a causal effect is more difficult than testing the null of no effect. If one could condition on W _R in Fig. 13.13a, then the absence of an association between M and W ₀ conditional on E and W _R would imply the absence of a causal effect M→W ₀—the null can be tested. But if there is an effect M→W ₀, as in Fig. 13.13b, then the observed association between M and W ₀ given E and W _R is biased for the causal effect M→W ₀—the magnitude of the effect cannot be measured.
16.
Propensity score analysis is not immune to this problem (Shrier 2009; Sjölander 2009).
17.
Elwert and Christakis (2008) use additional knowledge of the network topology to gage and remove the bias from residual confounding (i.e., if conditioning on H does not solve the problem).
18.
By definition, interpersonal causal effects in social networks violate the no-interference decree of Rubin’s (1980) stable unit treatment value assumption (SUTVA). See VanderWeele and An (Chap. 17, this volume) for a detailed discussion of causal inference with interference.
19.
DAGs for triadic networks would usually include separate variables for the characteristics of all three members of a generic triad. Obviously, the complexity of a DAG increases with the complexity of social structure. This is one reason why causal inference in social networks is a difficult problem.
20.
Here, we focus on causal effects of time-varying treatments that contrast predetermined treatment sequences. For two binary unit treatments, we can define six causal effects corresponding to the six pairwise contrasts between the four possible predetermined treatment sequences, here, (math, math), (math, English), (English, math), and (English, English). Note that some of these causal effects, such as (math, English) vs. (English, English), equal so-called controlled direct effects (Pearl 2001; Robins and Greenland 1992). The identification criteria discussed in this section apply to all causal effects of predetermined treatment sequences and hence to all controlled direct causal effects. See Bollen and Pearl (Chap. 15, this volume) and Wang and Sobel (Chap. 12, this volume) for mediation formulae and the identification of other types of (“natural” or “pure”) direct and indirect effects. See Robins and Richardson (2011) and Pearl (2012b) for graphical identification conditions of path-specific effects. See Robins and Hernán (2009) for yet other types of time-varying treatments, especially the distinction between static and dynamic time-varying treatment effects.
21.
Note that the joint causal effect of A ₀ and A ₁ is not the same as the total causal effect of A ₀ plus the total causal effect of A ₁, as is sometimes incorrectly thought.
22.
A minimally sufficient set is a sufficient set with the smallest number of variables. There may be multiple minimally sufficient sets.

References

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182.
Article Google Scholar
Berkson, J. (1946). Limitations of the application of fourfold table analysis to hospital data. Biometrics Bulletin, 2(3), 47–53.
Article Google Scholar
Blalock, H. M. (1964). Causal inferences in nonexperimental research. Chapel Hill: University of North Carolina Press.
Google Scholar
Brito, C., & Pearl, J. (2002). Generalized instrumental variables. In A. Darwiche & N. Friedman (Eds.), Uncertainty in artificial intelligence, proceedings of the eighteenth conference (pp. 85–93). San Francisco: Morgan Kaufmann.
Google Scholar
Brumback, B. A., Hernán, M. A., Haneuse, S. J. P. A., & Robins, J. M. (2004). Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures. Statistics in Medicine, 23, 749–767.
Article Google Scholar
Burns, B. D., & Wieth, M. (2004). The collider principle in causal reasoning: Why the Monty Hall Dilemma is so hard. Journal of Experimental Psychology: General, 133(3), 434–449.
Article Google Scholar
Chan, H., & Kuroki, M. (2010). Using descendants as instrumental variables for the identification of direct causal effects in linear SEMs. In Proceedings of the thirteenth international conference on Artificial Intelligence and Statistics (AISTATS-10) (pp. 73–80), Sardinia, Italy.
Google Scholar
Cole, S. R., & Hernán, M. A. (2002). Fallibility in estimating direct effects (with discussion). International Journal of Epidemiology, 31, 163–165.
Article Google Scholar
Duncan, O. D. (1975). Introduction to structural equation models. New York: Academic.
Google Scholar
Elwert, F., & Christakis, N. A. (2006). Widowhood and race. American Sociological Review, 71(1), 16–41.
Article Google Scholar
Elwert, F., & Christakis, N. A. (2008). Wives and ex-wives: A new test for homogamy bias in the widowhood effect. Demography, 45(4), 851–873.
Article Google Scholar
Elwert, F., & Winship, C. (2010). Effect heterogeneity and bias in main-effects-only regression models. In R. Dechter, H. Geffner, & J. Y. Halpern (Eds.), Heuristics, probability and causality: A tribute to Judea Pearl (pp. 327–336). London: College Publications.
Google Scholar
Elwert, F., & Winship, C. (forthcoming). Endogenous selection bias the dangers of conditioning on collider variables. Annual Review of Sociology.
Google Scholar
Farr, W. (1858). Influence of marriage on the mortality of the French people. In G. W. Hastings (Ed.), Transactions of the national association for the promotion of social science (pp. 504–513). London: John W. Park & Son.
Google Scholar
Finn, J. D., & Achilles, C. M. (1990). Answers and questions about class size. American Educational Research Journal, 27(3), 557–577.
Article Google Scholar
Finn, J. D., Gerber, S. B., & Boyd-Zaharias, J. (2005). Small classes in the early grades, academic achievement, and graduating from high school. Journal of Educational Psychology, 97(2), 214–223.
Article Google Scholar
Fowler, J. H., & Christakis, N. A. (2010). Cooperative behavior cascades in human social networks. PNAS: Proceedings of the National Academy of Sciences, 107(12), 5334–5338.
Article Google Scholar
Galles, D., & Pearl, J. (1998). An axiomatic characterization of causal counterfactuals. Foundations of Science, 3(1), 151–182.
Article Google Scholar
Glymour, M. M., & Greenland, S. (2008). Causal diagrams. In K. J. Rothman, S. Greenland, & T. Lash (Eds.), Modern epidemiology (3rd ed., pp. 183–209). Philadelphia: Lippincott.
Google Scholar
Greenland, S. (2003). Quantifying biases in causal models: Classical confounding versus collider-stratification bias. Epidemiology, 14, 300–306.
Google Scholar
Greenland, S. (2010). Overthrowing the tyranny of null hypotheses hidden in causal diagrams. In R. Dechter, H. Geffner, & J. Y. Halpern (Eds.), Heuristics, probability and causality: A tribute to Judea Pearl (pp. 365–382). London: College Publications.
Google Scholar
Greenland, S., & Robins, J. M. (1986). Identifiability, exchangeability and epidemiological confounding. International Journal of Epidemiology, 15, 413–419.
Article Google Scholar
Greenland, S., Pearl, J., & Robins, J. M. (1999a). Causal diagrams for epidemiologic research. Epidemiology, 10, 37–48.
Article Google Scholar
Greenland, S., Robins, J. M., & Pearl, J. (1999b). Confounding and collapsibility in causal inference. Statistical Science, 14, 29–46.
Article Google Scholar
Gronau, R. (1974). Wage comparisons-a selectivity bias. Journal of Political Economy, 82, 1119–1144.
Article Google Scholar
Heckman, J. J. (1974). Shadow prices, market wages and labor supply. Econometrica, 42(4), 679–694.
Article Google Scholar
Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection, and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5, 475–492.
Google Scholar
Hernán, M. A., Hernández-Diaz, S., Werler, M. M., Robins, J. M., & Mitchell, A. A. (2002). Causal knowledge as a prerequisite of confounding evaluation: An application to birth defects epidemiology. American Journal of Epidemiology, 155(2), 176–184.
Article Google Scholar
Hernán, M. A., Hernández-Diaz, S., & Robins, J. M. (2004). A structural approach to section bias. Epidemiology, 155(2), 174–184.
Google Scholar
Hernán, M. A., Clayton, D., & Keiding, N. (2011). The Simpson’s paradox unraveled. International Journal of Epidemiology, 40, 780–785.
Article Google Scholar
Holland, P. W. (1986). Statistics and causal inference (with discussion). Journal of the American Statistical Association, 81, 945–970.
Article Google Scholar
Holland, P. W. (1988). Causal inference, path analysis, and recursive structural equation models. Sociological Methodology, 18, 449–484.
Article Google Scholar
Kim, J.H., & Pearl, J. (1983). A computational model for combined causal and diagnostic reasoning in inference systems. In Proceedings of the 8th International Joint Conference on Artificial Intelligence (pp. 190–193). Karlsruhe.
Google Scholar
Kyono, T. (2010). Commentator: A front-end user-interface module for graphical and structural equation modeling (Tech. Rep. (R-364)). UCLA Cognitive Systems Laboratory.
Google Scholar
Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. Cambridge: Cambridge University Press.
Book Google Scholar
Morgan, S. L., & Winship, C. (2012). Bringing context and variability back in to causal analysis. In H. Kincaid (Ed.), Oxford handbook of the philosophy of the social sciences. New York: Oxford University Press.
Google Scholar
Neyman, J. ([1923] 1990). On the application of probability theory to agricultural experiments. Essay on principles, section 9, translated (with discussion). Statistical Science, 5(4), 465–480.
Google Scholar
O’Malley, A. J., Elwert, F., Rosenquist, J. N., Zaslavsky, A. M., & Christakis, N. A. (2012). Estimating peer effects in longitudinal dyadic data using instrumental variables (Working Paper). Department of Health Care Policy, Harvard Medical School.
Google Scholar
Pearl, J. (1985). Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings, Cognitive Science Society (pp. 329–334). Irvine: University of California.
Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems. San Mateo: Morgan Kaufman.
Google Scholar
Pearl, J. (1993). Comment: Graphical models, causality, and interventions. Statistical Science, 8(3), 266–269.
Article Google Scholar
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82(4), 669–710.
Article Google Scholar
Pearl, J. (1998). Graphs, causality, and structural equation models. Sociological Methods and Research, 27(2), 226–284.
Article Google Scholar
Pearl, J. (2001). Direct and indirect effects. In Proceedings of the seventeenth conference on Uncertainty in Artificial Intelligence (pp. 411–420). San Francisco: Morgan Kaufmann.
Google Scholar
Pearl, J. ([2000] 2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge: Cambridge University Press.
Book Google Scholar
Pearl, J. (2010). The foundations of causal inference. Sociological Methodology, 40, 75–149.
Article Google Scholar
Pearl, J. (2012a). The causal foundations of structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 68–91). New York: Guilford Press.
Google Scholar
Pearl, J. (2012b). Interpretable conditions for identifying direct and indirect effects (Tech. Rep. (R-389)). UCLA Cognitive Systems Laboratory.
Google Scholar
Pearl, J., & Robins, J. M. (1995). Probabilistic evaluation of sequential plans from causal models with hidden variables. In P. Besnard & S. Hanks (Eds.), Uncertainty in artificial intelligence 11 (pp. 444–453). San Francisco: Morgan Kaufmann.
Google Scholar
Robins, J. M. (1986). A new approach to causal inference in mortality studies with a sustained exposure period: Application to the health worker survivor effect. Mathematical Modeling, 7, 1393–1512.
Article Google Scholar
Robins, J. M. (1989). The control of confounding by intermediate variables. Statistics in Medicine, 8, 679–701.
Article Google Scholar
Robins, J. M. (1997). Causal inference from complex longitudinal data. In M. Berkane (Ed.), Latent variable modeling and applications to causality (Lecture notes in statistics 120, pp. 69–117). New York: Springer.
Chapter Google Scholar
Robins, J. M. (1999). Association, causation, and marginal structural models. Synthese, 121, 151–179.
Article Google Scholar
Robins, J. M. (2001). Data, design, and background knowledge in etiologic inference. Epidemiology, 23(3), 313–320.
Article Google Scholar
Robins, J. M., & Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology, 3, 143–155.
Article Google Scholar
Robins, J. M., & Hernán, M. A. (2009). Estimation of the causal effects of time-varying exposures. In G. Fitzmaurice et al. (Eds.), Handbooks of modern statistical methods: Longitudinal data analysis (pp. 553–599). Boca Raton: CRC Press.
Google Scholar
Robins, J. M., & Richardson, T. (2011). Alternative graphical causal models and the identification of direct effects. In P. Shrout, K. Keyes, & K. Ornstein (Eds.), Causality and psychopathology: Finding the determinants of disorders and their cures (pp. 103–158). New York: Oxford University Press.
Google Scholar
Robins, J. M., & Wasserman, L. (1999). On the impossibility of inferring causation from association without background knowledge. In C. N. Glymour & G. G. Cooper (Eds.), Computation, causation, and discovery (pp. 305–321). Cambridge: AAAI/MIT Press.
Google Scholar
Rosenbaum, P. R. (1984). The consequences of adjustment for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A, 147(5), 656–666.
Article Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.
Article Google Scholar
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66, 688–701.
Article Google Scholar
Rubin, D. B. (1980). Comment on ‘randomization analysis of experimental data in the fisher randomization test’ by Basu. Journal of the American Statistical Association, 75, 591–593.
Google Scholar
Shalizi, C. R., & Thomas, A. C. (2011). Homophily and contagion are generically confounded in observational social network studies. Sociological Methods and Research, 40, 211–239.
Article Google Scholar
Sharkey, P., & Elwert, F. (2011). The legacy of disadvantage: Multigenerational neighborhood effects on cognitive ability. The American Journal of Sociology, 116(6), 1934–1981.
Article Google Scholar
Shpitser, I., & Pearl, J. (2006). Identification of conditional interventional distributions. In R. Dechter & T. S. Richardson (Eds.), Proceedings of the twenty-first national conference on Artificial Intelligence (pp. 437–444). Menlo Park: AAAI Press.
Google Scholar
Shpitser, I., & Pearl, J. (2007). What counterfactuals can be tested. In Proceedings of the twenty-third conference on Uncertainty in Artificial Intelligence (UAI-07) (pp. 352–359). Corvallis: AUAI Press.
Google Scholar
Shpitser, I., VanderWeele, T. J., & Robins, J. M. (2010). On the validity of covariate adjustment for estimating causal effects. In Proceedings of the 26th conference on Uncertainty and Artificial Intelligence (pp. 527–536). Corvallis: AUAI Press.
Google Scholar
Shrier, I. (2009). Letter to the editor. Statistics in Medicine, 27, 2740–2741.
Article Google Scholar
Sjölander, A. (2009). Letter to the editor: Propensity scores and M-structures. Statistics in Medicine, 28, 1416–1423.
Article Google Scholar
Smith, H. L. (1990). Specification problems in experimental and nonexperimental social research. Sociological Methodology, 20, 59–91.
Article Google Scholar
Sobel, M. E. (2008). Identification of causal parameters in randomized studies with mediating variables. Journal of Educational and Behavioral Statistics, 33(2), 230–251.
Article Google Scholar
Spirtes, P., Glymour, C. N., & Schein, R. ([1993] 2001). Causation, prediction, and search (2nd ed.). Cambridge, MA: MIT Press.
Google Scholar
Textor, J., Hardt, J., & Knüppel, S. (2011). Letter to the editor: DAGitty: A graphical tool for analyzing causal diagrams. Epidemiology, 22(5), 745.
Article Google Scholar
VanderWeele, T. J. (2009). On the distinction between interaction and effect modification. Epidemiology, 20, 863–871.
Article Google Scholar
VanderWeele, T. J. (2011). Sensitivity analysis for contagion effects in social networks. Sociological Methods and Research, 40, 240–255.
Article Google Scholar
VanderWeele, T. J., & Robins, J. M. (2007). Four types of effect modification: A classification based on directed acyclic graphs. Epidemiology, 18(5), 561–568.
Article Google Scholar
VanderWeele, T. J., & Robins, J. M. (2009). Minimal sufficient causation and directed acyclic graphs. The Annals of Statistics, 37, 1437–1465.
Article Google Scholar
VanderWeele, T. J., & Shpitser, I. (2011). A new criterion for confounder selection. Biometrics, 67, 1406–1413.
Article Google Scholar
Verma, T., & Pearl, J. (1988). Causal networks: Semantics and expressiveness. In Proceedings of the fourth workshop on Uncertainty in Artificial Intelligence (pp. 352–359). Minneapolis/Mountain View: AUAI Press.
Google Scholar
Winship, C., & Harding, D. J. (2008). A mechanism-based approach to the identification of age-period-cohort models. Sociological Methods and Research, 36(3), 362–401.
Article Google Scholar
Wodtke, G. T., Harding, D. J., & Elwert, F. (2011). Neighborhood effects in temporal perspective: The impact of long-term exposure to concentrated disadvantage on high school graduation. American Sociological Review, 76, 713–736.
Article Google Scholar
Wooldridge, J. (2005). Violating ignorability of treatment by controlling for too many factors. Econometric Theory, 21, 1026–1028.
Google Scholar
Wooldridge, J. (2006). Acknowledgement of related prior work. Econometric Theory, 22, 1177–1178.
Article Google Scholar

Download references

Acknowledgments

I thank Stephen Morgan, Judea Pearl, Tyler VanderWeele, Xiaolu Wang, Christopher Winship, and my students in Soc 952 at the University of Wisconsin for discussions and advice. Janet Clear and Laurie Silverberg provided editorial assistance. All errors are mine.

Author information

Authors and Affiliations

Department of Sociology, Center for Demography and Ecology, University of Wisconsin–Madison, Madison, WI, USA
Felix Elwert

Authors

Felix Elwert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Elwert .

Editor information

Editors and Affiliations

, Department of Sociology, Cornell University, Uris Hall 358, Ithaca, 14853, USA
Stephen L. Morgan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Elwert, F. (2013). Graphical Causal Models. In: Morgan, S. (eds) Handbook of Causal Analysis for Social Research. Handbooks of Sociology and Social Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6094-3_13

Download citation

DOI: https://doi.org/10.1007/978-94-007-6094-3_13
Published: 27 March 2013
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6093-6
Online ISBN: 978-94-007-6094-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics