Abstract
Causal inference plays a central role in behavioral science. Historically, behavioral science methodologies have typically sought to infer a single causal relation. Each of the major approaches to causal inference in the behavioral sciences follows this pattern. Nonetheless, such approaches sometimes differ in the causal relation that they infer. Incremental causal inference offers an alternative to this conceptualization of causal inference that divides the inference into a series of incremental steps. Different steps infer different causal relations. Incremental causal inference is consistent with both causal pluralism and anti-pluralism. However, anti-pluralism places greater constraints the possible topology of sequential inferences. Arguments against causal inference include questioning consistency with causation as an explanatory principle, charging undue complexity, and questioning the need for it. Arguments in favor of incremental inference include better explanation of diverse causal inferences in behavioral science, tailored causal inference, and more detailed and explicit description of causal inference. Incremental causal inference offers a viable and potentially fruitful alternative to approaches limited to a single causal relation.
Similar content being viewed by others
Notes
Menzies (2009) has challenged the view that causation is a natural relation, but still allows that causation is a relation.
This seemingly requires that a conditional with the manipulation as the antecedent can be non-degenerately true even when the antecedent is impossible, perhaps because the manipulation is impossible in the actual world but possible in some counterfactual world accessible to the actual world. That is, manipulation is possibly possible but not possible in the actual world. Thus Cook and Campbell seem to assume modality understood in a form weaker than S5 modal logic.
I have slightly modified Rubin’s notation for consistency with notation used elsewhere in this article, but not in a way that affects its formal structure or expressive power.
This reading is supported by the balance of Pearl’s exposition despite the literal tone of the passage cited earlier. Note that in that passage Pearl referred to inferring the structure, not the mechanisms. Pearl has confirmed this reading in conversation.
A more flexible explanatory account can be constructed by allowing branches that re-converge on a single endpoint. This involves weakening the consistency requirement to allow causal relations distinguished by the absence or presence of the same property and to require only that the properties used to define the causal relations are themselves compatible with one another.
References
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91, 444–455.
Beebee, H. (2000). The non-governing conception of laws of nature. Philosophy and Phenomenological Research, 61, 571–594.
Cartwright, N. (1989). Nature’s capacities and their measurement. Oxford, UK: Oxford University Press.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues for field settings. Boston: Houghton Mifflin.
Elby, A. (1992). Should we explain the EPR correlations causally? Philosophy of Science, 59, 16–25.
Foster, J. (1982-1983). Induction, explanation, and natural necessity. Proceedings of the Aristotelian Society, 83, 87–101.
Glenan, S. (2009). Mechanisms. In H. Beebee, C. Hitchcock, & P. Menzies (Eds.), The oxford handbook of causation (pp. 315–325). Oxford, UK: Oxford University Press.
Glymour, C. (1997). Representations and misrepresentations. In V. R. McKim (Ed.), Causality in crisis? Statistical methods and the search for causal knowledge in the social sciences (pp. 317–322). Notre Dame, IN: University of Notre Dame Press.
Glymour, C., Scheines, R., Spirtes, P., & Kelly, K. (1987). Discovering causal structure: Artificial intelligence, philosophy of science, and statistical modeling. Orlando, FL: Academic Press.
Granger, C. W. J. (2001). In: Ghysels, E., Swanson, N. R., Watson, M. W. (Eds.) Essays in econometrics: Collected papers of Clive W. J. Granger, Vol. 2. Cambridge UK: Cambridge University Press.
Hayduk, L. A. (1996). LISREL: Issues, dabates & strategies. Baltimore, MD: Johns Hopkins.
Healey, R. A. (1992). Causation, robustness, and EPR. Philosophy of Science, 59, 282–292.
Hitchcock, C. (2007). What’s wrong with neuron diagrams? In J. K. Campbell, M. O’Rourke, & H. Silverstein (Eds.), Causation and explanation (pp. 69–92). Cambridge, MA: MIT Press.
Hitchcock, C. (2009). Causal modeling. In H. Beebee, C. Hitchcock, & P. Menzies (Eds.), The oxford handbook of causation (pp. 299–314). Oxford, UK: Oxford University Press.
Holland, P. W. (1986a). Statistics and causal inference. Journal of the American Statistical Association, 81, 945–960.
Holland, P. W. (1986b). Statistics and causal inference: Rejoinder. Journal of the American Statistical Association, 81, 968–970.
Holland, P. W. (1988a). Causal inference, path analysis, and recursive structural equations models. Sociological Methodology, 18, 449–484.
Lewis, D. (1986a). Philosophical papers (Vol. 2). New York: Oxford university press.
Lewis, D. (1986b). On the plurality of worlds. Malden, MA: Blackwell.
Lewis, D. (2004). Causation as influence. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and counterfactuals (pp. 75–106). Cambridge, MA: MIT press.
Mackie, J. L. (1980). The cement of the universe: A study of causation. Oxford, UK: Clarendon.
Markus, K. A. (2002). Statistical equivalence, semantic equivalence, eliminative induction, and the Raykov-Marcoulides proof of infinite equivalence. Structural Equation Modeling, 9, 503–522.
Markus, K. A. (2004). Varieties of causal modeling: How optimal research design varies by explanatory strategy. In K. van Montfort, J. Oud, & A. Satorra (Eds.), Recent developments on structural equation models: Theory and applications (pp. 175–196). Dordrecht: Kluwer.
Markus, K. A. (2010). Structural equations and causal explanations: Some challenges for causal SEM. Structural Equation Modeling, 17, 654–676. doi:10.1080/10705511.2010.510068.
Markus, K. A. (2011). Real causes and ideal manipulations: Pearl’s theory of causal inference from the point of view of psychological research methods. In P. McKay Illari, F. Russo, & J. Williamson (Eds.), Causality in the sciences (pp. 240–269). Oxford, UK: Oxford University Press.
Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation and meaning. New York: Routledge.
Maxwell, S. E. (2010). Introduction to the special section on Campbell’s and Rubin’s conceptualizations of causality. Psychological Methods, 15, 1–2. doi:10.1037/a0018825.
Menzies, P. (2009). Platitudes and counterexamples. In H. Beebee, C. Hitchcock, & P. Menzies (Eds.), The Oxford handbook of causation (pp. 341–367). Oxford, UK: Oxford University Press.
Pearl, J. (2009). Causality: Models, reasoning and inference (2nd ed.). Cambridge, UK: Cambridge University Press.
Phye, G. D., & Sanders, C. E. (1994). Advice and feedback: Elements of practice for problem solving. Contemporary Educational Psychology, 19, 286–301.
Railton, P. (1978). A deductive-nomological model of probabilistic explanation. Philosophy of Science, 45, 206–226.
Redhead, M. (1989). The nature of reality. The British Journal for the Philosophy of Science, 40, 429–441.
Reichardt, C. (2006). The principle of parallelism in the design of studies to estimate treatment effects. Psychological Methods, 11, 1–18. doi:10.1037/1082-989x.11.1.1.
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701. doi:10.1037/a0018537.
Rubin, D. B. (2010). Reflections stimulated by the comments of Shadish (2010) and West and Thoemmes (2010). Psychological Methods, 15, 38–46.
Shadish, W. R. (2010). Campbell and Rubin: A primer and comparison of their approaches to causal inference in field settings. Psychological Methods, 15, 3–17. doi:10.1037/a0015916.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal research. Boston: Houghton Mifflin.
Shadish, W. R., & Sullivan, K. J. (2012). Theories of causation in psychological science. In H. Cooper, P. M. Camic, D. L. Long, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology (vol 1): Foundations, planning, measures, and psychometrics (pp. 23–52). Washington, DC: American Psychological Association. doi:10.1037/13619-003.
Shute, V. J. (2007). Focus on formative feedback. Educational Testing Service Research Report 07–11. Princeton, NJ: ETS.
Skyrms, B. (1984). EPR: Lessons for metaphysics. In P. A. French, T. E. Uehling Jr, & H. K. Wettstein (Eds.), Midwest studies in philosophy IX: Causation and causal theories (pp. 245–255). Minneapolis: University of Minnesota Press.
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). Cambridge, MA: MIT Press.
West, S. G., & Thoemmes, F. (2010). Campbell’s and Rubin’s perspectives on causal inference. Psychological Methods, 15, 18–37. doi:10.1037/a0015917.
Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford, UK: Oxford University Press.
Acknowledgments
Joshua Clegg provided helpful feedback on a previous draft of this article.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Campbellian notation
In its most recent form (Shadish et al. 2002), \(R\) indicates random group asignment whereas NR indicates non-random group assignment. Non-random groups are also separated by dashed lines. \(X\) indicates a treatment and \(X\!\!\!\!\!-\) indicates cessation of ongoing treatment intervention, whereas \(O\) indicates an observation. Numeric subscripts identify time points at which these take place whereas letter subscripts indicate measures used. Braces group observations that take place at the same time point.
1.2 Potential outcomes approach
As formalized by Holland (1986b), \(R=<U,\,K,\,Y,\,S>\) where \(R\) is Rubin’s model, \(U\) is the set of experimental units commonly referred to as cases or observations, \(K\) is the set of treatments to which members of \(U\) are assigned, \(Y\) is a function from paired values of \(U\) and \(K\) to values of the observed response variable, and \(S\) is a function from members of \(U\) to members of \(K\) representing the treatment to which the given member was indeed assigned. The causal effect of \(K\) on \(Y\) is then defined as \(Y(u,\,t)=Y(u,\,c)\) for \(K=\{t,\,c\}\) which readily generalizes to any two treatments. The Stable Unit Treatment Value Assumption (SUTVA) states that \(Y(u,\,k)\) is stable which entails (a) that there are no unrepresented treatment variables and (b) that for all \(u \quad \ne u',\,K(u')\) does not affect \(Y(u,\,k)\) (Rubin 2010).
1.3 Bayes net approach
The three central conditions formalized by Spirtes et al. (2000) take the following form. Let \(P\) denote the probability distribution over the set of vertices, \(V\), represented in graph \(G\). Let \(W\) denote any subset of \(V\). Let \(A(v)\) denote the ancestors of \(v\), the causes of \(v\) represented the graph and let \(D(v)\) denote the descendants of \(v\), the effects of \(v\) represented in the graph. The Causal Markov Condition states that for every \(W\) in \(V,\,W\) remains statistically independent of all remaining variables in \(V\) excluding \(A(W)\) and \(D(W)\). The Causal Minimality Condition states that there is no subgraph of \(G\) that satisfies the Casual Markov Condition with respect to \(P\). The Faithfulness Condition states that every conditional independence relation in \(P\) is implied by the application of the Causal Markov Condition to \(G\).
Pearl’s (2009) five properties characterizing causation take the following form. Let \(W,\,X\), and \(Y\) denote sets of variables. Let \(W_{x}(u)\) denote the value of \(W\) that results from manipulating \(X\) (in the model) to take the value \(x\). Composition states that if \(W_{x}(u) = \hbox {w}\) then \(Y_{xw}(u)=Y_{x}(u)\). Effectiveness states that \(X_{xw}(u)=x\). Reversibility states that if (\(Y_{xw}(u)=y\) and \(W_{xy}(u)=w\)) then \(Y_{x}(u)=y\). Existence states that there exists an \(x \varepsilon X\) such that \(X_{y}(u)=x\). Uniqueness states that for univariate \(X\), if (\(X_{y}(u)=x\) and \(X_{y}(u)=x\)’) then \(x=x\)’. For detailed discussion, see Pearl (2009 chapter 7 and Markus 2011).
1.4 Granger causation
Granger (2001, chapter 2) gives the following definition of Granger causation. Let \(t\) index discrete non-overlapping times. Let \(\Omega _{t}\) denote the values of all variables up to and including time \(t\). Axiom A states that the past can cause the future but not vice versa. Axiom B states that \(\Omega _{t}\) contains no redundant information as would be the case if one variable were a determinate function of one or more others. Let \(A\) denote some subset of values of a putative effect variable \(Y_{t+1}\). \(X_{t}\) causes \(Y_{t+1}\) if \(\hbox {P}(Y_{t+1 }\varepsilon A \vert \Omega _{t})\ne \,\hbox {P}(Y_{t+1}\varepsilon A {\vert } \Omega _{t}-X_{t})\), where P(.) denotes probability and \(\Omega _{t}-X_{t}\) denotes the complement of \(X_{t}\) within \(\Omega _{t}\). Granger’s later definition of non-causation suggests that one can safely interpret the above definition as a biconditional rather than just a conditional as stated.
Rights and permissions
About this article
Cite this article
Markus, K.A. An incremental approach to causal inference in the behavioral sciences. Synthese 191, 2089–2113 (2014). https://doi.org/10.1007/s11229-013-0386-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-013-0386-x