Abstract
Pearl opened the door to formally defining actual causation using causal models. His approach rests on two strategies: first, capturing the widespread intuition that X = x causes Y = y iff X = x is a Necessary Element of a Sufficient Set for Y = y, and second, showing that his definition gives intuitive answers on a wide set of problem cases. This inspired dozens of variations of his definition of actual causation, the most prominent of which are due to Halpern & Pearl. Yet all of them ignore Pearl’s first strategy, and the second strategy taken by itself is unable to deliver a consensus. This paper offers a way out by going back to the first strategy: it offers six formal definitions of causal sufficiency and two interpretations of necessity. Combining the two gives twelve new definitions of actual causation. Several interesting results about these definitions and their relation to the various Halpern & Pearl definitions are presented. Afterwards the second strategy is evaluated as well. In order to maximize neutrality, the paper relies mostly on the examples and intuitions of Halpern & Pearl. One definition comes out as being superior to all others, and is therefore suggested as a new definition of actual causation.
Change history
28 August 2021
A Correction to this paper has been published: https://doi.org/10.1007/s10992-021-09632-6
Notes
R is defined in Observation 1.
References
Beckers, S. (2021). The counterfactual NESS definition of causation. In Proceedings of the 35th AAAI conference on artificial intelligence.
Beckers, S., & Vennekens, J. (2017). The transitivity and asymmetry of actual causation. Ergo, 4(1), 1–27.
Beckers, S., & Vennekens, J. (2018). A principled approach to defining actual causation. Synthese, 195(2), 835–862.
Glymour, C., Danks, D., Glymour, B., Eberhardt, F., Ramsey, J., Scheines, R., Spirtes, P., Teng, C.M., & Zhang, J. (2010). Actual causation: a stone soup essay. Synthese, 2, 169–192.
Hall, N. (2004). Two concepts of causation. In Collins, J., Hall, N., & Paul, L.A. (Eds.) Causation and counterfactuals (pp. 225–276): The MIT Press.
Hall, N. (2007). Structural equations and causation. Philosophical Studies, 132(1), 109–136.
Halpern, J.Y. (2015). A modification of the halpern-pearl definition of causality. In Proceedings of the 24th IJCAI (pp. 3022–3033): AAAI Press.
Halpern, J.Y. (2016). Actual causality. Cambridge: MIT Press.
Halpern, J.Y., & Pearl, J. (2001). Causes and explanations: a structural-model approach. Part I: causes. In Proc. 17th Conference on Uncertainty in Artificial Intelligence (UAI 2001) (pp. 194–202).
Halpern, J.Y., & Pearl, J. (2005). Causes and explanations: a structural-model approach. Part I: causes. The British Journal for the Philosophy of Science, 56(4), 843–87.
Hitchcock, C. (2001). The intransitivity of causation revealed in equations and graphs. Journal of Philosophy, 98, 273–299.
Hitchcock, C. (2007). Prevention, preemption, and the principle of sufficient reason. The Philosophical Review, 116(4), 495–532.
Mackie, J. (1965). Causes and conditions. American Philosophical Quarterly, 2(4), 261–264.
McDermott, M. (1995). Redundant causation. The British Journal for the Philosophy of Science, 46(4), 523–544.
Pearl, J. (1998). On the definition of actual cause. Tech. rep., Department of Computer Science, University of California, Los Angeles, R-259.
Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press.
Pearl, J. (2009). Causality: models, reasoning, and inference, 2nd edn. Cambridge: Cambridge University Press.
Rosenberg, I., & Glymour, C. (2018). Review of Joseph Halpern, actual causality. BJPS Review of Books.
Schaffer, J. (2000). Trumping preemption. Journal of Philosophy, 97(4), 165–181.
Weslake, B. (2015). A partial theory of actual causation. The British Journal for the Philosophy of Science forthcoming.
Woodward, J. (2003). Making things happen: a theory of causal explanation. Oxford University Press.
Wright, R.W. (1988). Causation, responsibility, risk, probability, naked statistics, and proof: pruning the bramble bush by clarifying the concepts. Iowa Law Review, 73, 1001–1077.
Wright, R.W. (2011). The NESS account of natural causation: a response to criticisms. In Goldberg, R. (Ed.) Perspectives on causation: Hart Publishing.
Acknowledgements
Many thanks to Joe Halpern, Naftali Weinberger, and an anonymous reviewer for helpful comments on earlier versions of this paper. This research was made possible by funding from the Alexander von Humboldt Foundation.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The author declares that he has no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Causal Sufficiency
Proposition 1
X = x is strongly sufficient for Y = y in M along a network N iff X = x is strongly sufficient for Y = y in M.
Proof
First assume X = x is strongly sufficient for Y = y in M and N can be used to show this. Then the result follows immediately from the observation that X = x is directly sufficient for N = n and either N = n is directly sufficient for Y = y or N = Y and n = y.
Second assume X = x is strongly sufficient for Y = y in M along a network N. Define A = − (X ∪N). We need to show that for all a ∈(A) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\) we have that (M,u)⊧[X ←x,A ←a]N = n.
We know that X = x is directly sufficient for N1 = n1. Define C1 = − (X ∪N1) and D1 = N −N1. Note that C1 = A ∪D1. We have that for all c1 ∈(C1) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\), (M,u)⊧[X ←x,C1 ←c1]N1 = n1. In particular, we have that for all a ∈(A) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\), (M,u)⊧[X ←x,A ←a]N1 = n1.
Define C2 = − (N1 ∪N2) and D2 = N − (N1 ∪N2). Note that C2 = A ∪D2 ∪X. We have that for all c2 ∈(C2) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\), (M,u)⊧[N1 ←n1,C2 ←c2]N2 = n2. In particular, we have that for all a ∈(A) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\), (M,u)⊧[X ←x,N1 ←n1,A ←a]N2 = n2. Combined with the conclusion from the previous paragraph, it follows that for all a ∈(A) and all \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\), (M,u)⊧[X ←x,A ←a]N1 = n1 ∧N2 = n2.
Defining Nk+ 1 = Y, we can generalize this reasoning for all consecutive i ∈{3,…,k + 1} to get the desired outcome. □
Appendix B: Defining Causation using Sufficiency
Theorem 1
The following are all equivalences among the twelve definitions and the three HP definitions:
-
Modified HP iff Def 1
-
Def 2 iff Def 5
-
Def 8 iff Def 11
-
Def 3 iff Def 6 iff Def 9 iff Def 12
Proof
First we consider the equivalences that do hold.
We start with the first equivalence: Modified HP iff Def 1. This is simply a matter of explicitly writing out the definitions, starting with actual weak sufficiency: X = x is actually weakly sufficient for Y = y in (M,u) iff (M,u)⊧[X ←x]Y = y. Next we note that the following condition is trivially satisfied for any \(\mathbf {W} \subseteq \mathcal{V} \): (M,u)⊧[X ←x,W ←w∗]Y = y.
Combining both claims, we can rewrite Modified HP as follows, which gives the desired result:
- AC2(a).:
-
There is a set \(\mathbf {W} \subseteq (\mathcal{V} - (\mathbf {X} \cup \{Y\}))\) and a setting \(\mathbf {x}^{\prime }\) of the variables in X such that \((\mathbf {X}=\mathbf {x}^{\prime }, \mathbf {W}=\mathbf {w}^{*})\) is not actually weakly sufficient for Y = y in (M,u).
- AC2(b).:
-
(X = x,W = w∗) is actually weakly sufficient for Y = y in (M,u).
Next we consider all of the following equivalences: Def 2 iff Def 5, Def 8 iff Def 11, Def 3 iff Def 6, Def 9 iff Def 12. The reason we can group these together, is because we can prove all of them by invoking the following observation and two subsequent lemmas. □
Observation 1
Recall our restriction on causal models that exogenous variables only appear in equations of the form V = U. Say \(\mathbf {R} \subseteq \mathcal{V} \) are all variables which have such an equation, and call these the root variables. It is clear that if we intervene on all of the root variables, they take over the role of the exogenous variables. Concretely, given strong recursivity, for any setting r ∈(R) there exists a unique setting v ∈() so that for all contexts \(\mathbf {u} \in \mathcal{R} (\mathcal {U})\) we have that (M,u)⊧[R ←r] = v.
Lemma 1
Given a setting X = x, a setting N = n that includes Y = y and such that N ∩R = ∅, a context u, the following holds:Footnote 1
-
X = x is actually directly sufficient for Y = y in (M,u) iff X = x is directly sufficient for Y = y in M;
-
X = x is actually strongly sufficient for Y = y in (M,u) along N = n iff X = x is strongly sufficient for Y = y in M along N = n.
Proof
Filling in the definitions of direct and actually direct sufficiency, the first equivalence reduces to the following: for all c ∈(− (X ∪{Y })), it holds that (M,u)⊧[X ←x,C ←c]Y = y iff for all \(\mathbf {u}^{\prime \prime } \in \mathcal{R} (\mathcal {U})\), \((M,\mathbf {u}^{\prime \prime }) \models [\mathbf {X} \gets \mathbf {x}, \mathbf {C} \gets \mathbf {c}]Y=y\).
Because of Observation 1, we have that for any setting v ∈ and any setting r ∈(R), it holds that (M,u)⊧[R ←r] = v iff for all contexts \(\mathbf {u}^{\prime \prime } \in \mathcal{R} (\mathcal {U})\), \((M,\mathbf {u}^{\prime \prime }) \models [\mathbf {R} \gets \mathbf {r}]\mathcal{V} =\mathbf {v}\). Combining this with the fact that \(\mathbf {R} \subseteq (\mathbf {C} \cup \mathbf {X})\) gives the desired result.
The second equivalence can be reformulated as follows: X = x is actually directly sufficient for N = n in (M,u) iff X = x is directly sufficient for N = n in M. In turn, this reduces to: for all c ∈(− (X ∪N)), it holds that (M,u)⊧[X ←x,C ←c]N = n iff for all \(\mathbf {u}^{\prime \prime } \in \mathcal{R} (\mathcal {U})\), \((M,\mathbf {u}^{\prime \prime }) \models [\mathbf {X} \gets \mathbf {x}, \mathbf {C} \gets \mathbf {c}]\mathbf {N}=\mathbf {n}\).
Given that N ∩R = ∅, we still have that \(\mathbf {R} \subseteq (\mathbf {C} \cup \mathbf {X})\), and therefore we can apply the same reasoning as before. □
Lemma 2
For all twelve instances of the General Definition of Causation we can restrict ourselves to sets N so that (N −{Y }) ∩R = ∅.
Proof
Let A denote (N −{Y }) ∩R. For all definitions using either variants of direct or weak sufficiency the result follows immediately from the fact that N −{Y } = ∅.
First consider the case where we use non-actual strong sufficiency (Def 5 or Def 11). In that case, AC2(b) can never be satisfied unless A = ∅. To see why, note that in all contexts \(\mathbf {u}^{\prime \prime } \in \mathcal{R} (\mathcal {U})\), it has to hold that \((M,\mathbf {u}^{\prime \prime })\models [\mathbf {X} \gets \mathbf {x}, \mathbf {W} \gets \mathbf {w}^{*}] \mathbf {A}=\mathbf {a}\). Since A ∩ (X ∪W) and the equation for each element Ai ∈A is of the form Ai = U for some exogenous variable U, this is impossible. (Strictly speaking it is possible, namely if the range of U consists only of the single value \(a_{i}^{*}\). Although I did not make this explicit in Section ??, it is standard to assume that all variables have a range that contains at least two elements.)
Second consider the case where we use actual strong sufficiency and contrastive necessity (Def 2). (The case of Def 8 is entirely analogous.) Say we are considering a candidate cause X = x, a candidate witness W = w∗, contrast values \(\mathbf {x}^{\prime }\), and a setting N = n that includes Y = y. Given AC1, we can safely assume that n = n∗.
I claim that the following holds, from which the result follows: X = x satisfies AC2 using contrast values \(\mathbf {x}^{\prime }\), witness W = w∗, and network N iff X = x satisfies AC2 using contrast values \(\mathbf {x}^{\prime }\), witness (W = w∗,A = a∗), and network N −A.
Because \(\mathbf {A} \subseteq \mathbf {R}\), we have that for any set \(\mathbf {B} \subseteq (\mathcal{V} - \mathbf {A})\), and any setting b ∈(B), (M,u)⊧[B ←b]A = a∗. Moreover, since (M,u)⊧A = a∗, for each setting v ∈ (−A) we also have that (M,u)⊧[B ←b](−A) = v iff (M,u)⊧[B ←b,A ←a∗](−A) = v.
Using these observations and the fact that \(\mathbf {A} \subseteq \mathbf {N}\), we get that the following two conditions are equivalent, for which the result follows as far as AC2(b) is concerned:
- AC2(b).:
-
For all c ∈(− (X ∪W ∪N)) we have that (M,u)⊧[X ←x,W ←w∗,C ←c]N = n∗.
- AC2(b).:
-
For all c ∈(− (X ∪W ∪N)) we have that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}, \mathbf {W} \gets \mathbf {w}^{*}, \mathbf {A} \gets \mathbf {a}^{*}, \mathbf {C} \gets \mathbf {c}] (\mathbf {N} - \mathbf {A})=\mathbf {n_{2}}^{*}\) (where n2 is the restriction of n∗ to (N −A)).
Now we focus on AC2(ac).
Let us first assume AC2(ac) holds for X = x, contrast values \(\mathbf {x}^{\prime }\), witness (W = w∗,A = a∗), and network N −A. We need to show that it holds for X = x, contrast values \(\mathbf {x}^{\prime }\), witness (W = w∗), and network N.
Consider some \(\mathbf {S} \subseteq \mathbf {N}\) with Y ∈S. We need to find a t ∈(− (X ∪W ∪S)) so that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {T} \gets \mathbf {t}] \mathbf {S} \neq \mathbf {s}^{*}\). Define S1 = S −A, S2 = S ∩A, and A1 = A −S.
Since \(\mathbf {S_{1}} \subseteq (\mathbf {N} - \mathbf {A})\) with Y ∈S1, we know that there exists some t1 ∈(− (X ∪W ∪A ∪S1) so that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {A} \gets \mathbf {a}^{*}, \mathbf {T} \gets \mathbf {t_{1}}] \mathbf {S_{1}} \neq \mathbf {s_{1}}^{*}\). Since \(\mathbf {S_{1}} \subseteq \mathbf {S}\), it also holds that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {A} \gets \mathbf {a}^{*}, \mathbf {T} \gets \mathbf {t_{1}}] \mathbf {S} \neq \mathbf {s}^{*}\). Also, given our observations about A, it also follows that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*}, \mathbf {A_{1}} \gets \mathbf {a_{1}}, \mathbf {T} \gets \mathbf {t_{1}}] \mathbf {S} \neq \mathbf {s}^{*}\). Lastly, note that [− (X ∪W ∪A ∪S1)] ∪A1 = − (X ∪W ∪S). Therefore we can choose t = (a1,t1).
Next we consider the other direction: assume AC2(ac) holds for X = x, contrast values \(\mathbf {x}^{\prime }\), witness W = w∗, and network N. We need to show that it holds for X = x, contrast values \(\mathbf {x}^{\prime }\), witness (W = w∗,A = a∗), and network N −A.
Consider some \(\mathbf {S} \subseteq (\mathbf {N} - \mathbf {A})\) with Y ∈S. We need to find a t ∈(− (X ∪W ∪A ∪S) so that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {A} \gets \mathbf {a}^{*},\mathbf {T} \gets \mathbf {t}] \mathbf {S} \neq \mathbf {s}^{*}\).
Note that \((\mathbf {S} \cup \mathbf {A}) \subseteq \mathbf {N}\), and also Y ∈ (S ∪A). Therefore there exists some t2 ∈(− (X ∪W ∪A ∪S) so that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {A} \gets \mathbf {a}^{*}, \mathbf {T} \gets \mathbf {t_{2}}] (\mathbf {S} \neq \mathbf {s}^{*} \lor \mathbf {A} \neq \mathbf {a}^{*})\). It follows that \((M,\mathbf {u})\models [\mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {W} \gets \mathbf {w}^{*},\mathbf {A} \gets \mathbf {a}^{*}, \mathbf {T} \gets \mathbf {t_{2}}] \mathbf {S} \neq \mathbf {s}^{*}\). Choosing t = t2 gives the desired result. □
Because of the above lemmas, all that remains is to show that the above equivalences hold also when Y ∈R. This is accomplished by showing that settings of such variables do not have any cause, regardless of the definition one uses.
AC2(a) requires us to look at all subsets of N = n that include Y = y, and verify that the candidate cause and witness \((\mathbf {X}=\mathbf {x}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) (or candidate witness W = w∗ in case we use AC2(am)) is not sufficient for that subset. One such subset is the one containing just Y = y. By AC1, we have that (M,u)⊧Y = y. Since Y ∈R, there is no intervention on the other endogenous variables so that Y ≠y under that intervention in u. Therefore any definition of causation using a version of actual sufficiency (i.e., Def 2, Def 3, Def 8, and Def 9) considers all sets that do not include Y to be sufficient for Y = y in (M,u). In particular, they consider \((\mathbf {X}=\mathbf {x}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) to be sufficient for Y = y in (M,u), and thus fail to meet condition AC2(a).
For the definitions using non-actual variants of sufficiency (Def 5, Def 6, Def 11, and Def 12), it is condition AC2(b) that can never be satisfied. Analogous to what we saw in the proof of Lemma 2, this follows from the fact that whatever version of sufficiency we use, Y = y has to hold in all contexts, which is impossible given that Y ∉(X ∪W). From this the result follows.
Now we prove the only remaining equivalence: Def 6 iff Def 12. (Given the previous equivalences, other choices are possible too.) We need to show that the following two statements are equivalent:
-
W = w∗ is not directly sufficient for Y = y.
-
There exists values \(\mathbf {x}^{\prime }\) of X such that \((\mathbf {X}=\mathbf {x}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not directly sufficient for Y = y.
Filling in Definition 7, the result follows immediately:
-
There exists a z ∈(− (W ∪X ∪{Y })), a \(\mathbf {x}^{\prime } \in \mathcal{R} (\mathbf {X})\), and a \(\mathbf {u}^{\prime } \in \mathcal{R} (\mathcal {U})\) so that \((M,\mathbf {u}^{\prime })\models [\mathbf {W} \gets \mathbf {w}^{*}, \mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {C} \gets \mathbf {c}] Y \neq y\).
-
There exists values \(\mathbf {x}^{\prime }\) of X, a z ∈(− (W ∪X ∪{Y })) and a \(\mathbf {u}^{\prime } \in \mathcal{R} (\mathcal {U})\) so that \((M,\mathbf {u}^{\prime })\models [\mathbf {W} \gets \mathbf {w}^{*}, \mathbf {X} \gets \mathbf {x}^{\prime },\mathbf {C} \gets \mathbf {c}] Y \neq y\).
Second, we go over some examples to show that none of the other equivalences hold. (Obviously, from now on we may ignore Def 1, Def 5, Def 6, Def 7, Def 9, Def 11, and Def 12.)
Example 1
Equations: Y = (X ∧ A) ∨ D, D = A. Context: A = 1. Then X = 1 is a cause of Y = 1 according to:
-
Modified HP: We can always consider choosing W = ∅, in which case we simply get counterfactual dependence: (M,u)⊧X = x ∧ Y = 1 and \((M,\mathbf {u}) \models [\mathbf {X} \gets \mathbf {x}^{\prime }]Y \neq y\). Doing so in this example, we see that Y = 1 counterfactually depends on (X = 1,D = 1). There is clearly also no witness W = w∗ to show that X = 1 or D = 1 are causes by themselves, so X = 1 is part of a cause.
-
Updated HP and Original HP: taking (A = 1,D = 0) as a witness meets the conditions.
-
Def 3: again take (A = 1,D = 0) as a witness.
-
Def 2: follows from the previous item and Theorem 4.
-
Def 8: follows from the previous item and Theorem 4.
X = 1 is not a cause of Y = 1 according to:
-
Def 10: X = 1 by itself does not weakly suffice for Y = 1 (just look at a context in which A = 0), so we need to add A or D to the witness. But both A = 1 and D = 1 each weakly suffice for Y = 1.
-
Def 4: (X = 0,A = 1) and (X = 0,D = 1) also weakly suffice for Y = 1.
So we know that Def 4 and Def 10 are not equivalent to any of the other definitions. We give an example to show that Def 4 and Def 10 are not equivalent to each other either.
Example 2
Equations: Y = X ∧ A, X = A. Context: A = 1. Since X = 1 is not weakly sufficient for Y = 1, we need to include A = 1 in the witness. Indeed, (X = 1,A = 1) is weakly sufficient for Y = 1. However, so is A = 1, and therefore X = 1 does not cause Y = 1 according to Def 10. Yet (X = 0,A = 1) is not weakly sufficient for Y = 1, and therefore X = 1 causes Y = 1 according to Def 4.
This leaves us with the HP definitions, Def 2, Def 3, and Def 8. The next example shows that the former are not equivalent to the latter.
Example 3
Equations: \(Y=(X \land \lnot A) \lor D\), D = A. Context: A = 1. Then X = 1 is a cause of Y = 1 according to:
-
Modified HP: Y = 1 counterfactually depends on (X = 1,A = 1), and not on either X = 1 or A = 1. So X = 1 is part of a cause.
-
Updated HP and Original: take A = 0 as a witness.
X = 1 is not a cause of Y = 1 according to:
-
Def 3: X = 1 by itself does not directly suffice for Y = 1 (just look at [A ← 1,D ← 0]), so we need to add A or D to the witness. Since the actual value of A is 1, it is of no use, which leaves us with D. But D = 1 directly suffices for Y = 1 by itself, and thus so does (X = 0,D = 1).
-
Def 2: follows from the previous item and Proposition 12.
-
Def 8: follows from the previous item and Proposition 12.
That none of the HP definitions are equivalent is of course a well-established fact, and also follows from the examples we consider in Section ??. Therefore we are left with showing that Def 2, Def 3, and Def 8 are not equivalent. That Def 3 differs from the other two is a direct consequence of some of our later results, but a simple example illustrates this as well.
Example 4
Equations: Y = A, A = X. Context: A = 1. Then it is easy to see that X = 1 causes Y = 1 according to all definitions here considered, except for Def 3.
Lastly, I refer the reader to Example 2 in Sections ?? for an example that shows Def 2 and Def 8 are not equivalent.
Proposition 4
If X = x causes Y = y in (M,u) according to a definition that uses minimal necessity, then X is a singleton.
Proof
Since we know that Def 7 is unsatisfiable and we have Theorem 3, we only need to consider Def 3, Def 8, and Def 10. The following applies to both weak and direct sufficiency (i.e., Def 3 and Def 10.)
Assume \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y, and W = w∗ is not sufficient for Y = y. If either \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) or \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {W}=\mathbf {w}^{*})\) is also sufficient for Y = y, then (X1 = x1,X2 = x2) is not minimal.
So let us assume that neither \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) nor \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y. This means we can move X2 to the witness to show that X1 = x1 satisfies AC2 by itself, and likewise for X2 and X1 reversed. From this the result follows.
Now we prove that it also holds for strong sufficiency, i.e., for Def 8. Assume \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along N, and W = w∗ is not sufficient for Y = y along any network \(\mathbf {S} \subseteq \mathbf {N}\). If either \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) or \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {W}=\mathbf {w}^{*})\) is also sufficient for Y = y along N, then (X1 = x1,X2 = x2) is not minimal.
So let us assume that neither \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) nor \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along N. If the same is true for all subnetworks \(\mathbf {S} \subseteq \mathbf {N}\), then as before, we can move either one of X1 and X2 to the witness to show that the other satisfies AC2 by itself.
So let us assume that there is some subnetwork \(\mathbf {S}^{\prime } \subseteq \mathbf {N}\) such that \((\mathbf {X_{1}}=\mathbf {x_{1}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along \(\mathbf {S}^{\prime }\). (Obviously the same reasoning applies to X2.) Since all subnetworks \(\mathbf {S}^{\prime \prime }\) of \(\mathbf {S}^{\prime }\) are also subnetworks of N, it follows from the above that (X1 = x1) satisfies AC2 by itself when taking W as witness and \(\mathbf {S}^{\prime }\) as network. From this the result follows. □
Theorem 2
The only implications – involving either causes or parts of causes – between the remaining five definitions (Def 2, Def 3, Def 4, Def 8, and Def 10) and the three HP definitions are the following ones (and their immediate consequences, of course):
-
If part of Modified HP then Updated HP;
-
If part of Updated HP then Original HP;
-
If Def 3 then Def 2;
-
If part of Def 2 then Def 8;
-
If Def 3 then Original HP;
-
If Def 10 then Def 4.
Proof
The first two implications are proven in Halpern [8].
First we prove the third implication. Assume X = x causes Y = y with witness W according to Def 3. It follows from Proposition 10 that X is a single conjunct X. Note that this immediately implies minimality of X.
In other words, (X = x,W = w∗) is directly sufficient for Y = y, and there exists some \(x^{\prime }\) such that \((X=x^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not directly sufficient for Y = y. From the former it follows that (X = x,W = w∗) is strongly sufficient for Y = y along ∅. From the latter it follows that \((X=x^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not strongly sufficient for Y = y along ∅, from which the result follows.
Second we prove the fourth implication. Assume \((X=x,\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along N, and \((X=x^{\prime },\mathbf {X_{2}}=\mathbf {x_{2}}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not sufficient for Y = y along any network \(\mathbf {S} \subseteq \mathbf {N}\), for some N, \(x^{\prime }\) and \(\mathbf {x_{2}}^{\prime }\). We show that X = x causes Y = y according to Def 8.
Taking \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) as our witness and using N, AC2(b) remains unchanged. If \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) is not sufficient for Y = y along any network \(\mathbf {S} \subseteq \mathbf {N}\), then the result follows. We proceed by a reductio.
Let us assume that \((\mathbf {X_{2}}=\mathbf {x_{2}},\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along some \(\mathbf {S} \subseteq \mathbf {N}\). If \((\mathbf {X_{2}}=\mathbf {x_{2}}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not sufficient for Y = y along any \(\mathbf {S}^{\prime \prime } \subseteq \mathbf {S}\), we have a violation of minimality (since X is redundant). Therefore we know that \((\mathbf {X_{2}}=\mathbf {x_{2}}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is sufficient for Y = y along some network \(\mathbf {S}^{\prime \prime } \subseteq \mathbf {S}\).
This means that there exist values \(\mathbf {s}^{\prime \prime } \in \mathcal{R} (\mathbf {S}^{\prime \prime })\) so that for all settings \(\mathbf {c} \in \mathcal{R} (\mathcal{V} - (\mathbf {S}^{\prime \prime } \cup \mathbf {X_{2}} \cup \{X,Y\})\), and for all \(x^{\prime \prime } \in \mathcal{R} (X)\), it holds that \((M,\mathbf {u})\models [\mathbf {X_{2}} \gets \mathbf {x_{2}}^{\prime }, \mathbf {W} \gets \mathbf {w}^{*},\mathbf {C} \gets \mathbf {c}, X \gets x^{\prime \prime }]\mathbf {S}=\mathbf {s}^{\prime \prime }\) and \((M,\mathbf {u})\models [\mathbf {X_{2}} \gets \mathbf {x_{2}}^{\prime }, \mathbf {W} \gets \mathbf {w}^{*},\mathbf {C} \gets \mathbf {c}, X \gets x^{\prime \prime },\mathbf {S} \gets \mathbf {s}^{\prime \prime }]Y=y\). In particular, this holds if we choose \(X=x^{\prime }\). But that means that \((X=x^{\prime },\mathbf {X_{2}}=\mathbf {x_{2}}^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is also sufficient for Y = y along \(\mathbf {S}^{\prime \prime }\), which contradicts our starting assumption.
Third we prove the fifth implication. As with the third implication, assume that (X = x,W = w∗) is directly sufficient for Y = y, and there exists some \(x^{\prime }\) such that \((X=x^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not directly sufficient for Y = y. From the latter it follows that there exists a setting d of − (X ∪W ∪{Y }) such that \((M,\mathbf {u}) \models [X \gets x^{\prime },\mathbf {W} \gets \mathbf {w}^{*}, \mathbf {D} \gets \mathbf {d}] Y \neq y\). This means that if we take (W = w∗,D = d) as witness, AC2(a) is satisfied for Original HP. Since (X = x,W = w∗) is directly sufficient for Y = y, we know that (M,u)⊧[X ← x,W ←w∗,D ←d]Y = y. Also, we have that Z = X, and thus the former means that also AC2(b) is satisfied for Original HP.
Fourth we prove the last implication. Assume X = x causes Y = y with witness W according to Def 10. (We know because of Proposition 10 that X is a singleton.) In other words, (X = x,W = w∗) is weakly sufficient for Y = y, and W = w∗ is not weakly sufficient for Y = y. Remains to be shown that there exist a value \(x^{\prime }\) so that \((X=x^{\prime },\mathbf {W}=\mathbf {w}^{*})\) is not weakly sufficient for Y = y.
Say \(\mathbf {u}^{\prime }\) is a context such that \((M,\mathbf {u}^{\prime }) \models [\mathbf {W} \gets \mathbf {w}^{*}] Y \neq y\), and say \(x^{\prime }\) is the unique value such that \((M,\mathbf {u}^{\prime }) \models [\mathbf {W} \gets \mathbf {w}^{*}] X = x^{\prime }\). Then also \((M,\mathbf {u}^{\prime }) \models [X \gets x^{\prime },\mathbf {W} \gets \mathbf {w}^{*}] Y \neq y\), which is what remained to be shown.
Fifth, we show that none of the remaining implications hold. (Again, we do not consider the relations amongst the HP definitions explicitly and refer the reader to the examples in Section ??. We also do not explicitly consider the remaining implications for parts of causes, but the reader can verify that the following examples suffice to falsify all those implications as well. For the left-hand side of all implications this follows immediately from the fact that the causes in all the following examples are singletons. For the right-hand side of implications, Propositions 4, 5, and 6 come in handy.)
Example 15 shows that Def 4 does not imply Def 10.
Example 14 shows that none of the other definitons imply either Def 4 or Def 10. So there are no remaining implications with either Def 4 or Def 10 on the right-hand side.
Example 17 shows that Def 3 is not implied by any definition.
Example 16 shows that none of the HP definitions imply Def 2 or Def 8. Note that Def 4 and Def 10 also consider X = 1 a cause of Y = 1 in that example (since X = 1 is weakly sufficient for Y = 1, whereas X = 0 or the emptyset is not). Further, Example 2 shows that Def 8 does not imply Def 2. Therefore there are no remaining implications with Def 2 or Def 8 on the right-hand side.
That leaves us to consider implications with one of the HP definitions on the right-hand side. Given the first two implications of Theorem 4, it suffices to show that none of Def 4, Def 2, Def 8, or Def 10, imply Original HP, and that Def 3 does not imply Updated HP.
I refer the reader to Example 7 in Section ?? for an example where Def 2 – and thus also Def 8 – hold and Original HP does not. □
The following example shows that neither Def 4 nor Def 10 implies Original HP.
Example 5
Equations: Y = Z1 ∨ Z2 ∨ A, Z1 = X ∧ A, \(Z_{2}=X \land \lnot A\). Context: A = 1 and X = 1. Then X = 1 is a cause of Y = 1 according to:
-
Def 10: X = 1 is weakly sufficient for Y = 1 and ∅ is not.
-
Def 4: follows from the previous one.
Yet X = 1 is not a cause of Y = 1 according to Original HP. To see why, note that we need to include A = 0 into the witness in order to get AC2(a), and we must exclude Z1. Also, we clearly cannot add Z2 = 1. Therefore the witness has to be A = 0. The actual value of Z2 is 0. Since we have (M,u)⊧[X ← 1,A ← 0,Z2 ← 0]Y = 0, AC2(b) is not satisfied.
Lastly, an example to show that Def 3 does not imply Updated HP.
Example 6
Equations: Y = (X ∧ D) ∨ A, D = A. Context: A = 1 and X = 1. Then X = 1 is a cause of Y = 1 according to Def 3: (X = 1,D = 1) is directly sufficient for Y = 1, and (X = 0,D = 1) is not. But X = 1 is not a cause of Y = 1 according to Updated HP. To see why, note that we need to include A = 0 into the witness in order to get AC2(a). But (M,u)⊧[X ← 1,A ← 0]Y = 0, thus falsifying AC2(b) for Updated HP.
Appendix C: Excluding Def 3 and Def 10
Proposition 5
If X = x causes Y = y in (M,u) according to Def 3, then X is a singleton, and X is a parent of Y.
Proof
That X is always a singleton is a direct consequence of the combination of Proposition 10 and Theorem 3.
Recall that X is a parent of Y iff there exists a context \(\mathbf {u}^{\prime \prime }\), a setting z ∈(−{X,Y }), and values \(x,x^{\prime \prime }\) of X so that \(F_{Y}(\mathbf {u}^{\prime \prime },\mathbf {z},x) \neq F_{Y}(\mathbf {u}^{\prime \prime },\mathbf {z},x^{\prime \prime })\). This means precisely that for some y ∈(Y ), \((M,\mathbf {u}^{\prime \prime }) \models [\mathbf {Z} \gets \mathbf {z}, X \gets x]Y=y\) and \((M,\mathbf {u}^{\prime \prime }) \models [\mathbf {Z} \gets \mathbf {z}, X \gets x^{\prime \prime }]Y \neq y\). If X = x causes Y = y according to Def 3, the existence of values such that the previous holds follows immediately. □
Proposition 6
If X is only a parent of Y, then Def 3, Def 2, and Def 8 are all equivalent for causes X = x.
Proof
Given Theorem 4, we only need to prove the implication from Def 8 to Def 3.
Assume X is only a parent of Y, and X = x causes Y = y according to Def 8. Thus, there is a witness W and some network N such that (X = x,W = w∗) is strongly sufficient for Y = y along N, and (W = w∗) is not strongly sufficient for Y = y along any subnetwork of N.
First consider the case where N = ∅. This means that (X = x,W = w∗) is directly sufficient for Y = y, and (W = w∗) is not directly sufficient for Y = y. That means precisely that X = x causes Y = y according to Def 12. The result now follows from Theorem 3.
Second consider the case where there exists some N ∈N. If N is not an ancestor of Y, it can be removed from N without consequence. If N is an ancestor of Y, then it cannot be a descendant of X. But in that case it does not depend on X, and thus we can remove it from N and add it to the witness W without consequence. Therefore there always exists a choice of witness so that N = ∅, and thus the result follows. □
Proposition 7
Out of all definitions we have considered, Def 10 and Def 3 are the only ones which do not satisfy Dependence.
Proof
For the HP definitions this is proven in Halpern [8, p. 26].
Example 17 shows the result for Def 3.
Example 15 shows the result for Def 10.
Therefore it remains to be shown that Dependence implies Def 2, Def 4, and Def 8. This is a direct consequence of the fact that Dependence implies Modified HP, combined with Proposition 14. □
Appendix D: Def 2, Def 4, and Def 8, vs the HP Definitions
Proposition 8
If Modified HP with X a singleton, then Def 2, Def 4, and Def 8.
Proof
Recall the root variables R from Observation 1. Note that for any setting r ∈(R), for any set \(\mathbf {Y} \subseteq (\mathcal{V} - \mathbf {R})\), there exists some y so that R = r is both weakly, actually weakly, and strongly, sufficient for Y = y.
Assume X = x causes Y = y according to Modified HP with witness W. This means there exists a \(x^{\prime }\) so that \((M,\mathbf {u})\models [X \gets x^{\prime }, \mathbf {W} \gets \mathbf {w}^{*}] Y \neq y\). Let S = R − (W ∪{X}).
First we focus on Def 4. Note that (X = x,S = s∗,W = w∗) is weakly sufficient for Y = y. Furthermore, changing X from x to \(x^{\prime }\) obviously has no effect on any of the values in R. Therefore \((M,\mathbf {u})\models [X \gets x^{\prime }, \mathbf {W} \gets \mathbf {w}^{*}] \mathbf {S}=\mathbf {s}^{*}\), and thus we get that \((M,\mathbf {u})\models [X \gets x^{\prime }, \mathbf {W} \gets \mathbf {w}^{*},\mathbf {S} \gets \mathbf {s}^{*}] Y \neq y\). (Also, we may assume that W ∩R = ∅.) From this it follows that \((X=x^{\prime },\mathbf {S}=\mathbf {s}^{*},\mathbf {W}=\mathbf {w}^{*})\) is not weakly sufficient for Y = y. So taking (S = s∗,W = w∗) as witness gives the desired result.
Second we focus on Def 2 (from which Def 8 follows due to Theorem 4). Combining the previous statement about \((X=x^{\prime },\mathbf {S}=\mathbf {s}^{*},\mathbf {W}=\mathbf {w}^{*})\) with Proposition 2 it follows immediately that there does not exist any network N so that \((X=x^{\prime },\mathbf {S}=\mathbf {s}^{*},\mathbf {W}=\mathbf {w}^{*})\) is strongly sufficient for Y = y along N.
Clearly there exists some N so that R = r∗ is strongly sufficient for Y = y along N. (We can start by picking parents A of Y = y such that A = a∗ is directly sufficient for Y = y. Then we can take parents of all elements in A, to get a set B so that B = b∗ is directly sufficient for A = a∗, etc.) But then also (X = x,S = s∗,W = w∗) is strongly sufficient for Y = y along N, from which the result follows. □
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Beckers, S. Causal Sufficiency and Actual Causation. J Philos Logic 50, 1341–1374 (2021). https://doi.org/10.1007/s10992-021-09601-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10992-021-09601-z
Keywords
- Actual causation
- Causal sufficiency
- NESS
- Counterfactuals