A New Condition for Transitivity of Probabilistic Support

As is well known, implication is transitive but probabilistic support is not. Eells and Sober, followed by Shogenji, showed that screening off is a sufficient constraint for the transitivity of probabilistic support. Moreover, this screening off condition can be weakened without sacrificing transitivity, as was demonstrated by Suppes and later by Roche. In this paper we introduce an even weaker sufficient condition for the transitivity of probabilistic support, in fact one that can be made as weak as one wishes. We explain that this condition has an interesting property: it shows that transitivity is retained even though the Simpson paradox reigns. We further show that by adding a certain restriction the condition can be turned into one that is both sufficient and necessary for transitivity.


Introduction
We say that proposition p probabilistically supports proposition q, and that q probabilistically supports r if and only if Unlike implication or entailment, probabilistic support is not transitive; in general it does not follow from (1) that p also supports r, i.e. that (1) P(q|p) − P(q) > 0 and P(r|q) − P(r) > 0.
In 2003 Tomoji Shogenji proved that screening off is sufficient for making probabilistic support transitive: if p supports q and q supports r and there is screening off, then p supports r. 1 In 2012 William Roche described a weak version of screening off, and he demonstrated that it still suffices for transitivity. 2 In this paper we weaken Roche's condition further; and we demonstrate that the transitivity of probabilistic support still holds.
Our argument is set up as follows. In Sect. 2 we recall the two versions of screening off: the normal version, which is a particular Markov condition, and the weak variant of Roche. An identity that Tomoji Shogenji developed in 2017 enables us to demonstrate in a particularly succinct manner that both versions suffice for transitivity. In Sect. 3 we show that there exists an even weaker condition, one that still guarantees the transitivity of probabilistic support. As we explain, our new condition covers a continuum of conditions, each of which is weaker than Roche's, and each of which preserves transitivity. Section 4 highlights an interesting property of the new condition: it guarantees transitivity of probabilistic support even if the Simpson paradox obtains. The Simpson effect, as we prefer to call this paradox, implies that p disconfirms r conditionally on q and also disconfirms r conditionally on ¬q , while p still confirms r unconditionally. Rather surprisingly, as we show, this effect can coexist with transitivity: p confirms q, q confirms r, and p confirms r. In Sect. 5 we illustrate this coexistence with a well-known medical example. We conclude in Sect. 6 by constraining our new sufficient condition in such a way that it is also necessary for the transitivity of probabilistic support.

Normal and Weak Screening Off
Our first task is to find a condition, C, that is sufficient for transitivity in the sense: "If C, then if (1) then (2)", or equivalently: In 2003 Tomoji Shogenji showed that (3) is valid if for C we fill in the condition of screening off, which here means that q screens off p from r, that is, Moreover, in 2012 William Roche demonstrated that (3) remains valid if we weaken the condition (4) to where the equals signs have been replaced by inequalities. 3 A good way to see that both (4) and (5) are sufficient conditions for the transitivity of probabilistic support is by means of an illuminating analysis that Shogenji gave of what is in our notation P(r|p) − P(r). 4 We will not reproduce here all the actual steps of Shogenji's careful argument. For our purpose it is enough to say that he derives a relation which in our reworking is the following: where Equation (6) is an identity: it is valid for any propositions p, q and r, irrespective of whether there is screening off, or even probabilistic support. 5 The expressions for and contain four Carnap measures of confirmation. For example, P(r|q) − P(r) is Carnap's difference measure. 6 The square brackets in the definitions of are the Carnap measures of confirmation, conditional on q and ¬q respectively. The additional factors P(q|p) , P(¬q|p) , and the division by 1 − P(q) , are inert, in the sense that none of them is zero (see footnote 3).
(8) P(r|p) − P(r) = (p, r;q). 3 It is assumed that all the conditional probabilities are well defined, which implies that P(q), P(¬q) , P(q ∧ p) , and P(¬q ∧ p) are all non-zero. 4 Shogenji (2017). 5 The actual identity that Shogenji proves in Appendix A of his 2017 paper is the following: The third term on the right here corresponds to our (p, r;q) and the fourth term to (p, r;¬q) , with the identification of his x, y, z with our p, q, r, respectively. After an elementary transformation the sum of the first two terms becomes our (p, r;q).
If p supports q and q supports r, then the right-hand side of (8) is positive. Therefore the left-hand side is positive: p supports r. So with the help of (6) we see that normal screening off suffices for transitivity. But (6) also allows us to see that there is transitivity under (5) as well, for the inequalities (5) imply that both (p, r;q) and (p, r;¬q) are non-negative: If p supports q and q supports r, then (p, r;q) is strictly positive, as we have just seen, and so with (9) the right-hand side of (6) is still positive. In fact, we do not need (p, r;q) and (p, r;¬q) to be separately non-negative, it is clear from (6) that it is enough if their sum is non-negative: Condition (10) is a little weaker than (9) because it allows for the possibility that one of (p, r;q) and (p, r;¬q) may be negative, on condition that the other is positive and sufficiently large to guarantee that the sum is equal to or greater than zero. We will call condition (10) weak screening off or w-screening off for short. If (1), then with w-screening off it follows that p supports r: P(r|p) > P(r).
Conclusion: C in (3) may be identified with either (7), the normal screening off condition, or with (10), the condition of w-screening off. In both cases, transitivity has been ensured.

Very Weak Screening Off
In this section we will weaken w-screening off still further while retaining the transitivity. Our new condition, which we may call very weak screening off or vw-screening off for short, has to satisfy two requirements: (i) it is weaker than w-screening off (ii) it makes the left-hand side of (6) positive, so that p supports r, and transitivity has been achieved.
Clearly requirement (i) is fulfilled if (p, r;q) + (p, r;¬q) < 0 , for this is a situation explicitly ruled out by (10), the condition of w-screening off. As to requirement (ii), this can still be fulfilled even if the sum (p, r;q) + (p, r;¬q) is negative. What is needed for this possibility is that the sum be not too small-the negativity of (p, r;q) + (p, r;¬q) may so to speak not overpower the positivity of (p, r;q) . So requirement (ii) is fulfilled if (p, r;q) + (p, r;¬q) > − (p, r;q).
To show that we are talking about a real possibility see Fig. 1, which displays a probability distribution for which 0 > (p, r;q) + (p, r;¬q) > − (p, r;q).
From this probability distribution we calculate but nevertheless all the following differences are positive: How to define the condition of vw-screening off, which encompasses probability distributions like that of Fig. 1? It may seem that we have already found a suitable definition when we observed that (p, r;q) + (p, r;¬q) may be negative as long as this negativity does not swamp the positivity of (p, r;q) . This line of thought yields as a candidate for the definition of vw-screening off: This inequality satisfies (i) and (ii): it is weaker than w-screening off, since it allows possibilities that are excluded by the latter, and it implies that p supports r. So it seems that (11) can take the the rôle of C in (3): where (1) means that p supports q and q supports r, and (2) that p supports r.
However (11) is not acceptable, for it satisfies (3) only trivially: (11) by itself entails (2), there is no need for (1). To see this, add (p, r;q) to both sides of (11). Then we obtain (p, r;q) = − 2 15 and (p, r;¬q) = − 1 50 ; (1) → (11) → (2) , From the identity (6) it then follows that P(r|p) − P(r) > 0 and thus that P(r|p) > P(r) . All that (11) does is to assert the tautology: The fact that (11) does not need (1) in order to entail (2) goes against the very idea of transitivity, which after all is that p supports r through the mediator q. Clearly we require a condition for which (1) is needed. 8 Note that both normal screening off and w-screening off satisfy this requirement: both need (1) in order to entail (2), since the conclusion that p supports r does not follow from (7) or (10) alone.
Here is a way to formulate a nontrivial condition of vw-screening off. Consider If = 0 , then either (12) reduces to the trivial (11) or p does not support r at all. However, if 0 < < 1 , then (12) does the job. It then satisfies requirement (i), since (p, r;q) + (p, r;¬q) may be negative; it is thus weaker than w-screening off, which would correspond to (12) with ≥ 1 . It also satisfies (ii), for the Shogenji identity (6) shows that (12) is equivalent to P(r|p) − P(r) ≥ (p, r;q) ; and because and (p, r;q) are both positive, p supports r. Finally, (12) with 0 < < 1 is not a trivial condition. Since (p, r;q) could in general be negative, it needs (1) to ensure the positivity of (p, r;q): This is our condition of vw-screening off. Note that, because (12) contains explicitly, our condition for vw-screening off covers a continuum of conditions, one for each value of in the open interval (0, 1). Each of these conditions serves as a sufficient constraint for the transitivity of probabilistic support which is weaker than w-screening off. By making smaller and smaller, we can make the constraint as weak as we like.
In order to render our argument more intuitive and less abstract, we will in Sect. 5 give a real medical example of vw-screening off as defined by (12). But first, in (p, r;q) + (p, r;¬q) + (p, r;q) > 0.
The same triviality lurks in a condition WC that William Roche introduces on p. 456 of Roche (2018). Transposed to our notation, WC is the statement "It is not the case that (p, r;q) + (p, r;¬q) < 0 and | (p, r;q) + (p, r;¬q)| ≥ (p, r;q) ". This is equivalent to the following disjunction The first disjunct is the inequality (10), that is our condition of w-screening off; but the second disjunct is our trivial condition (11).
Sect. 4, we explain that vw-screening off has a remarkable property: it preserves transitivity even in the presence of the Simpson effect.

Transitivity and the Simpson Effect
The example of vw-screening given in Fig. 1 is also an instance of the Simpson effect. Recall the probability distribution in Fig. 1: Here there is not only transitivity (p supports q, q supports r and p supports r), but there is also a Simpson effect: p disconfirms r conditionally on q and also conditionally on ¬q , but nevertheless p confirms r unconditionally. 9 The fact that, via vw-screening off, transitivity can coexist with the Simpson effect is somewhat surprising-it was at least to us. For the two results seem to pull in different directions. To put it somewhat impressionistically, while transitivity has a ring of uninterruptedness to it, suggesting the continuous flow of probabilistic support through a chain, the Simpson effect gives the idea of an unexpected rupture, which is precisely why it is experienced as a paradox.
In general, the relation between the Simpson effect on the one hand and screening off (normal, weak, or very weak) on the other might appear somewhat complicated. Figure 2 can help us to achieve a better understanding of how precisely the two are connected. On the left, the smallest circle represents normal screening off (n-so) where (p, r;q) = 0 and (p, r;¬q) = 0 , the next circle represents weak screening off (w-so) where (p, r;q) + (p, r;¬q) ≥ 0 , and the large circle represents very weak screening off (vw-so) where (p, r;q) + (p, r;¬q) > − (p, r;q) . In all of these three regions p supports q and q supports r, so (p, r;q) is positive and there is transitivity of probabilistic support.
The circle on the right represents the Simpson effect (se), in which also p supports r, but (p, r;q) < 0 and (p, r;¬q) < 0 . In the overlap region between vw-so and se, (p, r;q) is positive but (p, r;q) and (p, r;¬q) are both negative, and there is transitivity of support. Outside se and w-so but inside vw-so, one of (p, r;q) and (p, r;¬q) is positive and the other is negative, so there is no Simpson effect, but their sum is negative. But inside se and outside vw-so, (p, r;q) and (p, r;¬q) are both negative and, although (p, r;q) is positive, p disconfirms q and q disconfirms r.
These properties have been implicitly obtained already in the literature. For example, the theorem in Appendix 1 of Lindley and Novick, 10 translated into our notation, states that, 'If Simpson's paradox holds, with p and r positively correlated, and p and q positively correlated, then q and r are positively correlated'. This 1 3 is one half of our finding. Mittal supplied what is equivalent to the other half of our result. Theorem 4.1 in Mittal's paper, again translated into our notation, states that, 'If Simpson's paradox holds, with p and r positively correlated, then either (a) q is positively correlated to p and to r, or (b) q is positively correlated to ¬p and to ¬r'. 11 Mittal's alternative (b) amounts to P(q|¬p) > P(q) and P(q|¬r) > P(q) , which is equivalent to P(q|p) < P(q) and P(r|q) < P(r) . Thus Mittal's case (b) corresponds to the region of Fig. 2 inside se but outside vw-so. However, our approach enables us to see that there is more to be said. As we have seen, the Simpson effect does not imply that p supports q, and q supports r. Nevertheless it could be argued that the Simpson effect manifests a generalized sense of the transitivity of probabilistic support. For the Shogenji identity (6) can be transposed as follows: Under Simpson reversal all the quantities between the parentheses {...} are positive, so (p, r;q) must be positive too. Thus either P(q|p) > P(q) and P(r|q) > P(r) or P(q|p) < P(q) and P(r|q) < P(r) . However in the latter case P(¬q|p) > P(¬q) and P(r|¬q) > P(r). 12 So p supports ¬q , and ¬q supports r. So whenever Simpson reversal occurs, p supports r, either through the mediation of q or through the mediation of ¬q.
Various estimates of probabilistic support, under the guise of Bayesian measures of the confirmation of hypotheses, have been listed by Fitelson (1999) and others. Although they suffer from the disadvantage that they are not ordinally equivalent to one another, they do all agree that, if P(p ∧ r) > P(p)P(r) , then c(r, p) > 0 , and if P(p ∧ r) < P(p)P(r) , then c(r, p) < 0 , where c(r, p) is any of the aforementioned (p, r;q) = P(r|p) − P(r) − (p, r;q) − (p, r;¬q) .
Bayesian measures of confirmation. Accordingly, whenever the Simpson effect occurs, then one or other of the following alternatives is true: (a) c(r, p) > 0 and c(q, p) > 0 and c(r, q) > 0 (b) c(r, p) > 0 and c(q, p) < 0 and c(r, q) < 0.

Example: Kidney Stones
A real life instance of the Simpson effect concerns the removal of kidney stones (renal calculi). Julious and Mullee drew attention to a study that had been made by Charig and coworkers of the success rates of two kinds of operations to remove the stones: open surgery or percutaneous nephrolithotomy (the penetration of the skin and kidney by a tube, through which the stone is removed). 13 Julious and Mullee concentrated on 700 operations that were performed on patients with kidney stones, one half by open surgery (between 1972 and 1980), and the other half by percutaneous nephrolithotomy (between 1980 and 1985). An operation was deemed successful if no stones greater than 2 mm in diameter were present in the operated kidney three months after the operation; and success rates were compared for stones that were smaller or larger than 2 cm in diameter.
Consider one operation among these 700, and define the following propositions: r : the operation was successful p : percutaneous nephrolithotomy was performed ¬p ∶ open surgery was performed q : the stone that was removed was less than 2 cm in diameter ¬q ∶ the stone that was removed was at least 2 cm in diameter Since the number of percutaneous nephrolithotomies was equal to the number of open surgeries (namely 350), P(p) = 0.5.
The numbers given by Charig et al. correspond to the following conditional probabilities (relative frequencies): 14 The complete probability distribution can be extracted from these numbers: it has been reproduced in Fig. 3. From this distribution we calculate So percutaneous nephrolithotomy decreases the chance of success for stones of less than 2 cm diameter, and also for stones at least as large as 2 cm. This is of course an example of the Simpson effect, which was the burden of the paper of Julious and Mullee.
From Fig. 3 we can also calculate thus p supports q, and q supports r, in the sense that the correlations in question are positive. In other words, the kidney stone example displays the Simpson effect and also transitivity of probabilistic support; it is in fact an instance of very weak screening off. Eqs. (14) imply that the two functions are negative: while the function is positive: The sum is equal to P(r|p) − P(r) , as should be the case. We see from (6) and (12)  .102 .077 .034 .051 .274 .116

3
A New Condition for Transitivity of Probabilistic Support

A Necessary Condition
We started our inquiry by recalling the well-known fact that probabilistic support is in general not transitive. Tomoji Shogenji however showed that it is transitive under (normal) screening off, and William Roche proved that normal screening off can be weakened, while retaining the transitivity; earlier proofs can be found in Eells andSober (1983), andSuppes (1986). In this paper we offered an alternative proof of their results, making use of a powerful identity in Shogenji (2017). We then weakened weak screening off further to what we called very weak screening off, defined by inequality (12) where satisfies 0 < < 1 . Very weak screening off covers a continuum of conditions, each of which is weaker than weak screening off and is sufficient for transitivity. We pointed out that a special case of very weak screening off includes a special case of the Simpson effect, which we illustrated by means of an example about kidney stones taken from the seminal paper by Julious and Mullee.
Like normal and weak screening off, very weak screening off is a nontrivial sufficient condition for (2), given (1). If 0 < < 1 , then However, by placing an extra constraint on we can turn (12) into a nontrivial necessary condition as well. With any satisfying the inequality we will show that: According to the Shogenji identity, the right-hand side of (15) is equal to From (1), (p, r;q) is positive, and from (2), P(r|p) − P(r) is positive, so it follows that the right-hand side of (16) is positive, so the left-hand side is positive, too. Note that (2) is required to make the right-hand side of (15) positive: without (2) could be negative, which would make (15) inconsistent. On multiplying (15) throughout by (p, r;q) , we obtain and this is none other than (12).
With also being positive, (12) suffices for transitivity: (1) → (12) → (2) . 2. The domain > 0 can be divided into two subdomains: ≥ 1 , when weak screening off applies (or normal screening off as a special case); (b) the more stringent < 1 , when very weak screening off holds sway. Very weak screening off encompasses weak and normal screening off, but also includes probability distributions that fall outside the domain of weak screening off.
In either case (a) or (b) transitivity transpires, which is why we were able simply to specify > 0 as the constraint for sufficiency. 3. The even stronger restriction is a nontrivial necessary and sufficient condition for transitivity, that is for the following to hold, Note that, since the inequality (12) depends on , there is in effect a separate condition of necessity and sufficiency for each value of in the permitted range. In examining various conditions for transitivity, we have been relying on the ∀ quantifier: we talked about particular values for all in a particular domain. As an alternative, one could employ the quantifier ∃ . For example, one could express the necessary and sufficient condition for probabilistic support as: This expression has the advantage of being economical, indeed terse. 15 The disadvantage might be that it hides much under the logical carpet.

A New Condition for Transitivity of Probabilistic Support
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.