In both scientific discourse and natural language communication, when thinking about history and in everyday situations we are confronted with conditionals such as If it is even, then it is a six (when commenting on a die roll) or If Reagan worked for the KGB, I’ll never find out (Lewis, 1986, p. 155) or If I had drunk my morning coffee, I would not have headache. Estimating or computing the probabilities of such claims is a notorious problem. In particular, a much-discussed issue is whether the probability of a conditional sentence equals conditional probability (in the appropriate probability space). This is Adams’ definition: the probability of a conditional P(A → C) is P(C|A) (Adams, 1965, 1970, 1975, 1998).Footnote 1 We shall use the standard acronym PCCP for this thesis.Footnote 2

This solution is simple, attractive and coherent with our intuitions in many cases. The intuitive probability of the conditional If it is even, then it is a six is 1/3 (for a fair die), which is exactly the conditional probability P(It is a six|It is even).Footnote 3 However, it is far from obvious that the rule according to which the probability of the conditional is conditional probability should be accepted in full generality, and there is intense discussion going on. Lewis’ triviality results (Lewis, 1976) (which purport to show that this equality holds in very special cases only) are very important for this discussion as they indicate that the problem has a fundamental character. In a nutshell, they suggest that it is not possible to give a sound formal argument for PCCP (apart from trivial cases), which—in Lewis’ words—leads to absurd results.Footnote 4

In the present article we propose an original solution to this problem. Our starting point is the construction of a diachronic Dutch Book for the conditional A → C. We analyze the betting behavior of a rational agent whose beliefs are expressed in a propositional language that—apart from the Boolean connectives—also contains conditionals. This agent ascribes credence to their beliefs, and a minimal condition for the agent being rational is the coherence of their beliefs (in particular their credence assignments).Footnote 5 In particular, this means that a Dutch Book cannot be constructed against a rational agent. We will use the term “DB-resistant” for such a system of beliefs.

The role of Dutch Book analysis for the rationality of agents was stressed in Lewis (1999) and we accept his point of view (see also footnote 18).Footnote 6 This is a fairly standard approach when considering beliefs expressed as ordinary Boolean sentences, i.e. without the conditional. In this paper we propose an extension of this approach to language that also contains conditionals. This will allow the following questions to be answered:

  1. 1.

    What is the coherent, DB-resistant extension of these beliefs to the conditional sentence A → C—assuming that the agent has a coherent, DB-resistant system of beliefs concerning the Boolean part of their language? Namely, what credence should be assigned to A → C, given P(A), P(C) and P(AC)?

  2. 2.

    What is the appropriate (and, if possible, the simplest) probability space S* = (Ω*, Σ*, P*) which allows the DB-resistant probability of the conditional A → C to be computed in a formally sound way and which provides a mathematical underpinning for our DB results?

  3. 3.

    How should we interpret Lewis’ triviality proof in the light of our findings? Is it true that (apart from trivial cases) there is no sound mathematical argument in favor of PCCP?Footnote 7

Providing answers to these questions will give a strong argument in favor of PCCP, in particular by identifying some weaknesses in Lewis’ original reasoning (Lewis, 1976).Footnote 8 To make the presentation of our arguments lucid and illustrative, we use a simple urn example (a counterpart of the example from Edgington (1995) or Kaufmann (2004)). The numerical calculations are therefore very simple, but without loss of generality they exhibit the crucial phenomena and illustrate the problematic aspects of the intuitive, informal argumentation. The general formal construction is given in the “Appendix”.

The structure of the paper is as follows:

In Sect. 1, An introductory example, we show how to define a diachronic Dutch Book against an agent who has inconsistent beliefs about the credence of the conditional. Accordingly, we show that the DB-resistant credence of the conditional A → C is given as conditional probability P(C|A). We show this without (yet) invoking any formally defined probability space.

In Sect. 2, The integration of partial information, we examine the paradoxical consequences of a straightforward application of the Law of Total Probability to conditionals (a real-life example is also presented). The Law of Total Probability is an elementary result in probability theory, but it is a very subtle question how it works for conditional sentences.

In Sect. 3, Lewis’ reasoning, we give an analysis of Lewis’ triviality proof from Lewis (1976) and identify the problematic assumptions in this proof.

In Sect. 4, The space S*, we define in a mathematically sound way the probability space S* = (Ω*, Σ*, P*), which allows an independent mathematical underpinning to be given for the obtained results. In particular, it allows the probability of the conditional P*(A → C) to be computed in a mathematically sound way—so that it coincides with the DB-resistant value for bets that is obtained with the aid of the analysis of optimal betting behavior. This explains the source of the misunderstandings and identifies the proper version of the Law of Total Probability for conditionals.

In Sect. 5, Lewis’ dilemma in S*, we analyze two possible interpretations of conditionalizing the conditional A → C on B (i.e. of P(A → C|B)). We show that neither of these interpretations allows two of Lewis’ essential assumptions to hold simultaneously in the properly defined probability space. The constructed space S* = (Ω*, Σ*, P*) allows to explain where Lewis’ assumptions are problematic.

We conclude with a short Summary.

In the “Appendix” we present the general construction of a diachronic Dutch Book for the conditional A → C.

1 An introductory example

Consider an urn containing 100 White balls, 80 Green balls and 20 Red balls, all of which are equally likely to be drawn (in short: 100 W, 80G, 20R).Footnote 9 This is modeled in a natural way by the sample space S = (Ω, Σ, P), in which:

  • Ω = {W, R, G};

  • P(W) = 100/200 = 0.5;

  • P(G) = 80/200 = 0.4;

  • P(R) = 20/200 = 0.1.

Σ is the σ-field, which consists of all subsets of Ω, i.e. Σ = 2Ω. We will make use of this sample space S = (Ω, Σ, P) throughout this paper.Footnote 10 The corresponding language is formed from three atomic sentences: W, G, R. In this case, we restrict ourselves to one particular conditional: ¬W → G. This means that in our language we have the Boolean combinations of W, G, R and ¬W → G.Footnote 11


We are interested in the probability of the conditional:

  • (*) If the ball is non-White, it is Green. (i.e. ¬W → G).

None of the 8 events in S = (Ω, Σ, P) is a counterpart of ¬W → G.Footnote 12 So, if we want to define the probability of ¬W → G in the mathematically standard way, we need to construct an appropriate probability space. Before we do this, we will find the DB-resistant credence of ¬W → G by means of a Dutch Book analysis.

1.1 Some general assumptions

In order to discuss the rational probability assignment made by the agent, we first formulate some minimal assumptions concerning the agent’s decisions, their understanding of the notion of credence, and their interpretation of the conditionals.


1. The bets accepted by the rational agent


The standard way of identifying the subjective probability of a sentence is by analyzing the bets the agent considers to be fair.Footnote 13 We use the symbol PDB for the credence function of the agent. This function leads from the set of sentences in the language of the agent into the interval [0, 1].Footnote 14 At this stage we do not yet assume that PDB is a probability function (on some probability space), but we will show later that it does indeed have a formal counterpart.

The betting behavior of the rational agent has the following properties:


1.1. If the agent thinks that PDB(A) = x, then they consider both selling and buying bets on A for $x to be fair:

(a) The Bookmaker sells the bet on A for $x, i.e. they get $x from the agent. If A happens to be true, the Bookmaker pays $1 to the agent (so in this case the agent’s win is $(1 − x)). If A is false, the Bookmaker keeps the $x.

(b) The Bookmaker buys the bet on A from the agent for x, i.e. he pays $x to the agent. If A happens to be true, the agent pays $1 to the Bookmaker, i.e. the agent’s loss is $(1 − x). If A is false, the agent keeps the $x.


1.2 If the agent considers a bet to be fair, they are willing to repeat it an arbitrary number of times and to make n bets simultaneously (i.e. to buy/sell for $nx a bet where the win is $n).


2. The credence function P DB


The agent’s beliefs concerning atomic sentences are modeled in the sample probability space S = (Ω, Σ, P). We know that if the function PDB violated the rules of probability, it would be possible to construct a Dutch Book against the agent, i.e. a series of bets, each of which the agent considers to be fair but which inevitably lead to the agent’s loss.Footnote 15

So, for Boolean sentences A and C (i.e. formed from atomic sentences with the use of only ¬, ∧, ∨ but not the conditional →), the rational agent’s credence function PDB is exactly the probability function P, i.e.:

  • 2.1 PDB(A) = P(A).

Of course, because it is a probability function that is defined on the Boolean part of the language, PDB has the following properties:

  • 2.2 PDB(A) = 1 − PDBA).

  • 2.3 PDBA(C) = \(\frac{{{\text{P}}^{{{\text{DB}}}} \left( {A \wedge C} \right){ }}}{{{\text{P}}^{{{\text{DB}}}} \left( A \right)}}\).

Here PDBA(.) is the credence function which results from a belief revision by conditionalizing on a sentence A (i.e. the agent assumes that A is true and modifies the belief system appropriately).Footnote 16


3. Interpretation of the conditional A → C


The agent has the conditional → in their language, so we have to give an outline of how the conditional is interpreted. Of course, A → C is not identified with the material implication (i.e. with the Boolean sentence ¬AC).


3.1 In order to make and settle bets on conditionals, the agent has to assume that there are circumstances in which the conditional is considered to be true, and there are also circumstances in which the conditional is considered to be false. This means that the notion of truth conditions for A → C is accepted.

3.2 The agent believes that if both A and C are true, then the conditional A → C is true.

3.3 The agent believes that if A is true but C is false, then the conditional A → C is false.

3.4 The agent believes that if A is false, then the conditional A → C is neither true nor false.Footnote 17

3.5 If PDB(A → C) = x, then the agent considers the bet on A → C for $x to be fair (cf. 1.1), but assumes that it will be cancelled if A turns out to be false (in this case the $x is refunded to the buyer of the bet).


These three groups of assumptions each have a different character:

Assumptions (1.1–1.2) describe the decisions that the agent make after ascribing credence to sentences from the language. In particular, there is no risk aversion (and no propensity to risk) and the agent is always ready to engage in actions (bets) which have a non-negative expectancy.

Assumptions (2.1–2.3) have a practical character: the agent knows that if their credence function PDB violates these assumptions, it will be possible to construct a (synchronic or diachronic) Dutch Book against them. In terms of the practical character of these assumptions, we mean not the real financial lossFootnote 18 but the fact that the agent can identify the desired properties of the function PDB by simply analyzing the safe values of bets which give a coherent, DB-resistant system of beliefs. This argument is independent of the formal construction of any probability space.

Assumptions (3.1–3.4) clarify the understanding of the conditional (in particular, of the truth conditions of the conditionals). They might be viewed as general postulates which are independent of any previous probability considerations.

Assumption 3.5 is the consequence of 3.1–3.4 and the assumptions regarding fair bets.

1.2 A diachronic Dutch Book for the conditional ¬W → G

What credence should our agent assign to the conditional ¬W → G? Namely, what is the DB-resistant value of PDBW → G)? We strongly believe that the proper answer (for the urn containing 100 W, 80G, 20R) is 0.8, as 80% of the non-White balls are Green. However, the argumentation should be based only on the assumptions 1.1–3.5 and proceeds by defining a Dutch Book against the agent, who accepts a different value.

So, assume that the agent believes the opposite:

  • PDBW → G) ≠ 0.8.

We will show that this leads to disaster, i.e. to a diachronic Dutch Book against the agent. In this case, assume that

  • PDBW → G) = 0.875,

i.e. that the agent considers $0.875 to be a fair price for a standard bet on (¬W → G). The number 0.875 is chosen to make the calculations convenient, but the reasoning is similar for any value of 0.8 < p < 1.Footnote 19

We assume that in the course of events, the agent and the bookmaker agree that it is possible to update their knowledge (assumption 2.3) and to place new bets after this update; of course, they agree on the present status of the bet.

We play the role of the Bookmaker; of course, we will only buy/sell bets which are considered fair by the agent. The game consists of drawing a ball from the urn. Before we draw, we make the following bets:

  • Bet(W). We buy from the agent the bet that the ball is White for $0.5.Footnote 20

  • Bet(¬W → G). We sell 10 bets on ¬W → G for $0.875 each.Footnote 21

The agent accepts the bets as they consider them to be fair and coherent with their credence assignments (assumptions 1.1 and 1.2).

We start the game, i.e. a ball is drawn. The first step is to identify whether the ball is White or not: this will settle Bet(W). Importantly, at this moment we only check whether it is White or not! If it is White, we finish the game and take our money, as we have won Bet(W). But if it is not White, we make another bet before we check whether it is Red or Green!Footnote 22

So, these are our actions:


If the ball is White:

  • The agent loses Bet(W) and has to pay $1 (i.e. the agent’s loss is $0.5);

  • Bet(¬W → G) is cancelled and $8.75 (i.e. 10x$0.875) is given back to the agent (Assumption 3.5).

If the ball is non-White:

  • The agent wins Bet(W) and keeps (i.e. wins) $0.5.

  • We do not yet have knowledge about the outcome of Bet(¬W → G) as the color of the non-White ball is not yet known.

At exactly this moment, we propose Bet(R), i.e. the bet that the ball is Red. Both we and the agent have updated our beliefs, conditionalizing on ¬W (according to Assumption 2.3.). So now we use the credence function PDB¬W. The agent knows that the fair bet value on R (given it is ¬W) is 0.2, as.

  • PDB¬W(R) = \(\frac{{{\text{P}}^{{{\text{DB}}}} \left( {\neg W \wedge R} \right){ }}}{{{\text{P}}^{{{\text{DB}}}} \left( {\neg W} \right)}}\) = \(\frac{{{\text{P}}\left( {\neg W \wedge R} \right){ }}}{{{\text{P}}\left( {\neg W} \right)}}\) = 0.2 (according to Assumptions 2.1 and 2.3).

A bet is now made:


Bet(R). We sell 10 bets that the ball in question is Red for $0.2 each.Footnote 23


After the bet we check the color of the non-White ball. Of course, being non-White, the ball is either Green or Red.


If the ball is Green:

  • The agent wins Bet(¬W → G)—so wins 10 x (1−$0.875) = $1.25.

  • The agent loses Bet(R)—so loses 10x$0.2 = $2.

If the ball is Red:

  • The agent loses Bet(¬W → G)—so loses 10 x $0.875 = $8.75.

  • The agent wins Bet(R)—so wins 10 x $0.8 = $8.

The agent’s wins and losses in these three cases are summarized in the table:

 

Bet(W)

Bet(¬W → G)

Bet(R)

The outcome

W

− $0.5

0

(the bet was cancelled)

0

(the bet has not been placed)

− $0.5

G

+ $0.5

+ $1.25

− $2

− $0.25

R

+ $0.5

− $8.75

+ $8

− $0.25

Regardless of the result (i.e. the color of the drawn ball), the agent loses. This is the consequence of assuming that PDBW → G) ≠ 0.8, i.e. that PDBW → G) ≠ P(G|¬W). This reasoning shows that it is rational to assume the proper “betting credence” of the conditional ¬W → G to be PDBW → G) = 0.8. Of course, this is a general rule: it is not a numerical artifact related to some special properties of the particular values 0.8, 0.875 etc.

Importantly, we assumed that the agent’s beliefs concerning the Boolean fragment of the language are coherent, which means that no “Boolean Dutch Book” can be constructed. Nevertheless, a diachronic DB can be constructed against a system of beliefs to which the conditional sentence in question is added and in which the agent assumes that PDBW → G) ≠ P(G|¬W). This is the source of the incoherence and the agent’s misfortune.

It is also important to observe that we have constructed the Dutch Book in question based on the internal relationships between the agent’s beliefs: we need not examine whether they are adequate in any sense to the empirical situation. This is a general feature of the Dutch Book construction. Even if the agent believes that regardless of the content of the urn the probabilities are always PDB(W) = 1/3; PDB(R) = 1/3; PDB(G) = 1/3; PDB¬W(G) = ½; PDB¬W(R) = ½; PDBW → G) = ½, it will not be possible to construct a Dutch Book (even if it is highly probable that the Bookmaker wins the bets in the long run by making use of the law of large numbers). It is the incoherence in the agent’s beliefs (i.e. the fact that the agent believes that PDBW → G) ≠ P(G|¬W)) which leads to the diachronic Dutch Book and an inevitable loss.

At this stage we have not yet used any formal model (semantics) for probabilities of conditionals. The Dutch Book reasoning has a very intuitive character: informally speaking, it identifies the rational betting behavior, but does not explain the mathematical (theoretical) reasons. Nevertheless it: (i) gives an elementary argument in favor of PCCP; (ii) helps to identify the problematic places in Lewis’ reasoning; and (iii) provides an important boundary condition: the to-be-constructed probabilistic model must not lead to a Dutch Book disaster!

2 The integration of partial information

The findings from the previous section can be summarized as follows:


the only way to expand the agent’s belief system in a DB-resistant way to include the conditional ¬ W → G is by assigning:

  • PDBW → G) = P(G|¬W).

This formula can also be written in the form:

  • PDBW → G) = PDB¬W(G).

To justify this claim, we made use only of some non-controversial assumptions concerning the agent’s “betting behavior”. We will use our findings in the discussion concerning Lewis’ triviality results, which—according to Lewis—prove that the PCCP principle, i.e.

  • P(A → C) = PA(C) = P(C|A).

can be formally justified only in very special cases, namely in trivialFootnote 24 probability spaces in which at most 4 values of probabilities of events are assumed.Footnote 25 Our sample space S = (Ω, Σ, P) obviously is not such a special case, but the DB-resistant values of PDB for conditionals obey the PCCP principle. Lewis’ theorem suggests that this result cannot be given a proper mathematical formalization and justification, so a feeling of incoherence arises which needs to be explained.

To clearly exhibit the problematic assumptions in Lewis’ reasoning, we will modify our example. Consider an additional feature of the balls in our sample space (apart from their color): some of them are Heavy (H) and some are Light (L).Footnote 26 In this case, assume that

  • (a) there are 20 White, 60 Green and 20 Red balls in the “Heavy subspace”;

  • (b) there are 80 White and 20 Green balls in the “Light subspace”.

Assume that our agent has the following partial information (concerning the Heavy and Light subspaces) at their disposal:

  • (i) They know the values PDB(H) and PDB(L), i.e. the chances of drawing a Heavy/Light ball.

  • (ii) They know the values PDBH(Y) and PDBL(Y) of every Boolean sentence Y.


Is this knowledge sufficient to compute PDB(Y)? The Law of Total Probability (an elementary theorem in probability theory) states that:

  • (LTP) P(Y) = P(Y|H) P(H) + P(Y|L) P(L) = PH(Y) P(H) + PL(Y) P(L),

for any event Y and complementary events H, L in the probability space.


So, the version for PDB is straightforward:

  • (DB-LTP) PDB(Y) = PDBH(Y) PDB(H) + PDBL(Y) PDB(L).Footnote 27

The agent can use this formula to integrate their partial knowledge concerning the proper values of PDB in the subspaces H and L in order to obtain the proper value of PDB in the whole space. Another way of putting it is that someone who is a DB-resistant Y-player in the Heavy and Light subspaces will become a DB-resistant Y-player in the whole urn just by using LTP. Undoubtedly this method works for all Boolean sentences. A natural question is whether the same can be done with the conditional ¬W → G.

Assume now that the agent has the following information:

  1. (i)

    they know the proper (DB-resistant) values PDB(H), PDB(L);

  2. (ii)

    they know the proper (DB-resistant) values PDBHW → G) and PDBLW → G).

The analogue of (DB-LTP) is obtained by substituting ¬W → G for Y and has the form:

  • (DB-LTP-C) PDBW → G) = PDBHW → G) PDB(H) + PDBLW → G) PDB(L).

(The “C” in (DB-LTP-C) is for “conditional”). In our example, the proper (i.e. DB-resistant) values in the subspaces are: PDBHW → G) = 0.75 and PDBLW → G) = 1.Footnote 28 We also have PDB(H) = PDB(L) = 0.5. Applying the formula (DB-LTP-C) gives:

  • PDBW → G) = 0.75 × 0.5 + 1 × 0.5 = 0.875

which is wrong! We have already shown in Sect. 1 that taking this value leads to a Dutch Book against the agent, and that the proper DB-resistant value is PDBW → G) = 0.8.

Importantly, we started with the proper (DB-resistant) values for PDBHW → G), PDB(H), PDBLW → G) and PDB(L). Then we applied (DB-LTP-C) and obtained absurd results! This means that the Law of Total Probability holds for Boolean sentences (which is not surprising, as this is elementary probability theory), but its counterpart (DB-LTP-C) does not hold for the conditional ¬W → G. Using the game metaphor, we can say that LTP is a golden rule for people who know good “Heavy subspace / Light subspace” strategies that can be used to plan the strategy for the whole urn; however, this works for Boolean sentences only! Importantly, in Sect. 5 we shall see that LTP-C in this misleading form is involved in Lewis’ reasoning.


Remark: We will obtain the proper value of PDBW → G) using a different “information integration formula” for subspaces H and L:


(DB-LTP-C*) PDBW → G) = PDBHW → G) PDBHW) + PDBLW → G) PDBLW).


Indeed:

  • PDBHW) = P(¬W|H) = 20/100 = 0.2;

  • PDBLW) = P(¬W|L) = 80/100 = 0.8.

After substituting these values to the formula (DB-LTP-C*), we obtain:

  • PDBW → G) = 0.2 × 1 + 0.8 × 0.75 = 0.8

This is not a coincidence but a general phenomenon. We present a proof of (DB-LTP-C*). To begin with, it is a simple exercise to check that:

  • P(G∣¬W) = PH(G∣¬W) P(H∣¬W) + PL(G∣¬W) P(L∣¬W)

where P is a standard probability function, P(.∣.) is conditional probability, PH and PL are conditionalized probability measures.Footnote 29 We use this formula to justify (DB-LTP-C*).

We have already justified the claim, that the proper, DB-resistant value PDBW → G) is P(G∣¬W). The DB-argument is general, and it applies to any probability measure, in particular to PDBHW → G) and PDBLW → G). We also know, that for Boolean sentences X, PDB(X) = P(X) (this applies to PH and PL as well). So, if we substitute:

  • PDBW → G) for P(G∣¬W);

  • PDBHW → G) for PH(G∣¬W);

  • PDBHW) for P(H∣¬W);

  • PDBLW → G) for PL(G∣¬W);

  • PDBLW) for P(L∣¬W);

in the formula above, we obtain the required:

  • (DB-LTP-C*) PDBW → G) = PDBHW → G)PDBHW) + PDBLW → G)PDBLW).Footnote 30


Is vaccination effective? To illustrate the misleading intuitions which suggest the wrong formula (DB-LTP-C), consider a population in which people are:

  1. (i)

    type A or B (this might, for instance, be some genetic feature);

  2. (ii)

    either Sick or not (S / ¬S);

  3. (iii)

    either vaccinated or not (V / ¬V).

We are interested in the effectiveness of vaccination, i.e. in the probability of a vaccinated person not getting sick. This means that we are interested in the probability of the conditional If someone has been vaccinated, they won’t get sick:

  • (VAC) V → ¬S.

Our task is to evaluate P(V → ¬S) on the basis of empirical data collected within two subpopulations A and B. Experts working in subpopulations A and B know that DB is a good way of producing coherent systems of beliefs, so they use it to estimate PA(V → ¬S) and PB(V → ¬S). Their findings are as follows:


  • In subpopulation A: PA(V → ¬S) = 1 (as 100% of the vaccinated A-people did not get sick).

    In subpopulation B: PB(V → ¬S) = 0.5 (as 50% of the vaccinated B-people did not get sick).

In this case, assume that the number of A-people and B-people is equal, i.e. P(A) = P(B) = 0.5. If we use the (DB-LTP-C) formula to estimate the probability of V → ¬S, i.e. P(V → ¬S) in the whole population, the result is:

  • P(V → ¬S) = PA(V → ¬S) x P(A) + PB(V → ¬S) x P(B) = 1 × 0.5 + 0.5 × 0.5 = 0.75.

This is not a reasonable result as it gives the (usually false!) value 0.75 in radically different situations. For instance:


Situation 1: There are 100 A-people and 100 B-people, but there are only 2 vaccinated A-people, and neither of them are sick (so indeed PA(V → ¬S) = 1); there are 98 vaccinated B-people, and 49 of them are not sick (so indeed PB(V → ¬S) = 0.5). In this case there are 100 vaccinated people, and 51 of them are not sick. So, P(V → ¬S) does not equal 0.75 but is much lower (51/100 = 0.51).


Situation 2: There are 98 vaccinated A-people (none of them are sick, so indeed PA(V → ¬S) = 1) and 2 vaccinated B-people (1 of them is not sick, so indeed PB(V → ¬S) = 0.5). In this case P(V → ¬S) does not equal 0.75 but is much higher (99/100 = 0.99).


P(V → ¬S) is computed by means of the Law of Total Probability as a weighted sum. But (DB-LTP-C) uses the wrong weights, i.e. the relative size of the samples.Footnote 31 The proper weights should be the relative frequency of the vaccinations in the subpopulations—exactly as in the true formula (DB-LTP-C*).

3 Lewis’ reasoning

Let us recall an important fragment of Lewis’ argumentation from Lewis (1976):

  • “(7) P(A → C∣B) = P(C∣AB), if P(AB) is positive.

  • […]

  • we have:

  • (8) P(A → C) = P(C∣A)

  • By (7), taking B as C or as \({\overline{\text{C}}}\) and simplifying the right-hand side, we have:

  • (9) P(A → C∣C) = P(C∣A C) = 1

  • (10) P(A → C∣\({\overline{\text{C}}}\)) = P(C∣A \({\overline{\text{C}}}\)) = 0

  • For any sentence D, we have the familiar expansion by case:

  • (11) P(D) = P(D∣C) x P(C) + P(D∣\({\overline{\text{C}}}\)) xP(\({\overline{\text{C}}}\)).

In particular take D as (A → C). Then we may substitute (8), (9), and (10) into (11) to obtain:

  • (12) P(C∣A) = 1 × P(C) + 0 × P(\({\overline{\text{C}}}\)) = P(C)” (Lewis, 1976, p. 300)

Lewis describes this result as “absurdity, but not quite a contradiction” (Lewis, 1976, p. 300).

In our terminology, Lewis divides the set of possible events into two (virtual) subspaces C and ¬C. Then, in steps (9) and (10) of his reasoning, he claims that:

  1. (i)

    in subspace C: PC(A → C) = 1;

  2. (ii)

    in subspace ¬C: P¬C(A → C) = 0.

In step (12), in order to compute the total probability of the conditional A → C, Lewis uses formula (11), which is an exact counterpart of the formula (DB-LTP-C). But we have already seen that this formula is wrong: depending on the division of the sample space into subspaces, it leads to inconsistent results.

To give an illustration of the weak point in Lewis’ reasoning, we show its exact counterparts in our vaccination and urn examples. The first seems to be more intuitive, while the second allows to perform exact numerical calculations.


The vaccination example. Let us divide the population in the vaccination example into the subpopulations of healthy people (i.e. ¬S) and sick people (i.e. S). Undoubtedly

  • P¬S(V → ¬S) = 1;

  • PS(V → ¬S) = 0.

Integrating these results with the use of Lewis’ formula (i.e. DB-LTP-C) can give wrong results (see Sect. 2).


The urn example. We divide the urn into subspaces of Green and non-Green balls:

  • Subspace G: 80G;

  • Subspace ¬G: 100 W, 20R.

The probability of choosing subspace G is PDB(G) = 80/200 = 0.4; for subspace ¬G this is PDBG) = 120/200 = 0.6.


The diachronic Dutch Book reasoning within the subspaces G and ¬G shows that:

  • PDBGW → G) = 1;

  • PDB¬GW → G) = 0.Footnote 32

After applying the formula (DB-LTP-C):

  • PDBW → G) = PDBGW → G) PDB(G) + PDB¬GW → G) PDBG)

we obtain the wrong value:

  • PDBW → G) = PDB(G) = 0.4 (instead of 0.8).

PDBW → G) depends therefore on the division into subspaces: (i) for our H/L division from Sect. 2, we obtain 0.875; (ii) for Lewis’ division G/¬G, we obtain 0.4; (iii) for the division R/¬R, we obtain 0.9.


Observe also that if we apply the formula (DB-LTP-C*) we get:

  • PDBW → G) = PDBGW → G) P¬W(G) + PDB¬GW → G) P¬WG)

  •  = 1 × 0.8 + 0 × 0.2 = 0.8,

i.e. the proper result.

So far, we have arrived at the following conclusions.


Observation 1. The possibility of producing a Dutch Book against the agent arises when their systems of beliefs are incoherent. If the agent’s credence function PDB for the Boolean part of the language is modeled within the probability space S = (Ω,Σ,P), the only coherent (i.e. DB-resistant) way of expanding it to the conditional A → C is by setting PDB(A → C) = PDBA(C) = P(C|A).


Observation 2. The integration of the partial information concerning the proper (sic!), i.e. DB-resistant values, of PDBHW → G) and PDBLW → G), by means of the formula (DB-LTP-C) gives the wrong value of PDBW → G). We have shown that in the general case:

  • PDBW → G) ≠ PDBHW → G) PDB(H) + PDBLW → G) PDB(L).

As a consequence, this opens the way to constructing a diachronic DB against the agent. This means that the straightforward, “mechanical” generalization of the Law of Total Probability does not work for conditionals.


Observation 3. The reasoning we analyzed is based on the misleading formula (DB-LTP-C) and leads to a contradiction: we obtain different values for PDBW → G) depending on the division of the sample space. But this reasoning is similar to Lewis’ argumentation. So, this suggests that it is based on implausible assumptions. We think that the source of the problems are as follows:

  1. 1.

    Lewis did not specify any probability space on which the conditional A → C is interpreted as an event.

  2. 2.

    Lewis does not explain what the term P(A → C|B) really means, but the assumption that P(A → CC) = 0 is crucial in his reasoning.

This is why our aim in the next section is to construct a probability space S* = (Ω*,Σ*,P*) in which the conditional ¬W → G is interpreted as an event, so that P*(¬W → G) is a mathematically well-defined term. This will allow us to understand (i) why the Law of Total Probability in the form (DB-LTP-C) fails and how it should be applied properly; (ii) whether it is justified to assume that P*(¬W → GG) = 0.

4 The space S*

The Dutch Book-resistant value for PDB(A → C) is P(C|A). This formula works, but why? Can we consider PDB(A → C) to be a genuine probability, or is it just some real number that only accidentally coincides with P(C|A) and has nothing to do with any genuine probabilistic structure? What are the reasons for the failure of the generalization of LTP for PDB?

We want to understand the logical structure of the problem, i.e. we want to describe the situation in a mathematically sound model. The model should give a mathematical underpinning for the formula PDB(A → C) = P(C|A) so that we can apply standard mathematical theorems (in particular LTP) to PDB. This means that we need to construct a probability space S* = (Ω*, Σ*, P*) in which the conditional A → C will have an interpretation as an event so that P*(A → C) is a well-behaved probability function. In the following presentation, we keep matters as simple as possible.

Undoubtedly, the to-be-constructed probability space S* = (Ω*, Σ*, P*) has to fulfill some requirements:

  • S* = (Ω*, Σ*, P*) is associated with the sample space S = (Ω, Σ, P), which models the Boolean fragment of the agent’s beliefs. S* = (Ω*, Σ*, P*) should allow the Boolean sentences to be interpreted, and the probabilities from the sample space S = (Ω, Σ, P) should be preserved in S* = (Ω*, Σ*, P*).

  • The conditional ¬W → G should have a representation as an event, i.e. a subset [¬W → G] ⊆ Ω*. To avoid confusion, we will use square brackets to distinguish sentences from their counterparts (events) in the probability space S*. But this means that any elementary event ω* ∈ Ω* is either the element of [¬W → G] or its complement, i.e. it confirms or disconfirms the conditional ¬W → G. The construction is therefore based on the assumption that there are circumstances in which ¬W → G is true (false), i.e. we accept the notion of truth conditions for conditionals (assumptions 3.1–3.4).

  • As it is convenient to think of probabilities of conditionals in terms of bets, it is natural to describe their truth conditions in terms of game scenarios (which lead to either a win or a loss). The games have a dynamic character, i.e. there is always some unfolding scenario, and bets are placed at appropriate moments of the game. S* should therefore model this fact and naturally correspond to the gambling situation.

Imagine that we start the “conditional game” for ¬W → G: the rules are fixed (and conform to assumptions 3.2–3.5), the appropriate bets are made, and we draw the first ball. Obviously, if the first ball is Green or Red, the conditional ¬W → G is decided. If the first ball is White, ¬W → G is undecided, and we have to draw a ball again. This process is repeated until either R or G appears, at which point the conditional is shown to be true or false.

So, the scenarios confirming the conditional ¬W → G are:

  • G, WG, WWG, WWWG, …

And the scenarios disconfirming the conditional ¬W → G are:

  • R, WR, WWR, WWWR, …

We will use WnG (resp. WnR) to denote the sequence consisting of n White balls followed by a Green ball (resp. a Red ball). These scenarios are natural candidates for elementary events in S*.

4.1 Formal definition of S* = (Ω*, Σ*, P*)Footnote 33

We start with our toy-example sample space S = (Ω, Σ, P), in which Ω = {W,G,R} with P(G) = p, P(R) = q, P(W) = r. The conditional in question is ¬W → G. The associated probability space S* = (Ω*,Σ*,P*) is defined in the following way:

  • Ω* = {WnG, WnR: n ∈ \({\mathbb{N}}\)};

  • Σ* = 2Ω* (i.e. the power set of Ω*);

  • P*(WnG) = rnp (for n ∈ \({\mathbb{N}});\)

  • P*(WnR) = rnq (for n ∈ \({\mathbb{N}})\).

In S* = (Ω*,Σ*,P*), elementary events are not balls but sequences of balls (intuitively speaking, to “draw” an event from Ω* is to draw a “game scenario”). The stipulations P*(WnG) = rnp and P*(WnR) = rnq are straightforward.

Ω* is a countable set, so for any set A ⊆ Ω* we set:

  • P*(A) = \(\sum\nolimits_{\omega \in A} {P^{*} (\omega )}\).

So, P* is indeed a probability function properly defined on the σ-field Σ*.

We can interpret the sentences from our language (i.e. the Boolean sentences and the conditional ¬W → G) in the probability space S* = (Ω*,Σ*,P*) as events, i.e. subsets of Ω*. In particular, ¬W → G has its counterpart [¬W → G] ⊆ Ω*, which consists of scenarios confirming the conditional:


W → G] = {WnG: n ∈ \({\mathbb{N}}\)}.

Of course

  • P*([¬W → G]) = \(\sum\nolimits_{n = 0}^{\infty } {pr^{n} }\) = \(\frac{p}{1 - r}\) = \(\frac{p}{p + q},\)

i.e.

  • P*([¬W → G]) = P(G|¬W) = PDBW → G)

where P is the probability from the sample space S = (Ω, Σ, P).Footnote 34 This means that the probability P* models the DB-resistant value PDB.Footnote 35

4.2 The probability of Boolean sentences in S*

The space S* = (Ω*, Σ*, P*) is designed to offer interpretations for Boolean sentences and the conditional ¬W → G from our language.Footnote 36 To give an interpretation of a given sentence α is to define the set [α] ⊆ Ω*. This means that for any event ω* ∈ Ω* we have to decide whether ω* ∈ [α], i.e. whether it makes the given sentence α true or false.

Consider α = The ball is Green. Which of the elementary events ω* ∈ Ω*, i.e. sequences WnG and WnR, make α true? We start the game, and obviously we claim that α is true (“We win the game”) when we see a Green ball as the first ball. We lose when we see a Red or White ball as the first ball. And the fact that later—perhaps after drawing 50 White balls—a Green ball might appear does not matter: we are not taking about the conditional If it is non-White, it is Green but the simple sentence The ball is Green. So:

  • [The ball is Green] = {G};

  • and P*([G]) = r0p = p.

So, the event [The ball is Green] contains only one sequence of length 1 consisting of one Green ball (as elementary events in Ω* are sequences)! Similarly:

  • [The ball is Red] = {R}

  • and P*([R]) = r0q = q.

Now consider The ball is White. Obviously, we win when we see a White ball and lose when we see a Green or Red ball. So, the truthmakers for The ball is White are sequences beginning with W:

  • [The ball is White] = {WnG, WnR: n > 0}

  • and P*([W]) = r.

This means that

  • P*([G]) = p = P(G);

  • P*([R]) = q = P(R);

  • P*([W]) = r = P(W).

The probabilities of the simple sentences W, G, R are preserved in S* = (Ω*, Σ*, P*), as required.

4.3 Heavy and light balls

In order to give an analysis of Lewis’ reasoning, we divided the urn in our example into Heavy and Light balls. So, assume the division of the space S = (Ω, Σ, P) into subsets H and L (which, in the general case, we assume to be independent of the division W, G, R). It is convenient to think of our sample space as consisting of six events, HG, HR, HW, LG, LR, LW, corresponding to six types of balls: Heavy Green, Heavy Red etc. The sample space SHL = (ΩHL, ΣHL, PHL) is specified as follows:

  • ΩHL = {HG, HR, HW, LG, LR, LW} (i.e. there are 6 elementary events);

  • ΣHL = \(2^{{{\Omega }_{{{\text{HL}}}} }}\).

For notational convenience, we shall use the following symbols:

  • pH = PHL(HG); qH = PHL(HR); rH = PHL(HW); pL = PHL(LG); qL = PHL(LR); rL = PHL(LW).Footnote 37

We shall also use the obvious symbols W, G, R, H, L for White, Green, and Red (regardless of weight) and Heavy and Light (regardless of color), and we set p = pH + pL; q = qH + qL; r = rH + rL.Footnote 38

With this sample space SHL = (ΩHL, ΣHL, PHL) we shall associate a probability space S* = (Ω*, Σ*, P*)Footnote 39 which should also allow the situation to be modeled when the additional property (the weight of the balls) is taken into account.

S* = (Ω*, Σ*, P*) is specified in the following way:

  • Ω* = {Wn(HG), Wn(LG),Wn(HR), Wn(LR): n ∈ \({\mathbb{N}}\)};

  • Σ* = 2Ω*.

Wn(HG) is the sequence consisting of n White balls (regardless of weight) and one Heavy Green ball, similarly for Wn(HR) etc. The probabilities of the elementary events are given as follows:

  • P*(Wn(HG)) = rnpH (for n ∈ \({\mathbb{N}});\)

  • P*(Wn(LG)) = rnpL (for n ∈ \({\mathbb{N}});\)

  • P*(Wn(HR)) = rnqH (for n ∈ \({\mathbb{N}});\)

  • P*(Wn(HG)) = rnqL (for n ∈ \({\mathbb{N}})\).

Obviously, [¬W → G] = {WnG: n ∈ \({\mathbb{N}}\)}, where the White balls and the Green ball might be Heavy or Light, and.

  • P*([¬W → G]) = \(\sum\nolimits_{n = 0}^{\infty } {pr^{n} }\) = \(\frac{p}{1 - r}\) = \(\frac{p}{p + q}\).

As before, we can interpret the sentences W, R and G in S* and ascribe probabilities to them. The probabilities of the corresponding events are:

  • P*([W]) = PHL(W) = r;

  • P*([G]) = PHL(G) = p;

  • P*([R]) = PHL(R) = q.

Similarly, we win the bet on The ball is Heavy if we draw a Heavy ball (regardless of what happens later). So, The ball is Heavy is interpreted as the set of sequences starting with a heavy ball (of any color):


[H] = [The ball is Heavy] = {HR, HG, (HW)Wn(HG), (HW)Wn(LG), (HW)Wn(HR), (HW)Wn(LR):n ∈ \({\mathbb{N}}\)}.Footnote 40


After a short computation we have

  • P*([H]) = pH + qH + rH = PHL(H).

The same applies to [L] = [The ball is Light]; its probability is P*([L]) = pL + qL + rL = PHL(L).

This means that the probabilities of the Boolean part of the language have been preserved in the new space S* (associated with SH,L) as required.Footnote 41

4.4 The law of total probability in S*

According to LTP, for any event Y:

  • P(Y) = PH(Y) P(H) + PL(Y) P(L)

where PH(.) and PL(.) are the probability measures obtained by standard conditionalization on H and L, i.e. P(.|H) and P(.|L).

In Sect. 2 we showed that in the general case it is not true that.

  • (DB-LTP-C) PDBW → G) = PDBHW → G) PDB(H) + PDBLW → G) PDB(L)

even if the inputs are DB-resistant probabilities for ¬W → G in the subspaces H and L. However, it is always true that.

  • (LTP)* P*([¬W → G]) = P*[H]([¬W → G]) P*([H]) + P*[L]([¬W → G]) P*([L])

as it is a simple application of a mathematical theorem (LTP) in the space S*.


As S* models the values for PDB in the proper way, observe that the following equalities hold:

  • PDBW → G) = P*([¬W → G]);

  • PDB(H) = P*([H]);

  • PDB(L) = P*([L].)

So, it cannot be the case that (in addition to these three equalities) the following two equalities also hold in the general case:

  • PDBHW → G) = P*[H]([¬W → G]);

  • PDBLW → G) = P*[L]([¬W → G]).

Indeed, if they both were true, then the formula (DB-LTP-C) would also hold, which is not the general case. In the next section we will see how this affects Lewis’ reasoning.

5 Lewis’ dilemma in S*

Let us examine once again the crucial steps in Lewis’ reasoning:

  • (6) P(A → C) = P(C∣A)Footnote 42

  • (7) P(A → C∣B) = P(C∣AB)

  • (9) P(A → C∣C) = 1

  • (10) P(A → C∣\({\overline{\text{C}}}\)) = 0

  • (11) P(D) = P(D∣C) x P(C) + P(D∣\({\overline{\text{C}}}\)) x P(\({\overline{\text{C}}}\)).

Under these assumptions, Lewis’ corollary is that P(C∣A) = P(C), which he considers to be absurd.

A straightforward application of (7) to the conditional A → C and the condition ¬C gives P¬C(A → C) = P¬C(CA), which is obviously 0. If we apply it to our conditional ¬W → G, we obtain the result P¬GW → G) = 0. But we have shown in Sect. 2 that if we assume.

  • PDB¬GW → G) = 0;

  • PDBGW → G) = 1

then “integrating” this information in the form suggested by formula (11) leads to wrong results: in our example we get PDBW → G) = 0.4, which is not DB-resistant (and which shows that our beliefs are inconsistent).

There seems to be tension between our findings and the assumptions on which Lewis’ proof is based. After having introduced the space S*, we have tools to offer a better description of the situation.

We think that the key to understanding the problematic steps of Lewis’ proof is identifying what P(A → CB) really means: we do not know in which probability space the function P is defined and which events correspond to sentences A → C and B. In our example, we are interested in the probability of the conditional ¬W → G under the condition H (and its special case when we substitute ¬G for H). The probability space S* is suited to formalizing the notion of probability of ¬W → G in a coherent (DB-resistant) way. So, we will analyze the possible interpretations of conditionalizing the conditional ¬W → G on H within S*. Two of them are most natural:


First interpretation. Conditionalizing ¬W → G on H within S* means considering only games which are played entirely within the Heavy subspace, i.e., formally, sequences consisting entirely of Heavy balls. This means that we think only of sequences of the form (HW)n(HG) and (HW)n(HR). Similarly, conditionalizing on L means considering only sequences consisting entirely of Light balls.

However, in S* = (Ω*, Σ*, P*) there are not only “pure Heavy” and “pure Light” sequences but also mixed sequences (i.e. in which both Heavy and Light balls occur). Consequently, the universe Ω* is in the general case divided into three subsets, which we shall denote by [Heavy], [Light], [Mixed].Footnote 43 In this case, the Law of Total Probability in S* has the form:


P*([¬W → G]) = P*[Heavy]([ ¬W → G]) P*([Heavy]) + P*[Light]([ ¬W → G]) P*([Light]) + P*[Mixed]([ ¬W → G]) P*([Mixed]).


To make our reasoning analogous to Lewis’ reasoning, we substitute ¬G for H (i.e. [Heavy] = [non-Green]; [Light] = [Green]). In this case, when “translated” to our example, steps (10) and (11) have the following form:

  • (10-1) P*([¬W → G]|[non-Green]) = 0;

  • (11-1) P*([¬W → G]) = P*[non-Green]([¬W → G]) P*([non-Green]) + P*[Green]([¬W → G]) P*([Green]).

(10-1) is true. Indeed, [non-Green] is the set of sequences consisting of non-Green balls only, i.e. [non-Green] = {WnR: n ∈ \({\mathbb{N}}\)}. Obviously, within this subspace the probability of the conditional ¬W → G is 0.

But Ω* is not the union of the two subsets [Green] and [non-Green]. Indeed,

  • [Green] = {G};

  • [non-Green] = {WnR: n ∈ \({\mathbb{N}}\)};

  • which means that we leave out all sequences of the form:

  • {WnG: n > 0}.

For this reason, (11-1) is not a proper application of the Law of Total Probability. The term P*[Mixed]([¬W → G]) P*([Mixed]) is missing!

The only way to save both the assumptions (10-1) and (11-1) in Lewis’ proof is to assume that the division of Ω* into [Green] and [non-Green] is a genuine division. This means that there are no mixed sequences in the universe Ω*, i.e. no sequences of the form WnG. But this is only possible when there are no White balls, which makes the conditional ¬W → G uninteresting.Footnote 44


Second interpretation. The second possibility is that conditionalizing on H within S* means considering only sequences beginning with a Heavy ball. Similarly, conditionalizing on L within S* means considering only sequences beginning with a Light ball. Indeed, this is the interpretation we used in Sect. 4.3.

In this case, we have a genuine division of Ω* into two disjointed subsets [H] and [L], and the Law of Total Probability holds:

  • (LTP)* P*([¬W → G]) = P*[H]([¬W → G]) P*([H]) + P*[L]([¬W → G]) P*([L]).

Under this interpretation, after substituting ¬G for H in our example, steps (10) and (11) will take the form:

  • (10-2) P*([¬W → G]|[¬G]) = 0;

  • (11-2) P*([¬W → G]) = P*G]([¬W → G]) P*([¬G]) + P*[G]([¬W → G]) P*([G]).

(11-2) is a proper application of (LTP*) (now the division of Ω* into [G] and [¬G] is a genuine division) but in this case (10-2) is not true. Indeed, for arbitrary H with positive probability:

  • P*([¬W → G]|[H]) = \(\frac{{{\text{P*}}\left( {\left[ {\neg W \to G} \right] \cap { }\left[ H \right]} \right)}}{{{\text{P*}}\left( {\left[ H \right]} \right)}}\) = \(\frac{{p_{H} }}{{(p_{H} + q_{H} + r_{H} )}}\) + \(\frac{{r_{H} }}{{(p_{H} + q_{H} + r_{H} )}}\frac{p}{{\left( {p + q} \right)}}.\)

As H was arbitrary, we can substitute H = ¬G. Obviously, in this case, pH = p¬G = P(G∧¬G) = 0: the chance of finding a ball that is both Green and non-Green is obviously 0, so the first term \(\frac{{p_{H} }}{{(p_{H} + q_{H} + r_{H} )}} = 0\).Footnote 45

However, the number rH = r¬G is the probability of finding a ball which is both White and non-Green, but this simply amounts to being a White ball and definitely need not be 0! So, the part \(\frac{{r_{H} }}{{(p_{H} + q_{H} + r_{H} )}}\frac{p}{{\left( {p + q} \right)}}\) need not be 0 at all. This means that:

  • P*([¬W → G]|[¬G]) = \(\frac{{r_{H} }}{{(p_{H} + q_{H} + r_{H} )}}\frac{p}{{\left( {p + q} \right)}} \) ≠ 0.Footnote 46

We can only save the equality (10-2) by assuming that rH = 0, i.e. that there are no White balls, i.e. that Ω* = {G,R}. And, as before, this makes the conditional ¬W → G not interesting.

We have discussed two possible interpretations of conditionalizing the conditional ¬W → G on ¬G within the space S*.Footnote 47 We have also shown that in each of these interpretations, at least one of Lewis’ assumptions is not satisfied: either the formula (11) fails (as the division in question is not a genuine division of Ω*), or assumption (10) fails (as the probability in question is not equal 0). The only way to save the proof is by assuming, that there are no White balls in the urn—which makes the conditional uninteresting. Anyway, the triviality proof falls apart.

Finally, observe that in our argumentation there is no need to use nested conditionals, as all expressions of the form P*([A → B]|[C]) (for instance P*([¬W → G]|[¬G])) have a mathematically proper interpretation in the probability space. This means in particular, that conditionalization of the sentence A → B by C (with A, B, C Boolean) can be mathematically captured in a direct way, without the need to examine the right-nested conditional C → (A → B). Therefore, as no nested conditionals are used, the problem of the status of the (probabilistic) Import–Export Principle (IE) does not arise in the direct form at all.

However, it is natural to think of right-nested conditionals in this context, as there are interesting connections between PCCP and IE in the context of triviality results. Fitelson (2015) gives an overview of a general form of the triviality results and points out that triviality results can be justified within a rather uncontroversial (background) theory with a form of Import–Export principle added.

The probabilistic Import–Export Principle in the general case need not hold in the Bernoulli-Stalnaker model. However, it holds in the causal random variable model in Kaufmann (2009), and in McGee’s model: indeed, McGee assumes the probabilistic version of the Import–Export Principle as axiom (C7) (McGee, 1989, p. 504).Footnote 48 An important paper discussing the status of IE is (McGee, 1985) (the Reagan-Anderson examples), a vivid discussion followed, see for instance Arlo-Costa (2001) or the recent Mandelkern (2020a) (where the Import–Export principle in the context of the interpretation of the conjunction is discussed) or Mandelkern (2020b) (the paper discusses the role of the Import–Export principle in the context of the law of identity, identifying in particular diverging intuitions concerning IE for indicative and subjunctive condidionals). These papers focus on the Import–Export Principle as a logical rule, not on the probabilistic version.

Nevertheless, our reasoning is conducted in a way which does not involve nested conditionals, which makes it independent form the problem of the status of IE.

6 Summary

1. The diachronic Dutch Book makes it possible to show that if the agents want to extend their system of beliefs (originally modeled within the sample space S = (Ω, Σ, P)) so as to include also the conditional A → C, then the only DB-resistant extension is by setting PDB(A → C) = P(C|A), i.e. in accordance with PCCP. (At this stage we do not have a model or semantics for the conditional—but have identified the proper value of the confidence level.)


2. We are confronted with Lewis’ triviality results, which purport to show that mathematical arguments against PCCP can be formulated. Or, put differently: that formulating a mathematical model for probabilities of conditionals leads to absurd results.


3. The straightforward generalization of the Law of Total Probability (LTP) to conditionals has the form (DB-LTP-C) and is an exact counterpart of Lewis’ formula (11) from his proof. But this formula does not work: after we divide the sample space into two disjointed events, H, L, and use the proper values of PDBH(A → C); PDBL(A → C); P(H) and P(L), we obtain diverse wrong results, depending on the division H/L. This is important as Lewis’ proof uses conditionalization on C and ¬C, which leads to the wrong value of PDB(A → C). The proper formula has the form (DB-LTP-C*). We have justified this conclusion using only a Dutch Book analysis without even mentioning any formally defined probability space.


4. For a given probability sample space S = (Ω, Σ, P) and the given conditional A → C, we have constructed the probability space S* = (Ω*, Σ*, P*), which is naturally associated with S = (Ω, Σ, P), such that:

  1. a.

    The conditional A → C is interpreted as an event [A → C] ⊆ Ω*;

  2. b.

    Ω* naturally corresponds to possible scenarios that set the truth value of the conditional A → C;

  3. c.

    The probabilities of the Boolean sentences from the sample space S = (Ω, Σ, P) are preserved in S* = (Ω*, Σ*, P*).

We have shown that P*([A → C]) = PDB(A → C). This means that we can treat the credence assignments as a mathematically well-defined object. We can in particular analyze the assumptions of Lewis’ Triviality proof within S* = (Ω*, Σ*, P*).Footnote 49


5. Lewis’ argument against PCCP rests on two crucial assumptions: (i) P¬C(A → C) = 0; (ii) a version of the Law of Total Probability. It turns out that—apart from very special situations—at most one of them can be true.


6. We have shown that under one of the possible interpretations of the term P(A → CB), the Law of Total Probability does not have Lewis’ form

  • P(A → C) = P(A → CC) P(C) + P(A → C∣¬C) P(¬C)

so assumption (11) in Lewis’ proof does not hold.


7. We have shown that under the second interpretation (which is more natural in the space S* = (Ω*,Σ*,P*)), the Law of Total Probability has the form

  • (LTP)* P*([¬W → G]) = P*[G]([¬W → G]) P*([G]) + P*[¬G]([¬W → G]) P*([¬G])

which means that Lewis’ assumption (11) holds. However, in this case it is not true that P*[¬G]([¬W → G]) = 0. This means that Lewis’ crucial assumption (10), i.e. P(A → C∣¬C) = 0, does not hold.


8. Under both discussed interpretations of conditionalizing the conditional A → C on ¬C, P(A → C∣¬C) has both properties which are crucial for Lewis’ proof only if we assume that the probability space Ω is trivial, i.e. when Ω = {A,C}. This means, that Lewis’ proof does not prove this fact, but rather presupposes it—therefore it is circular.


The presented analysis of Lewis’ Triviality proofs invoking the probability space S* does not directly use Dutch Book. However, Dutch Book reasoning has two important roles to play in our argumentation: (1) it gives a simple, pedestrian argument for PCCP; (2) it shows that using Lewis’ version of LTP (i.e. (DB-LTP-C), which corresponds directly to assumption (11) in Lewis’ reasoning) will lead to wrong results (and losing against the Bookmaker). This shows that something is wrong with this formula—and the mathematical explanation is provided by the probabilistic model. In this sense DB shows what boundary conditions have to be met in the model for probabilities of conditionals.