A Stochastic Graphs Semantics for Conditionals

Wójtowicz, Krzysztof; Wójtowicz, Anna

doi:10.1007/s10670-019-00144-z

A Stochastic Graphs Semantics for Conditionals

Original Research
Open access
Published: 22 November 2019

Volume 86, pages 1071–1105, (2021)
Cite this article

Download PDF

You have full access to this open access article

Erkenntnis Aims and scope Submit manuscript

A Stochastic Graphs Semantics for Conditionals

Download PDF

2058 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

We define a semantics for conditionals in terms of stochastic graphs which gives a straightforward and simple method of evaluating the probabilities of conditionals. It seems to be a good and useful method in the cases already discussed in the literature, and it can easily be extended to cover more complex situations. In particular, it allows us to describe several possible interpretations of the conditional (the global and the local interpretation, and generalizations of them) and to formalize some intuitively valid but formally incorrect considerations concerning the probabilities of conditionals under these two interpretations. It also yields a powerful method of handling more complex issues (such as nested conditionals). The stochastic graph semantics provides a satisfactory answer to Lewis’s arguments against the PC = CP principle, and defends important intuitions which connect the notion of probability of a conditional with the (standard) notion of conditional probability. It also illustrates the general problem of finding formal explications of philosophically important notions and applying mathematical methods in analyzing philosophical issues.

On the Probabilistic Notion of Causality: Models and Metalanguages

A Minimal Probability Space for Conditionals

Article Open access 14 September 2023

Generalized logical operations among conditional events

Article 27 July 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The problem of estimating the probabilities of conditional sentences is interesting for at least two reasons:

First, it seems that in many cases it is not clear which estimation is “The True Estimation” (as different language users give different values).

Second, providing a coherent and robust method of evaluating these probabilities would give us a better understanding of many other problems connected with conditionals—e.g. the problem of counterfactuals.

In this article, we define a semantics for conditionals in terms of stochastic graphs, which gives a straightforward and simple method of evaluating the probabilities of conditionals.^{Footnote 1} It seems to be a good and useful method in the cases already discussed in the literature (cf. e.g. Kaufmann (2004)), and it can easily be extended to cover more complex situations. In particular, it allows us to describe several possible interpretations of the conditional (the global and the local interpretation, and generalizations of them), and yields a powerful method of handling more complex issues (such as nested conditionals^{Footnote 2}). It provides a satisfactory answer to Lewis’s arguments against the PC = CP principle, and defends important intuitions which connect the notion of probability of a conditional with the (standard) notion of conditional probability.

The article has the following structure:

In Part 1 we analyze a simple example, which shows how to define the stochastic graph semantics for conditionals.

In Part 2 we demonstrate how the notions introduced in Part 1 can be used to analyze more complex problems (we examine the case of two urns, which makes the situation non-trivial).

Parts 3 and 4 are devoted to the global and local interpretations of conditionals. They provide a formal framework suitable for their analysis (and we formalize the intuitive computations given by Kaufmann for the global and local interpretations).

In Part 5 we discuss Lewis’s arguments (concerning the PC = CP principle) and show, how to resolve within our framework the problems indicated by Lewis.

A short summary (part 6) follows, in which we also discuss possible generalizations and indicate some results that we have obtained already (but which lie outside the scope of this article).

The Appendix presents some technical tools which are needed in the text.

A “byproduct” of our results is a contribution to the vivid discussion concerning the explanatory (explicatory) role of mathematics. Notions from probability theory (and the theory of Markov processes) have been used in order to give a precise explication of some philosophically important notions. So apart from their interest to the problem of conditionals, our results exemplify the general strategy of finding formal counterparts (explications) of philosophical notions. The notion of probability is a classic example of an explication (the locus classicus is Carnap (1950), cf. e.g. Brun (2016) for a contemporary discussion). In a sense, we widen the scope of an explication and show how to give formal, precise explications of the probability of conditionals, taking different interpretations and variants into account.

1 A Simple One-Urn Example

Very often in discussions concerning conditionals (and in particular, when their probabilities are under discussion), problems from natural language are analyzed. For example, suppose we try to estimate the probability that If Reagan worked for the KGB, I’ll never find out (Lewis 1986, 155). It is typically the case that our intuitions drive the analysis, but it is difficult (indeed, impossible) to give a concrete value of this probability and it is often even impossible to determine whether different language users apply the same methods of estimating it. For these reasons, in this article (like in many other works on conditionals, e.g. Kaufmann (2004, 2005, 2009, 2015), Khoo (2016), van Fraassen (1976)) we will investigate examples where the probabilities of conditional sentences in question can be given concrete values. This precise semantics can later be applied to everyday-life examples.

Let us start with a simple one-urn example. A ball is drawn from the urn containing 10 White, 8 Green and 2 Red balls.^{Footnote 3} What probability is a rational subject going to ascribe to the sentence A Green ball has been drawn? It is of course 8/20, which is exactly the probability computed within the classical probability model. It is also equal to the expected value in the game, where the reward is 1 for winning the game (i.e. drawing the Green ball), and 0 for losing it. This is in accord with the intuition that the subjective probability of a sentence is the odds at which you would be willing to place a bet on the outcome or the price you are willing to pay for the game. So, for the same reasons, the sentence A non-White and Green ball has been drawn has probability 8/20; the sentence A White ball has been drawn has probability 10/20 etc. These are the prices to pay for the respective (fair) games.

A rational subject estimates the probabilities (under the threat of a Dutch Book) so as to conform to the Kolmogorov probability axioms.^{Footnote 4} In other words, the subject works within the probability space Ξ = (Ω, Μ, P), where Ω = {W, G, R} are elementary events (i.e. drawing a ball of one of the colors), with probabilities respectively: P(W) = r = 10/20; P(G) = p = 8/20; P(R) = q = 2/20. Μ is the σ-field of all events, and P is the probability function defined on M.^{Footnote 5}

Let us estimate the probability of the sentence If we draw a non-White ball, we draw a Green ball (or, simpler: If the ball is not White, it is Green). We symbolize it as ¬W → G.

As we have observed before, estimating this probability amounts to estimating the price for playing a certain (fair) game, which we might call “The Conditional Game”. The problem is therefore: how much are we willing to bet on the truth of the conditional ¬W → G? Before we bet, we have to define the game conditions in a precise manner, i.e. we have to stipulate in a clear way how the game should be played and in what circumstances we will consider it as settled. In particular, we have to determine the result for all possible draws (i.e. White, Green, Red—W, G, R).

There is no doubt that we win if we draw a Green ball, and we lose if we draw a Red ball. But the problem arises with what should we do when we draw a White ball.

There are a priori four possibilities. After drawing a White ball:

(1)
We win the game.
(2)
We lose the game.
(3)
The game is undecided and stops.
(4)
The game is undecided, we put the ball into the urn and draw the ball again.

Choosing option (1) is equivalent to the claim, that we do not really play The Conditional Game ¬W → G, but rather The Material Implication Game: we win always when the ball is White or Green.^{Footnote 6} The probability of this event is clearly 18/20. If we chose (2), it means, that we lose after drawing a White or a Red ball and the probability of this event is 12/20. But it is widely agreed that this is not the appropriate way of interpreting conditionals (plenty of discussion can be found in the literature, so there is no need to consider the matter further here).

Choosing (3) (the game stops as undecided) has a major drawback: it “annihilates” the problem of counterfactuals. So this is not an attractive solution.^{Footnote 7} The conclusion therefore is that the only reasonable definition of The Conditional Game ¬W → G is (4): drawing a White ball does not end the game but extends it: we put the ball back into the urn, and draw again. This choice is a natural one: our bet concerns the case when the ball is not White, and in order to resolve the situation, we have to check what happens if the ball is non-White! If we stipulate the rules of our game in this way, then the outcomes are not single draws, but rather sequences of draws, consisting of a (possibly empty) sequence of White balls (which simply have the effect that the game restarts), followed by a Green or Red ball (either of which settles the game).^{Footnote 8}

This approach to the problem of computing the probability of a conditional means, that we have to define formally a new probability space (different from the simple initial space Ξ = (Ω, Μ, P), but of course intimately connected with it), where computing the probability of the outcome of the game will be possible. Observe, that in our initial space Ξ = (Ω, Μ, P), there is no event corresponding to the conditional ¬W → G.^{Footnote 9}

1.1 A Stochastic Graph Representation

We can represent the flow of the game in a very natural and convenient way by means of a stochastic (Markov) graph. We will present the general idea of a Markov graph using a simple example (for the reader’s convenience, technical details and references can be found in the Appendix).

Gambler’s ruin. Consider a coin-flipping game. At the beginning of the game, two players each have a number of pennies, say n₁ and n₂. After each flip of the coin, the loser transfers one penny to the winner and the game restarts on the same terms. The game ends when the first player (“The Gambler”) has either lost all his pennies or has collected all the pennies. Assume, that the probability of tossing heads is p, and of tossing tails is q (p + q=1). The Gambler gets one penny if heads show up.

The simplest non-trivial example is when n₁ = 1 and n₂ = 2 (when n₁ = n₂ = 1, the game stops after the first flip). At each moment, the state of the game is the number of The Gambler’s pennies, so the possible states are 0,1,2,3. The game starts in state 1 and the transitions between states occur as a consequence of one of two actions: heads or tails (H,T). The game finishes when the gambler enters either state 0 or 3—both 0 (loss) and 3 (victory) are absorbing states.

The dynamics of the game is represented in Graph 1 (Fig. 1, next page).

We can naturally track the possible scenarios of the game (i.e. ending either with victory or with loss) as possible paths in the graph. For instance, the path HH (tossing heads twice) leads to victory and the corresponding transitions between the states are 1→2→3.

There are infinitely many paths (scenarios) that lead to victory: HTHH (1→2→1→2→3); HTHTHH (1→2→1→2→1→2→3); HT…HTHH (1→2→…→2→1→2→3), etc.

The possible paths for losing the game are: T (1→0); HTT (1→2→1→0); HT…HTT (1→2→…→2→1→0), etc.

So, the space of all possible paths that settle the game consists of sequences {(HT)ⁿHH, (HT)ⁿT: ${\text{n}}\!\in\!{\mathbb{N}}$} ((HT)ⁿ means the sequence HT repeated n times, i.e. HT….HT). They will form the set of elementary events Ω* in a new probability space that describes all the scenarios. It is straightforward to compute their probabilities. If P(H) = p; P(T) = q, then

P*(HH) = pp = p²
P*(HTHH) = pqpp = pqp²
P*((HT)ⁿ HH) = (pq)ⁿpp = (pq)ⁿp²
…
P*(T) = q
P*(HTT) = (pq)q
P*(HTHTT) = (pq)²q
…
P*((HT)ⁿ T) = (pq)ⁿq

The probability of winning the game is

$$ \mathop \sum \limits_{n = 0}^{\infty } \left( {pq} \right)^{n} p^{2} = \frac{{p^{2} }}{1 - pq} $$

For p = 1/2 (a fair coin) it is 1/3.

The probability can be computed in a more direct way by examining the graph and writing down a simple system of equations. Let P(n) denote the probability of winning the game, which starts in state n. Obviously, P(0) = 0 (if we are in state 0, we have already lost), and P(3) = 1 (if we are in state 3, we have already won). What are the probabilities P(1), P(2)? We reason in an intuitive way: being in state n, we can either:

toss heads (with probability p)—then we are transferred to the state n + 1, and our chance of winning is P(n + 1);
toss tails (with probability q)—then we are transferred to the state n − 1, and our chance of winning is P(n − 1).

So, P(n) = pP(n + 1) + qP(n − 1). In our example this means that

P(0) = 0
P(1) = pP(2) + qP(0)
P(2) = pP(3) + qP(1)
P(3) = 1

The solution is P(1) = $ \frac{{p^{2} }}{1 - pq} $, which is not a surprise. As our example is very simple, the difference in the complexity of computations (i.e. within the probability space versus directly solving the equations obtained from the graph) is not really big; however, for even slightly more complex systems, the gain in the complexity of computations is enormous!^{Footnote 10}

A crucial feature of the process is its memorylessness (Markov property): the coin does not remember the history of tosses. Moreover, the probabilities of actions (H, T) are fixed throughout the whole history (homogeneity). Generally speaking, the future depends only on the present state of the process, not on the past. This applies to our ball-drawing game: anytime we restart the game, the probabilities remain the same as history does not matter. If our Conditional Game concerns the conditional α (e.g. ¬W → G), then G_α denotes the corresponding graph. We might say that the appropriate class of graphs defines the semantics for the conditionals. In general, there are many possibilities of representing a certain stochastic process by Markov graphs; usually we are looking for the simplest representation.

The dynamics of the game is represented in Graph 2 (Fig. 2, next page).

It consists of three states: START, WIN, LOSE. According to our definition, the game proceeds as follows:

We start (obviously) in START. Three moves are possible:

If we draw a White ball, we get back to START (and the game restarts);
If we draw a Green ball, we go to WIN;
If we draw a Red ball, we go to LOSE.

WIN and LOSE are absorbing states: if we get into them, we remain there and the game stops.

The transition probabilities of getting from one state to another are given by (we use the standard notation here)^{Footnote 11}:

p_{START, START} = P(W) = r = 10/20
p_{START, WIN} = P(G) = p = 8/20
p_{START, LOSS} = P(R) = q = 2/20
p_{LOSS, LOSS} = 1
p_{WIN, WIN} = 1.
(All the other transition probabilities are equal to 0).^{Footnote 12}

This is the simplest (and standard) representation of this random process as a Markov graph. The labels for the states indicate the current state of the game (i.e. whether we have won, lost, or perhaps the game (re)starts), and the edges are labeled by the drawn balls (they correspond to the actions: a White/Green/Red ball was drawn from the urn).

A big advantage of such a representation is that it shows the possible scenarios in the game in a straightforward way: we start in START, and we simply travel through the graph until we get into one of the absorbing (deciding) states WIN, LOSS. We can also compute the probability of winning (losing) the game in a very simple way. The graph representation allows us to understand the dynamics of the game—and this will be particularly important in more complex situations, when we distinguish between the local and global interpretation of conditionals (cf. Kaufman 2004).

Let P(START) denote the (as yet unknown) probability of winning the game (i.e. the probability of reaching WIN starting from START). We write down the respective equation for this simple graph^{Footnote 13}:

$$ {\text{P}}\left( {\text{START}} \right) = p + r{\text{P}}\left( {\text{START}} \right) $$

We solve it, obtaining:

$$ \left( {1 - r} \right){\text{P}}\left( {\text{START}} \right) = p $$

So finally:

Fact 1.1

$$ {\text{P}}\left( {\text{START}} \right) = \frac{p}{1 - r} = \frac{p}{p + q} $$

So the probability P(¬W → G) is $ \frac{p}{p + q} $. In our particular case, it is 8/10 (which is also the intuitive answer).

1.2 The Probability Space

There is a unique probabilistic space corresponding to the graph G_α (where α = ¬W → G) that describes the possible “travel histories” (or paths) in this graph.^{Footnote 14} The possible histories are of two kinds: WⁿG (we win the game), and WⁿR (we lose the game), where ${\text{n}}\!\in\!{\mathbb{N}}$. These paths will serve as elementary events in the constructed space.

The probability space we construct will therefore satisfy the following conditions:

i.
It ascribes a certain probability to every possible course of The Conditional Game ¬W → G which settles the game. In particular, it will be possible to compute the probability of our victory (and loss).
ii.
It takes into account the initial probabilities of drawing a White, Green or Red ball from the urn, i.e. its definition reflects the probabilities from the space Ξ = (Ω, Μ, P).
iii.
As the aim is to compute the probability of a particular conditional α (e.g. α = ¬W → G), it is important to make our space as convenient as possible for this particular task (in particular, it is a minimal space).^{Footnote 15}

Let Ξ = (Ω, Μ, P) be a probability space, where Ω is the set of elementary events: Ω = {W, G, R} with probabilities respectively P(W) = r, P(G) = p, P(R) = q; Μ is the σ-field of all events, and P is the probability function defined on M.

Definition 1.2 The probability space Ξ_α corresponding to the conditional α = ¬W → G is the triple Ξ_α = (Ω_α, Μ_α, P_α), where:

1.
Ω_α is the set of elementary events corresponding to sequences of events from the space Ξ, which decide The Conditional Game α;
2.
Μ_α is the σ-field of all events;
3.
P_α is the probability function defined on M_α.

Formally:

Ω_α = {WⁿG, WⁿR: ${\text{n}} \in {\mathbb{N}}$}.^{Footnote 16}
Μ_α = 2^Ωα;
P_α(WⁿG) = rⁿp (for ${\text{n}} \in {\mathbb{N}}$);
P_α(WⁿR) = rⁿq (for ${\text{n}} \in {\mathbb{N}}$).

Observe that our probability space was defined with respect to a particular conditional α: ¬W → G, which is not expressible in the language corresponding to the initial probability space Ξ. The definition is formally correct; in particular, the condition P_α(Ω_α) = 1 is fulfilled (which can be checked by an easy computation^{Footnote 17}). Ω_α is a countable set, so (due to σ-additivity), the probability measure P_α extends to Μ_α, i.e. P_α is defined for any set A ⊆ Ω_α:

$$ {\text{P}}_{\upalpha} \left( {\text{A}} \right) = \mathop \sum \limits_{\omega \in A} {\text{P}}_{\upalpha} (\upomega) $$

It should be noted that, from a formal point of view, speaking of the probability of the sentence ¬W → G is a kind of abuse of language, because de facto we compute the probability P_α of the event [¬W → G]_α, which is the interpretation of the sentence ¬W → G in the probability space Ξ_α. But we will use this expression freely, as it does not lead to misunderstanding (and in the literature it is standard to speak of the probability of sentences). But we stress here that there is always an interpretation of the sentence defining the game in an appropriate probability space: this interpretation makes clear the sense in which we talk of the probability and how we defined the game (i.e. how we defined the conditions for our victory). Being meticulous in identifying the set of events in the concrete probability space, which is the interpretation of the sentence ¬W → G, will be of particular importance when the context (i.e. the rules of the game) changes (we will discuss it later). In such situations, the interpretation of the sentence ¬W → G will differ in different probability spaces, corresponding e.g. to different understandings of the conditional (local and global).

We want to compute the probability of the sentence ¬W → G. It has its counterpart WIN_α in the probability space Ξ_α = (Ω_α, Μ_α, P_α), consisting of the possible scenarios leading to victory:

$$ {\text{WIN}}_{\upalpha} = [\neg W \to G]_{\upalpha} = \left\{ {{\text{W}}^{\rm n} {\text{G}}\!\!:{\text{n}} \in {\mathbb{N}}} \right\}, $$
by LOSS_α we denote the event leading to our defeat in the game:
$$ {\text{LOSS}}_{\upalpha} = \, [\neg W \to R]_{\upalpha} = \{ {\text{W}}^{\rm n} {\text{R}}\!\!:{\text{n}} \in {\mathbb{N}}\} .$$

Fact 1.3

$ {\text{P}}_{\upalpha} ({\text{WIN}}_{\upalpha} ) \, = \mathop \sum \limits_{n = 0}^{\infty } pr^{n} = \frac{p}{1 - r} = \frac{p}{p + q} $

Fact 1.4

$ {\text{P}}_{\upalpha} ({\text{LOSS}}_{\upalpha} ) \, = \mathop \sum \limits_{n = 0}^{\infty } qr^{n} = \frac{q}{1 - r} = \frac{q}{p + q} $

Fact 1.5

$ {\text{P}}_{\upalpha} ({\text{WIN}}_{\upalpha} ) + {\text{P}}_{\upalpha} ({\text{LOSS}}_{\upalpha} ) \, = \frac{p}{p + q} + \frac{q}{p + q} = 1 $

Obviously, the following two facts are true:

Fact 1.6

$ {\text{P}}_{\upalpha} ( {\text{WIN}}_{\upalpha} ) = {\text{P(G|}}\neg {\text{W}}) $

Fact 1.7

$ {\text{P}}_{\upalpha} ( {\text{LOSS}}_{\upalpha} ) = {\text{P(R|}}\neg {\text{W}}) $

where P is the probability measure in the initial probability space Ξ = (Ω, Μ, P), which was the basis for our construction of the space (of possible game scenarios) Ξ_α = (Ω_α, Μ_α, P_α).

To sum up: we have given a semantics for conditional sentences α (consisting of atomic sentences in the initial probability space), which allows us to compute their probabilities in a straightforward way. This semantics is given by an appropriate graph G_α and the corresponding (unique) probability space Ξ_α. The correlation between the values of the probability P_α of the conditional and the conditional probability P in the space Ξ is given by the Facts 1.6 and 1.7.^{Footnote 18}

1.3 A Comparison with Kaufmann’s (Stalnaker Bernoulli) Space

A similar approach to the problem of computing the probability of conditionals is presented in Kaufmann (2004, 2005, 2009, 2015). It is based on the Stalnaker Bernoulli model (cf. also van Fraassen (1976); this terminology is used there and in Kaufmann’s papers). Here we indicate the most important similarities and differences between this model and our stochastic graphs semantics.

For the sake of this comparison, consider Kaufmann’s construction of the probability space for the conditional ¬W → G. Using a notation similar to ours, the appropriate space would have the form:

Ξ* = (Ω*, Μ*, P*), where
Ω* = the set of infinite sequences consisting of the events W, G, R (Kaufmann speaks of worlds being elements of these sequences);
Μ* is the σ-field defined over Ω*;
P* is a probability measure defined on M* in an appropriate way.

An important feature of the definition of P* is the fact, that any particular (infinite) sequence of balls has null probability (we consider it to be a sequence of events coming from the space Ξ), but the appropriate “bunches” of such sequences have non-zero probabilities. Considering the sentence ¬W → G, such bunches are constructed from sequences starting with an initial finite (perhaps empty) sequence of White balls, followed by a Green or Red ball, after which an arbitrary infinite sequence of balls follows.

In this way, Kaufmann is able to give a method of computing the probability of victory in The Conditional Game, i.e. P*([¬W → G]*). It corresponds exactly to our intuitions: this computation amounts to estimating the fraction of sequences starting with WⁿG (i.e. the winning sequences) within all the sequences settling the game (i.e. starting with WⁿG or WⁿR).^{Footnote 19} And of course, this value is exactly the same as the value computed here.

Specific features of Kaufmann’s construction become even more clearly visible when we try to give it a graph-semantical presentation. First, Kaufmann’s method leads to all possible graphs at once: every such graph corresponds to a possible conditional in the language in question. In our graph (and the corresponding probability space) we can only play the game¬W → G (and—after an obvious relabeling WIN/LOSS—also the “dual game” ¬W → R). Secondly, in our approach the game terminates after entering WIN or LOSS (so the sequences, being elementary events in the probabilistic space Ξ_α are finite); but under Kaufmann’s approach, the game (formally speaking) lasts infinitely long. Both these approaches lead to the same results. The main difference lies in a certain redundancy appearing in Kaufmann’s model, which might be important when more complicated cases are discussed.

To sum up, the advantage of stochastic-graph semantics is its intuitiveness (it is connected with the particular conditional in a natural way) and it gives us an easy method of computing the probability of the conditional. It is formally correct (there is a corresponding formally defined probability space), but this semantics reveals its full power in more complicated cases in which the use of Kaufmann’s semantics might face technical complications of a combinatorial character (the probability space might turn out to be very intricate). We suppose that in some cases these technical complications might even make the treatment impossible.^{Footnote 20}

2 Two-Urn Examples: Global and Local Conditional Probabilities

Consider the following example (which is similar to the example analyzed in Kaufman (2004) and Khoo (2016)):

We have two urns I, II. In urn I there are 2 White, 9 Green and 1 Red ball. In the urn II there are 50 White, 1 Green and 9 Red balls. Assume that urn I can be chosen with probability 1/4 and urn II with probability 3/4.^{Footnote 21}

We have chosen these particular values to simplify the comparison of our results with the results given in Kaufman (2004). But as our considerations have a general character, we use the following symbols: λ₁,λ₂ are the probabilities of choosing urn I and II; r₁, p₁, q₁ and r₂, p₂, q₂—are the probabilities of a White/Green/Red ball within the respective urns I and II.

Again, we want to compute the probability of the sentence If not-White, then Green (i.e. ¬W → G). To do this, we consider The Conditional Game and try to estimate the chance of winning it. This requires a precise description of the rules of the game. An urn must first be chosen and then a ball from this urn has to be drawn. If we draw a Green ball (from any of the urns I, II), we win; if we draw a Red ball (for any urn), we lose. But what happens when we draw a White ball? A priori, the following decisions are possible:

(1)
We win the game.
(2)
We lose the game.
(3)
The game is undecided and stops.
(4)
The game is undecided. We put the ball back (to the urn it was drawn from), and restart the game: we draw the urn and the ball from it.
(5)
The game is undecided. We put the ball back (to the urn it was drawn from), but remain within the chosen urn (i.e. we keep drawing the ball from the same urn until the game is decided).

Just like we did in the one-urn model, we can agree that the options (1), (2) and (3) are not attractive. This means that we have (4) and (5) at our disposal: after drawing a White ball we can restart the game from the beginning or remain within the drawn urn and continue the game within it. A priori none of these variants is superior and so we have to make a decision. This means that we have two possible variants of The Conditional Game—and in particular two possible interpretations of the sentence in question. Depending on the chosen variant, we have to adjust the formal constructions of it. Both these variants of the game correspond to the global and local interpretations of the conditional discussed in the literature (Kaufmann 2004; Khoo 2016). We will use this terminology, in particular denoting global as the conditional defined (interpreted) by (4), and local as the conditional defined (interpreted) by (5).^{Footnote 22}

When discussing these two possible variants we will see the advantage of the graph semantics—it shows clearly the differences and allows us to compute the corresponding probabilities in a simple (and mathematically correct) way.

3 The Global Interpretation

Consider the game with a global interpretation of the conditional (i.e. given by the condition (4)). In this case, we can give several graph representations of the game where the one we give is the simples one

3.1 A Stochastic Graph Representation

Graph 3 (Fig. 3, next page) illustrates the mechanism in the Global Conditional Game.

The transition probabilities in this graph are given in the following way:

p_{START, I} = λ₁ = 3/4
p_{START, II} = λ₂ = 1/4
p_{I, START} = r₁ = 2/12
p_{I, WIN} = p₁ = 9/12
p_{II, LOSS} = q₁ = 1/12
p_{II, START} = r₂ = 50/60
p_{II, WIN} = p₂ = 1/60
p_{II, LOSS} = q₂ = 9/60
p_{WIN, WIN} = 1
p_{LOSS, LOSS} = 1.^{Footnote 23}

Let us (as before) denote by P(START) the probability of getting (finally) to the state WIN from the state START. In addition, let P(I), P(II) denote the probabilities of getting (finally) to the state WIN from the states I, II respectively (perhaps after several moves). Just as in the case of the one-urn model, we compute P(START) using the appropriate system of equations for the Markov graph:

P(START) = λ₁P(I) + λ₂P(II)
P(I) = p₁ + r₁P(START)
P(II) = p₂ + r₂P(START)

After solving this system, we obtain the formula:

Fact 3.1

$$ {\text{P}}\left( {\text{START}} \right) = \frac{{\lambda_{1} p_{1} + \lambda_{2} p_{2} }}{{\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} }} = 0.6 $$

Here we have made use of the fact that 1 − r₁ = (p₁ + q₁); 1 − r₂ = (p₂ + q₂).

3.2 The Probability Space

We will construct a probability space Ξ ^glob_α = (Ω ^glob_α , Μ ^glob_α , P ^glob_α ) for the game defined in this way.

Let W₁, W₂, G₁, G₂, R₁, R₂ denote the drawing of a White, Green or Red ball from urns I or II respectively. In the case of the global interpretation, the paths in the graph show us which sequences of results (draws) are possible. The set of sequences, representing the possible histories of The Global Conditional Game is: {G₁, G₂, R₁, R₂, W₁G₁, W₁G₂, W₂G₁, W₂G₂, W₁R₁, W₁R₂, W₂R₁, W₂R₂, W₁W₁G₁, W₁W₁G₂, W₁W₂G₁, W₁W₂G₂ ,….}. Our victories are represented by sequences ending with G₁or G₂ (our losses—respectively—R₁ or R₂).

The important feature of this case is that after drawing (and replacing) the White ball from urn I, we can in the next step draw any ball from any urn—e.g. Green from urn II. This is because returning the White ball to any urn restarts the whole procedure—again we select an urn and draw a ball from it. This is indicated by the arrows in the graph leading from I and II to START (and again to any urn and any ball). In particular, this means that in the probability space we will construct, there will be “mixed” sequences: first, there are White balls (from any of the urns), then the decisive ball (from any urn).

Consider the probability space Ξ_U2 = (Ω_U2, Μ_U2, P_U2),where Ω_U2 is the set of elementary events: Ω_U2 = {W₁, G₁, R₁, W₂, G₂, R₂}, with probabilities λ₁r₁, λ₁p₁, λ₁q₁, λ₂r₂, λ₂p₂, λ₂q₂ respectively; Μ_U2 is the σ-field of all events; P_U2 is the probability measure on M_U2. We assume that λ₁ + λ₂ = 1; p₁ + q₁ + r₁ = 1; p₂ + q₂ + r₂ = 1.^{Footnote 24}

Definition 3.2 The space Ξ ^glob_α for the globally understood conditional α = ¬W → G is the triple Ξ ^glob_α = (Ω ^glob_α , Μ ^glob_α , P ^glob_α ) where:

1.
Ω ^glob_α is the set of elementary events—these are the sequences of events from the space Ξ_U2, which decide The Global Conditional Game α;
2.
Μ ^glob_α is the σ-field of all events;
3.
P ^glob_α is a probability measure on M ^glob_α .

Formally:

Ω ^glob_α = {(W₁/W₂)ⁿ(G₁/G₂), (W₁/W₂)ⁿ(R₁/R₂): n∈ℕ}, where by (W₁/W₂)ⁿ we denote any sequence of White balls of length n, coming from any urn; similarly (G₁/G₂), (R₁/R₂) denotes any Green (Red) ball.

So these sequences consist of a (possibly empty) series of White balls coming from any urn, followed by a Green or Red ball (from any urn).

Μ ^glob_α = P(Ω ^glob_α ) (here P denotes the powerset of Ω ^glob_α )

The probability P ^glob_α is defined on elementary events in the following way:

$ {\text{P}}_{\upalpha}^{\rm glob} (({\text{W}}_{1}/{\text{W}}_{2})^{\text{n}} {\text{G}}_{1}) \, = \, (\lambda_{1} r_{1} )^{\rm k} (\lambda_{2} r_{2} )^{{{\text{n}} - {\text{k}}}} (\lambda_{1} p_{1} ) $—when k balls in the sequence $ ({\text{W}}_{1}/{\text{W}}_{\text{2}})^{\text{n}} {\text{G}}_{1} $ are of the kind W₁, and n − k balls are of the kind W₂, for ${\text{n,k}}\!\in\!{\mathbb{N}}$; 0 ≤ k ≤ n

$ {\text{P}}_{\upalpha}^{\rm glob} (({\text{W}}_{1} / {\text{W}}_{\text{2}})^{\text{n}} {\text{G}}_{2} ) \, = \, (\lambda_{1} r_{1} )^{\rm k} (\lambda_{2} r_{2} )^{{{\text{n}} - {\text{k}}}} (\lambda_{2} p_{2} ) $—when k balls in the sequence $ ({\text{W}}_{1} / {\text{W}}_{\text{2}})^{\text{n}} {\text{G}}_{1} $ are of the kind W₁, and n–k balls are of the kind W₂, for ${\text{n,k}}\!\in\!{\mathbb{N}}$; 0 ≤ k ≤ n

$ {\text{P}}_{\upalpha}^{\text{glob}}({\text{W}}_{1}/ {\text{W}}_{\text{2}})^{\text{n}} {\text{R}}_{1}\, = \, (\lambda_{1} r_{1} )^{\text{k}} (\lambda_{2} r_{2} )^{{{\text{n}} - {\text{k}}}} (\lambda_{1} q_{1} ) $—when k balls in the sequence $({\text{W}}_{1}/ {\text{W}}_{\text{2}})^{\text{n}} {\text{R}}_{1}$ are of the kind W₁, and n–k balls are of the kind W₂, for ${\text{n,k}}\!\in\!{\mathbb{N}}$; 0 ≤ k ≤ n

$ {\text{P}}_{\upalpha}^{\text{glob}} (({\text{W}}_{1}/ {\text{W}}_{\text{2}})^{\text{n}} {\text{R}}_{2} ) \, = \, (\lambda_{1} r_{1} )^{\text{k}} (\lambda_{2} r_{2} )^{{{\text{n}} - {\text{k}}}} (\lambda_{2} q_{2} ) $—when k balls in the sequence $ ({\text{W}}_{1}/ {\text{W}}_{\text{2}})^{\text{n}} {\text{R}}_{2} $ are of the kind W₁, and n–k balls are of the kind W₂, for ${\text{n,k}}\!\in\!{\mathbb{N}}$; 0 ≤ k ≤ n^{Footnote 25}

The chance that our game ends at the (n + 1)-st move, and the last ball is G₁ is:
$$ (\lambda_{1} r_{1} + \lambda_{2} r_{2} )^{\text{n}} (\lambda_{1} p_{1} ) \, ({\text{for n}}\!\in\!{\mathbb{N}}) $$
So:
The chance, that the game lasts (n + 1) moves and the last ball is G₁ is (λ₁r₁ +λ₂r₂)ⁿ(λ₁p₁).
Similarly:
The chance, that the game lasts (n + 1) moves and the last ball is G₂ is (λ₁r₁ +λ₂r₂)ⁿ(λ₂p₂).
So, the chance that the game ends at the (n + 1)-th move and we win, is:
(λ₁r₁ +λ₂r₂)ⁿ(λ₁p₁ + λ₂p₂).

We want to compute the probability of the conditional ¬W → G. It has its counterpart WIN ^glob_α in the probability space Ξ ^glob_α = (Ω ^glob_α , Μ ^glob_α , P ^glob_α ). This set consists of all game scenarios leading to the victory:

$$ {\text{WIN}}_{\upalpha}^{\text{glob}} = \, [\neg W \to G]_{\upalpha}^{\text{glob}} = \left\{ {\left( {{\text{W}}_{1} / {\text{W}}_{2} } \right)^{\text{n}} \left( {{\text{G}}_{ 1} / {\text{G}}_{ 2} } \right)\!\!:{\text{ n}} \in {\mathbb{N}}} \right\}$$
The event We lose in this game is the set:
$$ {\text{LOSS}}_{\upalpha}^{\text{glob}} = \, [\neg W \to R]_{\upalpha}^{\text{glob}} = \left\{ {\left( {{\text{W}}_{1} / {\text{W}}_{2} } \right)^{\text{n}} \left( {{\text{R}}_{ 1} / {\text{R}}_{ 2} } \right)\!\!:{\text{ n}} \in {\mathbb{N}}} \right\}$$

We compute P ^glob_α (WIN ^glob_α ) as a sum of geometric series and obtain the (already known) result:

Fact 3.3

$$ \begin{aligned} {\text{P}}_{\upalpha}^{\text{glob}} ({\text{WIN}}_{\upalpha}^{\text{glob}} ) \, = & \mathop \sum \limits_{n = 0}^{\infty } (\lambda_{1} r_{1} + \lambda_{2} r_{2} )^{\text{n}} (\lambda_{1} p_{1} + \lambda_{2} p_{2} ) \, =\, \frac{{\lambda_{1} p_{1} + \lambda_{2} p_{2} }}{{\lambda_{1} (1 - r_{1} ) + \lambda_{2} (1 - r_{2} )}} \\ = \,& \frac{{\lambda_{1} p_{1} + \lambda_{2} p_{2} }}{{\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} }} \\ \end{aligned} $$

Similarly:

Fact 3.4

$$ \begin{aligned} {\text{P}}_{\upalpha}^{\text{glob}} ({\text{LOSS}}_{\upalpha}^{\text{glob}} ) \, = & \mathop \sum \limits_{n = 0}^{\infty } (\lambda_{1} r_{1} + \lambda_{2} r_{2} )^{\text{n}} (\lambda_{1} q_{1} + \lambda_{2} q_{2} ) \, = \frac{{\lambda_{1} q_{1} + \lambda_{2} q_{2} }}{{\lambda_{1} (1 - r_{1} ) + \lambda_{2} (1 - r_{2} )}} \\ = \,& \frac{{\lambda_{1} q_{1} + \lambda_{2} q_{2} }}{{\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} }} \\ \end{aligned} $$

Fact 3.5

$$ {\text{P}}_{\upalpha}^{\text{glob}} ({\text{WIN}}_{\upalpha}^{\text{glob}} ) + {\text{P}}_{\upalpha}^{\text{glob}} ( {\text{LOSS}}_{\upalpha}^{\text{glob}} ) = 1 $$

3.3 Kaufmann’s Equation (The Global Version)

In our model, some intuitive considerations concerning the chances to win can be formalized. Kaufmann (2004, 586) examines an equation which is intuitively clear but formally not quite precise:

$$ P(\neg W \to_{g} G) \, = P(\neg W \to G | {\text{I}})P({\text{I}}|\neg {\text{W}}) \, + P(\neg W \to G | {\text{II}})P({\text{II}}{\mid }\neg {\text{W}}) = \, {0.6} $$

Imagine someone who has already observed that the drawn ball is not White (so the game is already decided). What is the chance that it is Green? The knowledge, that the ball is not White gives us some indirect information about the urn it (probably) comes from. This information is contained in the expressions P(I∣¬W) and P(II∣¬W). Now, if the ball is from urn I, the chance of the conditional ¬W → G is P(¬W → G∣I), if it is from urn II, the chance is P(¬W → G∣II). This leads to the equation above.

The weakness of this equation consists in the fact that it is not clear where (i.e. in what probability space) the probability function P is defined. In particular, there should be an event corresponding to the conditional ¬W → G, but Kaufmann does not define an appropriate probability space.^{Footnote 26} We can give a precise counterpart of this equation in the following way (using the observation, that drawing urn I/II can be written as the event $\left({\text{W}}_{1} \vee {\text{G}}_{1} \vee {\text{R}}_{1}\right)$, resp.: $\left({\text{W}}_{2} \vee {\text{G}}_{2} \vee {\text{R}}_{2}\right)$):

Instead of $P\left({\text{I}}| \neg {\text{W}}\right)$ we take: ${\text{P}}_{\text{U2}} (({\text{W}}_{1} \vee {\text{G}}_{1}\vee {\text{R}}_{1})|\,(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2}))$.
Instead of $P\left({\text{II}}| \neg {\text{W}}\right)$ we take: ${\text{P}}_{\text{U2}} (({\text{W}}_{2} \vee {\text{G}}_{2}\vee {\text{R}}_{2})|\,(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2}))$.
Instead of $P\left(\neg{\text{W}} \to {\text{G}}|{\text{I}}\right)$ we take: ${\text{P}}_{\text{U2}} ({\text{G}}_{1} | ({\text{G}}_{1}\vee {\text{R}}_{1}))$.
Instead of $P\left(\neg{\text{W}} \to {\text{G}}|{\text{II}}\right)$ we take: ${\text{P}}_{\text{U2}} ({\text{G}}_{2} | ({\text{G}}_{2}\vee {\text{R}}_{2}))$.^{Footnote 27}

The conditional probabilities in the (simple) space Ξ_U2 are given by:

$$ {\text{P}}_{\text{U2}} (({\text{W}}_{1} \vee {\text{G}}_{1} \vee {\text{R}}_{1} )|(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2} )) \, = \frac{{\lambda_{1} \left( {p_{1} + q_{1} } \right)}}{{\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} }} = 5/8 $$
$$ {\text{P}}_{\text{U2}} (({\text{W}}_{2} \vee {\text{G}}_{2} \vee {\text{R}}_{2} )|(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2} )) \, = \frac{{\lambda_{2} \left( {p_{2} + q_{2} } \right)}}{{\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} }} = 3 /8 $$
$$ {\text{P}}_{\text{U2}} ({\text{G}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )) \, = \frac{{p_{1} }}{{(p_{1} + q_{1} )}} = \, 9 /10 $$
$$ {\text{P}}_{\text{U2}} ({\text{G}}_{2} |({\text{G}}_{2} \vee {\text{R}}_{2} )) \, = \frac{{p_{2} }}{{(p_{2} + q_{2} )}} = 1 /10 $$

So, finally:

$$ P(\neg {\text{W}} \to_{\text{g}} {\text{G}}) \, = \frac{{\lambda_{1} q_{1} + \lambda_{2} q_{2} }}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} = 0.6 $$

which is formally correct, identical with the results known from the literature (Kaufmann 2004; Khoo 2016) and already established by Fact 3.3. We have here just made the intuitive considerations mathematically precise.

Two facts are of particular importance for the ongoing analyses:

Fact 3.6

\( \begin{aligned} {\text{P}}_{\upalpha}^{\text{glob}} (\neg W \to G) \, =\, & {\text{P}}_{\text{II}} ({\text{G}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )){\text{P}}_{\text{II}} (({\text{W}}_{1} \vee {\text{G}}_{1} \vee {\text{R}}_{1} )|(\neg {\text{W}}_{ 1} \wedge \neg {\text{W}}_{ 2} )) \\ & + {\text{P}}_{\text{II}} ({\text{G}}_{ 2} |({\text{G}}_{ 2} \vee {\text{R}}_{ 2} )){\text{P}}_{\text{II}} (({\text{W}}_{ 2} \vee {\text{G}}_{ 2} \vee {\text{R}}_{2} )|(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2} )) \\ =\, & \frac{{p_{1} }}{{(p_{1} + q_{1} )}}\frac{{\lambda_{1} \left( {p_{1} + q_{1} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} + \frac{{p_{2} }}{{(p_{2} + q_{2} )}}\frac{{\lambda_{2} \left( {p_{2} + q_{2} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} \\ =\, & \frac{{\lambda_{1} q_{1} + \lambda_{2} q_{2} }}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} \\ \end{aligned} \)

Fact 3.7

$$ \begin{aligned} {\text{P}}_{\upalpha}^{\text{glob}} (\neg W \to R) \, =\, & \frac{{q_{1} }}{{(p_{1} + q_{1} )}}\frac{{\lambda_{1} \left( {p_{1} + q_{1} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} \\ & + \frac{{q_{2} }}{{(p_{2} + q_{2} )}}\frac{{\lambda_{2} \left( {p_{2} + q_{2} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} =\, \frac{{\lambda_{1} q_{1} + \lambda_{2} q_{2} }}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} \\ \end{aligned} $$

The structure of these formulas is very similar to the structure of the formulas for the probabilities of WIN_α and LOSS_α in The (One-Urn) Conditional Game ¬W → G (cf. Facts 1.3 and 1.4). They correspond precisely to the situation where we have just one urn (call it U_new), where the probabilities of drawing a White, Green, or Red ball are respectively:

P_new(W) = λ₁r₁ + λ₂r₂
P_new(G) = λ₁p₁ + λ₂p₂
P_new(R) = λ₁q₁ + λ₂q₂.

In this urn:

$$ {\text{P}}_{\text{new}} ({\text{G}}{\mid }\neg {\text{W}}) \, = \frac{{\lambda_{1} p_{1} + \lambda_{2} p_{2} }}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} ) }} $$

Consider the probability space Ξ_new = (Ω_new, Μ_new, P_new), where Ω_new is the set of elementary events: W, G, R with probabilities respectively λ₁r₁ + λ₂r₂, λ₁p₁ + λ₂p₂, λ₁q₁ + λ₂q₂; Μ_new is the σ-field of all events and P_new is the probability measure defined on M_new.

The following fact takes place:

Fact 3.8

$$ {\text{P}}_{\upalpha}^{\text{glob}} (\neg W \to G) \, = {\text{ P}}_{\text{new}} ({\text{G}}{\mid }\neg {\text{W}}) $$

It is important to observe that the probability space Ξ_new has a universal character which means that it allows us to compute the probabilities of other conditionals as conditional probabilities. For example, the probability of β = ¬G → W can be computed within Ξ_new simply as P_new(W∣¬G). Of course, the appropriate probability space Ξ ^glob_β = (Ω ^glob_β , Μ ^glob_β , P ^glob_β ) will be different than Ξ ^glob_α = (Ω ^glob_α , Μ ^glob_α , P ^glob_α ), as it will consist of sequences of the form GⁿW, GⁿR. So P ^glob_β (¬G → W) = P_new(W∣¬G). We summarize these observations as:

Fact 3.9

For any probability space Ξ = (Ω, Μ, P), where Ω = {W,G,R}, there is another probability space Ξ_new = (Ω_new, Μ_new, P_new), where Ω_new = {W, G, R} is the set of elementary events with probabilities λ₁r₁ + λ₂r₂, λ₁p₁ + λ₂p₂, λ₁q₁ + λ₂q₂ respectively; Μ_new is the σ-field of all events, and P_new is the probability measure defined on M_new, such that:

For any globally interpreted conditional ¬X → Y, where X, Y ∈ {W, G, R} and X ≠ Y, and the space Ξ ^glob_¬X→Y = (Ω ^glob_{¬X→ Y} , Μ ^glob_{¬X→ Y} , P ^glob_{¬X→ Y} ):

$$ {\text{P}}_{\neg X \to Y}^{\text{glob}} ([\neg X\, \to \,Y]_{\neg X \to Y} )\, = \,{\text{P}}_{\text{new}} ({\text{Y}}{\mid }\neg {\text{X}}) $$

Let us once again take a look at the formula in Fact 3.6. The terms $ \frac{{p_{1} }}{{(p_{1} + q_{1} )}} $ and $ \frac{{p_{2} }}{{(p_{2} + q_{2} )}} $ are just the probabilities of the conditional ¬W → G computed separately in the two urns I and II (or: the probabilities of winning The Conditional Game within these single urns). The more complicated terms $ \frac{{\lambda_{1} \left( {p_{1} + q_{1} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} $ and $ \frac{{\lambda_{2} \left( {p_{2} + q_{2} } \right)}}{{(\lambda_{1} p_{1} + \lambda_{2} p_{2} + \lambda_{1} q_{1} + \lambda_{2} q_{2} )}} $ are conditional probabilities of being in a certain urn, provided we know that a non-White ball has been drawn. So we have a mathematically precise expression which corresponds exactly to Kaufmann’s intuitive formulas. Remember that The Global Conditional Game in two urns can be simulated with one urn, where the probabilities of White, Green and Red balls have been fixed in an appropriate way. This urn is universal (or: stable) in the sense that it can also be used for computing the probabilities of all globally interpreted conditionals formulated in the language with atomic sentences W,G,R.^{Footnote 28}

4 Local Interpretation

4.1 The Stochastic Graph Representation

If we chose the local interpretation, i.e. option (5), we remain in the urn drawn in the first move until the game is decided. This means that we have different scenarios leading to the decision compared to the global case. We can give the simples graph representation of the game (just like in the global case).

Graph 4 (Fig. 4, next page) corresponds to graph 3. It has the same states (START, I,II,WIN,LOSS), but its structure (i.e. edges) is different.

The difference with respect to graph 3 (Fig. 3—illustrating the global case) is that it is no longer possible to get back to START from the states I, II. This is because in the local interpretation, we remain within an urn after we get into it. In particular, if we are already in one of the urns i (i = I, II), we have 3 possible actions (Fig. 4):

After drawing a White ball we are again in the same state i (with probability r_i)
After drawing a Green ball we go to WIN (with probability p_i)
After drawing a Red ball we go to LOSS (with probability q_i)

WIN and LOSE are absorbing states: if we get in any of them, we remain there and the game stops.

The transition probabilities in the graph are given in the following way:

p_{START, I} = λ₁ = 1/4
p_{START, II} = λ₂ = 3/4
p_{I, I} = r₁ = 2/12
p_{I, WIN} = p₁ = 9/12
p_{I, LOSS} = q₁ = 1/12
p_{II, II} = r₂ = 50/60
p_{II, WIN} = p₂ = 1/60
p_{II, LOSS} = q₂ = 9/60
p_{WIN, WIN} = 1
p_{LOSS, LOSS} = 1^{Footnote 29}

This Markov graph exhibits the dynamics of the game and allows for a very simple computation of the probability of victory. The respective system of equations is:

P(START) = λ₁P(I) + λ₂P(II)
P(I) = r₁P(I) + p₁
P(II) = r₂P(II) + p₂

After solving it, we obtain:

Fact 4.1

$$ {\text{P}}\left( {\text{START}} \right) = \lambda_{1} \frac{{p_{1} }}{{(p_{1} + q_{1} )}} + \lambda_{2} \frac{{p_{2} }}{{(p_{2} + q_{2} )}} $$

The paths in the graph show us, which sequences of draws are possible. Of course, in the local situation, the sequences are homogeneous: only one index appears in them as a green/red ball from urn I (II) can be preceded only by white balls from the same urn I (II). So the sequences form the following set:

$$ \left\{ {{\text{G}}_{ 1} ,\;{\text{G}}_{ 2} ,\;{\text{R}}_{ 1} ,\;{\text{R}}_{2} ,\;{\text{W}}_{ 1} {\text{G}}_{1} ,\;{\text{W}}_{ 1} {\text{R}}_{ 1} ,\;{\text{W}}_{ 2} {\text{G}}_{ 2} ,\;{\text{W}}_{ 2} {\text{R}}_{2} ,\;{\text{W}}_{ 1} {\text{W}}_{ 1} {\text{G}}_{ 1} ,\;{\text{W}}_{ 1} {\text{W}}_{ 1} {\text{R}}_{1} ,\;{\text{W}}_{ 2} {\text{W}}_{ 2} {\text{G}}_{2} ,\ldots} \right\} $$

4.2 The Probability Space

We construct the probability space Ξ ^loc_α = (Ω ^loc_α , Μ ^loc_α , P ^loc_α ) for the Local Conditional Game.

Consider again the probability space Ξ_U2 = (Ω_U2, Μ_U2, P_U2), where Ω_U2 is the set of elementary events Ω_U2 = {W₁, W₂, G₁, G₂, R₁, R₂} with probabilities λ₁r₁, λ₂r₂, λ₁p₁, λ₂p₂, λ₁q₁, λ₂q₂. Μ_U2 is the σ-field of all events, and P_U2 is the probability measure on M_U2 (cf. Definition 3.2.)

Definition 4.2 The space Ξ ^loc_α for the locally understood conditional α = ¬W → G is the triple Ξ ^loc_α = (Ω ^loc_α , Μ ^loc_α , P ^loc_α ), where:

1.
Ω ^loc_α is the set of elementary events—these are the sequences of events from the space Ω_U2 which decide The Local Conditional Game α;
2.
Μ ^loc_α is the σ-field of all events;
3.
P ^loc_α is a probability measure on M ^loc_α .

Formally:

$$ \Omega_{\upalpha}^{\rm {loc}} = \{ {\rm {W}}_{ 1}^{\rm {n}} {\rm {G}}_{1} ,\;{\rm {W}}_{ 1}^{\rm {n}} {\rm {R}}_{1} ,\;{\rm {W}}_{ 2}^{\rm {n}} {\rm {G}}_{ 2} ,\;{\rm {W}}_{ 2}^{\rm {n}} {\rm {R}}_{2} \!:{\rm {n}} \in {\mathbb{N}}\} $$

$$ {\rm M}_{\upalpha}^{\text{loc}} = {\sf{P}} \left(\Omega_{\upalpha}^{{\text{loc}}}\right)\, \left({\text{here}}\,{\sf{P}}\, {\text{denotes the powerset of }}\Omega_{\upalpha}^{{\text{loc}}} \right)$$

The probability measure P ^loc_α is defined on the elementary events in the following way:

$$ {\text{P}}_{\upalpha}^{\rm loc} \left( {{\text{W}}_{ 1}^{\rm n} {\text{G}}_{1} } \right) \, = \lambda_{1} \left( {r_{1} } \right)^{\rm n} p_{1} ,({\text{for n}}\!\in\!{\mathbb{N}}); $$
$$ {\text{P}}_{\upalpha}^{\text{loc}} \left( {{\text{W}}_{ 1}^{{\rm n}} {\text{R}}_{1} } \right) \, = \lambda_{1} \left( {r_{1} } \right)^{\text{n}} q_{1} ,({\text{for n}}\!\in\!{\mathbb{N}}); $$
$$ {\text{P}}_{\upalpha}^{\text{loc}} \left( {{\text{W}}_{ 2}^{{\rm n}} {\text{G}}_{2} } \right) \, = \lambda_{2} \left( {r_{2} } \right)^{\text{n}} p_{2} ,({\text{for n}}\!\in\!{\mathbb{N}}); $$
$$ {\text{P}}_{\upalpha}^{\text{loc}} \left( {{\text{W}}_{ 2}^{{\rm n}} {\text{R}}_{2} } \right) \, = \lambda_{2} \left( {r_{2} } \right)^{\text{n}} q_{2} ,({\text{for n}}\!\in\!{\mathbb{N}}). $$

So there are four types of elementary events in this space, corresponding to possible scenarios of The Local Conditional Game:

Urn I has been drawn, and then n balls W₁, finally G₁;
Urn I has been drawn, and then n balls W₁, finally R₁;
Urn II has been drawn, and then n balls W₂, finally G₂;
Urn II has been drawn, and then n balls W₂, finally R₂;

The counterpart of ¬W → G in this probability space (i.e. the set of scenarios leading to victory) is:

$$ {\text{WIN}}_{\upalpha}^{\text{loc}} = \, [\neg W \to G]_{\upalpha}^{\text{loc}} = \, \{ {\text{W}}_{ 1}^{{\rm n}} {\text{G}}_{1} ,{\text{ W}}_{ 2}^{{\rm n}} {\text{G}}_{2}\!\!:{\text{n}} \in {\mathbb{N}}\} $$

The event we lose is:

$$ {\text{LOSS}}_{\upalpha}^{\text{loc}} = \{ {\text{W}}_{1}^{{\rm n}} {\text{R}}_{1} ,{\text{ W}}_{ 2}^{{\rm n}} {\text{R}}_{2}\!\!:{\text{n}} \in {\mathbb{N}}\} $$

The probabilities of victory (and defeat) have been computed in a simple way with the use of the stochastic graph. We can also (as in the case of the global interpretation) compute it in the straightforward way: after summing up appropriate geometric series, we obtain the familiar results:

Fact 4.3

$$ {\text{P}}_{\upalpha}^{\text{loc}} ( {\text{WIN}}_{\upalpha}^{\text{loc}} ) = \lambda_{1} \frac{{p_{1} }}{{(p_{1} + q_{1} )}} + \lambda_{2} \frac{{p_{2} }}{{(p_{2} + q_{2} )}} $$

Fact 4.4

$$ {\text{P}}_{\upalpha}^{\text{loc}} ( {\text{LOSS}}_{\upalpha}^{\text{loc}} ) \, = \lambda_{1} \frac{{q_{1} }}{{(p_{1} + q_{1} )}} + \lambda_{2} \frac{{q_{2} }}{{(p_{2} + q_{2} )}} $$

4.3 Kaufmann’s Equation (The Local Version)

We can again compare these formalisms and computations with the intuitive (but informal) computations from Kaufman (2004, 586):

$$ P(\neg W \to_{l} G) \, = P((\neg W \to G)|{\text{I}})P\left( {\text{I}} \right) \, + P((\neg W \to G)|{\text{II}}))P\left( {\text{II}} \right). $$

In the local interpretation, we have P(I), P(II)—not the conditional probabilities P(I∣¬W), P(II∣¬W)! We can formalize Kaufmann’s formulas in the following way:

$$\begin{aligned} &P({\text{I}}) = {\text{P}}_{{\text{U2}}}\, ({\text{G}}_{1} \vee {\text{R}}_{1} \vee {\text{W}}_{1}) = \lambda_{1}\\ &P({\text{II}}) = {\text{P}}_{{\text{U2}}}\, ({\text{G}}_{2} \vee {\text{R}}_{2} \vee {\text{W}}_{2}) = \lambda_{2}\\ &P((\neg W \to G)|{\text{I}})) = {\text{P}}_{\text{U2}}({\text{G}}_{1}|({\text{G}}_{1} \vee {\text{R}}_{1})) = \frac{p_{1}}{(p_{1}+q_{1})}\\ &P((\neg W \to G)|{\text{II}})) = {\text{P}}_{\text{U2}}({\text{G}}_{2}|({\text{G}}_{2} \vee {\text{R}}_{2})) = \frac{p_{2}}{(p_{2}+q_{2})}\end{aligned}$$

Finally, we have two facts, which are exact counterparts of Kaufmann’s intuitive formulas:

Fact 4.5

$$ \begin{aligned} {\text{P}}_{\upalpha}^{\text{loc}} (\neg {\text{W}} \to {\text{G}}) \, =\, & {\text{P}}_{\text{U2}} ({\text{G}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )){\text{ P}}_{\text{U2}} ({\text{G}}_{1} \vee {\text{R}}_{1} \vee {\text{W}}_{1} ) + {\text{P}}_{\text{U2}} ( {\text{G}}_{ 2} | ( {\text{G}}_{ 2} \vee {\text{R}}_{ 2} ) ) {\text{ P}}_{\text{U2}} ({\text{G}}_{2} \vee {\text{R}}_{2} \vee {\text{W}}_{ 2} ) \\ =\, &\uplambda_{1} \frac{{p_{1} }}{{(p_{1} + q_{1} )}} +\uplambda_{2} \frac{{p_{2} }}{{(p_{2} + q_{2} )}} \\ \end{aligned} $$

Fact 4.6

$$ \begin{aligned} {\text{P}}_{\upalpha}^{\text{loc}} (\neg W \to R) \, =\, & {\text{P}}_{\text{U2}} ( {\text{R}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )){\text{ P}}_{\text{U2}} ( {\text{G}}_{1} \vee {\text{R}}_{ 1} \vee {\text{W}}_{1} ) \, + {\text{ P}}_{\text{U2}} ( {\text{R}}_{ 2} |({\text{G}}_{2} \vee {\text{R}}_{2} )){\text{ P}}_{\text{U2}} ({\text{G}}_{2} \vee {\text{R}}_{2} \vee {\text{W}}_{2} ) \, \\ { \,=\, } &\uplambda_{1} \frac{{q_{1} }}{{(p_{1} + q_{1} )}} +\uplambda_{2} \frac{{q_{2} }}{{(p_{2} + q_{2} )}} \\ \end{aligned} $$

To summarize: the computation of the probability of the locally‑interpreted conditional is based on the probabilities of the sentence ¬W → G computed separately in urn I and urn II (which are $ \frac{{p_{1} }}{{(p_{1} + q_{1} )}} $ and $ \frac{{p_{2} }}{{(p_{2} + q_{2} )}} $). They are then summed up with weights equal to the probabilities of drawing the urn I or II (i.e. λ₁ and λ₂ respectively).

4.4 The Global Versus the Local Interpretation

We have shown that the conditional can be formally interpreted in at least two ways. Both these interpretations rely on different assumptions concerning the rules of The Conditional Game (Fig. 5).

To stress the difference, consider a generalization of the two-urn example, namely assume that we have n different urns. The main question to decide is: what should we do when the antecedent of the conditional is not fulfilled—i.e. when a White ball has been drawn from urn i?^{Footnote 30} According to the rules, we repeat the draw, but we have at our disposal only those urns which are somehow connected with the urn i via a certain relation R.^{Footnote 31}

If R is the identity relation, it means that every urn i is connected only with itself. So, after drawing White from urn i, we put the ball back and repeat the draw within the urn i. So in this case we have the local interpretation of the conditional.

If R is the full relation, it means that all urns are connected with each other. Whereas previously (in the local case), after drawing a white we returned the ball to the urn i and restarted the game within that particular urn i—here we select an urn again and draw the ball from this urn. In this case, we have the global interpretation of the conditional.

These two approaches are in a sense two extremes (and they are based on the “granularity” of the division of the set of urns), and many intermediate approaches are possible.

What are the important consequences of distinguishing between the local and global interpretation of the conditional? We will indicate only a few of them:

Under the local interpretation, the number of White balls in the urns does not matter. It means that the probability of the conditional is not dependent on the fact of how probable (or improbable) the antecedent is (i.e. what the probability is of drawing a non-White ball). It follows from the fact that the probability of the conditional in any of the urns depends not on the number of White balls but only on the proportion of Green balls to non-White balls (which is expressed by the formula $ \frac{{p_{i} }}{{(p_{i} + q_{i} )}} $). Imagine that we put 10¹⁰ additional White balls into the urn: the probability of the conditional will not change (of course, on average, we will have to wait longer for the result). In other words: the degree of counterfactuality of a sentence has no influence on its logical (mathematical) probability but it influences the magnitude which we might call its “practical probability”. This issue is discussed elsewhere.^{Footnote 32}

Under the global interpretation, the probability of the antecedent is important for estimating the probability of the conditional (again, we mean the probability of the event non-White was drawn in the initial space Ξ_U2 = (Ω_U2, Μ_U2, P_U2)).The following example will illustrate this: urn I contains 100 Green balls, 1 Red ball, and 10¹⁰⁰⁰⁰⁰ White balls. The probability of drawing this urn is 99/100. Urn II contains 1 Green ball, 100 Red balls and 1 White ball, and the probability of drawing urn II is 1/100. The probability of the (global) conditional ¬W → G is approximately 1/100, because we have virtually no chance to decide the game within urn I. The game will therefore (with probability almost 1) be decided in urn II. This is the consequence of the fact that drawing a non-White ball from urn I is highly improbable.

In particular, it follows from the considerations above that The Global Conditional Game can be reduced to The Conditional Game in one urn U_new. This new urn is universal in the sense characterized by the Fact 3.9. This is not possible for The Local Conditional Game. It follows from the fact that the probability of the locally interpreted conditional is independent from the number of white balls in the urns.

5 Lewis’s Triviality Result: PC = CP?

In the discussion concerning the probabilities of conditionals we have to take into account the PC = CP thesis, and the results of Lewis (1976) (and the ensuing discussion).^{Footnote 33}

Observe first, that this thesis was considered to be the expression of certain intuitions which are usually presented by the following, oft-quoted claims of Ramsey and van Fraassen:

If two people are arguing ‘If p will q?’ and both are in doubt as to p, they are adding p hypothetically to their stock of knowledge and arguing on that basis about q… We can say that they are fixing their degrees of belief in q given p (Ramsey 1929, 247).

What is the probability that I throw a six if I throw an even number, if not the probability that: if I throw an even number, it will be a six? (van Fraassen, 1976, 273).

Generally speaking, they express the view that the probability of the conditional is somehow dependent on the conditional probability. Applying it to the example of drawing balls from one urn (paragraph 1): if we want to know the probability of the conditional If the ball is not White, then it is Green, we have to compute the (usual) conditional probability Green was drawn under the condition non-White was drawn.

According to the widely-accepted interpretation of the PC = CP thesis, this statement should be written as:

$$ {\text{P}}(\neg W \to G) \, = {\text{ P}}({\text{G}}|\neg {\text{W}}) $$

And of course, in this form it is immediately subject to Lewis’s criticism: accepting standard assumptions concerning the properties of the probability measure and of the counterfactual leads us to the (obviously) false claim, that:

$ {\text{P}}(\neg W \to G) \, =\, {\text{ P}}({\text{G}})$ ^{Footnote 34}

But in the light of the semantics for conditionals proposed here it becomes obvious that even the very formulation of the thesis is not correct. The probability of the conditional ¬W → G can only be defined in the probability space Ξ_α, as there is no event, corresponding to this sentence within the initial probability space Ξ. Hayek’s numerical arguments concerning a similar example show it clearly enough (Hajek 2011, 4). So, if we are going to formulate the thesis PC = CP concerning the conditional ¬W → G in a formally proper way, it surely cannot be:

(PC = CP): P(¬W → G) = P(G|¬W), where P is the probability measure in the initial probability space Ξ.

But notice that in fact, the following equality holds (Fact 1.6):

$ ({\mathbf{P}}_{{\neg {\mathbf{W}} \to {\mathbf{G}}}} {\mathbf{C}} \, = \, {\mathbf{CP}})\!\!:{\text{P}}_{{\neg {\text{W}} \to {\text{G}}}} (\neg W \to G) \, = {\text{P}}({\text{G}}|\neg {\text{W}}), $ where on the left side we have the probability function P_α from the probability space Ξ_α (α = ¬W → G) and on the right side we have the probability function P defined in the initial probability space Ξ.

Two remarks follow:

(1)
This formulation corresponds to the intuitions of Ramsey and van Fraassen—it shows how to reduce the difficult problem of estimating the probability of conditionals to the simpler problem of computing the conditional probabilities in the initial probability space Ξ;
(2)
This thesis can be formally proven—it follows trivially from Fact 1.6.

What about the more complicated two-urn example? In this case, the problem of the dependence of the probability of the conditional ¬W → G, and the conditional probabilities can be given a weaker and a stronger interpretation.

Under the weaker interpretation, we expect only that calculating the probability of the counterfactual ¬W → G is possible if we know the conditional probabilities P(G∣¬W), calculated separately in urns I and II. This means that we expect that there is a formula giving the probability of the conditional, taking the conditional probabilities as arguments.

Under the stronger interpretation, we expect that the probability of the counterfactual ¬W → G is equal to a certain conditional probability P_x(G∣¬W) in a suitably-constructed probability space X.

Facts 3.6 and 4.3 allow us to formulate the proper versions of the PC = CP principle under the weaker interpretation: on the left side of the equality (equalities) there is the value of the probability of the conditional α in appropriate probability spaces Ξ ^glob_α and Ξ ^loc_α , and on the right side there is a certain function taking the appropriate conditional probabilities in the space Ξ_U2 = (Ω_U2, Μ_U2, P_U2) as arguments. The formulas for the conditional ¬W → G will have the following forms:

$$ \begin{aligned} & ({\mathbf{P}}_{{\neg {\mathbf{W}} \to {\mathbf{G}}}}^{{{\mathbf{glob}}}} {\mathbf{C}} \, = \, {\mathbf{CP}}_{{{\mathbf{II}}}} )\!\!: \,{\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{glob}} (\neg {\text{W}} \to {\text{G}}) = \\ & {\text{P}}_{\text{U2}} ({\text{I}}|(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2} )){\text{P}}_{\text{U2}} ({\text{G}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )) + {\text{P}}_{\text{U2}} ({\text{II}}|(\neg {\text{W}}_{1} \wedge \neg {\text{W}}_{2} )){\text{P}}_{\text{U2}} ({\text{G}}_{2} |({\text{G}}_{2} \vee {\text{R}}_{2} )) \\ \end{aligned} $$

$$ \begin{aligned} & ({\mathbf{P}}_{{\neg {\mathbf{W}} \to {\mathbf{G}}}}^{{{\mathbf{loc}}}} {\mathbf{C}} \, = \, {\mathbf{CP}}_{{{\mathbf{II}}}} )\!\!: \,{\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{loc}} (\neg {\text{W}} \to {\text{G}}) = \\ & {\text{P}}_{\text{U2}} \left( {\text{I}} \right){\text{ P}}_{\text{U2}} ({\text{G}}_{1} |({\text{G}}_{1} \vee {\text{R}}_{1} )) + {\text{P}}_{\text{U2}} \left( {\text{II}} \right){\text{ P}}_{\text{U2}} ( {\text{G}}_{2} |({\text{G}}_{ 2} \vee {\text{R}}_{2} )) \\ \end{aligned} $$

where I and II are abbreviations for $ ({\text{W}}_{1} \vee {\text{G}}_{1} \vee {\text{R}}_{1})\, {\text{and}}\, ({\text{W}}_{2} \vee {\text{G}}_{2} \vee {\text{R}}_{2})$ respectively.^{Footnote 35}

Notwithstanding the fact that the right sides of these equalities are quite extended, we must agree that they conform to Ramsey’s intuitions: they reduce the probability of the conditional to the conditional probabilities ${\text{P}}_{\text{U2}} ({\text{G}}_{1} | ({\text{G}}_{1}\vee {\text{R}}_{1}))\, {\text{and}}\, {\text{P}}_{\text{U2}} ({\text{G}}_{2} | ({\text{G}}_{2}\vee {\text{R}}_{2}))$ summed up with certain weights (which depend on the interpretation of the conditional as local or global). This is obvious for the local interpretation; for the global interpretation, it is evident from Fact 3.6. And it is clear that the probability of the conditional ¬W → G (interpreted globally) is equal to the conditional probability in the new urn U_new (Fact 3.8).

For the global interpretation of the conditional, an even stronger version can be formulated in the form: $ {\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{glob}} $C = CP_new, where P_new is the probability in the suitable probability space Ξ_new = (Ω_new, Μ_new, P_new) (cf. Fact 3.8). It should be stressed that this space (according to the remarks at the end of paragraph 3) has a stable character: it can be used to compute the probabilities of other conditionals built from the atomic sentences W,G,R.^{Footnote 36}

Observe, that a natural counterpart of the space Ξ_new = (Ω_new, Μ_new, P_new) for the local interpretation does not exist. Of course, we can trivially define an artificial probability space Ξ^# = (Ω^#, Μ^#, P^#) with events dubbed W,G,R in such a way that the probability of the locally interpreted conditional ¬W → G equals P^#(G∣¬W). But the stability condition will not be fulfilled: this space will not work for other conditionals, e.g. for ¬G → W. This is because it is only the globally interpreted conditional which has the Ramseyian character.

To summarize: when analyzing the relationships between the probability of the conditional and conditional probability, we have to be aware of the fact that the notion of probability can have different meanings (in the sense that it can be defined differently in different probability spaces). Lewis’s arguments are directed only against a particular version of the PC = CP thesis, where there is the same probability measure P on both sides of the equality. But it need not have this form; in particular, this form is not enforced by the task it was meant to fulfill, i.e. the task of reducing the probabilities of the conditionals to (functions of) the conditional probabilities. But exactly this task is fulfilled (without any fear of Lewis’s arguments) by the principles we have given here:

(P_¬W→GC = CP): concerning The Conditional Game in one urn, where we have the strong interpretation.

($ {\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{glob}} $C = CP_U2): concerning The Two Urn Global Conditional Game—where the probability of the conditional ¬W → G is computed using conditional probabilities in the space Ξ_U2 = (Ω_U2, Μ_U2, P_U2) with appropriate weights—this is the weak interpretation.

($ {\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{glob}} $C = CP_new): concerning The Two Urn Global Conditional Game—where the probability of the conditional ¬W → G is computed using conditional probabilities in the space Ξ_new = (Ω_new, Μ_new, P_new)–i.e. we have the strong interpretation here.

($ {\text{P}}_{{\neg {\text{W}} \to {\text{G}}}}^{\text{loc}} $C = CP_U2): concerning The Two Urn Local Conditional Game—where the probability of the conditional ¬W → G is computed using conditional probabilities in the space Ξ_U2 = (Ω_U2, Μ_U2, P_U2) with appropriate weights—of course, they are different than in the global case—so this is the weak interpretation.

6 Summary

The semantics for conditionals given in terms of stochastic graphs (and the corresponding probability spaces) has several advantages:

It allows us to exhibit the structure of the problem in a very clear way.
It reveals the differences between the global and local interpretations of the conditional.
It is mathematically precise and allows for a straightforward computation of the needed probabilities.

But this model reveals its full power when we turn to more complex issues. In our opinion, promising areas of research (where we already have obtained some results) are:

Generalizing the notions of the local and global interpretation (we consider a more general case of a certain accessibility relation between the worlds).
The description of nested conditionals.^{Footnote 37}
Making the models more realistic by considering not only mathematical probability, but also “practical probability, where the time factor is taken into account (we will not bet on a game which we will almost surely win—but only after a billion years!).

Additionally, there is also a clear metaphilosophical gain: our models show how certain vaguely understood philosophical notions can be given formal, fully precise counterparts. They can be called explications (in Carnap’s sense). When turning to more complex issues, more powerful mathematical methods will be used and they will provide additional explanatory power for these problems. Our results can be also viewed as a contribution to the idea of mathematical philosophy (Leitgeb 2013). So—in a sense—this paper contributes also to the more general discussion concerning the role of mathematical methods in analyzing philosophical problems.

Notes

We define the appropriate probabilities, but the model also uses the notion of truth-conditions (expressed in terms of the rules of a game). However, we do not discuss the topic here, as the aim of the article is to present the formal model and to discuss the interpretation of the conditional (and its impact on the discussion about PC = CP).
The problem of nested conditionals is discussed in Wójtowicz and Wójtowicz (2019a).
In order to maintain consistency with the upcoming formalisation, we shall capitalise the three colour adjectives henceforth.
This is a much-discussed issue, but we will not go into details here.
M is formally very simple: M = 2^Ω (i.e. it is the power set of Ω). Of course, we could also use as the model a (more complicated) 20-element space (with 10 White, 8 Green, 2 Red balls, all equally probable), but we prefer the simpler—and more general—interpretation.
This reflects the simple tautology of the propositional calculus: $ (\neg {\text{p}} \to {\text{q}}) \leftrightarrow ({\text{p}} \vee {\text{q}})$. But it is widely agreed that the interpretation of a conditional as the material implication is not appropriate.
We admit, that this choice might be natural if we do not admit „restarting actuality”, and in particular if we do not want to think in terms of possible worlds. If we think of evaluating the probability or truth value of the conditional in terms of the actual world only, without allowing possible worlds (or alternative scenarios) etc. our approach can be considered problematic. However, it is quite standard for other approaches, e.g. of van Fraassen: Imagine the possible worlds are balls in an urn, and someone selects a ball, replaces it, selects again and so forth. We say he is carrying out a series of Bernoulli trials (van Fraassen 1976, 279).
Examples of sequences which settle the game are: WWWG (we win with the fourth move); R (we lose with the first move); WWWWWR (we lose with the sixth move) etc.
In the space Ω = {W, R, G} there are 8 events (as M = 2^Ω), which correspond to Boolean combinations of the elementary events. But none of these events corresponds to the conditional ¬W → G.
In the general case, the gambler’s ruin process is defined for arbitrary N, which makes the process much more complex (think of describing the possible paths, for instance when N = 10 and n₁ = 1!). The gambler’s ruin process is a particular case of a more general random walk process; for a survey of applications see for instance chapter 4.5.1. of Ross (2007).
In the general case, when there are n possible states s_i, for i = 1,…n;
- p_i,j denotes the transition probability from the state s_i to the state s_j;
- P(i) denotes the probability of finally (i.e. possibly after a sequence of steps) reaching the winning state starting from the state s_i.
The states in this article have their names, so we use symbols like p_START,WIN or P(START).
This is the transition probability matrix for the graph 2 (Fig. 2):

START
WIN
LOSS
START
r
p
q
WIN
0
1
0
LOSS
0
0
1
The general theory is presented in the Appendix, but we can also justify the equation in an intuitive way. Assume that we are in the state START. What are our chances P(START) of winning? There are two ways of winning: (1) we can win in just one move, with probability p; (2) we will be in the starting position again, with probability r, but then the game is restarted, so the probability of winning is P(START). So, the “contributions” of these variants to P(START) are (1) p and (2) rP(START), which gives the equation P(START) = p + rP(START).
We can think of them as paths within the graph which end in one of the states WIN or LOSE. Each of these paths has a definite probability which is computed in a very intuitive way: as the draws are independent of each other, we simply multiply the probabilities; for instance, the probability of WWWG is rrrp.
Observe, that our approach is different in this respect from Kaufmann’s approach, which is based on Stalnaker Bernoulli models. Kaufmann presents the construction of a universal probability space, where all possible conditionals (based on the initial space Ω) can be represented. However, formal complications are the price to pay for the generality of this model.
The probability of drawing an infinite sequence of White balls (treated as a sequence of events from the initial probability space Ξ) is 0 and so we do not need to include it in our definition of Ξ_α.
P_α(Ω_α) = P_α({WⁿG, WⁿR: n∈ℕ}) = P_α({WⁿG: n∈ℕ}) + P_α({WⁿR: n∈ℕ}) = $ \mathop \sum \limits_{n = 0}^{\infty } pr^{n} + \mathop \sum \limits_{n = 0}^{\infty } qr^{n} = \frac{p}{1 - r} + \frac{q}{1 - r} = \frac{p}{p + q} + \frac{q}{p + q} = 1 $.
By. the way, observe, that this semantics allows us also to compute the probabilities of sentences like I will win The Conditional Game within an even number of moves, which is a nice feature of the model.
Usually, when we speak of fractions, we have finite sets in mind—but the sets here are infinite (and even uncountable). So perhaps the word “proportion” would be more appropriate.
Kaufmann’s model using Stalnaker Bernoulli spaces covers right-nested conditionals B → (C → D); left-nested conditionals (B → C) → D; conjoined conditionals (A → B)∧(C → D) and conditional conditionals (A → B) → (C → D). The model makes it possible to give explicit formulas for these probabilities: P*(B → (C → D)) = P(CD∣B) + P(C^c∣B)P(D∣C), where C^c is the complement of C, and CD is the conjunction (intersection) of events C and D). P* is the probability in the Bernoulli Stalnaker space. The formula for the conjoined conditional (Kaufmann 2009, 12, see also McGee 1989, 500) is:

$$ {\text{P}}^{*}(({\text{A}}\, \to \,{\text{B}}) \wedge ({\text{C}}\, \to \,{\text{D}}))\, = \frac{{ \left[ {{\text{P}}\left( {\text{ABCD}} \right) + {\text{P}}\left( {\neg {\text{ACD}}} \right){\text{P}}\left( {{\text{A}} \to {\text{B}}} \right) + {\text{P}}\left( {{\text{AB}}\neg {\text{C}}} \right){\text{P}}\left( {{\text{C}} \to {\text{D}}} \right)} \right]}}{{{\text{P}}\left( {{\text{A}} \vee {\text{C}}} \right) }} $$
These are very important achievements and a perfect motivation and justification for introducing and investigating the Bernoulli Stalnaker model. The graph model covers all these cases as well but simplifies matters and allows for generalizations (see also footnote 37).
In Kaufmann’s model we have a red ball with a spot instead of green and a red ball without a spot for red. Choosing a red ball (with or without a spot) corresponds to choosing a non-White ball. Kaufmann considers the sentence: If I pick a red ball, it will have a black spot, which corresponds exactly to ours If the ball is non-White, then it is Green. As the examples are entirely isomorphic, it is easy to compare the results.
We adopt the terminology from Kaufmann (2004, 2005, 2009, 2015), leaving aside the discussion concerning the detailed relationships between this classification, and other possible ones, like the standard subjunctive/indicative classification of conditionals (as exemplified by the famous Oswald–Kennedy sentences (Adams 1970)) or epistemic/metaphysical distinction from Khoo (2015).
The transition probability matrix for this graph (Fig. 3) in the general case is:
GLOBAL
START
I
II
WIN
LOSS
START
0
λ₁
λ₂
0
0
I
r₁
0
0
p₁
q₁
II
r₂
0
0
p₂
q₂
WIN
0
0
0
1
0
LOSS
0
0
0
0
1
U2 in the index indicates the fact that we have two urns to draw the balls from.
E.g. the probability of W₁W₁W₂W₁W₂G₁ equals (λ₁r₁)³(λ₂r₂)²(λ₁p₁).
Khoo observes: “Kaufmann's theory marks a significant advance in our thinking about conditionals. However, I think it faces some challenges. First, it is incomplete: since Kaufmann never specifies a semantics for conditionals, it remains unclear why conditionals should have local and global interpretations – for instance, is it due to ambiguity or perhaps context-dependence?” Khoo (2016, 8). We remove this weakness.
P(¬W → G∣I) is replaced by ${\text{P}}_{\text{U2}} ({\text{G}}_{1} | ({\text{G}}_{1}\vee {\text{R}}_{1}))$. P(¬W → G∣I) expresses our intuitive degree of belief that we will win the Conditional Game ¬W → G, under the assumption, that we are in urn I. But to win the game ¬W → G amounts to drawing a Green ball, provided it is Green or Red, i.e. not White (and everything is going on within urn I). And this is exactly ${\text{P}}_{\text{U2}} ({\text{G}}_{1} | ({\text{G}}_{1}\vee {\text{R}}_{1}))$.
It does not matter that we have three atomic sentences – the generalization to n atomic sentences X₁, … X_n is straightforward. It will also work for a countably infinite language (set of elementary events).
The transition probability matrix for this graph (Fig. 4) in the general case is:
LOCAL
START
I
II
WIN
LOSS
START
0
λ₁
λ₂
0
0
I
0
r₁
0
p₁
q₁
II
0
r₂
0
p₂
q₂
WIN
0
0
0
1
0
LOSS
0
0
0
0
1
Observe that in the case, when there are only non-White balls in the urns (i.e. when r₁ = 0 and r₂ = 0) this distinction would make no sense: the probability of the conditional ¬W → G could be computed in one way only and it would be equal to λ₁p₁ + λ₂p₂ (the formulas from the facts 3.3, 4.1 both reduce to this simple form when r₁ = r₂ = 0). This is obvious, because in this case the game would be settled after drawing the first ball from the chosen urn.
In the literature concerning counterfactuals, such a relation R is often interpreted as a relationship occurring between the nearest or neighbouring worlds (this can be traced back at least to Stalnaker (1968)).
Here we observe only that it is easy to construct an example where the probability of the conditional ¬W → G is approximately 1 but at the same time we can claim that the game will last almost forever. It is enough to consider one urn containing 1 Red ball, 10¹⁰ Green balls, and 10^1000000 White balls. The chance of winning is almost 1, but we will have to wait very long for our victory (on average approx. 10¹⁰⁰⁰⁰⁰ moves). The details are given in Wójtowicz and Wójtowicz (2019b).
We do not discuss the historical details and the discussion. The thesis has been also known as “Stalnaker’s Thesis” or “Adams’ Thesis” (Stalnaker 1970; Adams 1965, 1966, 1975). Hajek and Hall (1994) and Hall (1994) use the term “Conditional Construal of Conditional Probability”; Hajek and Hall (1994) discuss several versions of it. Using their terminology we can say, that what we defend is closest to the universal version and the universal tailoring version Hajek and Hall (1994, 76).
Lewis’s result can be formulated as: there is no connective * such that P(X*Y) = P(Y|X) (for every P).
${\text{P}}_{\text{U2}} ({\text{G}}_{1} | ({\text{G}}_{1} \vee {\text{R}}_{1}))\, {\text{and}}\, {\text{P}}_{\text{U2}} ({\text{G}}_{2} | ({\text{G}}_{2}\vee {\text{R}}_{2}))$ are arguments of these functions!
It can also be generalized in a straightforward way to the case of finitely (or countably infinitely) many atomic sentences.
We can model right-nested conditionals B → (C → D) with graphs in a straightforward way. Think of a toy example of drawing colorful balls which have a heavy or light mass. The conditional If the ball is Heavy, then if it is not White, it is Green has two interpretations (deep and shallow), and the difference can by represented by the corresponding graphs, which exhibit the dynamics of the game. The Stalnaker Bernoulli spaces account imposes the shallow interpretation, which is unnatural in some cases: for example, the model gives very counter-intuitive results for sentences like If the match is wet, then if you strike it, it will light. The causal random variable model in Kaufmann (2009) covers the deep interpretation (this interpretation agrees with McGee’s account from McGee (1989); without going into details, the deep interpretation imposes the Import–Export principle). But they are two different models and an advantage of our model is that it can account in a uniform way for both the deep and shallow interpretation of right-nested conditionals. The method can also be extended to longer right-nested conditionals A₁ → (…(A_n-1 → A_n)), and also for cases when conditional operators that occur in the long conditional need different interpretations (i.e. some of them are deep and some are shallow). The rules for constructing appropriate graphs are straightforward; this is impossible to achieve within the Stalnaker Bernoulli model (which by definition works only for the shallow interpretation), and we suppose that mixed conditionals are not easy to cover within the causal model (which is designed to cover the deep interpretation). It is also possible to explain the interplay between the local/global and deep/shallow interpretations of the compound conditional. Due to limitations of space it is not possible to give the technical details here; they can be found in Wójtowicz and Wójtowicz (2019a).
The simplest presentation of the needed notions from Markov chains theory (we know of) is to be found in chapter 7 of Bertsekas and Tsitsiklis (2008). Our spider-fly example is a slight modification of the original one.
Formally, a Markov chain is a sequence of random variables X₀,…,X_k…, with values in the set of states S, which have the Markov property, i.e.
P(X_n+1 = j| X₀ = i₀, X₁ = i₁,…, X_n = i) = p_ij
for all times n, all states i, j∈S, and all possible sequences i₀, i₁,…,i_n−1 of earlier states.
Indeed, in our case it is natural to think of some very concrete actions leading from state to state. But in general, it is enough only to say, that there occurs a transition with a certain probability.
So we identify the set of paths generated by the graph by—in a sense—reversing the problem: consider any string of actions, and ask, whether it leads from START to an accepting state. Assume the fly starts in the state 1, and consider four sequences of actions: (1) RWR; (2) RLRLRLWWL; (3) WWWR; (4) RRRR. We see, that (1) RWR leads to the accepting state 3, so this path is in Ω*; (2) RLRLRLWWL leads to the accepting state 0, so this path is in Ω* as well; (3) WWWR leads to the state 2 (i.e. the game has not finished yet), so this path is not in Ω*; (4) is not in Ω*: we land in the accepting state 3 already after two R’s – but there are still two R’s left, so the string is too long. The paths generated by the graph are exactly these sequences of actions, which lead to an absorbing state in their final step. For formal definitions of a deterministic automaton and the language accepted by the automaton see for instance (Anderson 2006) or the classic (Hopcroft et al. 2001).
More general results stated in the language of matrix equations can be found for instance in chapter 11 of Grinstead and Snell (1997). The classic reference, with an extensive discussion of the motivations is Feller (1968).

References

Adams, E. W. (1965). On the logic of conditionals. Inquiry, 8, 166–197.
Article Google Scholar
Adams, E. W. (1966). Probability and the logic of conditionals. In J. Hintikka & P. Suppes (Eds.), Aspects of inductive logic (pp. 265–316). North Holland: Amsterdam.
Chapter Google Scholar
Adams, E. W. (1970). Subjunctive and indicative conditionals. Foundations of Language, 6, 89–94.
Google Scholar
Adams, E. W. (1975). The logic of conditionals. Dordrecht: D. Reidel.
Book Google Scholar
Anderson, J. A. (2006). Automata theory with modern applications. Cambridge: Cambridge University Press.
Book Google Scholar
Bertsekas, D. P., & Tsitsiklis, J. N. (2008). Introduction to probability (2nd ed.). Nashua: Athena Scientific.
Google Scholar
Brun, G. (2016). Explication as a method of conceptual re-engineering. Erkenntnis, 81(6), 1211–1241.
Article Google Scholar
Carnap, R. (1950). Logical foundations of probability (2nd ed.). Chicago/London: University of Chicago Press/Routledge and Kegan Paul.
Google Scholar
Feller, W. (1968). An introduction to probability theory and its applications (3rd ed., Vol. 1). New York: Wiley.
Google Scholar
Grinstead, C. M., & Snell, J. L. (1997). Introduction to probability. Providence, Rhode Island: AMS.
Google Scholar
Hajek, A. (2011). Triviality pursuit. Topoi, 30(1), 3–15.
Article Google Scholar
Hájek, A., & Hall, N. (1994). The hypothesis of the conditional construal of conditional probability. In E. Eells & B. Skyrms (Eds.), Probabilities and conditionals: Belief revision and rational decision (pp. 75–110). Cambridge: Cambridge University Press.
Google Scholar
Hall, N. (1994). Back in the CCCP. In E. Eells & B. Skyrms (Eds.), Probabilities and conditionals: Belief revision and rational decision (pp. 141–160). Cambridge: Cambridge University Press.
Google Scholar
Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2001). Introduction to automata theory, languages and computation (2nd ed.). Reading: Addison-Wesley.
Google Scholar
Kaufmann, S. (2004). Conditioning against the grain: abduction and indicative conditionals. Journal of Philosophical Logic, 33(6), 583–606.
Article Google Scholar
Kaufmann, S. (2005). Conditional predictions: A probabilistic account. Linguistics and Philosophy, 28(2), 181–231.
Article Google Scholar
Kaufmann, S. (2009). Conditionals right and left: probabilities for the whole family. Journal of Philosophical Logic, 38, 1–53.
Article Google Scholar
Kaufmann, S. (2015). Conditionals, conditional probability, and conditionalization. In H. Zeevat & H. C. Schmitz (Eds.), Bayesian natural language semantics and pragmatics (pp. 71–94). New York: Springer.
Chapter Google Scholar
Khoo, J. (2015). On indicative and subjunctive conditionals. Philosopher’s Imprint, 15(32), 1–40.
Google Scholar
Khoo, J. (2016). Probabilities of conditionals in context. Linguistics and Philosophy, 39, 1–43.
Article Google Scholar
Leitgeb, H. (2013). Scientific philosophy, mathematical philosophy, and all that. Metaphilosophy, 44(3), 267–275.
Article Google Scholar
Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Reviev, 85, 297–315.
Article Google Scholar
Lewis, D. (1986). Philosophical papers (Vol. 2). Oxford: Oxford University Press.
Google Scholar
McGee, V. (1989). Conditional probabilities and compounds of conditionals. Philosophical Review, XCVIII, 4, 485–541.
Article Google Scholar
Ramsey, F. P. (1929). General propositions and causality. In Mellor, D. H. (Ed.), (Reprinted in: Philosophical papers, pp. 145–163, Cambridge: Cambridge University Press 1990).
Ross, S. M. (2007). Introduction to probability models (9th ed.). New York: Academic Press.
Google Scholar
Stalnaker, R. (1968). A theory of conditionals. Studies in Logical Theory, American Philosophical Quarterly, Monograph, 2, 98–112.
Google Scholar
Stalnaker, R. (1970). Probability and conditionals. Philosophy of Science, 37(1), 64–80.
Article Google Scholar
Van Fraassen, B. C. (1976). Probabilities of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Foundations of probability theory, statistical inference, and statistical theories of science (Vol. 1, pp. 261–308)., The University of Western Ontario Series in Philosophy of Science Dordrecht: D. Reidel.
Chapter Google Scholar
Wójtowicz, A., & Wójtowicz, K. (2019a). A graph model for probabilities of nested conditionals (submitted).
Wójtowicz, A., & Wójtowicz, K. (2019b). Mathematical versus practical probabilities of conditionals: A stochastic graph approach (in preparation).

Download references

Acknowledgements

The preparation of this paper was supported by National Science Centre (NCN) grant 2016/21/B/HS1/01955. For helpful discussions and critical comments, we would like to thank the audience in MCMP in Munich. We also thank the anonymous referee for insightful comments which allowed to remove some weaknesses of the paper.

Author information

Authors and Affiliations

Institute of Philosophy, University of Warsaw, Krakowskie Przedmieście 3, 00-927, Warsaw, Poland
Krzysztof Wójtowicz & Anna Wójtowicz

Authors

Krzysztof Wójtowicz
View author publications
You can also search for this author in PubMed Google Scholar
Anna Wójtowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Wójtowicz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 The Definition of a Markov Chain

Markov chains are random processes with discrete time (formally: sequences of random variables, indexed by natural numbers),^{Footnote 38} which satisfy the property of memorylessness, i.e. the Markov property. It means, that predictions on the future of the process depend only on the present state, not on the full history of the process—i.e. on the particular path, by which the system got to its present state from the initial state.

A Markov chain is specified by identifying:

1.
A set of states S = {s₁,…,s_N};
2.
A set of transition probabilities p_ij ≥ 0, for i, j = 1,…,N—i.e. the probabilities of changing from state s_i to state s_j. The probabilities obey the equations: $ \mathop \sum \limits_{k = 1}^{N} p_{ik} $ = 1, for i = 1,…,N (which means, that the probabilities of going from state s_i to any of the possible states sum up to one).^{Footnote 39}

Any Markov chain can be described by a N-by-N matrix, with the probabilities p_ij being the respective entries. States with p_ii = 1 (which means, that the system will never get out of the state s_i) are called absorbing.

A very convenient feature of Markov chains is the possibility to give an intuitive picture of the process in form of a transition graph—which exhibits the behavior of the system. This is a simple illustration: imagine a fly is changing its position according to some transition probabilities. There are four states: 0,1,2,3; a spider is lurking at the state 0, while something nice (for the fly) is waiting at state 3. So both 0 and 3 are absorbing states: when the fly gets there, the process terminates. The following matrix specifies the process:

STATE	0	1	2	3
0	1.0	0	0	0
1	0.3	0.4	0.3	0
2	0	0.3	0.4	0.3
3	0	0	0	1.0

And the respective graph is Graph 6 (we use the more general version, with p,q,r instead of 0.4, 0.3, 0.3) (Fig. 6, next page).

In general, it is convenient to think of travelling within the graph, from one vertex to another, until the process stops. The vertices represent the states of the system, and the edges represent transitions: we can think of them as representing actions (like the gambler tossing a coin, the fly choosing what to do next, or—as in this paper—drawing a ball from the urn).^{Footnote 40}

In the fly example, at each state, the fly can either: L—go left; W—wait; R—go right. When starting in 1, the sequence RWLWRR takes the fly to the state 3 (and the game terminates). Observe, that the “go left”, “wait”, “go right” actions for state 1 and 2 have the same probability (this assumption suits the purpose of the paper, as we assume that the respective actions within the games have constant probabilities). For notational convenience, take P(R) = p; P(L) = q; P(W) = r.

1.2 The Probability Space Ξ* = (Ω, Μ, P*) Associated with the Markov Graph

We assume here, that there are two absorbing states in the graph (WIN, LOSS), and the process begins in the initial state START. So, in the spider-fly example: 1 = START; 0 = LOSS; 3 = WIN.

The Ω* of the constructed probability space consist of all the paths, which can be travelled within the graph, starting from START and ending in an absorbing state. The probability of a particular path is computed by multiplying the respective transition probabilities. For instance, if RWLWRR is one of the paths for the spider-fly example, its probability is given by P*(RWLWRR) = prqrpp.

Formally, the set of paths generated by the graph can be defined as the language accepted by the graph treated as a deterministic finite state automaton. The edges of the automaton are labeled by the actions, which lead from one vertex to another. In the gambler’s ruin example these actions are H and T; in the fly example they are R, W, L—and throughout the paper the actions are drawing the balls: W(hite), R(ed), G(reen). As we only want to identify the paths generated by the graph, we can ignore the transition probabilities at this stage.^{Footnote 41}

The set of elementary events Ω* in the constructed space is defined as the set of generated paths—i.e. the language accepted by the automaton. Computing their probabilities is straightforward: for any sequence of actions/labels $ A_{{i_{1} j_{1} }} $ … $ A_{{i_{n} j_{n} }} $ (which form the accepted word) we multiply the corresponding transition probabilities. If the action $ A_{ij} $ leads from the state s_i to the state s_j, then the transition probability is p_ij. So, the probability P* of the sequence $ A_{{i_{1} j_{1} }} $ … $ A_{{i_{n} j_{n} }} $ (being an elementary event in the new space) is:

$$ {\text{P*}}\,(A_{{i_{1} j_{1} }} \ldots A_{{i_{n} j_{n} }} ) = p_{{i_{1} j_{1} }} \ldots p_{{i_{n} j_{n} }} $$

We have the Ω* and the probabilities of elementary events. As Ω* is countable (or finite in trivial cases), we take the respective σ-field M* to be the power set, i.e. M* = 2^Ω*. P* is defined for elementary events, and it extends in a unique way to all sets A ⊆ Ω* by the standard formula P*(A) = $ \mathop \sum \nolimits_{\omega \in A} {\text{P*}} \left( \omega \right) $. Finally, Ξ* = (Ω*, Μ*, P*) is the probability space corresponding to the graph.

Observe, that the probability space Ξ* = (Ω*, Μ*, P*) was constructed at the basis of a simple, initial probability space Ξ = (Ω, Μ, P), including the “atomic actions”. The space for the spider-fly example consisted of three elementary events R,L,W, with probabilities P(R) = p; P(L) = q; P(W) = r.

1.3 Absorbtion Probabilities of the Markov Chain

Consider the general case with N states s₁,…, s_N. We assume that there are two absorbing states, say s₁, and s_N. For the purpose of the presentation, it is convenient to identify s_N with victory, and s₁ with loss.

The process is described by the transition probabilities p_ij, for i,j = 1,2,…,N. As s₁ and s_N are absorbing states, it means that:

p₁₁ = 1; p_1,i = 0 for i ≠ 1.
P_NN = 1; p_Ni = 0 for i ≠ N.

Let P(i) be the probability of eventually reaching state s_N, starting from state s_i. By definition, P(1) = 0 and P(N) = 1 (the game is already over).

Fact The probabilities P(i) are the unique solution of the system of equations:

P(1) = $ 0 $
P(N) = 1
P(i) = $ \mathop \sum \limits_{k = 1}^{N} p_{ik} \;{\text{P}}\left( {\text{k}} \right) $, for i = 2,…,N−1

So, we have in general a linear system of N equations with N variables, and the solution is straightforward. It is also a basic result, that in such a graph, the probability, that the process will be eventually absorbed is 1. This is enough for our purposes.^{Footnote 42}

We end with the observation, that for absorbing Markov chains (like the ones considered here), the probability of absorption is 1. So if only it is possible for the process to be absorbed, it eventually will be absorbed. This means, that we do not have to worry about infinite paths—their probabilities are 0, and they need not be included in our Ω*.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Wójtowicz, K., Wójtowicz, A. A Stochastic Graphs Semantics for Conditionals. Erkenn 86, 1071–1105 (2021). https://doi.org/10.1007/s10670-019-00144-z

Download citation

Received: 18 August 2018
Accepted: 28 June 2019
Published: 22 November 2019
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10670-019-00144-z

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Stochastic Graphs Semantics for Conditionals

Abstract

Similar content being viewed by others

On the Probabilistic Notion of Causality: Models and Metalanguages

A Minimal Probability Space for Conditionals

Generalized logical operations among conditional events

1 A Simple One-Urn Example

1.1 A Stochastic Graph Representation

1.2 The Probability Space

1.3 A Comparison with Kaufmann’s (Stalnaker Bernoulli) Space

2 Two-Urn Examples: Global and Local Conditional Probabilities

3 The Global Interpretation

3.1 A Stochastic Graph Representation

3.2 The Probability Space

3.3 Kaufmann’s Equation (The Global Version)

4 Local Interpretation

4.1 The Stochastic Graph Representation

4.2 The Probability Space

4.3 Kaufmann’s Equation (The Local Version)

4.4 The Global Versus the Local Interpretation

5 Lewis’s Triviality Result: PC = CP?

6 Summary

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 The Definition of a Markov Chain

1.2 The Probability Space Ξ* = (Ω, Μ, P*) Associated with the Markov Graph

1.3 Absorbtion Probabilities of the Markov Chain

Rights and permissions

About this article

Cite this article

Navigation

A Stochastic Graphs Semantics for Conditionals

Abstract

Similar content being viewed by others

On the Probabilistic Notion of Causality: Models and Metalanguages

A Minimal Probability Space for Conditionals

Generalized logical operations among conditional events

1 A Simple One-Urn Example

1.1 A Stochastic Graph Representation

1.2 The Probability Space

1.3 A Comparison with Kaufmann’s (Stalnaker Bernoulli) Space

2 Two-Urn Examples: Global and Local Conditional Probabilities

3 The Global Interpretation

3.1 A Stochastic Graph Representation

3.2 The Probability Space

3.3 Kaufmann’s Equation (The Global Version)

4 Local Interpretation

4.1 The Stochastic Graph Representation

4.2 The Probability Space

4.3 Kaufmann’s Equation (The Local Version)

4.4 The Global Versus the Local Interpretation

5 Lewis’s Triviality Result: PC = CP?

6 Summary

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 The Definition of a Markov Chain

1.2 The Probability Space Ξ* = (Ω*, Μ*, P*) Associated with the Markov Graph

1.3 Absorbtion Probabilities of the Markov Chain

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation

1.2 The Probability Space Ξ* = (Ω, Μ, P*) Associated with the Markov Graph