Skip to main content

Conditionals, Conditional Probabilities, and Conditionalization

  • Chapter
  • First Online:
Bayesian Natural Language Semantics and Pragmatics

Part of the book series: Language, Cognition, and Mind ((LCAM,volume 2))

Abstract

Philosophers investigating the interpretation and use of conditional sentences have long been intrigued by the intuitive correspondence between the probability of a conditional ‘if A, then C’ and the conditional probability of C, given A. Attempts to account for this intuition within a general probabilistic theory of belief, meaning and use have been plagued by a danger of trivialization, which has proven to be remarkably recalcitrant and absorbed much of the creative effort in the area. But there is a strategy for avoiding triviality that has been known for almost as long as the triviality results themselves. What is lacking is a straightforward integration of this approach in a larger framework of belief representation and dynamics. This paper discusses some of the issues involved and proposes an account of belief update by conditionalization.

Thanks to Hans-Christian Schmitz and Henk Zeevat for organizing the ESSLLI 2014 workshop on Bayesian Natural-Language Semantics and Pragmatics, where I presented an early version of this paper. Other venues at which I presented related work include the research group “What if” at the University of Konstanz and the Logic Group at the University of Connecticut. I am grateful to the audiences at all these events for stimulating discussion and feedback. Thanks also to Hans-Christian Schmitz and Henk Zeevat for their patience during the preparation of this manuscript. All errors and misrepresentations are my own.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A \(\sigma \)-algebra on \(\varOmega \) is a non-empty set of subsets of \(\varOmega \) that is closed under complements and countable unions. A probability measure on \(\mathcal {F}\) is a countably additive function from \(\mathcal {F}\) to the real interval [0, 1] such that \(\mathrm {Pr} (\varOmega ) = 1\).

  2. 2.

    I write ‘\(\mathrm {Pr} (\theta =x)\)’ to refer to the probability of the event that \(\theta \) has value x. This is an abbreviation of the more cumbersome ‘’ I also assume, here and throughout this paper, that the range of the random variable is finite. This is guaranteed for \(\mathcal {L}_{A}^{0}\) under \(V\) in Definition 2, but becomes a non-trivial restriction in general. Nothing hinges on it, however: Giving it up would would merely require that the summations in the definitions be replaced with integrals.

  3. 3.

    This is not the place to rehearse the arguments for and against the material conditional as an adequate rendering of our intuitions about the meaning of the ‘if-then’ construction. The material analysis has its adherents in philosophy (Jackson 1979; Lewis 1986, among many others) and linguistics (see Abbott 2004, for recent arguments); but it is fair to say that, especially in the philosophical tradition, such proposals tend to be driven by frustration with technical obstacles (more on this in the next subsection), rather than pre-theoretical judgments. Empirically, the probabilistic interpretation of (RT) has strong and growing support (Evans and Over 2004; Oaksford and Chater 1994, 2003, 2007).

  4. 4.

    Lewis argued for the plausibility of (6) by invoking the Import-Export Principle, which in its probabilistic version requires that \(P(\psi \rightarrow \varphi | \chi )\) be equivalent to \(P(\varphi |\psi \chi )\). I avoid this move here because this principle is not universally accepted (see, for instance, Adams 1975; Kaufmann 2009).

  5. 5.

    See Lewis (1973), Stalnaker (1981) for some relevant arguments.

  6. 6.

    In van Fraassen’s original version, a conditional is true, rather than undefined, at a sequence not containing any tails at which the antecedent is true. The difference is of no consequence for the cases I discuss here. In general, I find the undefinedness of the conditional probability in such cases intuitively plausible and preferable, as it squares well with widely shared intuition (in the linguistic literature, at least) that indicative conditionals with impossible antecedents give rise to presupposition failure. Moreover, it follows from the results below that the (un)definedness is fairly well-behaved, in the sense that the value of the conditional is defined with probability zero or one, according as the probability of the antecedent is zero or non-zero.

  7. 7.

    In probability theory, an event happens “almost surely” if its probability is 1. This notion should not be confused with logical necessity.

  8. 8.

    As a historical side note, it is worth pointing out that some of the functionality delivered here by the Stalnaker Bernoulli model can also be achieved in a simpler model. This was shown by Jeffrey (1991), who developed the random-variable approach with intermediate truth values without relying on van Fraassen’s construction. But that approach has its limits, for instance when it comes to conditionals with conditional antecedents, and can be seen as superseded by the Stalnaker Bernoulli approach.

  9. 9.

    In a related sense, one may also think of a given set of sequences as representing all paths following an introspective (i.e., transitive and euclidean) doxastic accessibility relation. I leave the further exploration of this connection for future work.

  10. 10.

    An alternative way of achieving the same result would be to model belief update in terms of “truncation” of world sequences along the lines of the interpretation of conditionals, chopping off initial sub-sequences until the remaining tail verifies \(\varphi \). I will not go into the details of this operation here; it corresponds to the operation of shallow conditioning discussed in this subsection, for the same reason that the probabilities of conditionals equal the corresponding conditional probabilities.

  11. 11.

    As a simple example, consider the task of choosing a point (xy) at random from a plane. Fix some point \((x^*,y^*)\) and consider the conditional probability that \(y>y^*\), given \(x=x^*\) (intuitively, the conditional probability that the randomly chosen point will lie above \((x^*,y^*)\), given that it lies on the vertical line through \((x^*,y^*)\)). We have clear intuitions as to what this conditional probability is and how it depends on the location of the cutoff point \(x^*\); but the probability that the randomly chosen point lies on the line is 0.

  12. 12.

    Notice, incidentally, that the view on conditional probability just endorsed is not at odds with the remarks on the undefinedness of the values of conditionals at world sequences throughout which the antecedent is false (see footnote 6). For one thing, technically the undefinedness discussed there does not enter the picture because some conditional probability is undefined. But that aside, I emphasize that I do not mean to claim that conditional probabilities given zero-probability events are always defined, only that they can be.

  13. 13.

    However, is itself not always defined: It is undefined if either \(\mathrm {Pr} (Y_i) = 0\) for any i, or if the function does not not converge as n approaches infinity.

References

  • Abbott, B. (2004). Some remarks on indicative conditionals. In R. B. Young (Ed.), Proceedings of SALT (Vol. 14, pp. 1–19). Ithaca: Cornell University.

    Google Scholar 

  • Adams, E. (1965). The logic of conditionals. Inquiry, 8, 166–197.

    Article  Google Scholar 

  • Adams, E. (1975). The logic of conditionals. Dordrecht: D. Reidel.

    Book  Google Scholar 

  • Bennett, J. (2003). A philosophical guide to conditionals. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Edgington, D. (1995). On conditionals. Mind, 104(414), 235–329.

    Article  Google Scholar 

  • Eells, E., & Skyrms, B. (Eds.). (1994). Probabilities and conditionals: Belief revision and rational decision. Cambridge: Cambridge University Press.

    Google Scholar 

  • Evans, J. S. B. T., & Over, D. E. (2004). If. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Fetzer, J. H. (Ed.). (1988). Probability and Causality (Vol. 192)., Studies in Epistemology, Logic, Methodology, and Philosophy of Science Dordrecht: D. Reidel.

    Google Scholar 

  • van Fraassen, B. C. (1976). Probabilities of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Foundations of probability theory, statistical inference, and statistical theories of science. The University of Western Ontario Series in Philosophy of Science (Vol. 1, pp. 261–308) Dordrecht: D. Reidel.

    Google Scholar 

  • van Fraassen, B. C. (1980). Review of Brian Ellis, rational belief systems. Canadian Journal of Philosophy, 10, 457–511.

    Google Scholar 

  • Gibbard, A. (1981). Two recent theories of conditionals. In Harper, W. L., Stalnaker, R., & Pearce, G. (Eds.), (pp. 211–247).

    Google Scholar 

  • Hájek, A. (1994). Triviality on the cheap?. In Eells, E. & Skyrms, B. (Eds.), (pp. 113–140).

    Google Scholar 

  • Hájek, A. (2003). What conditional probability could not be. Synthese, 137, 273–323.

    Article  Google Scholar 

  • Hájek, A. (2011). Conditional probability. In P. S. Bandyopadhyay & M. R. Forster (Eds.), Philosophy of statistics. Handbook of the philosophy of science (Vol. 7). Elsevier B.V. (Series editors: D.M. Gabbay, P. Thagard & J. Woods

    Google Scholar 

  • Hájek, A. (2012). The fall of Adams’ thesis? Journal of Logic, Language and Information, 21(2), 145–161.

    Article  Google Scholar 

  • Hájek, A., & Hall, N. (1994). The hypothesis of the conditional construal of conditional probability. In E. Eells & B. Skyrms (Eds.), (pp. 75–110).

    Google Scholar 

  • Harper, W. L., Stalnaker, R., & Pearce, G. (Eds.). (1981). Ifs: Conditionals, belief, decision, chance, and time. Dordrecht: D. Reidel.

    Google Scholar 

  • Jackson, F. (1979). On assertion and indicative conditionals. Philosophical Review, 88, 565–589.

    Article  Google Scholar 

  • Jeffrey, R. C. (1964). If. Journal of Philosophy, 61, 702–703.

    Google Scholar 

  • Jeffrey, R. C. (1991). Matter-of-fact conditionals. The Symposia Read at the Joint Session of the Aristotelian Society and the Mind Association at the University of Durham (pp. 161–183). The Aristotelian Society, July 1991. Supplementary Volume 65.

    Google Scholar 

  • Jeffrey, R. C. (2004). Subjective probability: The real thing. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Kaufmann, S. (2005). Conditional predictions: A probabilistic account. Linguistics and Philosophy, 28(2), 181–231.

    Article  Google Scholar 

  • Kaufmann, S. (2009). Conditionals right and left: Probabilities for the whole family. Journal of Philosophical Logic, 38, 1–53.

    Article  Google Scholar 

  • Lewis, D. (1973). Counterfactuals. Cambridge: Harvard University Press.

    Google Scholar 

  • Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Review, 85, 297–315.

    Article  Google Scholar 

  • Lewis, D. (1986). Postscript to Probabilities of conditionals and conditional probabilities. Philosophical papers (Vol. 2, pp. 152–156). Oxford: Oxford University Press.

    Google Scholar 

  • Mellor, D. H. (Eds.). (1990). Frank Ramsey: Philosophical papers. Cambridge: Cambridge University Press.

    Google Scholar 

  • Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal data selection. Psychological Review, 101, 608–631.

    Article  Google Scholar 

  • Oaksford, M., & Chater, N. (2003). Conditional probability and the cognitive science of conditional reasoning. Mind and Language, 18(4), 359–379.

    Article  Google Scholar 

  • Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Ramsey, F. P. (1929). General propositions and causality. Reprinted in Mellor (1990), (pp. 145–163).

    Google Scholar 

  • Stalnaker, R. (1968). A theory of conditionals. Studies in logical theory. American Philosophical Quarterly, Monograph (Vol. 2, pp. 98-112). Blackwell.

    Google Scholar 

  • Stalnaker, R. (1970). Probablity and conditionals. Philosophy of Science, 37, 64–80.

    Article  Google Scholar 

  • Stalnaker, R. (1981). A defense of conditional excluded middle. In W. Harper, et al. (Eds.), (pp. 87–104).

    Google Scholar 

  • Stalnaker, R., & Jeffrey, R. (1994). Conditionals as random variables. In E. Eells & B. Skyrms (Eds.), (pp. 31–46).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Kaufmann .

Editor information

Editors and Affiliations

Appendix: Proofs

Appendix: Proofs

Proposition

1 For \(X\in \mathcal {F} \), if \(\mathrm {Pr} (X)>0\), then \({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1\).

Proof

Notice that \(\bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \) is the set of all sequences containing at least one X-world, thus its complement is \(\overline{X} ^*\). Now

\({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1 - {\mathrm {Pr}}^*\left( \overline{X} ^*\right) \)

\(= 1 - \lim _{n\rightarrow \infty }{\mathrm {Pr}}^*\left( \overline{X} ^n \times \varOmega ^* \right) = 1 - \lim _{n\rightarrow \infty }\mathrm {Pr} \left( \overline{X} \right) ^n = 1 \text { since } \mathrm {Pr} (\overline{X}) < 1\).\(\square \)

Lemma

1 If \(\mathrm {Pr} (X)>0\), then \(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} \right) ^n = 1/\mathrm {Pr} (X)\).

Proof

\(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} \right) ^n \times \mathrm {Pr} \left( X\right) = \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{X} \right) ^n \times \mathrm {Pr} \left( X\right) \right) \)

\(= \sum _{n\in \mathbb {N}} {\mathrm {Pr}}^*\left( \overline{X} ^n \times X \times \varOmega ^* \right) = {\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1\) by Proposition 1.\(\square \)

Theorem

1 For \(A,C \in \mathcal {L}_{A}^{0} \), if \(\mathrm {P} (A)>0\), then \(\mathrm {P} ^* \left( A \rightarrow C \right) = \mathrm {P} (C|A)\).

Proof

By Definition 6, the set of sequences \(\omega ^*\) such that \(V ^* (A\rightarrow C)(\omega ^*)=1\) is the union Since the sets for different values of n are mutually disjoint, the probability of the union is the sum of the probabilities for all n. Now, for all \(X,Y\in \mathcal {F} \),

\({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{Y} ^n \times (X \cap Y) \times \varOmega ^* \right) \right) = \sum _{n\in \mathbb {N}} {\mathrm {Pr}}^*\left( \overline{Y} ^n \times (X \cap Y) \times \varOmega ^* \right) \)

\(= \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{Y} \right) ^n \times \mathrm {Pr} (X \cap Y) \right) = \sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{Y} \right) ^n \times \mathrm {Pr} (X \cap Y)\)

\(= \mathrm {Pr} (X \cap Y) / \mathrm {Pr} (Y) \text { by Lemma}\) 1.

In particular, let XY be the set of worlds in \(\varOmega \) at which \(V (C)\) and \(V (A)\) are true, respectively.\(\square \)

Proposition

2 If \(\mathrm {Pr} (Z) > 0\), then

Proof

\(\square \)

Proposition

3 If is defined, then

Proof

Since is defined, \(\mathrm {Pr} (Y_n) > 0\) for all n. Thus

\(= \lim _{n\rightarrow \infty } \dfrac{{\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) } \times \lim _{n\rightarrow \infty } {\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) \)

\(= \lim _{n\rightarrow \infty } \left( \dfrac{{\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) } \times {\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) \right) \)

\(= \lim _{n\rightarrow \infty } {\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) = {\mathrm {Pr}}^*\left( \mathbf {X} \cap \mathbf {Y} \right) \) \(\square \)

Proposition

4 If is defined and \(\mathbf {Y} \subseteq \mathbf {X} \), then

Proof

For all \(i\ge 1\), \(X_i \cap Y_i = Y_i\) since \(\mathbf {Y} \subseteq \mathbf {X} \), and \(\mathrm {Pr} (Y_i) > 0\) since is defined. Thus \(\square \)

The following auxiliary result will be useful in the subsequent proofs.

Proposition 9

If \(\mathrm {Pr} (Z) > 0\), then for \(X_i \in \mathcal {F}\) and \(n \in \mathbb {N}\),

Proof

Immediate because \(Z \subseteq \varOmega \).\(\square \)

The significance of Proposition 9 derives from the fact that the sets of sequences at which a given sentence in \(\mathcal {L}_{A}^{}\) is true can be constructed (using set operations under which \(\mathcal {F} ^*\) is closed) out of sequence sets ending in \(\varOmega ^* \). This is obvious for the non-conditional sentences in \(\mathcal {L}_{A}^{0}\). For conditionals \(\varphi \rightarrow \psi \) the relevant set is the union of sets of sequences consisting of n \(\overline{\varphi }\)-worlds followed by a \(\varphi \psi \)-world. For each n, the corresponding set ends in \(\varOmega ^* \) and therefore can be conditioned upon \(Z^*\) as shown in Proposition 9. Since these sets for different numbers n are mutually disjoint, the probability of their union is just the sum of their individual probabilities.

Proposition

5 If \(\mathrm {Pr} (X \cap Z) > 0\), then

Proof

Since \(\mathrm {Pr} (X \cap Z)>0\), \(\mathrm {Pr} (\overline{X} \cap Z) < \mathrm {Pr} (Z)\). Thus

\(\square \)

Lemma

2 If \(\mathrm {Pr} (X \cap Z) > 0\), then \(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} |Z\right) ^n = 1 / \mathrm {Pr} (X|Z)\).

Proof

Since \(\mathrm {Pr} (X \cap Z) = \mathrm {Pr} (Z) \times \mathrm {Pr} (X|Z)\), both \(\mathrm {Pr} (Z) > 0\) and \(\mathrm {Pr} (X|Z) > 0\).

\(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} |Z\right) ^n \times \mathrm {Pr} \left( X|Z\right) = \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{X} |Z\right) ^n \times \mathrm {Pr} \left( X|Z\right) \right) \)

\(= \sum _{n\in \mathbb {N}} \dfrac{\mathrm {Pr} \left( \overline{X} \cap Z\right) ^n \times \mathrm {Pr} \left( X \cap Z\right) }{\mathrm {Pr} \left( Z\right) ^{n+1}} = \sum _{n\in \mathbb {N}} \dfrac{{\mathrm {Pr}}^*\left( (\overline{X} \cap Z)^n \times (X \cap Z) \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Z^{n+1} \times \varOmega ^* \right) }\)

by Proposition 9

by Proposition 5.\(\square \)

Theorem

2 If \(\mathrm {P} \left( BC\right) > 0\), then \(\mathrm {P} ^* \left( C \rightarrow D|B^* \right) = \mathrm {P} \left( D|BC \right) \).

Proof

\(\mathrm {P} ^* \left( C \rightarrow D|B^* \right) = \lim _{n\rightarrow \infty } \dfrac{\mathrm {P} ^* (((C\rightarrow D) \wedge B)^n \times \varOmega ^*)}{\mathrm {P} ^* \left( B^n \times \varOmega ^* \right) }\)

\(= \lim _{n\rightarrow \infty }\dfrac{\sum _{i=0}^{n-1}\mathrm {P} (\overline{C} B)^i \times \mathrm {P} (CDB)}{\mathrm {P} (B)^n}\)

\(= \lim _{n\rightarrow \infty }\sum _{i=0}^{n-1}\mathrm {P} (\overline{C} |B)^i \times \mathrm {P} (CD|B)\)

\(= \mathrm {P} (CD|B) / \mathrm {P} (C|B)\) by Lemma 2

\(= \mathrm {P} (D|BC)\) \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Kaufmann, S. (2015). Conditionals, Conditional Probabilities, and Conditionalization. In: Zeevat, H., Schmitz, HC. (eds) Bayesian Natural Language Semantics and Pragmatics. Language, Cognition, and Mind, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-17064-0_4

Download citation

Publish with us

Policies and ethics