Abstract
Philosophers investigating the interpretation and use of conditional sentences have long been intrigued by the intuitive correspondence between the probability of a conditional ‘if A, then C’ and the conditional probability of C, given A. Attempts to account for this intuition within a general probabilistic theory of belief, meaning and use have been plagued by a danger of trivialization, which has proven to be remarkably recalcitrant and absorbed much of the creative effort in the area. But there is a strategy for avoiding triviality that has been known for almost as long as the triviality results themselves. What is lacking is a straightforward integration of this approach in a larger framework of belief representation and dynamics. This paper discusses some of the issues involved and proposes an account of belief update by conditionalization.
Thanks to Hans-Christian Schmitz and Henk Zeevat for organizing the ESSLLI 2014 workshop on Bayesian Natural-Language Semantics and Pragmatics, where I presented an early version of this paper. Other venues at which I presented related work include the research group “What if” at the University of Konstanz and the Logic Group at the University of Connecticut. I am grateful to the audiences at all these events for stimulating discussion and feedback. Thanks also to Hans-Christian Schmitz and Henk Zeevat for their patience during the preparation of this manuscript. All errors and misrepresentations are my own.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A \(\sigma \)-algebra on \(\varOmega \) is a non-empty set of subsets of \(\varOmega \) that is closed under complements and countable unions. A probability measure on \(\mathcal {F}\) is a countably additive function from \(\mathcal {F}\) to the real interval [0, 1] such that \(\mathrm {Pr} (\varOmega ) = 1\).
- 2.
I write ‘\(\mathrm {Pr} (\theta =x)\)’ to refer to the probability of the event that \(\theta \) has value x. This is an abbreviation of the more cumbersome ‘’ I also assume, here and throughout this paper, that the range of the random variable is finite. This is guaranteed for \(\mathcal {L}_{A}^{0}\) under \(V\) in Definition 2, but becomes a non-trivial restriction in general. Nothing hinges on it, however: Giving it up would would merely require that the summations in the definitions be replaced with integrals.
- 3.
This is not the place to rehearse the arguments for and against the material conditional as an adequate rendering of our intuitions about the meaning of the ‘if-then’ construction. The material analysis has its adherents in philosophy (Jackson 1979; Lewis 1986, among many others) and linguistics (see Abbott 2004, for recent arguments); but it is fair to say that, especially in the philosophical tradition, such proposals tend to be driven by frustration with technical obstacles (more on this in the next subsection), rather than pre-theoretical judgments. Empirically, the probabilistic interpretation of (RT) has strong and growing support (Evans and Over 2004; Oaksford and Chater 1994, 2003, 2007).
- 4.
Lewis argued for the plausibility of (6) by invoking the Import-Export Principle, which in its probabilistic version requires that \(P(\psi \rightarrow \varphi | \chi )\) be equivalent to \(P(\varphi |\psi \chi )\). I avoid this move here because this principle is not universally accepted (see, for instance, Adams 1975; Kaufmann 2009).
- 5.
- 6.
In van Fraassen’s original version, a conditional is true, rather than undefined, at a sequence not containing any tails at which the antecedent is true. The difference is of no consequence for the cases I discuss here. In general, I find the undefinedness of the conditional probability in such cases intuitively plausible and preferable, as it squares well with widely shared intuition (in the linguistic literature, at least) that indicative conditionals with impossible antecedents give rise to presupposition failure. Moreover, it follows from the results below that the (un)definedness is fairly well-behaved, in the sense that the value of the conditional is defined with probability zero or one, according as the probability of the antecedent is zero or non-zero.
- 7.
In probability theory, an event happens “almost surely” if its probability is 1. This notion should not be confused with logical necessity.
- 8.
As a historical side note, it is worth pointing out that some of the functionality delivered here by the Stalnaker Bernoulli model can also be achieved in a simpler model. This was shown by Jeffrey (1991), who developed the random-variable approach with intermediate truth values without relying on van Fraassen’s construction. But that approach has its limits, for instance when it comes to conditionals with conditional antecedents, and can be seen as superseded by the Stalnaker Bernoulli approach.
- 9.
In a related sense, one may also think of a given set of sequences as representing all paths following an introspective (i.e., transitive and euclidean) doxastic accessibility relation. I leave the further exploration of this connection for future work.
- 10.
An alternative way of achieving the same result would be to model belief update in terms of “truncation” of world sequences along the lines of the interpretation of conditionals, chopping off initial sub-sequences until the remaining tail verifies \(\varphi \). I will not go into the details of this operation here; it corresponds to the operation of shallow conditioning discussed in this subsection, for the same reason that the probabilities of conditionals equal the corresponding conditional probabilities.
- 11.
As a simple example, consider the task of choosing a point (x, y) at random from a plane. Fix some point \((x^*,y^*)\) and consider the conditional probability that \(y>y^*\), given \(x=x^*\) (intuitively, the conditional probability that the randomly chosen point will lie above \((x^*,y^*)\), given that it lies on the vertical line through \((x^*,y^*)\)). We have clear intuitions as to what this conditional probability is and how it depends on the location of the cutoff point \(x^*\); but the probability that the randomly chosen point lies on the line is 0.
- 12.
Notice, incidentally, that the view on conditional probability just endorsed is not at odds with the remarks on the undefinedness of the values of conditionals at world sequences throughout which the antecedent is false (see footnote 6). For one thing, technically the undefinedness discussed there does not enter the picture because some conditional probability is undefined. But that aside, I emphasize that I do not mean to claim that conditional probabilities given zero-probability events are always defined, only that they can be.
- 13.
However, is itself not always defined: It is undefined if either \(\mathrm {Pr} (Y_i) = 0\) for any i, or if the function does not not converge as n approaches infinity.
References
Abbott, B. (2004). Some remarks on indicative conditionals. In R. B. Young (Ed.), Proceedings of SALT (Vol. 14, pp. 1–19). Ithaca: Cornell University.
Adams, E. (1965). The logic of conditionals. Inquiry, 8, 166–197.
Adams, E. (1975). The logic of conditionals. Dordrecht: D. Reidel.
Bennett, J. (2003). A philosophical guide to conditionals. Oxford: Oxford University Press.
Edgington, D. (1995). On conditionals. Mind, 104(414), 235–329.
Eells, E., & Skyrms, B. (Eds.). (1994). Probabilities and conditionals: Belief revision and rational decision. Cambridge: Cambridge University Press.
Evans, J. S. B. T., & Over, D. E. (2004). If. Oxford: Oxford University Press.
Fetzer, J. H. (Ed.). (1988). Probability and Causality (Vol. 192)., Studies in Epistemology, Logic, Methodology, and Philosophy of Science Dordrecht: D. Reidel.
van Fraassen, B. C. (1976). Probabilities of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Foundations of probability theory, statistical inference, and statistical theories of science. The University of Western Ontario Series in Philosophy of Science (Vol. 1, pp. 261–308) Dordrecht: D. Reidel.
van Fraassen, B. C. (1980). Review of Brian Ellis, rational belief systems. Canadian Journal of Philosophy, 10, 457–511.
Gibbard, A. (1981). Two recent theories of conditionals. In Harper, W. L., Stalnaker, R., & Pearce, G. (Eds.), (pp. 211–247).
Hájek, A. (1994). Triviality on the cheap?. In Eells, E. & Skyrms, B. (Eds.), (pp. 113–140).
Hájek, A. (2003). What conditional probability could not be. Synthese, 137, 273–323.
Hájek, A. (2011). Conditional probability. In P. S. Bandyopadhyay & M. R. Forster (Eds.), Philosophy of statistics. Handbook of the philosophy of science (Vol. 7). Elsevier B.V. (Series editors: D.M. Gabbay, P. Thagard & J. Woods
Hájek, A. (2012). The fall of Adams’ thesis? Journal of Logic, Language and Information, 21(2), 145–161.
Hájek, A., & Hall, N. (1994). The hypothesis of the conditional construal of conditional probability. In E. Eells & B. Skyrms (Eds.), (pp. 75–110).
Harper, W. L., Stalnaker, R., & Pearce, G. (Eds.). (1981). Ifs: Conditionals, belief, decision, chance, and time. Dordrecht: D. Reidel.
Jackson, F. (1979). On assertion and indicative conditionals. Philosophical Review, 88, 565–589.
Jeffrey, R. C. (1964). If. Journal of Philosophy, 61, 702–703.
Jeffrey, R. C. (1991). Matter-of-fact conditionals. The Symposia Read at the Joint Session of the Aristotelian Society and the Mind Association at the University of Durham (pp. 161–183). The Aristotelian Society, July 1991. Supplementary Volume 65.
Jeffrey, R. C. (2004). Subjective probability: The real thing. Cambridge: Cambridge University Press.
Kaufmann, S. (2005). Conditional predictions: A probabilistic account. Linguistics and Philosophy, 28(2), 181–231.
Kaufmann, S. (2009). Conditionals right and left: Probabilities for the whole family. Journal of Philosophical Logic, 38, 1–53.
Lewis, D. (1973). Counterfactuals. Cambridge: Harvard University Press.
Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Review, 85, 297–315.
Lewis, D. (1986). Postscript to Probabilities of conditionals and conditional probabilities. Philosophical papers (Vol. 2, pp. 152–156). Oxford: Oxford University Press.
Mellor, D. H. (Eds.). (1990). Frank Ramsey: Philosophical papers. Cambridge: Cambridge University Press.
Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal data selection. Psychological Review, 101, 608–631.
Oaksford, M., & Chater, N. (2003). Conditional probability and the cognitive science of conditional reasoning. Mind and Language, 18(4), 359–379.
Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford: Oxford University Press.
Ramsey, F. P. (1929). General propositions and causality. Reprinted in Mellor (1990), (pp. 145–163).
Stalnaker, R. (1968). A theory of conditionals. Studies in logical theory. American Philosophical Quarterly, Monograph (Vol. 2, pp. 98-112). Blackwell.
Stalnaker, R. (1970). Probablity and conditionals. Philosophy of Science, 37, 64–80.
Stalnaker, R. (1981). A defense of conditional excluded middle. In W. Harper, et al. (Eds.), (pp. 87–104).
Stalnaker, R., & Jeffrey, R. (1994). Conditionals as random variables. In E. Eells & B. Skyrms (Eds.), (pp. 31–46).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Proofs
Appendix: Proofs
Proposition
1 For \(X\in \mathcal {F} \), if \(\mathrm {Pr} (X)>0\), then \({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1\).
Proof
Notice that \(\bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \) is the set of all sequences containing at least one X-world, thus its complement is \(\overline{X} ^*\). Now
\({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1 - {\mathrm {Pr}}^*\left( \overline{X} ^*\right) \)
\(= 1 - \lim _{n\rightarrow \infty }{\mathrm {Pr}}^*\left( \overline{X} ^n \times \varOmega ^* \right) = 1 - \lim _{n\rightarrow \infty }\mathrm {Pr} \left( \overline{X} \right) ^n = 1 \text { since } \mathrm {Pr} (\overline{X}) < 1\).\(\square \)
Lemma
1 If \(\mathrm {Pr} (X)>0\), then \(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} \right) ^n = 1/\mathrm {Pr} (X)\).
Proof
\(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} \right) ^n \times \mathrm {Pr} \left( X\right) = \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{X} \right) ^n \times \mathrm {Pr} \left( X\right) \right) \)
\(= \sum _{n\in \mathbb {N}} {\mathrm {Pr}}^*\left( \overline{X} ^n \times X \times \varOmega ^* \right) = {\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{X} ^n \times X \times \varOmega ^* \right) \right) = 1\) by Proposition 1.\(\square \)
Theorem
1 For \(A,C \in \mathcal {L}_{A}^{0} \), if \(\mathrm {P} (A)>0\), then \(\mathrm {P} ^* \left( A \rightarrow C \right) = \mathrm {P} (C|A)\).
Proof
By Definition 6, the set of sequences \(\omega ^*\) such that \(V ^* (A\rightarrow C)(\omega ^*)=1\) is the union Since the sets for different values of n are mutually disjoint, the probability of the union is the sum of the probabilities for all n. Now, for all \(X,Y\in \mathcal {F} \),
\({\mathrm {Pr}}^*\left( \bigcup _{n\in \mathbb {N}} \left( \overline{Y} ^n \times (X \cap Y) \times \varOmega ^* \right) \right) = \sum _{n\in \mathbb {N}} {\mathrm {Pr}}^*\left( \overline{Y} ^n \times (X \cap Y) \times \varOmega ^* \right) \)
\(= \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{Y} \right) ^n \times \mathrm {Pr} (X \cap Y) \right) = \sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{Y} \right) ^n \times \mathrm {Pr} (X \cap Y)\)
\(= \mathrm {Pr} (X \cap Y) / \mathrm {Pr} (Y) \text { by Lemma}\) 1.
In particular, let X, Y be the set of worlds in \(\varOmega \) at which \(V (C)\) and \(V (A)\) are true, respectively.\(\square \)
Proposition
2 If \(\mathrm {Pr} (Z) > 0\), then
Proof
\(\square \)
Proposition
3 If is defined, then
Proof
Since is defined, \(\mathrm {Pr} (Y_n) > 0\) for all n. Thus
\(= \lim _{n\rightarrow \infty } \dfrac{{\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) } \times \lim _{n\rightarrow \infty } {\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) \)
\(= \lim _{n\rightarrow \infty } \left( \dfrac{{\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) } \times {\mathrm {Pr}}^*\left( Y_1 \times \cdots \times Y_n \times \varOmega ^* \right) \right) \)
\(= \lim _{n\rightarrow \infty } {\mathrm {Pr}}^*\left( X_1 \cap Y_1 \times \cdots \times X_n \cap Y_n \times \varOmega ^* \right) = {\mathrm {Pr}}^*\left( \mathbf {X} \cap \mathbf {Y} \right) \) \(\square \)
Proposition
4 If is defined and \(\mathbf {Y} \subseteq \mathbf {X} \), then
Proof
For all \(i\ge 1\), \(X_i \cap Y_i = Y_i\) since \(\mathbf {Y} \subseteq \mathbf {X} \), and \(\mathrm {Pr} (Y_i) > 0\) since is defined. Thus \(\square \)
The following auxiliary result will be useful in the subsequent proofs.
Proposition 9
If \(\mathrm {Pr} (Z) > 0\), then for \(X_i \in \mathcal {F}\) and \(n \in \mathbb {N}\),
Proof
Immediate because \(Z \subseteq \varOmega \).\(\square \)
The significance of Proposition 9 derives from the fact that the sets of sequences at which a given sentence in \(\mathcal {L}_{A}^{}\) is true can be constructed (using set operations under which \(\mathcal {F} ^*\) is closed) out of sequence sets ending in \(\varOmega ^* \). This is obvious for the non-conditional sentences in \(\mathcal {L}_{A}^{0}\). For conditionals \(\varphi \rightarrow \psi \) the relevant set is the union of sets of sequences consisting of n \(\overline{\varphi }\)-worlds followed by a \(\varphi \psi \)-world. For each n, the corresponding set ends in \(\varOmega ^* \) and therefore can be conditioned upon \(Z^*\) as shown in Proposition 9. Since these sets for different numbers n are mutually disjoint, the probability of their union is just the sum of their individual probabilities.
Proposition
5 If \(\mathrm {Pr} (X \cap Z) > 0\), then
Proof
Since \(\mathrm {Pr} (X \cap Z)>0\), \(\mathrm {Pr} (\overline{X} \cap Z) < \mathrm {Pr} (Z)\). Thus
\(\square \)
Lemma
2 If \(\mathrm {Pr} (X \cap Z) > 0\), then \(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} |Z\right) ^n = 1 / \mathrm {Pr} (X|Z)\).
Proof
Since \(\mathrm {Pr} (X \cap Z) = \mathrm {Pr} (Z) \times \mathrm {Pr} (X|Z)\), both \(\mathrm {Pr} (Z) > 0\) and \(\mathrm {Pr} (X|Z) > 0\).
\(\sum _{n\in \mathbb {N}} \mathrm {Pr} \left( \overline{X} |Z\right) ^n \times \mathrm {Pr} \left( X|Z\right) = \sum _{n\in \mathbb {N}} \left( \mathrm {Pr} \left( \overline{X} |Z\right) ^n \times \mathrm {Pr} \left( X|Z\right) \right) \)
\(= \sum _{n\in \mathbb {N}} \dfrac{\mathrm {Pr} \left( \overline{X} \cap Z\right) ^n \times \mathrm {Pr} \left( X \cap Z\right) }{\mathrm {Pr} \left( Z\right) ^{n+1}} = \sum _{n\in \mathbb {N}} \dfrac{{\mathrm {Pr}}^*\left( (\overline{X} \cap Z)^n \times (X \cap Z) \times \varOmega ^* \right) }{{\mathrm {Pr}}^*\left( Z^{n+1} \times \varOmega ^* \right) }\)
by Proposition 9
by Proposition 5.\(\square \)
Theorem
2 If \(\mathrm {P} \left( BC\right) > 0\), then \(\mathrm {P} ^* \left( C \rightarrow D|B^* \right) = \mathrm {P} \left( D|BC \right) \).
Proof
\(\mathrm {P} ^* \left( C \rightarrow D|B^* \right) = \lim _{n\rightarrow \infty } \dfrac{\mathrm {P} ^* (((C\rightarrow D) \wedge B)^n \times \varOmega ^*)}{\mathrm {P} ^* \left( B^n \times \varOmega ^* \right) }\)
\(= \lim _{n\rightarrow \infty }\dfrac{\sum _{i=0}^{n-1}\mathrm {P} (\overline{C} B)^i \times \mathrm {P} (CDB)}{\mathrm {P} (B)^n}\)
\(= \lim _{n\rightarrow \infty }\sum _{i=0}^{n-1}\mathrm {P} (\overline{C} |B)^i \times \mathrm {P} (CD|B)\)
\(= \mathrm {P} (CD|B) / \mathrm {P} (C|B)\) by Lemma 2
\(= \mathrm {P} (D|BC)\) \(\square \)
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kaufmann, S. (2015). Conditionals, Conditional Probabilities, and Conditionalization. In: Zeevat, H., Schmitz, HC. (eds) Bayesian Natural Language Semantics and Pragmatics. Language, Cognition, and Mind, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-17064-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-17064-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17063-3
Online ISBN: 978-3-319-17064-0
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)