## Abstract

Improbable knowing is knowing something even though it is almost certain on one’s evidence at the time that one does not know that thing. Once probabilities on the agent’s evidence are introduced into epistemic logic in a very natural way, it is easy to construct models of improbable knowing, some of which have realistic interpretations, for instance concerning agents like us with limited powers of perceptual discrimination. Improbable knowing is an extreme case of failure of the KK principle, that is, of a case of knowing something even though one does not know at the time that one knows that thing. A generalization of the argument yields cases of improbable rationality, in which it is rational for one to do something even though it is almost certain on one’s evidence at the time that it is not rational for one to do that thing. When the models are elaborated to represent appearances and beliefs as well as knowledge, they turn out to contain Gettier cases. Neglect of the possibility of improbable knowing may cause some sceptical claims and claims of the non-closure of knowledge under competent deduction to look more plausible than they deserve to. A formal appendix explores the closely related question of the conditions under which a reflection principle is violated. The principle says that the evidential probability of a proposition conditional on the evidential probability of that proposition’s being *c* is itself *c*.

This is a preview of subscription content, log in to check access.

## Notes

- 1.
For an argument that epistemically possible propositions can have probability 0, even when infinitesimal probabilities are allowed, see Williamson (2007b).

- 2.
The talk of discrimination is just shorthand for talk of how much the subject can know about what time it is (for more discussion see Williamson 1990). In objecting to an earlier version of the argument, Conee and Feldman (2011) assume that indiscriminable positions of the hand appear visually to the subject in the same way, but this assumption is unwarranted; they might appear in different ways between which the subject cannot discriminate.

- 3.
Of effectively the same example, Conee and Feldman (2011) claim both that ‘Even the known proposition is not stated’ and that ‘The proposition that the pointer points somewhere in the relevant arc is the proposition that S [the subject] knows’. Their confusion may result from the presentation of the known proposition as

*R*(*w*), rather than by a sentence you, the subject, use to express it. They then suppose that the relevant sentence is ‘It is pointing somewhere in there’, where S means ‘there’ to identify ‘S’s exact discriminatory limits as S knows them to be’. As explained in the text, this is not the pertinent way to take the example. - 4.
Unlike the alternative examples of improbable knowing proposed by Conee and Feldman (2011), the example does not depend on a belief condition on knowledge, but rather on its more specifically epistemic character (the model also shows how a proposition can have evidential probability 1 even though the evidential probability that it has evidential probability 1 is virtually 0). This is important as a pointer to explaining why we may be reluctant to ascribe knowledge of

*p*to someone who on their own evidence certainly believes*p*but is unlikely to know*p*. - 5.
One should not get the impression that the case against the KK principle itself depends on the use of standard formal models of epistemic models. The anti-KK argument at Williamson (2000: 114–18) makes no such appeal. Their use here is to enable the calculation of evidential probabilities.

- 6.
On an epistemic account of vagueness, such variable margins for error yield distinctive forms of higher-order vagueness. Williamson (1999: 136–8) argues that if the ‘clearly’ operator for vagueness obeys the analogue of the B (for ‘Brouwersche’) axiom

*p*→*K*¬*K*¬*p*(which corresponds to the condition of symmetry on*R*) then any formula with second-order vagueness has*n*th-order vagueness for every*n*> 2, but does not endorse the B axiom for ‘clearly’. In response, Mahtani (2008) uses variable margins for error to argue against the B axiom for ‘clearly’ and suggests that they allow vagueness to cut out at any order. Dorr (2008) provides a formal model in which he proves Mahtani’s suggestion to hold. These arguments all have analogues for the overtly epistemic case. - 7.
Stalnaker (2006) has the axiom schema

*Bp*→*BKp*(which entails*Bp*→*BK*^{n}*p*for arbitrary iterations*K*^{n}of*K*). - 8.
- 9.
For a critique of internalist misinterpretations of excuses as justification see Williamson (2007a).

- 10.
Such a principle is called ‘intuitive closure’ at Williamson (2000: 117–18).

- 11.
- 12.
A similar problem arises for single-premise closure principles when one competently carries out a long chain of single-premise deductive steps, each with a small epistemic probability of inferential error (in the multi-premise case, for simplicity, one’s deductive competence is treated as beyond doubt); see Lasonen-Aarnio (2008) for discussion. Here is a parallel account of that case. One knows the premise without knowing that one knows it. For each deductive step, one carries it out competently without knowing that one does so. By single-premise closure, one knows the conclusion, without knowing that one knows it. For each deductive step, it is very probable on one’s evidence that one carries it out competently. However, it is very improbable one one’s evidence that one carries out every deductive step competently. Since it is granted that one knows the conclusion only if one carries out every deductive step competently, it is very improbable on one’s evidence that one knows the conclusion.

- 13.
The usual form of epistemic modelling is not appropriate for treating possible errors in deductive reasoning, since logical omniscience suppresses the dependence of inferential knowledge on correct inferential processes.

- 14.
The technical details are taken from Williamson (2009).

- 15.
*Proof*Suppose that*w*_{ i }>*k*. If*wRx*then |*w*_{ i }*–x*_{ i }| ≤*k*, so*x*_{ i }> 0, so \( x \in p_{i} \). Thus \( w \in Kp_{i} \). Conversely, suppose that*w*_{ i }≤*k*. Then*wRw*[*i*|0], for |*w*_{ i }*–w*[*i*|0]_{ i }| = |*w*_{ i }*–*0| =*w*_{ i }≤*k*and if*i*≠*j*then |*w*_{ j }–*w*[*i*|0]_{ j }| = 0; but \( w\left[ {i|0} \right] \notin p_{i} \) because*w*[*i*|0]_{ i }= 0, so \( w \notin Kp_{i} \). - 16.
*Proof*Suppose that*q*is*i*-based and*x*_{ i }=*y*_{ i }. Suppose also that \( x \notin Kq \). Then for some*z*,*xRz*and \( z \notin q \). But then*yRy*[*i*|*z*_{ i }], for |*y*_{ i }*–y*[*i*|*z*_{ i }]_{ i }| = |*y*_{ i }*–z*_{ i }| = |*x*_{ i }*–z*_{ i }| (because*x*_{ i }=*y*_{ i }) ≤*k*(because*xRz*), and if*i**≠**j*then |*y*_{ j }*–y*[*i*|*z*_{ i }]_{ j }| = 0. Moreover, \( y\left[ {i|z_{i} } \right] \notin q \) because \( z \notin q \),*q*is*i*-based and*y*[*i*|*z*_{ i }]_{ i }=*z*_{ i }. Hence \( y \notin Kq \). Thus if \( y \in Kq \) then \( x \in Kq \). By parity of reasoning the converse holds too. - 17.
*Proof*Set #(*i*,*q*,*w*) = {*j*: 0 ≤*j*≤ 2*k*, \( w\left[ {i|j} \right] \in q \) and |*w*_{ i }–*j*| ≤*k*} for any \( w \in W \), \( q \subseteq W \), 1 ≤*i*≤*n*. For each*i*, let*q*_{ i }be*i*-based. Let ∩*q*_{ i }=*q*_{1}∩…∩*q*_{ n }. For \( w \in W \), \( R(w) \cap \cap q_{i} = \{ x:\quad \forall \;i,x_{i} \in \# \left( {i,q_{i} ,w} \right)\} \), since for each*i*and \( x \in W \), \( x \in q_{i} \) iff \( w\left[ {i|x_{i} } \right] \in q_{i} \) since*q*_{ i }is*i*-based. Since Prob_{prior}is uniform, Prob_{ w }(∩*q*_{ i }) = |*R(w)*∩ ∩*q*_{ i }|/|*R(w)*| for \( w \in W \). But \( \left| {R(w) \cap \cap q_{i} } \right| = \left| {\{ x:\forall \;i,x_{i} \in \# (i,q_{i} ,w)\} } \right| = \left| {\# (1,q_{1} ,w)} \right| \ldots \left| {\# (n,q_{n} ,w)} \right| \). By the special case of this equation in which each*q*_{ i }is replaced by*W*(which is trivially*i*-based for any*i*), |*R(w)*| = |#(1,*W*,*w*)| … |#(*n*,*W*,*w*)|. Consequently:\( {\text{Prob}}_{w} \left( { \cap q_{i} } \right) = \left( {\left| {\# \left( {1,q_{1} ,w} \right)} \right| \ldots \left| {\# \left( {n,q_{n} ,w} \right)} \right|} \right)/\left( {\left| {\# \left( {1,W,w} \right)} \right| \ldots \left| {\# \left( {n,W,w} \right)} \right|} \right). \)

For any given

*i*, consider another special case in which*q*_{ j }is replaced by*W*whenever*i*≠*j*. Since*n*–1 of the ratios cancel out, Pr_{ w }(*q*_{ i }) = |#(*i*,*q*_{ i },*w*)|/|#(*i*,*W*,*w*)|. Therefore Prob_{ w }(∩*q*_{ i }) = Prob_{ w }(*q*_{1}) … Prob_{ w }(*q*_{ n }), as required. - 18.
*Proof*We have already established that \( x \in Kp_{i} \) iff*x*_{ i }>*k*. Thus, in the notation of the previous footnote, #(*i*,*Kp*_{ i }, <2*k*, …, 2*k*>) = {*j*:*k*<*j*≤ 2*k*}, so |#(*i*,*Kp*_{ i }, <2*k*, …, 2*k*>)| =*k*, while #(*i*,*W*, <2*k*, …, 2*k*>) = {*j*:*k*≤*j*≤ 2*k*}, so |#(*i*,*W*, <2*k*, …, 2*k*>)| =*k*+ 1. By the formula for Prob_{ w }(*q*_{ i }) in the previous footnote (with*Kp*_{ i }in place of*q*_{ i }), Prob_{<2k,…,2k>}(*Kp*_{ i }) =*k*/(*k*+ 1). - 19.
A similar generalization to higher iterations of knowledge is possible for the case of multiple risks of inferential error in a single-premise deduction. One has at least

*n*iterations of knowledge of the premise. For each deductive step, one has*n*− 1 but not*n*iterations of knowledge that one carried it out competently. By single-premise closure and plausible background assumptions, one has*n*but not*n*+ 1 iterations of knowledge of the conclusion. For each deductive step, it is very probable on one’s evidence that one has at least*n*− 1 iterations of knowledge that one carried it out competently. However, it is very improbable one one’s evidence that one has at least*n*− 1 iterations of knowledge that one carried out every deductive step competently. Since it is granted that one has at least*n*iterations of knowledge of the conclusion only if one has at least*n*− 1 iterations of knowledge that one carried out every deductive step competently, it is very improbable on one’s evidence that one has at least*n*iterations of knowledge of the conclusion. - 20.
See Williamson (2008) for more discussion of the structure and semantics of higher-order evidential probabilities. The phenomenon discussed in the text involves the apparent loss of only one iteration of knowledge between premises and conclusion. However, the apparent absence of a given number of iterations of knowledge can cause doubts about all lower numbers of iterations, by a domino effect, since lack of knowledge that one has

*n*+ 1 iterations implies lack of warrant to assert that one has*n*iterations (Williamson 2005: 233–4).

## References

Conee, E., & Feldman, R. (2011). Response to Williamson. In Dougherty 2011.

Dorr, C. (2008). How vagueness could cut out at any order. Unpublished MS.

Dougherty, T. (Ed.). (2011).

*Evidentialism and its discontents*. Oxford: Oxford University Press.Gettier, E. (1963). Is justified true belief knowledge?

*Analysis,**23*, 121–123.Gibbons, J. (2001). Knowledge in action.

*Philosophy and Phenomenological Research,**62*, 579–600.Goldman, A. (1976). Discrimination and perceptual knowledge.

*The Journal of Philosophy,**73*, 771–791.Greenough, P., & Pritchard, D. (Eds.). (2009).

*Williamson on knowledge*. Oxford: Oxford University Press.Hawthorne, J. (2004).

*Knowledge and lotteries*. Oxford: Clarendon Press.Hawthorne, J., & Lasonen-Aarnio, M. (2009). Knowledge and objective chance. In Greenough and Pritchard 2009.

Hawthorne, J., & Stanley, J. (2008). Knowledge and action.

*The Journal of Philosophy,**105*, 571–590.Hintikka, J. (1962).

*Knowledge and belief*. Ithaca, NY: Cornell University Press.Hyman, J. (1999). How knowledge works.

*The Philosophical Quarterly,**49*, 433–451.Lasonen-Aarnio, M. (2008). Single premise deduction and risk.

*Philosophical Studies,**141*, 157–173.Lemmon, E. J. (1967). If I know, do I know that I know? In A. Stroll (Ed.),

*Epistemology*. New York: Harper & Row.Mahtani, A. (2008). Can vagueness cut out at any order?

*Australasian Journal of Philosophy,**86*, 499–508.Radford, C. (1966). Knowledge—By examples.

*Analysis,**27*, 1–11.Stalnaker, R. (1999).

*Context and content: Essays on intentionality in speech and thought*. Oxford: Oxford University Press.Stalnaker, R. (2006). On logics of knowledge and belief.

*Philosophical Studies,**128*, 169–199.Williamson, T. (1990).

*Identity and discrimination*. Oxford: Blackwell.Williamson, T. (1999). On the structure of higher-order vagueness.

*Mind,**108*, 127–143.Williamson, T. (2000).

*Knowledge and its limits*. Oxford: Oxford University Press.Williamson, T. (2005). Contextualism, subject-sensitive invariantism and knowledge of knowledge.

*The Philosophical Quarterly,**55*, 213–235.Williamson, T. (2007a). On being justified in one’s head. In M. Timmons, J. Greco, & A. R. Mele (Eds.),

*Rationality and the Good: Critical essays on the ethics and epistemology of Robert Audi*. Oxford: Oxford University Press.Williamson, T. (2007b). How probable is an infinite sequence of heads?

*Analysis,**67*, 173–180.Williamson, T. (2008). Why epistemology can’t be operationalized. In Q. Smith (Ed.),

*Epistemology: New philosophical essays*. Oxford: Oxford University Press.Williamson, T. (2009). Reply to John Hawthorne and Maria Lasonen-Aarnio. In Greenough and Pritchard 2009.

Williamson, T. (2011). Improbable knowing. In Dougherty 2011.

## Author information

### Affiliations

### Corresponding author

## Appendix: The Reflection Principle for Evidential Probability

### Appendix: The Reflection Principle for Evidential Probability

For ease of working, we use the following notation. A *probability distribution* over a frame <*W*, *R*> is a function *Pr* from subsets of *W* to nonnegative real numbers such that *Pr*(*w*) = 1 and whenever \( X \cap Y = \{ \} \), \( Pr(X \cup Y) = Pr(X) + Pr(Y) \) (*Pr* must be total but need not satisfy countable additivity). *Pr* is *regular* iff whenever *Pr*(*X*) = 0, *X* = {}. Given *Pr*, the evidential probability of \( X \subseteq W \) at \( w \in W \) is the conditional probability \( Pr\left( {X|R(w)} \right) = Pr(X \cap R(w))/Pr\left( {R(w)} \right) \), where *R*(*w*) = {*x*: *wRx*}. Similarly, the evidential probability of *X* conditional on *Y* at *w* is \( Pr(X|Y \cap R(w)) = Pr(X \cap Y \cap R(w))/\Pr (Y \cap R(w)) \). In both cases, the probabilities are treated as defined only when the denominator is positive.

The *reflection principle* holds for a probability distribution *Pr* over a frame <*W*, *R*> iff for every \( w \in W \), \( X \subseteq W \) and real number *c*, the evidential probability of *X* at *w* conditional on the evidential probability of *X* being *c* is itself *c*; more precisely:

We must be careful about whether the relevant probabilities are defined. If *Pr*(*R*(*w*)) = 0 then *Pr*(*X* | *R*(*w*)) is undefined, so it is unclear what the set term {*u*: *Pr*(*X*|*R*(*u*)) = *c*} means. To avoid this problem, *Pr*(*R*(*w*)) must always be positive, so that all (unconditional) evidential probabilities are defined. In particular, therefore, *R*(*w*) must always be nonempty; in other words, <*W*, *R*> must be *serial* in the sense that for every \( w \in W \) there is an *x* such that *wRx*. For a regular probability distribution on a serial frame, all evidential probabilities are defined. Of course, it does not follow that the outer probability in the reflection principle is always defined. Indeed, \( \left\{ {u:\Pr \left( {X\left| {R(u)} \right.} \right) = c} \right\} \cap R(w) \) will often be empty, for example when *X* = *R*(*u*) and *c* < 1. In a setting in which all evidential probabilities are defined, we treat the reflection principle as holding iff the above equation is satisfied whenever the outer probability is defined.

Other terminology: A frame <*W*, *R*> is *quasi*-*reflexive* iff whenever *wRx*, *xRx*; <*W*, *R*> is *quasi*-*symmetric* iff whenever *wRx* and *xRy*, *yRx*. Other frame conditions are as usual. In terms of a justified belief operator *J* with the usual accessibility semantics, quasi-reflexivity to the axiom *J*(*Jp* → *p*) and quasi-symmetry to the axiom *J*(*p* → *J*¬*J*¬*p*), and seriality to the axiom *Jp* → ¬*J*¬*p*.

###
**Proposition 1**

*The reflection principle holds for a regular prior probability distribution over a serial frame only if the frame is quasi-reflexive, quasi-symmetric and transitive.*

###
*Proof*

Suppose that the reflection principle holds for a regular probability distribution *Pr* over a serial frame <*W*, *R*>, and *wRx*. Since <*W*, *R*> is serial and *Pr* regular, all evidential probabilities are defined.

(1) For quasi-reflexiveness, we show that *xRx*. Let *Pr*({*x*}|*R*(*x*)) = *c*. Thus \( x \in \left\{ {u:Pr\left( {\{ x\} |R(u)} \right) = c} \right\} \cap R(w) \) since *wRx*, so \( Pr\left( {\{ x\} \left| {\{ u:Pr(\{ x\} } \right|R(u)} \right) = c\} \cap R(w)) \) is defined by regularity. Hence by reflection:

Therefore, since \( x \in \left\{ {u:Pr\left( {\{ x\} |R(u)} \right) = 0} \right\} \cap R(w),c> 0 \) by regularity. Hence \( Pr\left( {\{x\} |R(x)} \right)> 0,\;{\text{so}}\;{{\{ x\} }} \cap R(x) \ne \{ \} \), so *xRx*.

(2) For transitivity, we suppose that *xRy* and show that *wRy*. By regularity, *Pr*({*y*} | *R*(*x*)) = *b* > 0. Hence \( x \in \left\{ {u:Pr\left( {\{ y\} |R(u)} \right) = b} \right\} \cap R(w) \), so by regularity \( Pr\left( {\{ y\} \left| {\{ u:Pr(\{ y\} } \right|R(u)} \right) = b\} \cap R(w)) \) is defined, so by reflection

Therefore {*y*} ∩ *R*(*w*) ≠ {}, so *wRy*.

(3) For quasi-symmetry, we suppose that *xRy* and show that *yRx*. By quasi-reflexiveness, \( y \in R(x) \cap R(y) \), so by regularity *Pr*(*R*(*y*)|*R*(*x*)) = *a* > 0. But by quasi-reflexiveness again \( x \in \left\{ {u:Pr\left( {R(y)|R(u)} \right) = a} \right\} \cap R(x)) \), so \( Pr\left( {R(y)\left| { \, \{ u:Pr(R(y)} \right|R(u)} \right) = a\} \cap R(x)) \) is defined by regularity, so by reflection

Suppose that *a* < 1. Whenever \( u \in R(y),\;R(u) \subseteq R(y) \) by transitivity, so *Pr*(*R*(*y*) | *R*(*u*)) = 1; thus \( R(y) \cap \left\{ {u:Pr\left( {R(y)|R(u)} \right) = a} \right\} = \{ \} \), so \( Pr\left( {R(y)\left| { \, \{ u:\Pr (R(y)} \right|R(u)} \right) = a\} \cap R(x)) = 0 \), so *a* = 0, which is a contradiction. Hence *a* = 1. Thus *Pr*(*R*(*y*) | *R*(*x*)) = 1, so \( R(x) \subseteq R(y) \) by regularity. But \( x \in R(x) \), so \( x \in R(y) \), so *yRx*.

###
**Corollary 2**

*The reflection principle holds for a regular prior probability distribution over a reflexive frame only if the frame is partitional.*

###
*Proof*

By Proposition 1; any reflexive quasi-symmetric relation is serial and symmetric.

###
**Proposition 3**

*The reflection principle holds for any probability distribution over any finite quasi-reflexive quasi-symmetric transitive frame for which all evidential probabilities are defined.*

###
*Proof*

Let *Pr* be a probability distribution over a finite quasi-reflexive quasi-symmetric transitive frame <*W*, *R*>. Pick \( w \in W \), \( X \subseteq W \) and a real number *c*. Suppose that \( Pr(\left\{ {u:Pr\left( {X|R(u)} \right) = c} \right\} \cap R(w)) > 0 \), so \( \left\{ {u:Pr\left( {X|R(u)} \right) = c} \right\} \cap R(w) \ne \{ \} \). Since *R* is quasi-reflexive, quasi-symmetric and transitive it partitions *R*(*w*). Suppose that \( x \in R(w) \). By transitivity, \( R(x) \subseteq R(w) \). Moreover, if \( y \in R(x) \) then by quasi-symmetry and transitivity *R*(*y*) = *R*(*x*), so *Pr*(*X*|*R*(*y*)) = *Pr*(*X*|*R*(*x*)), so if *Pr*(*X*|*R*(*x*)) = *c* then *Pr*(*X*|*R*(*y*)) = *c*. Hence if \( x \in \left\{ {u:Pr\left( {X|R(u)} \right) = c} \right\} \cap R(w) \) then \( R(x) \subseteq \left\{ {u:Pr\left( {X|R(u)} \right) = c} \right\} \cap R(w) \). Thus for some finite nonempty \( Y \subseteq \left\{ {u:Pr\left( {X|R(u)} \right) = c} \right\} \cap R(w) \):

where \( R(y) \cap R(z) = \{ \} \) whenever *y* and *z* are distinct members of *Y*. Trivially, if \( y \in Y \) then *Pr*(*X*|*R*(*y*)) = *c*. By hypothesis, *Pr*(*R*(*y*)) is always positive. Consequently:

###
**Proposition 4**

*The reflection principle holds for any countably additive probability distribution over any quasi-reflexive quasi-symmetric transitive frame for which all evidential probabilities are defined.*

###
*Proof*

The proof is like that for Proposition 3. In particular, whatever the cardinality of *W*, \( \left\{ {R(y)} \right\}_{y \in Y} \) is a family of disjoint sets to each of which *Pr* gives positive probability, so *Y* must be at most countably infinite by a familiar property of real-valued distributions. Thus the summation in the proof is over a countable set.

###
**Corollary 5**

*For a regular probability distribution over a finite serial frame, the reflection principle holds iff the frame is quasi-reflexive, quasi-symmetric and transitive.*

###
*Proof*

From Propositions 1 and 3, since all evidential probabilities are defined for a regular probability distribution over a serial frame.

###
**Corollary 6**

*For a regular probability distribution over a finite reflexive frame, the reflection principle holds iff the frame is partitional.*

###
*Proof*

From Propositions 2 and 3.

###
**Corollary 7**

*For a regular countably additive probability distribution over a serial frame, the reflection principle holds iff the frame is quasi-reflexive, quasi-symmetric and transitive.*

###
*Proof*

From Propositions 1 and 4.

###
**Corollary 8**

*For a regular countably additive probability distribution over a reflexive frame, the reflection principle holds iff the frame is partitional.*

###
*Proof*

From Propositions 2 and 4.

## Rights and permissions

## About this article

### Cite this article

Williamson, T. Very Improbable Knowing.
*Erkenn* **79, **971–999 (2014). https://doi.org/10.1007/s10670-013-9590-9

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Epistemic Status
- Rational Belief
- Epistemic Logic
- Epistemic Position
- Epistemic Possibility