Skip to main content
Log in

Knowledge from multiple experiences

  • Published:
Philosophical Studies Aims and scope Submit manuscript

Abstract

This paper models knowledge in cases where an agent has multiple experiences over time. Using this model, we introduce a series of observations that undermine the pretheoretic idea that the evidential significance of experience depends on the extent to which that experience matches the world. On the basis of these observations, we model knowledge in terms of what is likely given the agent’s experience. An agent knows p when p is implied by her epistemic possibilities. A world is epistemically possible when its probability given the agent’s experiences is not significantly lower than the probability of the actual world given that experience.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Here, as is standard in this literature, we assume that experiences represent the object as having a precise value. See Williamson (2013a) for discussion.

  2. As in economics, the hope is that the models will provide insight into the phenomena that it models; not that it will capture exceptionless generalizations true of the target phenomenon. As Williamson (2017) notes: ‘The traditional philosopher’s instinct is to provide counterexamples to refute the simplifications and idealizations built into a model, which rather misses the point of the exercise...what defeats a model is not a counterexample but a better model, one that retains its predecessor’s successes while adding some more of its own’ (Williamson, 2017, p. 9)

  3. Here, we don’t assume that this setup correctly describes the actual facts about human ability to perceive water temperature. We are simply imagining a possible creature who perceives water temperature in this way.

  4. Divergence and non-convexity can trivially obtain in situations where the agent possesses unusual background information. Imagine for example that the agent discovers she is wearing vision distorting goggles, or that the agent learns that a particular real value inside of a range cannot be actual. Crucially, though, we are modelling cases where the relevant effects arise even without unusual background knowledge. Moreover, in our examples and in the theory we develop, these features only emerge when the agent receives multiple pieces of evidence.

  5. Williamson (2013a) is primarily concerned with providing an internalist model of Gettier cases. But the models there have a variety applications outside of this internalist setting. For more discussion of these models and their varied significance, see the citations above.

    As a simplifying assumption, Williamson (2013a)’s models also assume that the agent knows everything that they are in a position to know. Obviously the real world isn’t like this; often, for example, we don’t even believe that which we are in a position to know. Readers that dislike this aspect of the idealization can treat the model as a model of what we are in a position to know rather than as a model of knowledge.

  6. See Goodman and Salow (2018) (although note that their notation for the normality ordering is reversed).

  7. Even here we are simplifying. Of course some strange features of the environment might be relevant to the normality of forming a belief about a certain quantity on the basis of its appearance. For example, it would be abnormal for the light from an object to take a detour through Beijing on its way to our eyes. Like Williamson (2013a), we abstract from this issue in our model.

  8. Arguably a task relative conception of normality also provides some motivation for Appearance Luminosity. The task that is relevant to our model is forming a belief about a certain quantity on the basis of its appearance. For the purposes of evaluating this task, we only consider worlds where the quantity has that perceptual appearance.

    On the other hand, we may for certain purposes want a more fine-grained conception of methods. Even holding fixed the appearance, there might be various methods used to form beliefs based on the appearance, and one might want one’s conception of knowledge to be somehow sensitive to which method is in play. The literature that we are contributing to abstracts away from questions of fine-grained methods, and this is one of the ways in which our model like theirs’ makes simplifying assumptions.

  9. See Goodman (2013).

  10. The line of thought that follows was suggested independently in conversation by both Jeremy Goodman and Yoaav Isaacs.

  11. There’s a generalization of No Future Dependence that is obviously hopeless. According to the generalization, what we know now is independent of our future evidence. One way to see that this is obviously wrong is to consider propositions about our future evidence. I may now know that I am not about to have an experience as of a polka-dotted elephant dancing the polka. But whether I know depends on whether I will in fact have the experience in the future.

    On the other hand, we admit that No Future Dependence is not completely sacrosanct even when restricted to knowledge of current real values. As a test case, suppose one has an experience of a barn. One might think that whether one knows now that it is a barn depends on whether in the near future there will be fake barns constructed that produce misleading experiences. On the other hand, its not clear how convincing this way of pushing back really is. Suppose someone looks at a working watch, and says they know it is 7 o’clock. Can I falsify their assertion two minutes later by showing them a stopped clock that says it is 7:10. Can one really retroactively prevent knowledge through such performances?

  12. On the other hand, Intersectivism does validate Margin for Error. Suppose the real value is 50. Then no individual appearance will allow the agent to learn that he is within m of 50. It follows that the sequence of appearances together does not allow the agent to learn this.

  13. A more promising version of Centrism would perhaps replace our crude measure of distance above with squared distance (cf. the superiority of the Brier score over cruder measures of accuracy in the case of credences). On this proposal, the error at a world is the sum of the squared distance between the real value and each appearance. At each world, the agent knows that the error is not significantly less than actual. Throughout, we restrict attention to the simpler definition of error in the main text because our two main challenges to this theory (regarding divergence and non-convexity) also affect the proposal that relies on squared distance.

  14. Suppose there is a smaller relation R satisfying these principles. If R is smaller than E, then there is some (rA) where \(R(r,A) \subset E(r,A)\). Since R satisfies Mean Appearance Centering, we know that R(rA) is centered on the center of A, just like E(rA). But since \(R(r,A) \subset E(r,A)\), we know that Shrinking Margin for Error fails. Suppose for simplicity that r is above or at the center of A. Then the possible real values extend above r by \(\frac{m}{t}\). Since \(R(r,A) \subset E(r,A)\), we know that the R-accessible real values extend above r by less than \(\frac{m}{t}\). So R violates Shrinking Margin for Error.

  15. We can distinguish Match from another principle, Strong Match, which says that an agent knows strictly more in cases of perfect match than in any other case. Centrism validates Match, but invalidates Strong Match. Below, we offer a counterexample to Match and hence also to Strong Match.

  16. There are other potential counterexamples to Match that we abstract from in this paper. Imagine that it is unclear whether the real values are reliably generating appearances. Imagine that there is instead a significant chance that the agent just experiences water as warm, no matter its temperature. In that case, having a series of warm experiences can provide less evidence than having a mixed series of experiences, even if the water is warm. The former series of appearances is consistent with the hypothesis that the water will feel warm no matter its temperature. The latter series is not. This case is similarly a counterexample to the principle that having many experiences that perfectly match reality always provides more knowledge than just one. Have many experiences that are exactly the same could actually provide evidence that one’s appearances are unreliable because insensitive.

  17. The reader may worry that our examples require bounded quantities (there is no value below 1). But our observations throughout also apply for example to cases with a circular structure, such as an agent’s knowledge of the time on the basis of looking at a clock. Note that even in the case of a single appearance, Appearance Centering would need some qualification with bounded domains, but can be imposed unrestrictedly when the values form a circle, as with the position of a clock hand.

  18. In our own theory, developed later, this assumption will imply that prior to any observations, the agent knows nothing about the real value.

  19. This independence assumption is not essential to the description of our cases, or to the positive theory we develop later. We use it here as a simplifying assumption that is appropriate for some range of cases. For some discussion of cases where the assumption is inappropriate, see Garber (1980). Often, in cases where independence fails, an agent learns less from repeated appearances with the same content. For example, imagine an agent who, once she has a certain perception of the water’s temperature, will continue to have a similar perception of the temperature upon repeated experiences unless she receives a radically different sensory input. Such an agent will learn less from repeated experiences that feel the same. Similarly, using one thermometer twice instead of using two different thermometers would make a difference to the plausibility of independence.

  20. An anonymous referee observes that one surprising feature of this particular example is that when there is a pair of appearances of 3, the agent considers 1 likelier than 3. One might worry that this is incompatible with the agent’s evidence playing the functional role of appearances. Notice, however, that this example is derived from a prior distribution on which any single appearance makes it most likely that the real value matches reality. Divergence between reality and appearance can only be probabilified by multiple pieces of evidence, not by any one alone. Finally, divergence does not force a scenario where certain pairs of matching appearances probabilify values other than themselves. For example, imagine a version of our original hot/warm/cold case, in which the probability of the water being hot conditional on appearing hot is roughly .8; the probability of the water being warm conditional on appearing hot is roughly .2, and the probability of the water being cold conditional on appearing cold is negligible. Suppose though that the probability of the water being warm conditional on appearing warm is roughly .6, while the probability of the water being hot conditional on appearing warm is roughly .2. In this case, receiving experiences of cold and hot will provide stronger evidence of warmth than receiving experiences of warm and warm; but each of warm/warm, hot/hot, and cold/cold will favor warm, hot, and cold respectively.

  21. On the other hand, we invalidate the extension of this principle from worlds to propositions. This principle says that if p is epistemically possible and q is epistemically impossible, then the probability of p on the appearances is higher than the probability of q on the appearances. To see the problem, imagine a model with 6 worlds, \(w_{1}\) through \(w_{6}\). \(w_{1}\) has a probability of .4, \(w_{2}\) has a probability of .2, and the remaining worlds have a probability of .1. Later we develop a model where if \(w_{1}\) is actual, the agent knows she is in \(w_{1}\) or \(w_{2}\). But the agent knows she is not in \(w_{3}\) through \(w_{6}\). So the proposition that she is in \(w_{2}\) is epistemically possible, and the proposition that she is in \(w_{3}\) through \(w_{6}\) is epistemically impossible, and yet the former proposition has a lower probability than the latter.

  22. To clarify, there are sometimes cases where the appearances are on opposite sides of the real value, and the agent can come to know that the real value is centered on the midpoint of these appearances. Everything depends on the underlying probabilities. We can’t in general say which worlds are possible merely on the basis of facts about the distance between reality and appearance. We need more than just these distances; we need to know how likely each real value was of producing various appearances.

  23. Respect does not imply Guidance, since (17) validates Respect but not Guidance. Similarly, Guidance does not imply Respect. Consider a model with five worlds, \(w_{1}\) through \(w_{5}\), where worlds with higher indices are more normal than worlds with lower indices. Consider a simple model where the epistemic possibilities at a world are those worlds that are not significantly less normal, and this set is found by admitting more and more abnormal worlds until the resulting set of worlds has a sufficiently high probability. Imagine the probabilities of \(w_{1}\) through \(w_{5}\) are .3, .25, .2, .15, and .1 respectively. If the actual world is \(w_{5}\), the agent’s epistemic possibilities are \(w_{2}\) through \(w_{5}\). In that case, every known proposition has a high probability. But \(w_{1}\) is epistemically impossible while \(w_{5}\) is epistemically possible, even though \(w_{1}\) has a higher probability than \(w_{5}\).

  24. Goodman and Salow (2018) develop a quite similar model that does not validate Guidance. They let the epistemically accessible worlds at w be the union of (i) the worlds not significantly less normal than the most normal worlds consistent with the evidence and (ii) the worlds at least as normal as w. In some applications, they understand normality in terms of probability. In Fig. 8, for example, they might say that the accessible worlds at \(w_{i}\) when \(w_{1}\) is consistent with the evidence are the union of \(\{1,2,3\}\) with any \(w_{j}\) where \(j \le i\). This theory differs from ours in the case where an agent’s evidence eliminates worlds besides the most normal world. For example, if the appearances rule out \(w_{2}\), our theory can allow \(w_{4}\) to be accessible at \(w_{3}\). Once \(w_{2}\) is eliminated, even \(w_{1}\) may access \(w_{4}\). By contrast, in Goodman and Salow (2018) the elimination of the second most normal world does not affect which worlds are significantly less normal than the most normal world. Guidance can then fail at the most normal world in a case where the evidence eliminates every world less normal: only the most normal world will be accessible, which may not itself be very probable.

  25. Of course, Positive Introspection only holds here in the presence of our simplifying assumptions, including Appearance Luminosity. It also only holds if the ordering over worlds (in our setting, the probabilities) is not sensitive to which world one is in.

  26. Stalnaker (2009, p. 406) defends cliff-edge knowledge. For a response, see Hawthorne and Magidor (2010, p. 1092). See also Weatherson (2013, p. 67).

  27. On the other hand, our model does permit cliff-edge knowledge in cases where the next real value would be significantly less likely than the actual one. We think this is the right prediction: in order for the probabilities to be this way, the agent would have to have experienced a large number of reliable appearances. In that case, they could plausibly discover the real value, at least in finite cases.

  28. We’ve been working with models where there are a finite number of possible values for the relevant quantity. It’s natural to wonder how our theory might look if we were to generalize it to the infinite case. In particular, what of a setting where precise values are represented and the possible values form a continuum? (As before, we do not consider cases where the apparent value is a range.) Obviously something needs tweaking here, for the simple reason that it’s natural to think that every precise value has probability 0, and so there won’t be any interesting differentiation of worlds into spheres, given that they are all equiprobable. But there’s a natural way to extend our theory to the infinite case, at least when the probability distribution is continuous. Each value will have the same probability, namely 0, but it is false that each value has the same probability density. By way of picture thinking, imagine one’s probability distribution to be represented by a curve, where the probability for each value range is proportional to the area under the curve. Then the probability density at a point will correspond to the height of the curve at the point, which corresponds to the limit of the values found by taking increasingly smaller segments of the curve centered on that point, and dividing that segment’s area by its width. Once we have probability densities for each point, we can proceed as before, building a sphere system on probability densities rather than probabilities. Then our theory can proceed as before.

  29. Before concluding, it is worth comparing our model with that in Sect. 2, where an agent receives a single appearance. Our model generates that model as a special case, provided that we impose a few constraints on the prior. First, suppose that \(\frac{P(r - m \mid A)}{P(r \mid A)} = s\). Second, suppose that \(P(\{(r',A) : r' \in [r - m, r + m] \} \mid A) = g\). Third, suppose that the probability of a real value conditional on an appearance decreases monotonically with distance from the apparent value. In that case, our model produces the same predictions as that in Sect. 1. When the real value is 50 and the apparent value is 50, the agent knows that the real value is within m of 50. When the real value is 45 and the apparent value is 50, the agent knows that the real value is anywhere from \(45 - m\) to \(55 + m\). In our view, then, the plausibility of the distance-based theory is explained by the way that the distance-based theory encodes natural assumptions about the priors.

  30. For a helpful introduction to this literature, see Doya et al. (2011), Trommershauser et al. (2011).

  31. See Goldstein and Hawthorne (2021) for defense of the role of normality in the theory of knowledge.

  32. Both our original model and the revised normality model allow that agents can know that some possibilities do not obtain before having any experiences, assuming that the relevant probability distribution is sufficiently biased against these possibilities prior to evidence (and, in the case of the normality model, assuming that these possibilities are sufficiently abnormal). It is not hard to motivate such prior knowledge. Consider the hypothesis that our life occurs in a hundred year bubble of relative normality embedded in a sea of chaos. If this hypothesis is epistemically possible in advance of experience, it is hard to see how later evidence could rule it out. See Bacon (2020) for further discussion of whether it is appropriate to posit this kind of prior knowledge. It is beyond the scope of the present paper to explore this interesting issue further.

  33. Thanks to Irene Bosco, Sam Carter, Stephanie Collins, Cian Dorr, and Jeremy Goodman.

References

  • Bacon, A. (2020). Inductive knowledge. Nous, 54(2), 354–388.

    Article  Google Scholar 

  • Beddor, B., & Pavese, C. (2019). Modal virtue epistemology. Philosophy and Phenomenological Research, 45, 1–19.

    Google Scholar 

  • Carter, S. (2019). Higher order ignorance inside the margins. Philosophical Studies, 176, 1–18.

    Article  Google Scholar 

  • Carter, S., & Goldstein, S. (2021). The normality of error. Philosophical Studies.

  • Cohen, S., & Comesaña, J. (2013). Williamson on Gettier cases and epistemic logic. Inquiry, 56, 15–29.

    Article  Google Scholar 

  • Dorr, C., Goodman, J., & Hawthorne, J. (2014). Knowing against the odds. Philosophical Studies, 170(2), 277–87.

    Article  Google Scholar 

  • Doya, K., Shin, I., Alexandre, P., & Rao Rajesh, P. N. (2011). Bayesian brain: Probabilistic approaches to neural coding. MIT Press.

    Google Scholar 

  • Dutant, J., & Rosenkranz, S. (2020). Inexact knowledge 2.0. Inquiry: An Interdisciplinary Journal of Philosophy, 63(8), 812–830.

  • Garber, D. (1980). Field and Jeffrey conditionalization. Philosophy of Science, 47(1), 142–145.

    Article  Google Scholar 

  • Goldstein, S. (2021). Fragile knowledge. Mind.

  • Goldstein, S., & Hawthorne, J. (2021). Counterfactual contamination. Australasian Journal of Philosophy.

  • Goodman, J. (2013). Inexact knowledge without improbable knowing. Inquiry, 56(1), 30–53.

    Article  Google Scholar 

  • Goodman, J., & Salow, B. (2018). Taking a chance on kk. Philosophical Studies, 175(1), 183–96.

    Article  Google Scholar 

  • Goodman, J., & Bernhard, S. (2021). Knowledge from probability. In Proceedings of TARK XVIII: Eighteenth conference on theoretical aspects of rationality and knowledge, Beijing.

  • Hawthorne, J. (2021). The epistemic use of ought. In L. Walters & J. Hawthorne (Eds.), Conditionals, paradoxes, and probability: Themes from the philosophy of Dorothy Edgington. Oxford University Press.

    Google Scholar 

  • Hawthorne, J., & Magidor, O. (2010). Assertion and epistemic opacity. Mind, 119(476), 1087–1105.

    Article  Google Scholar 

  • Hong, F. Uncertain knowledge. PhD thesis, University of Southern California (in preparation).

  • Stalnaker, R. (2009). On Hawthorne and Magidor on assertion, context, and epistemic accessibility. Mind, 118(470), 399–409.

    Article  Google Scholar 

  • Trommershauser, J., Kording, K., & Landy, M. S. (2011). Sensory cue integration. Oxford University Press.

    Book  Google Scholar 

  • Weatherson, B. (2013). Margins and errors. Inquiry, 56(1), 63–76.

    Article  Google Scholar 

  • Williamson, T. (2000). Knowledge and its limits. Oxford University Press.

    Google Scholar 

  • Williamson, T. (2013a). Gettier cases in epistemic logic. Inquiry, 56(1), 1–14.

    Article  Google Scholar 

  • Williamson, T. (2013b). Response to Cohen, Comesaña, Goodman, Nagel, and Weatherson on Gettier cases in epistemic logic. Inquiry, 56(1), 77–96.

    Article  Google Scholar 

  • Williamson, T. (2014). Very improbable knowing. Erkenntnis, 79, 971–999.

    Article  Google Scholar 

  • Williamson, T. (2017). Model-building in philosophy. In R. Blackford & D. Broderick (Eds.), Philosophys future. Oxford University Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Goldstein.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goldstein, S., Hawthorne, J. Knowledge from multiple experiences. Philos Stud 179, 1341–1372 (2022). https://doi.org/10.1007/s11098-021-01710-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11098-021-01710-4

Keywords

Navigation