Abstract
This paper proposes an analysis of surprise formulated in terms of proximity to the truth, to replace the probabilistic account of surprise. It is common to link surprise to the low (prior) probability of the outcome. The idea seems sensible because an outcome with a low probability is unexpected, and an unexpected outcome often surprises us. However, the link between surprise and low probability is known to break down in some cases. There have been some attempts to modify the probabilistic account to deal with these cases, but as we shall see, they are still faced with problems. The new analysis of surprise I propose turns to accuracy (proximity to the truth) and identifies an unexpected degree of inaccuracy as reason for surprise. The shift from probability to proximity allows us to solve puzzles that strain the probabilistic account of surprise.
Similar content being viewed by others
Notes
See, for example, recent quantitative formulations (McGrew 2003, Schupbach and Sprenger 2011, Crupi and Tentori 2012) of Peirce’s idea that explanation eliminates surprise (Peirce 1931–35, 5: p. 189). These authors all assume that the degree of surprise is an inverse function of the (prior) probability.
See Harker (2012, p. 253) for a proposal along this line.
The raffle tickets may have numerals on them, but they are assigned only for the purpose of identification. We can replace them by letters, colors, shapes, etc. that represent no quantities.
If x is the greatest lower bound, no other quantity is a greatest lower bound; but if x is only a lower bound, any quantity smaller than x is also a lower bound.
Note also that adding a known (new) truth to a theory does not necessarily increase its overall verisimilitude if the theory to which it is added is false (has false logical consequences). Adding a truth to a false theory may increase its falsity-content more than its truth-content. As a result, we cannot even make a comparative judgment that the addition of a truth increases the verisimilitude of the theory.
I will not discuss a categorical statement (with no assignment of a probability) separately because a categorical statement can be considered a special case of a probability distribution where the entire probability is assigned to one member of the partition.
When the members xi in the ranked partition are the values of a continuous variable, we need the Continuous Ranked Probability Score instead of the Ranked Probability Score, but I restrict discussions to discrete cases here because an extension to the continuous cases does not change the substance of my analysis.
See Roche and Shogenji (2018) for similar reasoning for Strict Propriety. Calling it “an epistemic reason” (p. 594), they argue that the prior probability distribution p(xi) is the most accurate in retrospect if and only if it is identical to the posterior (updated) probability distribution p(xi | y) given the new evidence y.
The point holds regardless of the choice of the scoring rule. Since p is equiprobable over XRAF = {x1, …, x1000}, the degree of inaccuracy is the same no matter who is the winner, so that the actual inaccuracy is exactly the expected inaccuracy.
I am using here the ratio of the actual degree of inaccuracy to the expected degree of inaccuracy (instead of the difference between them) to measure the unexpectedness of the actual degree of inaccuracy. This is because the degree of inaccuracy, as measured by SRB or SRRPS, is on a ratio scale with a unique and non-arbitrary zero value (reached when the probability distribution is completely accurate) and with no finite upper bound.
I am using the case of five out of five here, instead of four out of four used earlier, because the degree of inaccuracy for five out of five is closer (than four out of four) to the degree of inaccuracy for 53 out of 100. Suitable examples are different for comparing probabilities and comparing inaccuracies because the degree of inaccuracy (as measured by the Ranked Probability Score) is not a simple function of the probability assigned to the outcome.
The exact H-ratio of one half (more generally, the exact match between the probability and the observed frequency ratio) looks particularly suspect for two reasons. First, it makes the degree of accuracy not just better than expected, but the best possible. The second reason is the possible explanation of the exact match by manipulation. As Horwich points out (see Sect. 2 above), the recognition of a possible alternative explanation shakes our confidence in the default assumption. When the H-ratio is close to one half, but not exactly one half, the possible explanation of the departure from the expectation is less obvious.
This point is consistent with the fact that the most probable outcome is unsurprising in many cases. It is true that the most probable member of a non-ranked partition (to set aside the additional complication for a ranked partition) makes the probability distribution the least inaccurate, and thus makes the degree of inaccuracy less than expected. Note, however, that the probability distribution plays two roles. It determines the degree of inaccuracy SR(p, i) for the outcome xi, but it also determines the weights for calculating the expected inaccuracy. Since the most probable member receives the greatest weight, the expected inaccuracy is pulled closer to its degree of inaccuracy. In order for the most probable outcome to be surprising, the partition must consist of many members and the outcome in question must not be so probable as to pull the expected inaccuracy close to its degree of inaccuracy.
These are all cases of the ex post evaluation of the hypothesis based on the actual outcome, but we can extend the idea to ex ante evaluation: We may seek to minimize the estimated inaccuracy of the hypothesis in the ex ante selection of a probabilistic hypothesis. This is a familiar theme in statistical learning theory (see Burnham and Anderson 2002 for an overview).
There is the proposal of the “accuracy first” approach (Leitgeb and Pettigrew 2010a, b; Pettigrew 2016) to use accuracy as the ground for the basic principles of the probability calculus, but what I have in mind here is the use of proximity to the truth in solving some puzzles in epistemology and philosophy of science. See Shogenji (2018) for an instance of such attempts.
References
Baras, D. (2019). Why do certain states of affairs call out for explanation? A critique of two Horwichian accounts. Philosophia, 47(5), 1405–1419.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
Crupi, V., & Tentori, K. (2012). A second look at the logic of explanatory power (with two novel representation theorems). Philosophy of Science, 79(3), 365–385.
Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359–378.
Good, I. J. (1984). A Bayesian approach in the philosophy of inference. British Journal for the Philosophy of Science, 35(2), 161–173.
Harker, D. (2012). A surprise for Horwich (and some advocates of the fine-tuning argument (which does not include Horwich (as far as I know))). Philosophical Studies, 161(2), 247–261.
Horwich, P. (1982). Probability and evidence. Cambridge: Cambridge University Press.
Landy, D., Silbert, N., & Goldin, A. (2013). Estimating large numbers. Cognitive Science, 37(5), 775–799.
Leitgeb, H., & Pettigrew, R. (2010a). An objective justification of Bayesianism I: Measuring inaccuracy. Philosophy of Science, 77(2), 201–235.
Leitgeb, H., & Pettigrew, R. (2010b). An objective justification of Bayesianism II: The consequences of minimizing inaccuracy. Philosophy of Science, 77(2), 236–272.
Levinstain, B. A. (2012). Leitgeb and Pettigrew on accuracy and updating. Philosophy of Science, 79(3), 413–424.
Manson, N. A., & Thrush, M. J. (2003). Fine-tuning, multiple universes, and the “This Universe” objection. Pacific Philosophical Quarterly, 84(1), 67–83.
McGrew, T. (2003). Confirmation, heuristics, and explanatory reasoning. The British Journal for the Philosophy of Science, 54(4), 553–567.
Miller, D. (1974). Popper’s qualitative theory of verisimilitude. British Journal for the Philosophy of Science, 25(2), 166–177.
Oddie, G. (2014). Truthlikeness. Retrieved July 1, 2020, from Stanford Encyclopedia of Philosophy: https://plato.stanford.edu/entries/truthlikeness/
Olsson, E. (2002). Corroborating testimony, probability and surprise. British Journal for the Philosophy of Science, 53(2), 273–288.
Peirce, C. S. (1931-35). The collected papers of Charles Sanders Peirce (Vols. 1-6). In C. Hartshorne, & P. Weiss (Eds.). Cambridge MA: Harvard University Press.
Pettigrew, R. (2016). Accuracy and the laws of credence. Oxford: Oxford University Press.
Popper, K. (1963). Conjectures and refutations. London: Routledge.
Roche, W., & Shogenji, T. (2018). Information and Inaccuracy. British Journal for the Philosophy of Science, 69(2), 577–604.
Schlesinger, G. (1991). The sweep of probability. Notre Dame: University of Notre Dame Press.
Schupbach, J. N., & Sprenger, J. (2011). The logic of explanatory power. Philosophy of Science, 78(1), 105–127.
Shogenji, T. (2018). Formal epistemology and Cartesan skepticism: In defense of belief in the natural world. Abingdon: Routledge.
Tichý, P. (1974). On Popper’s definitions of verisimilitude. The British Journal for the Philosophy of Science, 25, 155–160.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131.
White, R. (2000). Fine-tuning and multiple universes. Noûs, 34(2), 260–276.
Winkler, R. L., & Jose, V. R. (2010). Scoring rules. In J. J. Cochran (Ed.), Wiley encychopedia of operations research and management science. Hoboken NJ: Wiley.
Acknowledgements
Precursors of this paper were presented at Chinese Academy of Social Sciences, Rhode Island College, the University of Turin, and the University of Groningen. I would like to thank the audiences at these institutions for valuable comments. I would also like to thank Matt Duncan and William Roche for carefully reading an earlier version and making many suggestions for improvement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shogenji, T. Probability and proximity in surprise. Synthese 198, 10939–10957 (2021). https://doi.org/10.1007/s11229-020-02761-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-020-02761-6