Probability and proximity in surprise

Shogenji, Tomoji

doi:10.1007/s11229-020-02761-6

Probability and proximity in surprise

Published: 02 July 2020

Volume 198, pages 10939–10957, (2021)
Cite this article

Synthese Aims and scope Submit manuscript

Tomoji Shogenji ORCID: orcid.org/0000-0001-8859-3960¹

195 Accesses
3 Citations
Explore all metrics

Abstract

This paper proposes an analysis of surprise formulated in terms of proximity to the truth, to replace the probabilistic account of surprise. It is common to link surprise to the low (prior) probability of the outcome. The idea seems sensible because an outcome with a low probability is unexpected, and an unexpected outcome often surprises us. However, the link between surprise and low probability is known to break down in some cases. There have been some attempts to modify the probabilistic account to deal with these cases, but as we shall see, they are still faced with problems. The new analysis of surprise I propose turns to accuracy (proximity to the truth) and identifies an unexpected degree of inaccuracy as reason for surprise. The shift from probability to proximity allows us to solve puzzles that strain the probabilistic account of surprise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evidence for surprise minimization over value maximization in choice behavior

Article Open access 13 November 2015

Unexpectedness and Bayes’ Rule

Knowledge-of-Own-Factivity, the Definition of Surprise, and a Solution to the Surprise Examination Paradox

Notes

See, for example, recent quantitative formulations (McGrew 2003, Schupbach and Sprenger 2011, Crupi and Tentori 2012) of Peirce’s idea that explanation eliminates surprise (Peirce 1931–35, 5: p. 189). These authors all assume that the degree of surprise is an inverse function of the (prior) probability.
See Tversky and Kahneman (1974) and the subsequent literature on “heuristics and biases”. See, in particular, Landy, Silbert and Goldin (2013) on heuristics for estimating large numbers, which is often a factor in paradigmatic cases of surprise.
Authors sympathetic to Horwich’s analysis include Good (1984), White (2000), Olsson (2002) and Manson and Thrush (2003). See Schlesinger (1991, pp. 99–102) and Harker (2012) for criticisms of Horwich’s analysis.
See Schlesinger (1991, pp. 99–102) and Harker (2012). Baras (Baras 2019, p. 1409n) reports (based on his correspondence with Horwich) that Horwich did not intend the existence of a serious alternative hypothesis to be a necessary condition for surprise.
See, again, Schlesinger (1991, pp. 99–102) and Harker (2012).
See Harker (2012, p. 253) for a proposal along this line.
The raffle tickets may have numerals on them, but they are assigned only for the purpose of identification. We can replace them by letters, colors, shapes, etc. that represent no quantities.
If x is the greatest lower bound, no other quantity is a greatest lower bound; but if x is only a lower bound, any quantity smaller than x is also a lower bound.
Miller (1974) and Tichý (1974) proved, independently, that the original definition by Popper (1963) is defective. See Oddie (2014) for a review of various attempts to overcome the difficulty, and for different conceptions of verisimilitude (truthlikeness).
Note also that adding a known (new) truth to a theory does not necessarily increase its overall verisimilitude if the theory to which it is added is false (has false logical consequences). Adding a truth to a false theory may increase its falsity-content more than its truth-content. As a result, we cannot even make a comparative judgment that the addition of a truth increases the verisimilitude of the theory.
See Gneiting and Raftery (2007) for a review of the literature, and Winkler and Jose (2010) for an accessible overview.
I will not discuss a categorical statement (with no assignment of a probability) separately because a categorical statement can be considered a special case of a probability distribution where the entire probability is assigned to one member of the partition.
When the members x_i in the ranked partition are the values of a continuous variable, we need the Continuous Ranked Probability Score instead of the Ranked Probability Score, but I restrict discussions to discrete cases here because an extension to the continuous cases does not change the substance of my analysis.
See Roche and Shogenji (2018) for similar reasoning for Strict Propriety. Calling it “an epistemic reason” (p. 594), they argue that the prior probability distribution p(x_i) is the most accurate in retrospect if and only if it is identical to the posterior (updated) probability distribution p(x_i | y) given the new evidence y.
See, for example, Levinstain (2012) and Roche and Shogenji (2018) for arguments against the Brier Score in favor of the Logarithmic Score SR_L(p, i) = − log p(x_i) for a non-ranked partition.
The point holds regardless of the choice of the scoring rule. Since p is equiprobable over X_RAF = {x₁, …, x₁₀₀₀}, the degree of inaccuracy is the same no matter who is the winner, so that the actual inaccuracy is exactly the expected inaccuracy.
I am using here the ratio of the actual degree of inaccuracy to the expected degree of inaccuracy (instead of the difference between them) to measure the unexpectedness of the actual degree of inaccuracy. This is because the degree of inaccuracy, as measured by SR_B or SR_RPS, is on a ratio scale with a unique and non-arbitrary zero value (reached when the probability distribution is completely accurate) and with no finite upper bound.
I am using the case of five out of five here, instead of four out of four used earlier, because the degree of inaccuracy for five out of five is closer (than four out of four) to the degree of inaccuracy for 53 out of 100. Suitable examples are different for comparing probabilities and comparing inaccuracies because the degree of inaccuracy (as measured by the Ranked Probability Score) is not a simple function of the probability assigned to the outcome.
The exact H-ratio of one half (more generally, the exact match between the probability and the observed frequency ratio) looks particularly suspect for two reasons. First, it makes the degree of accuracy not just better than expected, but the best possible. The second reason is the possible explanation of the exact match by manipulation. As Horwich points out (see Sect. 2 above), the recognition of a possible alternative explanation shakes our confidence in the default assumption. When the H-ratio is close to one half, but not exactly one half, the possible explanation of the departure from the expectation is less obvious.
This point is consistent with the fact that the most probable outcome is unsurprising in many cases. It is true that the most probable member of a non-ranked partition (to set aside the additional complication for a ranked partition) makes the probability distribution the least inaccurate, and thus makes the degree of inaccuracy less than expected. Note, however, that the probability distribution plays two roles. It determines the degree of inaccuracy SR(p, i) for the outcome x_i, but it also determines the weights for calculating the expected inaccuracy. Since the most probable member receives the greatest weight, the expected inaccuracy is pulled closer to its degree of inaccuracy. In order for the most probable outcome to be surprising, the partition must consist of many members and the outcome in question must not be so probable as to pull the expected inaccuracy close to its degree of inaccuracy.
These are all cases of the ex post evaluation of the hypothesis based on the actual outcome, but we can extend the idea to ex ante evaluation: We may seek to minimize the estimated inaccuracy of the hypothesis in the ex ante selection of a probabilistic hypothesis. This is a familiar theme in statistical learning theory (see Burnham and Anderson 2002 for an overview).
There is the proposal of the “accuracy first” approach (Leitgeb and Pettigrew 2010a, b; Pettigrew 2016) to use accuracy as the ground for the basic principles of the probability calculus, but what I have in mind here is the use of proximity to the truth in solving some puzzles in epistemology and philosophy of science. See Shogenji (2018) for an instance of such attempts.

References

Baras, D. (2019). Why do certain states of affairs call out for explanation? A critique of two Horwichian accounts. Philosophia, 47(5), 1405–1419.
Article Google Scholar
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
Google Scholar
Crupi, V., & Tentori, K. (2012). A second look at the logic of explanatory power (with two novel representation theorems). Philosophy of Science, 79(3), 365–385.
Article Google Scholar
Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359–378.
Article Google Scholar
Good, I. J. (1984). A Bayesian approach in the philosophy of inference. British Journal for the Philosophy of Science, 35(2), 161–173.
Article Google Scholar
Harker, D. (2012). A surprise for Horwich (and some advocates of the fine-tuning argument (which does not include Horwich (as far as I know))). Philosophical Studies, 161(2), 247–261.
Article Google Scholar
Horwich, P. (1982). Probability and evidence. Cambridge: Cambridge University Press.
Google Scholar
Landy, D., Silbert, N., & Goldin, A. (2013). Estimating large numbers. Cognitive Science, 37(5), 775–799.
Article Google Scholar
Leitgeb, H., & Pettigrew, R. (2010a). An objective justification of Bayesianism I: Measuring inaccuracy. Philosophy of Science, 77(2), 201–235.
Article Google Scholar
Leitgeb, H., & Pettigrew, R. (2010b). An objective justification of Bayesianism II: The consequences of minimizing inaccuracy. Philosophy of Science, 77(2), 236–272.
Article Google Scholar
Levinstain, B. A. (2012). Leitgeb and Pettigrew on accuracy and updating. Philosophy of Science, 79(3), 413–424.
Article Google Scholar
Manson, N. A., & Thrush, M. J. (2003). Fine-tuning, multiple universes, and the “This Universe” objection. Pacific Philosophical Quarterly, 84(1), 67–83.
Article Google Scholar
McGrew, T. (2003). Confirmation, heuristics, and explanatory reasoning. The British Journal for the Philosophy of Science, 54(4), 553–567.
Article Google Scholar
Miller, D. (1974). Popper’s qualitative theory of verisimilitude. British Journal for the Philosophy of Science, 25(2), 166–177.
Article Google Scholar
Oddie, G. (2014). Truthlikeness. Retrieved July 1, 2020, from Stanford Encyclopedia of Philosophy: https://plato.stanford.edu/entries/truthlikeness/
Olsson, E. (2002). Corroborating testimony, probability and surprise. British Journal for the Philosophy of Science, 53(2), 273–288.
Article Google Scholar
Peirce, C. S. (1931-35). The collected papers of Charles Sanders Peirce (Vols. 1-6). In C. Hartshorne, & P. Weiss (Eds.). Cambridge MA: Harvard University Press.
Pettigrew, R. (2016). Accuracy and the laws of credence. Oxford: Oxford University Press.
Book Google Scholar
Popper, K. (1963). Conjectures and refutations. London: Routledge.
Google Scholar
Roche, W., & Shogenji, T. (2018). Information and Inaccuracy. British Journal for the Philosophy of Science, 69(2), 577–604.
Article Google Scholar
Schlesinger, G. (1991). The sweep of probability. Notre Dame: University of Notre Dame Press.
Google Scholar
Schupbach, J. N., & Sprenger, J. (2011). The logic of explanatory power. Philosophy of Science, 78(1), 105–127.
Article Google Scholar
Shogenji, T. (2018). Formal epistemology and Cartesan skepticism: In defense of belief in the natural world. Abingdon: Routledge.
Google Scholar
Tichý, P. (1974). On Popper’s definitions of verisimilitude. The British Journal for the Philosophy of Science, 25, 155–160.
Article Google Scholar
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131.
Article Google Scholar
White, R. (2000). Fine-tuning and multiple universes. Noûs, 34(2), 260–276.
Article Google Scholar
Winkler, R. L., & Jose, V. R. (2010). Scoring rules. In J. J. Cochran (Ed.), Wiley encychopedia of operations research and management science. Hoboken NJ: Wiley.
Google Scholar

Download references

Acknowledgements

Precursors of this paper were presented at Chinese Academy of Social Sciences, Rhode Island College, the University of Turin, and the University of Groningen. I would like to thank the audiences at these institutions for valuable comments. I would also like to thank Matt Duncan and William Roche for carefully reading an earlier version and making many suggestions for improvement.

Author information

Authors and Affiliations

Philosophy Department, Rhode Island College, Providence, RI, USA
Tomoji Shogenji

Authors

Tomoji Shogenji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomoji Shogenji.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shogenji, T. Probability and proximity in surprise. Synthese 198, 10939–10957 (2021). https://doi.org/10.1007/s11229-020-02761-6

Download citation

Received: 12 November 2019
Accepted: 22 June 2020
Published: 02 July 2020
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11229-020-02761-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability and proximity in surprise

Abstract

Access this article

Similar content being viewed by others

Evidence for surprise minimization over value maximization in choice behavior

Unexpectedness and Bayes’ Rule

Knowledge-of-Own-Factivity, the Definition of Surprise, and a Solution to the Surprise Examination Paradox

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Probability and proximity in surprise

Abstract

Access this article

Similar content being viewed by others

Evidence for surprise minimization over value maximization in choice behavior

Unexpectedness and Bayes’ Rule

Knowledge-of-Own-Factivity, the Definition of Surprise, and a Solution to the Surprise Examination Paradox

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation