Abstract
In this paper I argue that de Finetti provided compelling reasons for rejecting countable additivity. It is ironical therefore that the main argument advanced by Bayesians against following his recommendation is based on the consistency criterion, coherence, he himself developed. I will show that this argument is mistaken. Nevertheless, there remain some counter-intuitive consequences of rejecting countable additivity, and one in particular has all the appearances of a full-blown paradox. I will end by arguing that in fact it is no paradox, and that what it shows is that conditionalisation, often claimed to be integral to the Bayesian canon, has to be rejected as a general rule in a finitely additive environment.
Similar content being viewed by others
Notes
An algebra of subsets of a set S contains S and is closed under the finite Boolean operations. A \(\upsigma \)-algebra is closed under denumerable union (and hence intersection). The subsets can be regarded as events or propositions; in the latter case, extensionally as classes of possibilities however these might be defined formally. Viewed propositionally, S is the necessary truth, also written T, and \(\emptyset \) the necessary falsehood \(\bot \). Since this paper mostly concerns Bayesian probabilities I will tend to use an explicitly propositional terminology.
\(\sum ^{\mathrm{n}}{\mathrm{P}}({\mathrm{B}}_{\mathrm{i}})\) is a monotone sequence bounded by one so the limit exists.
Though it may be suggested by independent considerations. Oxtoby cites the fact that for \({\mathrm{n}} >3\) Lebesgue measure is the only finitely additive measure on the bounded measurable subsets of \({\mathbb{R }}^{\mathrm{n}}\) that normalizes the unit cube and is isometry-invariant (1984, p. 221).
1950, p. 37.
Well, up to a point. ‘The’ class of hyperreals (comprising infinitesimal and infinite – reciprocally infinitesimal – numbers, together with representatives of the reals, and having all the algebraic properties of the reals themselves) is rather strongly non-unique. In the ultrapower construction, for example, it depends on the choice of a non-principal ultrafilter. Without the continuum hypothesis the classes of hyperreals need not even be order-isomorphic. By contrast, the real and natural numbers are determined up to isomorphism by second-order axiomatisations, and hence in every model of set theory.
Wenmackers and Horsten (2013).
Anyone who wishes to read more should consult Easwaran’s comprehensive discussion (2013).
Kelly (1996), p. 323.
Chen (1977). It is well known that finitary versions of the ‘Strong’ theorems can be proved under FA without additional constraints. Thus, for the Strong Law of Large Numbers becomes this: for every \(\upvarepsilon , \updelta >0\) there is an \({\mathrm{n}}_{0}\) such that for every \({\mathrm{n}}>{\mathrm{n}}_{0}\) and \({\mathrm{k}}>0\) \({\mathrm{P}}({\cap ^\mathrm{k}_{\mathrm{j=1}}} \vert {{\mathrm{S}}_{\mathrm{n+j}}}-{\mathrm{S}}_{\mathrm{n}}\vert <\upvarepsilon ) \ge 1-\updelta \), where \({\mathrm{I}}_{\mathrm{A}}\) is the indicator function of A and \({\mathrm{S}}_{\mathrm{n}}=({\mathrm{n}}^{-1})\sum \nolimits _{\mathrm{i=1}}^{\mathrm{n}} {\mathrm{I}}_{\mathrm{A}}\) (for a corresponding version of the Law of the Iterated Logarithm see Epifani and Lijoi 1997, Theorem 3). A detailed account of how measure-theoretic theorems can be approximated in an FA environment is contained in the Bhaskara Raos’ book (1983). Oxtoby’s review (1984) provides more details and an interesting commentary.
1988.
I take this convenient terminology from Wenmackers and Horsten (2013).
1975, Theorem 1.
Kolmogorov (1950), pp. 47–52.
The point is noted in Milne (1990), p. 117.
Kadane et al. (1986), p. 70, Example 6.1.
Billingsley (1995), p. 458, 33.28.
1950, pp. 50, 51.
De Finetti claimed that the Borel paradox can be seen as an example of nonconglomerability with respect to an uncountable partition. Noting that the \((1/2){\mathrm{cos}}\uplambda \) conditional density can’t be consistently applied to great circles intersecting a meridian circle, he argued that the ‘natural’ (his term) conditional density of \(\uplambda \), for any given value of \(\upvarphi \) picking out the corresponding meridian circle, is the uniform distribution \(\uppi ^{-1}\) (de Finetti 1972, p. 204). That granted, the unconditional probability of the event \({\vert }\uplambda {\vert }<\uppi /2\) is \(1/ \sqrt{2}\), strictly greater than its probability (1/2) given each \(\upvarphi ,\; 0\le \upvarphi <2\uppi \). But this strategy is not consistent under CA, where, given the uniform density distribution over the surface of the sphere, the probabilities are conglomerable: P(\(\uplambda ) = \) \({\mathrm{P}}(\uplambda {\vert }\upvarphi ) = (1/2){\mathrm{cos}}\uplambda ,\; 0\le \upvarphi <2\uppi \).
1950, p. 51.
1950, p. 4.
This recalls the so-called ‘Cournot’s Rule’ after the nineteenth-century French philosopher, mathematician and general savant A.A. Cournot who declared that small enough probabilities can be regarded as practically, or morally, impossible. A strict interpretation of such a rule would of course make it impossible to flip a fair coin too many times! There is, however, nothing necessarily wrong with pragmatically accepting that a very small probability is practically certain not to occur, so long as you do not also close off under finite conjunctions: otherwise you get the Kyburg paradox (see below, p.)
1972, pp. 89–90. De Finetti himself was far from advocating frequentism, however; on the contrary, he was notorious for denying that objective probabilities of any stripe have any place in empirical science.
1763.
Halmos (1950), §49, Theorem B.
This granted, Earman’s portrayal of the ‘almost everywhere’ convergence theorems as the best Bayesian answer to the claims of formal learning theory (1992, Chap. 7) seems somewhat misconceived.
Kelly (1996), p. 328.
He is echoed by Kelly: ‘Such an axiom should be subject to the highest degree of philosophical scrutiny. Mere technical convenience cannot justify it.’ (1996, p. 323)
1972, p. 92.
The Axiom of Choice is essential to this result: a celebrated theorem of Solovay (1970) shows that without it Lebesgue measure is extendable to all subsets of [0,1].
Jech (1997), pp. 297–303.
1972, p. 79.
The reason for two ‘or so it seems’ qualifications in successive sentences will become clearer later.
1972, p. 91.
For example Maher: ‘de Finetti cannot consistently reject countable additivity’, 1993, p. 200.
de Finetti (1972), p. 84; emphasis in the original. There is a subtlety involved in the ‘uniformly’ which it isn’t necessary to go into here.
1974, p. 85.
The stipulation is presented in de Finetti (1972) in the form of a definition of a bet ‘fair with respect to a probability function’ (p. 77).
In de Finetti’s fully operationalist account the uniqueness of p might seem unproblematic because you are compelled to choose a single number (1974, pp. 87, 88); but that only serves to conceal the problem, because there is a well-known theorem that in choosing a value of p that does not represent your true degree of belief in A you increase your expected penalty.
1974, p. 81.
1972, p. 77.
1972, p. 91.
As is explained clearly in Weintraub (2001).
The analogy between a sufficiently large and an infinite lottery is noted and developed in a different way, using non-standard analysis, by Sylvia Wenmackers (2011, pp. 93–94).
‘[de Finetti’s criticisms of CA] led him to the notion of coherence’ (Berti et al., 2007, p. 315).
E.g., the penalty imposed by a quadratic scoring rule on any coherent set of previsions cannot be uniformly reduced (1974, pp. 88–89).
op. cit. p. 11.
Ibid., p. 16.
de Finetti (1974), vol. 1, p. 215. Similar remarks are scattered throughout his writings.
de Finetti (1936).
Coletti and Scozzafava (2002), p. 76.
Under the rather imposing title of Bayesian epistemology.
1972, p. 205. De Finetti tells us that Dubins presented the example in a letter to L.J. Savage.
Kadane et al. (1996).
Using an example formally identical to Dubins’s, Ross concludes that ‘in virtue of nonconglomerable credences, [Sleeping Beauty] will be vulnerable to a legitimate Dutch Book strategy’ (2010, p. 435). He also tells us that there is a Dutch Book argument for CA (p. 439). If what one might call the First Bayesian Era (up to the nineteen twenties) was characterised by the cavalier use of the principle of indifference, so the Second (post 1960 or so) is characterised – at any rate among philosophers – by an equally sweeping use of Dutch Book arguments.
There is a fuller argument in Howson and Urbach (2006), pp. 276–288.
Unknown to Good or anyone else at the time, the theorem had been proved by Frank Ramsey; Ramsey’s manuscript proof was only discovered later.
1996, p. 1231 (my emphasis).
1996, p. 1235.
It is true that there is a Dutch Book argument for conditionalisation. I will say shortly why it is to no avail here.
Ibid.
Ibid. Also, in his axiomatisation of conditional probability, he points out that his axiom 3, that \({\mathrm{P}}(\,\cdot \,{\vert }{\mathrm{A}}\)) is an unconditional probability function even when P(A) \(=\) 0, presupposes that probabilities are updated by conditionalisation (1974, vol. 2, p. 339).
One might also be tempted to see the intuition endorsed by invoking nonstandard numbers: by the transfer principle, assigning an infinitesimal value to \({\mathrm{P}}({\mathrm{X}}={\mathrm{n}}{\vert }{\mathrm{B}})\) makes the likelihood-ratio \({\mathrm{P}}({\mathrm{X}}={\mathrm{n}}{\vert }{\mathrm{B}})/{\mathrm{P}}({\mathrm{X}}={\mathrm{n}}{\vert }{\mathrm{A}})\) strictly increase with n. It is not clear how much weight should be attached to this, however, since on taking standard parts the posterior probabilities of A and B remain obstinately at 1 and 0.
This possibility is noted by Kadane et al. (1996, p. 1232), but they make no mention of de Finetti’s discussion.
1950, p. 15, italics in the original.
1972, pp. 201–202.
References
Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53, 97–100.
Rao, K. P. S. B., & Rao, M. B. (1983). Theory of charges. London: Academic Press.
Berti, P., Regazzini, E., & Rigo, P. (2007). Modes of convergence in the coherence framework. Sankhyâ, 69, 314–329.
Billingsley, P. (1995). Probability and measure (2nd ed.). New York: Wiley.
Chen, R. (1977). On almost sure convergence in a finitely additive setting. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 37, 341–356.
Coletti, G., & Scozzafava, R. (2002). Probabilistic logic in a coherent setting. Dordrecht: Kluwer.
Cox, R. T. (1961). The algebra of probable inference. Baltimore: Johns Hopkins Press.
Cramér, H. (1937). Random variables and probability distributions. Cambridge: Cambridge University Press.
de Finetti, B. (1936). La logique de la probabilité. IV, Hermann, Paris: Actes du Congrès International de Philosophie Scientifique.
de Finetti, B. (1937). Foresight: Its logical laws, its subjective sources; translated from the French and reprinted in Kyburg and Smokler 1980, pp. 53–119.
de Finetti, B. (1972). Probability, induction and statistics. London: Wiley.
de Finetti, B. (1974). Theory of probability (Vols. 1, 2). New York: Wiley.
Dubins, L. E. (1975). Finitely additive conditional probabilities. Conglomerability and disintegrations. Annals of Probability, 3, 89–99.
Earman, J. (1992). Bayes or bust? A critical examination of Bayesian confirmation theory. Cambridge: MIT Press.
Easwaran, K. (2013). Regularity and hyperreal credences. Philosophical Review (forthcoming).
Epifani, I., & Lijoi, A. (1997). A finitely additive version of the law of the iterated logarithm. Theory of Probability and Its Applications, 44, 633–649.
Good, I. J. (1967). On the principle of total evidence. British Journal for the Philosophy of Science, 17, 319–321.
Greaves, H., & Wallace, D. (2006). Justifying conditionalization: Conditionalization maximises expected epistemic utility. Mind, 115, 607–632.
Halmos, P. (1950). Measure theory. New York: Van Nostrand-Reinhold.
Howson, C., & Urbach, P. M. (2006). Scientific reasoning: The Bayesian approach (3rd ed.). Chicago: Open Court.
Jech, T. (1997). Set theory (2nd ed.). Berlin: Springer.
Kadane, J. B., Schervish, M. J., & Seidenfeld, T. (1986). Statistical implications of finitely additive probability. In P. Goel & A Zellner (Eds.), Bayesian inference & decision techniques with applications (pp. 59–76). Amsterdam: North-Holland.
Kadane, J. B., Schervish, M. J., & Seidenfeld, T. (1996). Reasoning to a foregone conclusion. Journal of the American Statistical Association, 91, 1228–1235.
Kelly, K. (1996). The logic of reliable inquiry. Oxford: Oxford University Press.
Kolmogorov, A. N. (1950). Foundations of the theory of probability. New York: Chelsea.
Kyburg, H., & Smokler, H. (Eds.). (1980). Studies in subjective probability (2nd ed.). New York: Wiley.
Lewis, D. (1986). Philosophical papers (Vol. II). Oxford: Oxford University Press.
Maher, P. (1993). Betting on theories. Cambridge: Cambridge University Press.
Milne, P. (1990). Scotching the Dutch book argument. Erkenntnis, 32, 105–126.
Oxtoby, J. (1984). Review of Bhaskara Rao and Bhaskara Rao 1983. Bulletin of the American Mathematical Society, 11, 221–223.
Popper, K. R. (1959). The logic of scientific discovery. New York: Harper and Row.
Rényi, A. (1955). On a new axiomatic theory of probability. Acta Mathematica Academiae Scientiarum Hungaricae, VI, 285–335.
Ross, J. (2010). Sleeping beauty, countable additivity and rational dilemmas. Philosophical Review, 119, 411–447.
Ramakrishnan, S., & Sudderth, W. D. (1988). A sequence of coin toss variables for which the strong law fails. The American Mathematical Monthly, 95, 939–941.
Skyrms, B. (1980). Causal necessity. New Haven: Yale University Press.
Smullyan, R. M. (1968). First order logic. New York: Dover.
Solovay, R. M. (1970). A model of set-theory in which every set of reals is Lebesgue measurable. Annals of Mathematics, 92, 1–56.
van Fraassen, B. C. (1984). Belief and the will. Journal of Philosophy, 81, 235–256.
Weintraub, R. (2001). The lottery: A paradox regained and resolved. Synthese, 129, 439–449.
Wenmackers, S. (2011). Philosophy of probability: Foundations, epistemology, and computation. http://dissertations.ub.rug.nl/faculties/fil/2011/s.wenmackers/.
Wenmackers, S., & Horsten, L. (2013). Fair infinite lotteries. Synthese, 190, 37–61.
Williams, P. (1980). Bayesian conditionalisation and the principle of minimum information. British Journal for the Philosophy of Science, 31, 131–144.
Zabell, S. (1989). The rule of succession. Erkenntnis, 31, 283–321.
Zabell, S. (2011). Carnap and the logic of inductive inference. In D. M. Gabbay, S. Hartmann, & J. Woods (Eds.), Handbook of the history of logic (pp. 265–309). Dordrecht: Elsevier.
Acknowledgments
Thanks to Sorin Bangu and an anonymous referee for very helpful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Howson, C. Finite additivity, another lottery paradox and conditionalisation. Synthese 191, 989–1012 (2014). https://doi.org/10.1007/s11229-013-0303-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-013-0303-3