1 Introduction

Chances should guide credences; to put the point more guardedly, rational agents’ credences about chances should guide their credences: this is the essence of the intuition behind the Principal Principle (Lewis, 1986). The idea can be easily illustrated in simple cases: assuming we’re discussing the credences of some rational agent, her credence in A given (just) that the chance of A is 0.3 should equal 0.3. However, moving from this to a general formulation of a sensible candidate for a norm of rationality has been a daunting task; the multitude of texts published in the last four decades on this subject attest its difficulty.

The field has certainly seen progress; a few candidates for chance-credence norms have emerged, each with its own set of problems. Some of the issues have been given a formal treatment, with mathematical theorems enlightening previous metaphysical research. In this paper we would like to bring into focus a few issues on that frontFootnote 1:

  • a somewhat surprising relationship between two familiar norms, the Principal Principle (PP) and the Generalized Principal Principle (GPP): it turns out that, despite what some writers seem to have thought, the GPP is not “more general” of the two, at least in the sense that it does not even include the PP as a special case—the PP postulates stronger requirements on agents who are not certain what the objective chance function is than the GPP does;

  • this leads to a suggestion that there are two fundamentally different readings of the PP, the static and the dynamic one, which seems to illuminate the issue;

  • the PP seems to have at least one hitherto undiscussed unintentional consequence: if an agent has an opinion about a nontrivial chance function, then at least two possible worlds have to share that function;

  • related, but different issues arise for the New Principle (NP), originally proposed to address the problem of “undermining futures” (Lewis, 1994);

  • a “General New Principle” (GNP), whose relation to the NP is prima facie the same as the one holding between the GPP and the PP, turns out to be equivalent to the NP;

  • the GPP is not preserved by Bayesian Conditionalisation (while the PP and the NP are);

  • the GPP is not as good at handling conditional chances as the PP and the NP—it is not a “dot principle”, as defined below in Section 3.3.

It would seem, then, that neither the PP nor the GPP are good candidates for norms of rationality. What is left on the table? We will argue that the New Principle (NP), which avoids most of the problems discussed in this paper, also leads to a possibly unfortunate, somewhat metaphysical constraint on the chance functions about which the agent whose credence function satisfies the principle has an opinion: if such a chance function is not “fundamentally self-deceiving”, in the precise sense defined below, it has to be the chance function for at least two possible worlds (more precisely, it has to be their ur-chance function, as described below). Whether this constraint is unreasonable is another matter, of course. This will depend on one’s metaphysical account of chance, a matter on which we would like to be neutral in this paper.

The paper is structured as follows. In the next section we present the formal framework in which the proposed norms will be discussed. Section 3 contains the discussion of the various desiderata we would like the norms to satisfy. In Sect. 4 we sum up our findings.

2 The setup and initial observations

For the purposes of this paper, we have decided to adopt the formalism used in Pettigrew (2015). There have been numerous attempts at providing some formal language suitable for phrasing chance-credence norms, and we would not like to add another one to the mix. These attempts can be roughly divided into two groups depending on whether the PP is supposed to involve conditioning on a proposition about which we assume (only) that it specifies the chance of the proposition under discussion, or whether it does its job in the context of conditioning on a proposition which specifies the whole chance function. We encounter the first of these approaches whenever we see someone phrasing the PP as entailing that \(P(A | X)=x\), where X is assumed to “say that the chance of A is x”, or something to the similar effect.Footnote 2 We encounter the second one for example when we see someone phrasing the PP in a way which involves referring to some “complete theory of chance” (whatever that would be) instead of the proposition X. Pettigrew’s method brings clarity to the latter idea: instead of “complete theories of chance”, we refer to “chance propositions” true at those worlds which share the given chance function.

Let us then recapitulate the main ingredients of the formal setup from Pettigrew (2015). An agent’s epistemic state at a time t is given by a credence function \(b_t\) which is a classical probability function. It is defined on an algebra of propositions; these are the propositions about (the truth of) which the agent in question has an opinion. For each time t there is a proposition \(E_t\) which is the agent’s total evidence at t; we assume that \(b_t(E_t)=1\), that is, the agent does not dispute her evidence.

Given a possible world w we consider the “ur-chance function” at w. The idea is that, in w, chances at time t are obtained from a single chance function upon which we conditionalise with the history of w up to t: it is that function which is dubbed the ur-chance function. We may think of it as the “initial” chance function, if time has an initial instant; it specifies, in any case, the chances at a world “before anything happens in it”, i.e., when the history of it is a tautologous proposition. It is assumed that chance functions are classical probabilities and that they are defined on the same algebra of propositions as the credence functions. Given a chance function ch, \(C_{ch}\) is the proposition “the ur-chance is given by ch”, true at exactly those worlds where the ur-chance is ch.

Note that Pettigrew does not assume throughout that propositions are sets of possible worlds: he does so only in one of his arguments supporting the NP (Proposition 12 of Pettigrew, 2015). We also want to stay neutral on this in general. However, some arguments in what follows will require the actual construction of credence functions satisfying the GPP; to do so, we need to specify the algebra of propositions, and this is perhaps most conveniently described if propositions are sets. However, what will turn out to be important is a different issue: whether chance propositions are atoms of the algebra of propositions on which the credence function is defined (if it has atoms at all). If an atom of such an algebra is identified with a set of possible worlds, it is natural to treat it as a singleton: otherwise there will be two or more worlds which assign the same truth values to all propositions about which the agent has an opinion, i.e. which are indistinguishable from her perspective.

If every world has a single ur-chance function, it follows that the propositions \(C_{ch}\) correspond to a partition of the set of all possible worlds considered by the agent. An assumption Pettigrew makes “for the sake of mathematical simplicity” (p. 178) is that the domain of the credence function contains only finitely many propositions of the form \(C_{ch}\). The points made in this paper will require from us the construction of specific credence functions satisfying GPPs, but in each case finite structures (i.e. structures containing not only finitely many \(C_{ch}\)’s, as in Pettigrew’s case, but finite, full stop) will suffice. Therefore, we deem this assumption to be relatively innocent.

Let us move, then, to the formulation of the chance-credence norms under discussion here (lifted with minimal changes from Pettigrew, 2015).

(PP) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions A,

$$\begin{aligned}b_t(A|C_{ch})=ch(A|E_t),\end{aligned}$$

unless \(ch(E_t)=0\); in which case it ought to be that \(b_t(C_{ch})=0\).

The PP as just given is not exactly what Lewis originally had in mind; he wanted the Principle to concern reasonable initial credence functions:

(PP)\(_0\) At the beginning of her epistemic life, an agent ought to have a credence function \(b_0\) such that, for all ur-chance functions ch and propositions A,

$$\begin{aligned}b_0(A|C_{ch})=ch(A).\end{aligned}$$

Observe that PP\(_0\) entails that at the beginning of her epistemic life the agent has to have a nonzero credence in each chance proposition she has an opinion about. This strikes us as a strong assumption. However, we can perhaps mitigate it by thinking that, instead of considering a \(C_{ch}\) such that \(b_0(C_{ch})=0\), we shall just remove the possible worlds at which such a \(C_{ch}\) is true from the set of the worlds under discussion. (This is perhaps additionally justified if we assume that the agent updates via Bayesian Conditionalisation, which throughout her life would never raise her credence in such a \(C_{ch}\).)

The PP implies the PP\(_0\) if we assume that there is an instant 0 at the beginning of an agent’s epistemic life where her evidence is tautologous (we keep this assumption throughout the paper). Pettigrew notes (in his Proposition 1) that PP is entailed by PP\(_0\) in conjunction with Bayesian Conditionalisation (BC):

(BC) If \(t < t'\), it ought to be the case that, for any A,

$$\begin{aligned}b_{t'}(A) =b_t(A|E_{t'}),\end{aligned}$$

provided that \(b_t(E_{t'})>0\).

To quote Lewis, the PP is supposed to capture the intuition that “certainty about chances—or conditionality on propositions about chances—makes for resilient degrees of belief about outcomes” (Lewis, 1986, p. 86). However, an agent can satisfy PP (or PP\(_0\)) without bestowing credence 1 to any chance proposition.Footnote 3 In fact, it only makes real sense to write the PP in this way if one’s intention is to talk about agents who are not certain what the actual chance function is. Otherwise, if the actual chance function was ch, then the agent’s credence \(b_t(C_{ch})\) would equal 1, and writing the formula at the heart of the PP starting with “\(b_t(A|C_{ch})\)” would be puzzling indeed. We also should not read the PP as specifying what a rational agent’s credence should be once they learn what the actual chance is: first, the PP does not by itself specify a belief update rule, but is a synchronic normFootnote 4; second, the PP might be seen to be reasonable even if learning what actual chances are is fundamentally impossible. It just requires the rational agent’s personal odds to be set in a specific way: if \(ch(A |E_t)=.75\), then for the agent \(AC_{ch}\) should be thrice as likely as \(\lnot A C_{ch}\).Footnote 5

All this is not a peculiarity of the formalism we have adopted for the current paper. Lewis himself, immediately after the above-quoted fragment about how PP is to capture the intuition about “certainty about chances—or conditionality on propositions about chances”, defines the PP as involving the expression “\(C(A|XE)=x\)”,Footnote 6 where the “C” stands for the credence function, and the “X” for the proposition that the chance of A’s holding equals x. This, again, has nontrivial consequences for many agents who are not certain of what the chance function is, that is, for those with a credence function C such that \(C(X) \ne 1\). If the principle was supposed to concern just those agents who possess certainty about the objective chance, that is, those for which \(C(X)=1\), then writing the PP as Lewis did would indeed be baffling. Of the two options from the quote, then, we should point to “conditionality on propositions about chance” as the topic of the PP.

The PP might seem to be too specific: it literally refers only to cases involving a conditional credence in a proposition given a single proposition about chance, while in reality agents typically hold numerous hypotheses about possible chance functions. And it seems that indeed, Lewis intended the ultimate form of the Principal Principle to be somewhat different. In the context of an agent entertaining various options as to the chance of a certain coin falling heads, he writes “more generally, whether or not you are sure about the chance of heads, your unconditional degree of belief that the coin falls heads is given by summing over alternative hypotheses about chance” (ibid., p. 87). The suggestion seems clear that this is supposed to be the generalized version of the PP, covering more cases. In the parlance we have adopted here let us write it as follows, “GPP” meaning “Generalized Principal Principle”Footnote 7:

(GPP) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions A,

$$\begin{aligned} b_t(A)=\sum _{ch: ch(E_t)>0}b_t(C_{ch})ch(A|E_t). \end{aligned}$$

The intuition seems clear: “more generally”, the credences of an agent who satisfies the GPP are weighted means of conditional credences, the weights being the credences in chance propositions, and the conditional credences being conditional on those chance propositions. This we get just by the total probability theorem. Then we switch from the conditional credences to chances by appealing to (PP), and indeed (GPP) is what we end up with (see Appendix A.1).

If the GPP was indeed “more general”, then in the special case where the agent was certain of the truth of a single chance proposition, and just one weight (equal to 1) remained, we could be excused for thinking that we should end up with the “specific” PP. However, note that it is decidedly not the case: while it is easy to deduce the GPP from the PP just by the probability calculus, the GPP does not imply the PP, in the literal sense of there being a credence \(b_t\) which satisfies the GPP but which does not satisfy the PP (see Table 1 below, as well as Appendix A.2). So, quite literally, the GPP is not “more general” than the PP, since it doesn’t cover it as a special case. The situation is a little more subtle. If for some ch \(b_t(C_{ch})=1\), then both the PP and the GPP require that \(b_t(A)=ch(A|E_t)\), that is, that credences conform to known chances. This is the job the Principal Principle is usually informally expected to perform.Footnote 8 But the PP, as already noted, postulates requirements concerning a specific coordination of credences also on part of agents who are not certain about what the objective chance is, requirements the GPP does not posit. We also see that the GPP governs both the situations of certainty and of uncertainty about objective chances. When investigating the relationship between the PP and the GPP, and the alleged higher generality of the GPP, then, the topic of certainty is something of a red herring. The two principles demand the same credence function from agents certain about what the objective chance is; of other agents, actually, it is the PP that requires more.

After inspecting the literatureFootnote 9 we believe it is worth it to reiterate the following points:

  • the GPP, known also as “Ismael’s Principle”, is already present in Lewis (1986);

  • the PP implies the GPP (Appendix A.1);

  • the GPP does not imply the PP (Table 1 and Appendix A.2).

The direction of the logical relationship is one thing; justification is another.Footnote 10 Hoefer (2007) points out that “The justification of this GPP depends on the prior justification of PP in a fairly obvious way” (fn. 27).Footnote 11 Briggs (2009a) goes further and claims that “without PP, there is no reason to accept GPP” (p. 441). We disagree. In our opinion there are situations where the GPP is motivated just by the intuition that ideally credences should conform to chances, so in situations of uncertainty about chances one should set their credences as expectations of chances; sometimes it might be the case that the ratios of certain probabilities turn out to be such that the PP fails, but still, to reiterate, credences still are expectations of chances, as they should be.

Consider the situation in Table 1. Some agent’s credence \(b_t\) is defined on the space of four possible worlds, governed by two chance functions; \(C_{ch_1} = \{w_1, w_2\}\), \(C_{ch_2} = \{w_3, w_4\}\). What is the agent’s credence in, say, \(\{w_1\}\)? The agent does not know which of the two chance functions is the actual one. Still, there are two options, so they set their credence to be the appropriate weighted mixture of the two, that is, to be their expectation of the proposition’s chance: \(b_t(\{w_1\}) = b_t(C_{ch_1})ch_1(\{w_1\}) + b_t(C_{ch_2})ch_2(\{w_1\}) = 0.5 \times 0.4 + 0.5 \times 0.25 = 0.325\). This holds for all propositions, and so the agent’s satisfies the GPP (please take our word for it for now; for details see Appendix A.2). However, the PP fails: \(b_t(\{w_1\} | C_{ch_1}) = 0.65 \ne 0.4 = ch_1(\{w_1\})\). Is this a mark of synchronic irrationality? The agent’s odds of \(\{w_1\}C_{ch_1}\) vs. \(\{w_2, w_3, w_4\}C_{ch_1}\) are \(\nicefrac {13}{7}\), while according to \(ch_1\) the odds of \(\{w_1\}\) vs. \(\{w_2, w_3, w_4\}\) are \(\nicefrac {2}{5}\). However, the \(b_t\) is a classical probability function, so there is no risk of a Dutch Book. The credences are expectations of chances, as already noted. After the agent updates the credence function in light of some evidence, there might be trouble; we should expect, e.g., that after conditionalisation on some new evidence the GPP will cease to hold. However, this might not be thought of as a problem for GPP itself as a synchronic norm of rationality; and if one wishes, other belief update methods than conditionalisation could be considered, such as imaging (Lewis, 1976) or generalized imaging (Gardenfors, 1982). Perhaps sustaining the GPP under some reasonable belief update method is not out of the question.Footnote 12

Table 1 A credence function \(b_t\) that satisfies the GPP but does not satisfy the PP (\(C_{ch_1}=\{w_1,w_2\}\), \(C_{ch_2}=\{w_3,w_4\}\), \(E_t\) tautologous)

To reiterate, the supposedly “generalized” PP does not imply the “specific” one. In fact, it’s the other way round: once we assume the PP, the GPP follows. It’s not the case that the GPP covers both the situations of uncertainty, where the agent holds various hypotheses about chance, and of certainty, where the agent bestows one chance proposition credence 1, while the PP deals exclusively with the latter. The situation, as already noted, is markedly different, and perhaps we can see one aspect of it more clearly if we assume that rational agents adhere to BC. As formulated, both the PP and the GPP are static, synchronic norms: they specify how an agent’s credences at a certain time should internally cohere. However, once we assume the agent will update their credence by conditionalization, the PP tells us how such an update should proceed in situations where they learn the truth of one of the chance propositions, bringing a dynamic aspect to the picture, which is not entailed by the static one.Footnote 13 The PP, then, covers the situations of becoming certain what the objective chance is, not sharing this aspect with the GPP. Perhaps this might be the reason for the associations of the PP with certainty and the GPP with uncertainty.

The third candidate for a chance-credence norm to be discussed here is the so-called New Principle (NP) that was first put forth in response to the problem of “undermining futures”.Footnote 14 At the core of the NP lies the idea that rather than letting unconditional chances guide credence as is the case for PP, we should let the chances, conditional on theirs being the true chance, guide credence. In other words, chance is to be treated as an oracle who is aware of her ‘oracle-status’: not only do we bring her up to speed with what our evidence is in line with what the PP already suggested, but we also assume that she knows that she is the oracle and then conform our subjective credence to the probability that she would assign to the proposition in question, conditional on the evidence and on hers being the true chance function (Hall, 1994). More precisely:Footnote 15

(NP) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions A,

$$\begin{aligned}b_t(A|C_{ch})=ch(A|E_t C_{ch}),\end{aligned}$$

unless \(ch(E_t C_{ch})=0\); in which case it ought to be that \(b_t(C_{ch})=0\).

It might seem natural at this point to investigate a “general” counterpart to the NP, one explicitly involving an agent contemplating multiple chance hypotheses—an analogue of the GPP, but involving chance functions which are “brought up to speed” in the above sense:

(GNP) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions A,

$$\begin{aligned} b_t(A)=\sum _{ch: ch(E_t)>0}b_t(C_{ch})ch(A|E_t C_{ch}). \end{aligned}$$

Somewhat surprisingly, it turns out that this does not bring anything new to the table: the GNP and the NP are equivalent (Appendix A.3).

3 The Desiderata

We will now proceed to list and discuss a few conditions on prospective candidates for chance-credence norms of rationality.

3.1 Being preserved by BC

The first condition is simple: a good candidate for a chance-credence norm should stipulate a condition on credence functions which is preserved by BC, a widely accepted belief update operation. (This desideratum will not make sense, of course, for someone who believes that the right chance-credence norm is PP\(_0\).)

We do not wish to claim that this is a sine qua non: we would be open, say, to a slight divergence from Bayesianism, according to which an update via BC may require an immediate “chance-credence recalibration”; perhaps mere conditionalisation may loosen the proper fit between credences and credences about chances, which would need to be re-established. Let us check the facts as to whether the conditions imposed by the discussed norms are preserved by BC. Almost trivially, the PP is so preserved (Appendix A.4). However, as shown by Pettigrew and Titelbaum (2014), GPP is not (Pettigrew and Titelbaum attribute the principle to Ismael; see also Appendix A.5): one can construct credences \(b_t\) and \(b_{t'}\) (for \(t < t'\)), such that:

  • \(b_t\) satisfies the GPP;

  • \(b_{t'}\) arises from \(b_t\) in accordance with BC;

  • but \(b_{t'}\) does not satisfy the GPP.

(In such situations, in view of the PP implying the GPP, \(b_t\) will of course not satisfy the PP.) This is, then, another argument against the GPP.

The NP is also preserved by BC (Pettigrew, 2015, Proposition 8).

3.2 Avoiding unintentional metaphysical consequences

As is well-known, the PP faces tough problems related to the issue of “undermining futures”. These were initially formulated in the philosophical context of Humean Supervenience and led to several reformulations of the principle, of which we will discuss the NP (see Lewis, 1994 and Thau, 1994 for the initial statement of the problem and first attempts at a solution; among the papers important for subsequent discussion are e.g. Hall (1994) and Meacham (2010)). The issue can, however, be seen to deal a strong blow against the PP even if no deep philosophical commitment is made: only a few mathematical assumptions are needed. Specifically: take a chance function ch and a moment t such that \(ch(C_{ch} | E_t) < 1\) (Pettigrew calls such functions “modest in the presence of \(E_t\)”). Then from the fact that \(b_t\) satisfies the PP it follows that \(b_t(C_{ch}) = 0\) (Pettigrew, 2015, Proposition 4, but see also Proposition 3). Then, if all chance functions about which the agent has an opinion are modest in the presence of \(E_t\), a \(b_t\) satisfying the PP cannot be a probability function.

Perhaps even the weaker conclusion that an agent satisfying the PP cannot hold a nonzero credence in that the actual world is governed by a modest chance function is unpalatable. Assume an agent’s space is finite. If at the start of their epistemic life they consider more than one chance functions which are nowhere zero, then in order to satisfy the PP they need to choose one of those functions and go all in, setting their credence in that this is the objective chance to 1, and setting all other chance propositions to be empty. Such a behaviour does not strike us as rational, in view of complete lack of evidence.

This needs to be contrasted with the utmost ease with which one can, given the set of chance functions, produce credence functions satisfying the NP in situations where the agent has opinions regarding arbitrary chance propositions. As shown in Proposition 6 in Pettigrew (2015), for any finite set of chance functions, for any evidence \(E_t\), you can have an agent with arbitrary credences in the corresponding chance propositions, and still satisfy the NP; the issue of modesty is not relevant here. However, we will argue that the principle has other problems, which, again, will be most easily discussed if we start with how they trouble the PP.

Before we do this, let us note that, regarding modesty, the GPP falls somewhat in between the PP and the NP: it does not force the agent to have credence 0 in all chance propositions where the chance is modest in the presence of \(E_t\), but restricts this conclusion to these cases of modest ch’s for which there exists an immodest \(ch'\) such that \(ch(C_{ch'} | E_t)>0\), a noticeably weaker constraint.Footnote 16 Of the three principles, then, the NP seems to fare the best with regard to the modesty issue, with the GPP in the second place.

Let us now consider what it means if a \(C_{ch}\) is an atom of the algebra of propositions on which the credence function is defined.Footnote 17 If propositions are sets, it may be the case that \(C_{ch}\) is a singleton: among the possible worlds in the sphere of interest for the agent, there’s just a single one whose ur-chance function is ch. In a different case, \(C_{ch}\) may contain more than one world: but still, all the propositions about which the agent has an opinion have the same truth values at those worlds, and so from the perspective of the agent they (the worlds) are identical. This holds also in the general case where propositions are not sets of worlds: if \(C_{ch}\) is an atom, then the agent does not differentiate between the worlds at which ch is the ur-chance function.

Suppose, then, that \(C_{ch}\) is an atom. If the agent satisfies the PP, then, at any instant t, \(ch(A|E_t)\) is a trivial function (i.e., it obtains for any A only the values 1 and 0). And if we consider the case when \(E_t\) is tautologous, that is, we investigate the beginning of the agent’s epistemic life, we infer that the ur-chance function ch is trivial. This is because, if \(C_{ch}\) is an atom, then for any A, \(A \wedge C_{ch}\) can only be equal to \(C_{ch}\) or the bottom element of the algebra (i.e. \(\emptyset \) if propositions are sets), to which any probability function has to assign 0. Therefore \(b_t(A | C_{ch})\), and so \(ch(A|E_t)\), can only equal 0 or 1, which in the case where \(E_t\) is tautologous leads to the conclusion that ch is trivial.Footnote 18 In short, just because of how classical conditional probability works, the PP has the following consequence:

(Triv) If a chance function is non-trivial, there have to be at least two worlds for which it is the ur-chance function.

Whether (Triv) can be called a “metaphysical” consequence might depend on one’s view on the nature of possible worlds considered by agents in the context of chance-credence norms. Lewisians might just take the word literally. Those for whom these worlds are “personal possibilities” (Hacking, 1967, Sect. 6) may differ in their opinion; ersatzists likewise. We do not wish to put too fine a point on using the term.

Note also that (Triv) is an entirely reasonable principle on many, if not most, accounts of chance. What is surprising is that it is a consequence of a principle which is ostensibly about coordinating two types of credences: those about propositions and those about chances of propositions. (PP) requires your credence \(b_t\) in the propositions “A and the chance is ch” and “the chance is ch” to be related with each other by means of a very specific ratio, given by \(ch(A|E_t)\). And if \(ch(A|E_t)\) is different from 0 or 1, why should it follow from that that an agent has to consider at least two worlds for which ch is the ur-chance? This logical relation, not the conclusion, seems surprising to us. It is decidedly not odd to have two worlds with the same chance function; it is odd for a chance-credence norm to entail this.Footnote 19

The GPP does not entail (Triv); the Appendix contains an example of a credence function satisfying the GPP and involving atomic chance propositions for nontrivial chance functions (see the end of Appendix A.2). However, a related issue may be seen to trouble the NP. Call a chance function ch fundamentally self-deceiving in the presence of \(E_t\) if \(ch(\cdot | E_t)\) is nontrivial, but \(ch(\cdot | E_t C_{ch})\) obtains only the values 0 and 1. That is, ch considers itselfFootnote 20 to be deterministic (in the presence of \(E_t\)), but it really isn’t: there is a proposition A such that \(ch(A| E_t)\) is not equal to 0 or 1. Notice that if \(C_{ch}\) is an atom, then for any t, if \(ch(\cdot | E_t)\) is nontrivial, then ch is fundamentally self-deceiving in the presence of \(E_t\). In other words, the NP leads to the following:

(Deceiv) Take a chance function ch. If there is a t such that ch is not fundamentally self-deceiving in the presence of \(E_t\) and \(ch(\cdot | E_t)\) is nontrivial, then there have to be at least two worlds for which ch is the ur-chance function.

Perhaps the biggest trouble here is the following: if \(b_t\) satisfies the NP, then for a nontrivial chance function ch, if ch is not fundamentally self-deceiving in the presence of tautologous evidence, that is, if there is even a single proposition A such that \(ch(A| C_{ch})\) is different from 0 or 1, then there have to be at least two worlds for which ch is the ur-chance function.

It might very well be, then, that (Deceiv) is an unfortunate consequence of the NP. However, we have to note that the NP can only be an improvement over the PP if there are cases where the principles differ in the values of credence they mandate. These, in turn, are exactly the cases where \(ch(A|E_t) \ne ch(A |E_t C_{ch})\), cases, in other words, where ch is self-deceiving in the presence of \(E_t\): it would appear to be saying, for example, “if I’m right, the chance of A in the presence of \(E_t\) is 0.3”, while in reality it may be any other number. (We have to note that while we label these cases as those of deception, Hall (1994), who treats chance functions as performing the role of an expert, uses more moderate language and suggests that in such situation something is “news to the expert” (p. 511).) It is interesting to see that, if we want to avoid the modesty issue by switching from the PP to the NP, we necessarily have to invite some self-deception of this sort. Note that, as mentioned above, it is easy to construct examples of \(b_t\)’s satisfying NP where the atoms of the domain of \(b_t\) are chance propositions for nontrivial ch; in such cases all chances about which the agent has an opinion are fundamentally self-deceiving. To avoid this for some chance ch, there have to be at least two worlds sharing the ch as their ur-chance; it is, again, surprising to us that this “metaphysical” conclusion should follow just from a principle about coordinating two types of credences.

To sum up, with regard to the issue stemming from \(C_{ch}\)’s being atoms, it is the GPP which comes out on top, with NP second.

3.3 Conditional chances should guide conditional credences

A good chance-credence norm should capture the idea that chances guide rational credences, and chances can fulfill this function because they are taken to track patterns of events in the world. But this is something that conditional chances do as well, under the assumption that what we conditionalise upon occurs. A good chance-credence norm, then, should accommodate this last insight as well: conditional chances should guide conditional credences. Roughly, if I learn (just) that the conditional chance of some dice coming up 3 (A) given that it comes up odd (B) is .3, I should set my credence in A given B to .3. And if I entertain more than one hypothesis about this conditional chance, my conditional credence should be an appropriate mixture. It seems that these cases should be covered by chance-credence principles: and perhaps they already are?

To start exploring this issue, let us note that in the cases discussed so far the formulations of the chance-credence norms include the phrase “for all propositions A”, and apart from some chance proposition(s) and the evidence proposition, only the proposition A is subsequently used in the statement of the norm. It is a priori possible that in some cases, that is, for some norms, we could generalize this so that we quantify over pairs of propositions A and B, and proceed to use both of these when formulating some condition involving the conditional credence in A given B (this would be a generalization since B can be chosen to be tautologous).

Let us set this up slowly. It is probably safe to say that most of us, when encountering the notion of conditional probability, inquired about the possibility of iterating it: can we rigorously speak of things like "the probability of A given B, given C"? We were then told that there were ways of doing this, albeit somewhat convoluted. The probably most natural one is based on the following Fact:

Fact 1

Suppose \(B\in \mathcal {F}\) and \(P(B)>0\). Then \(P_B : \mathcal {F} \rightarrow [0,1]\) defined as, for any \(A \in \mathcal {F}\),

$$\begin{aligned} P_B(A)=P(A|B), \end{aligned}$$

is a probability function with the domain \(\mathcal {F}\).

This operation can be iterated, so that we can speak e.g. of the measure \(P_{C_B}\), for some suitably chosen B and C. Note the following Lemma:

Lemma 1

Assume \(P(C)>0\) and \(P(B)>0\). Then \(P_{C_B}(A)=P(A|BC)\).

Proof

$$\begin{aligned} P_{C_B}(A)=P_C(A | B)=\frac{P_C(AB)}{P_C(B)}=\frac{P(AB|C)}{P(B|C)}=P(A|BC). \end{aligned}$$

\(\square \)

This implies, for instance, that whenever \(P(C)>0\) and \(P(B)>0\), \(P_{C_B}=P_{B_C}\).

Suppose, then, we extend the language of probability and define \(P((A|B)|C) {:}{=}P_{C_B}(A)\). Then, the above Lemma implies that \(P((A|B)|C)=P((A|C)|B)=P(A|BC)\). Let us call this way of understanding expressions like P((A|B)|C), in the context of any probability function P, be it chance or credence, the conjunctive convention.Footnote 21

We can now roughlyFootnote 22 define what we’d like to call a “dot principle”, for reasons which should become clear soon. The “chance-credence language” to which the definition refers is the language we are using to formulate the conditions on credences and chances featuring in the specifications of the chance-credence norms. It features function names ch and \(b_t\), for various real values of t, possibly with other subscripts, propositional letters A, B, \(\ldots \), as well as chance-proposition symbols \(C_{ch}\), with subscripts, if necessary. We do not believe more rigor is needed at this point.

Definition 1

((Dot principle)) A dot principle is a principle involving an equation or inequality in the chance-credence language which features one of more occurrences of the ‘\(\cdot \)’ symbol; with the understanding that the expression in question is true on both the following interpretations:

  • when ‘\(\cdot \)’ is a variable ranging over all propositions;

  • and when for any two propositions A and B, \('(A|B)'\) can be substituted uniformly for ‘\(\cdot \)’ so that the result is a formula which is true under the conjunctive convention (provided that all conditional probabilities featuring in the resulting formula are defined).

As an initial example, which will also illustrate the point that “dot principles” may be a sensible topic of discussion also if they do not mention chance at all, consider the “dot” version of BC:

(BC.) If \(t < t'\), it ought to be the case that, for any A,

$$\begin{aligned}b_{t'}(\cdot ) =b_t(\cdot |E_{t'}).\end{aligned}$$

(BC\(\cdot \)) implies (BC), of course. However, it is elementary to notice that (BC) implies (BC\(\cdot \)): \(b_{t'}(A|B)=\frac{b_{t'}(AB)}{b_{t'}(B)}{\mathop {=}\limits ^{(BC)}}\frac{b_t(AB|E_{t'})}{b_t(B|E_{t'})} =b_t(A|BE_{t'}){\mathop {=}\limits ^{conj.~conv.}}b_{t}((A|B)|E_{t'})\). And so, the regular and the “dot” versions of BC are equivalent. In other words, we do not need to add anything to BC to make it cover conditional probabilities: the original BC suffices.

What about chance-credence norms? The “dot” versions of the principles under discussion here look as follows:

(PP.) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions,

$$\begin{aligned}b_t(\cdot |C_{ch})=ch(\cdot |E_t),\end{aligned}$$

unless \(ch(E_t)=0\); in which case it ought to be that \(b_t(C_{ch})=0\).

(GPP.) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions,

$$\begin{aligned}b_t(\cdot )=\sum _{ch: ch(E_t)>0}b_t(C_{ch})ch(\cdot |E_t).\end{aligned}$$

(NP.) At time t, an agent ought to have a credence function \(b_t\) such that, for all ur-chance functions ch and propositions,

$$\begin{aligned}b_t(\cdot |C_{ch})=ch(\cdot |E_t C_{ch}),\end{aligned}$$

unless \(ch(E_t)=0\); in which case it ought to be that \(b_t(C_{ch})=0\).

It turns out that while both the PP and the NP are equivalent to their “dot” versions (Appendix A.6), the GPP is not equivalent to the GPP\(\cdot \) (Appendix A.7). Therefore, of these three, the GPP stands out negatively: it is the worst at handling conditional credences and chances.Footnote 23

4 Conclusions

We began the paper by exploring the relationship between the GPP and the PP, noting the diversity of views regarding this issue. We argued that the GPP has been present already in Lewis, and the recent discussions regarding the principle proposed by Ismael pertain also to the original Lewisian notion. We discussed the logical relationship between the two principles (the PP does imply the GPP, and not vice versa) and their scope (it is identical: despite some remarks cited above, both principles apply both to situations in which the agent is certain about what the objective chance function is, and to those in which the agent displays uncertainty about that matter). We also suggested that in situations of the latter kind, where the GPP is satisfied—so the agent’s credences are their expectations of chances—but the PP is not, it is not easy to find a mark of synchronic irrationality on part of the agent; therefore, despite the logical relationship between the two principles, we should not think that the GPP gets all its justification from the PP. Ideally, we’d set our credences to equal the objective chances; if we don’t know what these chances are, we should set our credences to be expectancies of chances, and this is exactly what the GPP does. Notice that the starting point for this argument is only the informal intuition about ideally setting credences as equal to chances, and not the PP as formally stated, which says a great deal more.

We also recalled that, for those who subscribe to BC, the PP actually delineates something else than a special case of the GPP: it gives us potential future credences of the agent after updating on chance propositions. The GPP seems to lack this dynamic aspect.

We then introduced the NP and its “generalized” counterpart, GNP, noting that this generalization was only superficial. Then we offered a few desiderata on chance-credence norms and checked whether the three candidates under discussion here meet them. It seems that the NP is clearly the leader here: it is preserved by the BC, in at least in one aspect it handles conditional chances correctly (i.e., it is a “dot principle”), and it is not troubled by the phenomenon of modesty.Footnote 24 However, one outstanding issue remains: if a nontrivial chance function is not fundamentally self-deceiving, there have to be at least two possible worlds for which it is the ur-chance function. Perhaps it is a small price to pay for the benefits the principle brings; and perhaps, for many chance metaphysicians, it is no price at all.