1 Introduction

Principles of expert deference play a prominent role in Bayesian epistemology.Footnote 1 For an example of a principle of expert deference: Lewis (1980)’s Principal Principle tells you that, in the absence of an extraordinary form of evidence, you should defer to the future objective chances. That is, given that the objective chance of an arbitrary proposition, A, is n, your own subjective probability, or credence, in A should be n, too. That is, if C is your credence function, and \(\langle {\mathcal {C}}h_t(A) = n \rangle\) is the proposition that the time t chance of A is n, then you should satisfy the equality:Footnote 2

$$\begin{aligned} C(A \mid \langle {\mathcal {C}}h_t(A) = n \rangle ) = n \end{aligned}$$

That’s one principle of expert deference. For another: Rational Reflection says that you should defer to the ideally rational credences for someone with your evidence to have. (For discussion, see Christensen, 2010, Elga, 2013, and Lasonen-Aarnio, 2015.) That is, given that the rational credence function for you to have is R, your credence in any proposition A should be R(A).

$$\begin{aligned} C(A \mid \langle {\mathcal {R}} = R \rangle ) = R(A) \end{aligned}$$

(Here, ‘\(\langle {\mathcal {R}} = R \rangle\)’ says that the rational credence function for someone with your evidence is R.)

These principles both tell you to defer to some expert probability function, but they take different forms. The first tells you to defer to the expert conditional on their views about any proposition; whereas the second tells you to defer to the expert conditional on their views about every proposition (that is: conditional on their entire probability function). We can call the first a norm of local expert deference, and the second a norm of global expert deference.


Local deference You locally defer to an expert, \({\mathcal {E}}\), iff, for any proposition, A, and any number n, your credence in A, given that \({\mathcal {E}}\)’s probability for A is n, is n.

$$\begin{aligned} C(A \mid \langle {\mathcal {E}}(A) = n \rangle ) = n \end{aligned}$$

Global deference You globally defer to an expert, \({\mathcal {E}}\), iff, for any proposition A, and any probability function E, your credence in A, given that \({\mathcal {E}}\)’s entire probability function is E, is whatever probability E gives to A.

$$\begin{aligned} C(A \mid \langle {\mathcal {E}}= E \rangle ) = E(A) \end{aligned}$$

It’s not obvious what the relationship is between these two different ways of showing deference to an expert. It’s natural to think that they’re equivalent, in the sense that you will globally defer to an expert function \({\mathcal {E}}\) if and only if you locally defer to \({\mathcal {E}}\). However, as we’ll see in §2 below, this isn’t quite right. While globally deferring to \({\mathcal {E}}\) entails locally deferring to \({\mathcal {E}}\), an example from Gaifman (1988) teaches us that the converse is not true. In some cases, you can locally defer to \({\mathcal {E}}\) without globally deferring to \({\mathcal {E}}\).

Stalnaker (2019, pp. 111–12) speculates that Gaifman’s example is “a loophole—a contrived case where [a principle of local deference] is satisfied without its usual motivation”. Here, I will substantiate Stalnaker’s suspicions. I’ll argue that the differences between local and global deference are so incredibly slight as to be philosophicaly negligible—there is no good reason to accept the weaker local deference norm without accepting the stronger global deference norm. To that end, I will precisely characterise the situations in which global and local deference principles come apart. This characterisation will show us that Gaifman’s original example of an expert who may be deferred to locally but not globally is—in a good sense—the only expert like this. So the kinds of situations in which it is possible to defer locally without deferring globally are incredibly singular and fragile. And there is no reason to think that these kinds of cases are epistemologically singular. The upshot is that Bayesians should have no qualms about moving freely back and forth between global and local formulations of principles of expert deference. While they are not strictly speaking equivalent, they are equivalent for all philosophical purposes.

2 How local and global deference norms differ

I’m going to take for granted here that your credence function, C, is a countably additive probability function, defined over subsets of a space of possible worlds, \({\mathcal {W}}\). For the sake of simplicity, I’m going to assume that \({\mathcal {W}}\) is at most countably infinite. I’ll call any \(A \subseteq {\mathcal {W}}\) a ‘proposition’, and since \({\mathcal {W}}\) is at most countably infinite, we can suppose that C gives a probability to every proposition.

I’ll suppose that you are certain that the expert’s probability function is defined over exactly the same algebra of propositions as your own, namely the powerset of \({\mathcal {W}}\), \({\mathbb {P}}({\mathcal {W}})\). And I’ll suppose that we have a function from worlds in \({\mathcal {W}}\) to probability distributions over \({\mathbb {P}}({\mathcal {W}})\), which I’ll write ‘\({\mathcal {E}}\)’. The value of this function, given the argument w—which I’ll write ‘\({\mathcal {E}}_w\)’—will be interpreted as the probability function the expert has at the world w. With this function, we can form the proposition that the expert’s probability function is E (for some probability distribution E), by gathering together all the worlds \(w \in {\mathcal {W}}\) such that \({\mathcal {E}}_w = E\).

$$\begin{aligned} \langle {\mathcal {E}}= E \rangle =_{df} \{ w \in {\mathcal {W}}\mid {\mathcal {E}}_w = E \} \end{aligned}$$

We may likewise form the proposition that \({\mathcal {E}}\)’s probability for A is n by gathering together all of the worlds \(w \in {\mathcal {W}}\) such that \({\mathcal {E}}_w(A) = n\).

$$\begin{aligned} \langle {\mathcal {E}}(A) = n \rangle =_{df} \{ w \in {\mathcal {W}}\mid {\mathcal {E}}_w(A) = n \} \end{aligned}$$

Given this setup, if you defer to \({\mathcal {E}}\) globally, then you will defer to \({\mathcal {E}}\) locally as well. To appreciate this, just notice that \(\langle {\mathcal {E}}(A) = n \rangle\) is partitioned by the set of all propositions of the form \(\langle {\mathcal {E}}= E \rangle\), for some E that gives a probability of n to A. It then follows from conglomerability that, if \(C(A \mid \langle {\mathcal {E}}= E \rangle ) = n\) for each E such that \(E(A) = n\), then \(C(A \mid \langle {\mathcal {E}}(A) = n \rangle )\) must also be n.Footnote 3

So global deference implies local deference. But the converse is false.

Example 1

(Gaifman, 1988) There are three worlds in \({\mathcal {W}}\), which we will call ‘1’, ‘2’, and ‘3’. At world 1, the expert gives 50% probability to 1 and 50% probability to 2. At world 2, the expert gives 50% probability to 2 and 50% probability to 3. At world 3, the expert gives 50% probability to 3 and 50% probability to 1.

We can represent the expert from Example 1 with a square matrix, where the entry in the rth row and the cth column gives us the probability which the expert gives to world c at the world r, \({\mathcal {E}}_r(c)\). (Throughout, I’m going to adopt the convention of using expressions like ‘\({\mathcal {E}}_1(3)\)’ and ‘\({\mathcal {E}}_2(1 \vee 3)\)’ for \({\mathcal {E}}_1(\{ 3 \})\) and \({\mathcal {E}}_2(\{ 1, 3 \})\), respectively.)

figure a

Gaifman’s example is interesting because, if you spread your credences uniformly—\(C(1) = C(2) = C(3) = 1/3\), then you will defer to \({\mathcal {E}}\) locally, but not globally. For instance, your credence in \(1 \vee 2\), given \(\langle {\mathcal {E}}(1 \vee 2) = 1/2 \rangle\), is just \(C(1 \vee 2 \mid 2 \vee 3)\) (since \({\mathcal {E}}\)’s credence in \(1 \vee 2\) is 1/2 at worlds 2 and 3), and if your credences are uniform, then \(C( 1 \vee 2 \mid 2 \vee 3)\) is 1/2. Moreover, as you can check for yourself, this works for every \(A \subseteq \{ 1, 2, 3 \}\) and every n. \(C(A \mid \langle {\mathcal {E}}(A) = n \rangle ) = n\) whenever \(\langle {\mathcal {E}}(A) = n \rangle\) is given a credence greater than 0. So, with the uniform credence distribution, you defer to \({\mathcal {E}}\) locally. But you do not defer globally, since \(C(2 \mid \langle {\mathcal {E}}= {\mathcal {E}}_2 \rangle ) = C(2 \mid 2) = 1\), even though \({\mathcal {E}}_2\)’s credence in 2 is only 1/2.

We can pull the same trick with more worlds. For instance, if \({\mathcal {W}}= \{ 1, 2, 3, 4, 5 \}\), and the expert function is given by this matrix,

figure b

Then the uniform credence distribution (the one which gives credence 1/5 to every world) will defer locally, but not globally, to this expert.

Another helpful way of looking at an expert function, \({\mathcal {E}}\), is with a Kripke frame \(({\mathcal {W}}, R)\), where we stipulate that world w ‘sees’ a world, x, wRx, iff the expert at w gives positive probability to x, \({\mathcal {E}}_w(x) >0\). For illustration, the expert from the 5 world model above gives rise to the frame in Fig. 1.

Fig. 1
figure 1

The expert frame generated by the 5 world cyclic expert

Call any collection of worlds like this—a collection \({\mathscr {C}}\), containing at least 3 worlds, such that each world in \({\mathscr {C}}\) bears R to itself and exactly one other world, and every \(w \in {\mathscr {C}}\) bears \(R^+\) (the transitive closure of R) to every other world in \({\mathscr {C}}\)—a cycle. If \({\mathscr {C}}\) is a cycle and, moreover, for every \(w \in {\mathscr {C}}\), \({\mathcal {E}}_w\) gives exactly half of its probability to w, then I’ll say that \({\mathscr {C}}\) is a ‘half-cycle’. Finally, if the frame \({\mathcal {E}}\) gives rise to contains some half-cycle, then I’ll say that \({\mathcal {E}}\) is a half-cyclic expert.


Half-cyclicity An expert \({\mathcal {E}}\) is half-cyclic if and only if the frame it generates contains a cycle \({\mathscr {C}}\) such that, for every \(w \in {\mathscr {C}}\), \({\mathcal {E}}_w(w) = 1/2\).

Whenever an expert is half-cyclic, it will be possible to defer to them locally but not globally. In the appendix, I prove the following theorems:

Theorem 1

If \({\mathcal {E}}\) is half-cyclic, then C will defer to \({\mathcal {E}}\) locally but not globally if it spreads its credence uniformly over each half-cycle and gives a probability of 0 to any world not in a half-cycle.

Theorem 2

If \({\mathcal {E}}\) is half-cyclic, then C defers to \({\mathcal {E}}\) locally only if C is uniform over every half-cycle.

When else is it possible to defer locally but not globally? Never. The half-cyclic experts are the only ones to whom you can defer locally without deferring to them globally. In the appendix, I prove

Theorem 3

If \({\mathcal {E}}\) is not half-cyclic, then C defers to \({\mathcal {E}}\) locally iff C defers to \({\mathcal {E}}\) globally.

This tells us that Gaifman’s example is incredibly singular. We can vary the size of the half-cycles, but that’s it. In no other kind of case do the local and global norms pull apart.

3 Why the difference Is philosophically negligible

In my view, this theorem teaches us something helpful. It teaches us that we don’t have to concern ourselves with the differences between local and global norms of deference. For it teaches us that there is no philosophically plausible reason anyone could have to endorse a norm of local deference while denying the corresponding norm of global deference. I’ll give two independent reasons to think that such a position is implausible in §3.1 and §3.2 below.

3.1 Drawing new distinctions

Suppose that we begin with the model from Example 1, and we simply introduce a new distinction. Perhaps, for each world w, we introduce two new worlds, \(w_H\) and \(w_T\), where \(w_H\) is the possibility previously represented by w, plus the additional information that a flipped coin landed heads, and \(w_T\) is the possibility previous represented by w, plus the additional information that the coin landed tails. And suppose that each possible expert gives a probability of 1/2 to the coin landing heads and a probability of 1/2 to the coin landing tails, and takes the outcome of the coin flip to be independent of whether 1, 2,  or 3. Then, including this additional distinction gives us the following expert:

figure c

And Theorem 3 assures us that, while the half-cyclic expert from Example 1 could be deferred to locally, this non-half-cyclic expert cannot.Footnote 4 Attending to an additional distinction like whether a coin landed heads or tails should only make a difference to whether the expert \({\mathcal {E}}\) is deserving of epistemic deference if there is something irrational about the probabilities \({\mathcal {E}}\) assigns to the coin landing heads or tails. But in this case, there is nothing irrational about \({\mathcal {E}}\)’s probabilities. The coin is fair and independent of whether 1, 2,  or 3. So conditional on 1, conditional on 2, and conditional on 3, the expert should divide their probability evenly between heads and tails. So, if a half-cyclic expert is deserving of epistemic deference, then, after we introduce a new, independent distinction—dividing each former possibility into an equally likely ‘heads’ and ‘tails’ possibility—the new expert should also be deserving of epistemic deference.

However, if you endorsed a local norm of deference while rejecting the corresponding global norm of deference, you would be forced to disagree. For then, you would think that introducing this new distinction does make a difference to whether \({\mathcal {E}}\) is deserving of epistemic deference. I take that to be rather implausible; so I take it to be rather implausible that a local norm of deference holds without the corresponding global norm holding.

3.2 Learning the expert’s evidence

In the introduction, I said that Lewis’s Principal Principle tells you to locally defer to the future objective chances. That’s true, but it’s slightly misleading, because it also tells you to globally defer to the future objective chances. Lewis’s principle has the form of what we can call a conditional local deference principle. It says that your initial or ur-prior credence function, \(C_0\), should locally defer to the future objective chances conditional on any admissible evidence. That is: for any proposition A, any future time t, any number n, and any admissible evidence proposition F, you should satisfy the equality

$$\begin{aligned} C_0(A \mid \langle {\mathcal {C}}h_t(A) = n \rangle \cap F) = n \end{aligned}$$

If your total evidence is admissible, then conditionalisation says that \(C(-)\) should be \(C_0(- \mid F)\), so this norm implies the one from the introduction.

Lewis thought (back in 1980, at least) that propositions about the time t chances were themselves admissible. And he thought that admissibility was closed under conjunction. So we can take any probability function ch such that \(ch(A) = n\), and any admissible evidence F, and the Principal Principle will require that

$$\begin{aligned} C_0(A \mid \langle {\mathcal {C}}h_t(A) = n \rangle \cap \langle {\mathcal {C}}h_t = ch \rangle \cap F) = n \end{aligned}$$

Now, notice that \(\langle {\mathcal {C}}h_t(A) = n \rangle \cap \langle {\mathcal {C}}h_t = ch \rangle\) is just \(\langle {\mathcal {C}}h_t = ch \rangle\), and n is just ch(A), so this is equivalent to a conditional global norm which requires that

$$\begin{aligned} C_0(A \mid \langle {\mathcal {C}}h_t = ch \rangle \cap F) = ch(A) \end{aligned}$$

which is why, in his original 1980 article, Lewis was able to freely move back and forth between a local and a global version of the Principal Principle.

There’s a general lesson here. For we often don’t just want to suggest that you should defer to an expert now, given the evidence you currently have. We generally want to say that you should continue to defer to them, even after you’ve received certain kinds of evidence. Taking Lewis’s lead, call this kind of evidence ‘admissible’. Then, consider the following two ways of showing deference:

Conditional local deference You conditionally locally defer to an expert, \({\mathcal {E}}\), iff, for any proposition A, any number n, and any admissible evidence F, your credence in A, given that \({\mathcal {E}}\)’s probability for A is n, and given F, is n.

$$\begin{aligned} C(A \mid \langle {\mathcal {E}}(A) = n \rangle \cap F) = n \end{aligned}$$

Conditional global deference You conditionally globally defer to an expert, \({\mathcal {E}}\), iff, for any proposition A, any probability function E, and any admissible evidence F, your credence in A, given that \({\mathcal {E}}\)’s entire probability function is E, and given F, is whatever probability E gives to A.

$$\begin{aligned} C(A \mid \langle {\mathcal {E}}= E \rangle \cap F) = E(A) \end{aligned}$$

Intuitively, evidence F is admissible iff you should continue deferring to \({\mathcal {E}}\) even after you have F as your total evidence. Here’s a general principle about admissible evidence that we should want to accept in a wide variety of cases: if F might be the expert’s total evidence, then F is admissible.


Admissibility of expert evidence For any possible world w such that \(C(w)>0\), \({\mathcal {E}}\)’s total evidence at w is admissible.


In other words, if you should show epistemic deference to \({\mathcal {E}}\), then for any possible world w with positive credence, after learning \({\mathcal {E}}\)’s total evidence at w, you should continue to show epistemic deference to \({\mathcal {E}}\).

If we accept the admissibility of expert evidence, then there will be no difference between a norm of conditional local deference and a norm of conditional global deference. To appreciate this, notice that a norm of conditional local deference says not only that \(C(-)\) should locally defer to \({\mathcal {E}}\), but also that, for any admissible F, \(C(- \mid F)\) should locally defer to \({\mathcal {E}}\), too. But Theorem 3 teaches us that the only way it could be possible for \(C(- \mid F)\) to defer to \({\mathcal {E}}\) locally but not globally is if \({\mathcal {E}}\) is a half-cyclic expert. But then, \({\mathcal {E}}\)’s evidence at every world w in a half-cycle is \(w \vee wR\), where ‘wR’ is w’s successor in the cycle. If expert evidence is admissible, then for any world w with positive credence, \(w \vee wR\) is admissible, and a norm of conditional local deference will require that \(C(- \mid w \vee wR)\) locally defer to \({\mathcal {E}}\). Since \(C(- \mid w \vee wR)\) only gives positive probability to two worlds within a cycle, it does not spread its probability uniformly over every cycle. So Theorem 2 assures us that it does not locally defer to \({\mathcal {E}}\). So, if expert evidence is admissible, then it is impossible to conditionally locally defer to a half-cyclic expert. So, if expert evidence is admissible, then it is possible to conditionally locally defer to all and only the experts it is possible to conditionally globally defer to. And whenever you conditionally locally defer, you will also conditionally globally defer.

That is: if expert evidence is admissible, then there is no difference between a norm of conditional local deference and the corresponding norm of conditional global deference. It is not plausible to think that some expert \({\mathcal {E}}\) is deserving of epistemic deference, but that \({\mathcal {E}}\) might not be deserving of deference, were you to learn what \({\mathcal {E}}\)’s evidence is. So it is not plausible to endorse a norm of local deference without endorsing the corresponding global norm.