Admissibility Troubles for Bayesian Direct Inference Principles

Wallmann, Christian; Hawthorne, James

doi:10.1007/s10670-018-0070-0

Admissibility Troubles for Bayesian Direct Inference Principles

Original Research
Open access
Published: 24 November 2018

Volume 85, pages 957–993, (2020)
Cite this article

Download PDF

You have full access to this open access article

Erkenntnis Aims and scope Submit manuscript

Admissibility Troubles for Bayesian Direct Inference Principles

Download PDF

Christian Wallmann¹ &
James Hawthorne²

1783 Accesses
6 Citations
Explore all metrics

Abstract

Direct inferences identify certain probabilistic credences or confirmation-function-likelihoods with values of objective chances or relative frequencies. The best known version of a direct inference principle is David Lewis’s Principal Principle. Certain kinds of statements undermine direct inferences. Lewis calls such statements inadmissible. We show that on any Bayesian account of direct inference several kinds of intuitively innocent statements turn out to be inadmissible. This may pose a significant challenge to Bayesian accounts of direct inference. We suggest some ways in which these challenges may be addressed.

The Principal Principle and subjective Bayesianism

Article Open access 03 December 2019

Indirect illusory inferences from disjunction: a new bridge between deductive inference and representativeness

Article 03 June 2021

The Likelihood Ratio Measure and the Logicality Requirement

Article 01 January 2020

1 Introduction

Direct inferences identify values of some probabilistic credences with values of objective chances or relative frequencies. The main idea has been around for a long time. It goes by various names and has been articulated in a variety of ways.^{Footnote 1} Peirce calls it “probable deduction.” Contemporary logicians sometimes call it “statistical syllogism.” David Lewis’s Principal Principle is perhaps the most widely known version of an explicit direct inference principle (Lewis 1980).

Accounts of direct inference usually draw on two distinct notions of probability: an object-language notion, either relative frequency or some notion of objective chance, and a higher level metalinguistic notion that applies to object-language expressions, usually characterized as some kind of logical probability or as a probabilistic measure of rational credence. Carnap (1962), for instance, calls the object language notion $probability_{2}$, and takes it to represent relative frequencies of attributes among members of populations. He calls the metalanguage notion $probability_{1}$, and takes it to be a kind of degree of logical entailment, which he calls “degree of confirmation.”

For notational convenience we write ‘P’ for the $probability_1$ notion and ‘ch’ for the $probability_2$ notion. Although we will often take the ch function to represent some kind of objective chance, in most contexts the reader may interpret it to be either a chance function or a relative frequency function. In either case, expressions involving the function ch will take the form: ‘$ch(Ax,Rx)=r$’. On a reading of ‘ch’ as relative frequency, this expression says that the frequency of objects (or systems, or events) possessing attribute A among those in reference class R is r. On the reading of ‘ch’ as chance, this expression says that the chance that a system in initial state R will acquire attribute A is r.

Letting P represent the $probability_{1}$ notion and taking ch to represent the $probability_{2}$ notion, here is a generic version of a direct inference principle. Later we’ll extend it to more complex chance hypotheses.^{Footnote 2}

Generic Direct Inference Principle—G-DIP:^{Footnote 3}

Let P be an “appropriate” probability function on a language that contains chance (or frequency) claims. Let ‘$ch(Ax,Rx)=r$’ be an object-language statement that says that the chance that a system in state R acquires attribute A is r (alternatively, that the frequency of possessing attribute A among objects in reference class R is r), where r is a standard term for a real number between 0 and 1 (inclusive). Let ‘Rc’ say that system c is in state (or reference class) R, and let ‘Ac’ say that system c acquires (or possesses) attribute A. Then,
$$\begin{aligned} P[Ac \,|\, ch(Ax,Rx)=r \cdot Rc\cdot E] = r, \end{aligned}$$
provided that E is both consistent with $(ch(Ax,Rx)=r \cdot Rc)$ and admissible with respect to $(ch(Ax,Rx)=r \cdot Rc)$ (where tautologies are always considered admissible).^{Footnote 4}

We won’t attempt to spell out an account of admissibility. Doing so is a complex and controversial undertaking. But, for our purposes, no specific account of admissibility need be supposed. Thau’s proposal works well enough for our purposes: “A proposition is inadmissible if it provides direct information about what the outcome of some chance event is.” (Thau 1994, p. 500, emphasis added)

Since tautologies are always admissible, the admissibility of any other statement E requires that E be probabilistically independent of Ac, given $(ch(Ax,Rx)=r \cdot Rc)$ (for P). However, admissibility does not simply reduce to probabilistic independence; rather, it is designed to motivate probabilistic independence in appropriate cases. For instance, Lewis’ substantive account (in Lewis 1980) declares a statement admissible for a direct inference provided that it contains only information about particular matters of fact that occur before the time at which the associated chance outcome occurs. On this account, all future statements about particular matters of fact are inadmissible, even those that may happen to be probabilistically independent of Ac given chance claim $(ch(Ax,Rx)=r\cdot Rc)$.^{Footnote 5}

When a statement D fails to be probabilistically independent of Ac, given $(ch(Ax,Rx)=r \cdot Rc\cdot E)$ for admissible E (for probability function P), then we say that D defeats the corresponding direct inference. That is, defeat of a direct inference by D just means that $P[Ac \,|\, D \cdot ch(Ax,Rx)=r \cdot Rc \cdot E] \ne P[Ac \,|\, ch(Ax,Rx)=r \cdot Rc \cdot E] = P[Ac \,|\, ch(Ax,Rx)=r \cdot Rc] = r$ for admissible E.

Notice that if D is a defeater, then on any adequate account of admissibility, $(D \cdot E)$ must be inadmissible for the direct inference, since failure of probabilistic independence is a sure-fire way for admissibility to fail. But its also possible for admissibility to fail in cases where probabilistic independence remains intact. In such a case, although D (or $(D \cdot E)$) is inadmissible, D does not count as a direct inference defeater, not as we use that term in this paper. Thus, as we use the term, a direct inference defeater is a particularly strong kind of inadmissible statement.^{Footnote 6}

We will investigate several kinds of cases where, on purely logical grounds, when P satisfies the classical axioms of probability, direct inference outcomes must fail to be probabilistically independent of a statement D. Thus, any account of direct inference based on G-DIP will rule the defeating statement D to be inadmissible, regardless of the particular account of admissibility employed. These are the kinds of troubles we consider. These troubles pose significant challenges if an agent wants to use these probability functions in a certain epistemic situation she finds herself in. One such use is to determine one’s current credence via the total evidence requirement.

For Bayesians, the logic of credence functions (or confirmation functions) is captured by the way in which the axioms of probability theory constrain the numerical values of $P[A \,|\, B]$ for the range of statements A and B, often under conditions (or suppositions) that constrain the probability values of other statements. Logically speaking, a direct inference rule such as G-DIP is merely an additional axiomatic constraint. Any function P that satisfies the other axioms, but violates the direct inference rule, is “ruled out” for failing to be an “appropriate” credence (or confirmation) function.^{Footnote 7} However, the further issue of how a rational agent is supposed to apply these functions, given the situation in which she finds herself, including her current state of knowledge, is not a purely logical matter. Carnap realized this long ago. His Requirement of Total Evidence is merely a way to make explicit our usual implicit assumptions about how an agent is supposed to apply her credence (or confirmation) function. Here is a fairly close paraphrase of Carnap’s requirement, adapted to apply to the P functions of G-DIP.

Total Evidence Requirement: Suppose that the logic of credence functions (or confirmation functions) supplies a result of form ‘$P[A \,|\, B] = r$’, where A and B are statements, r is a real number between 0 and 1, and P is the rational initial credence function (or the confirmation function) for an agent. If B expresses this agent’s total available evidence at the time t, then she is justified at t in believing A to the degree r, and hence in betting that A is true with a betting quotient no higher than r.^{Footnote 8} (Compare Carnap 1962, p. 211.)

For an agent to apply our version of the direct inference principle, G-DIP, the agent’s total evidence should be captured by ‘$(Rc \cdot E)$’. What about the chance claim ‘$ch(Ax,Rx)=r$’ (the chance claim X, for Lewis)? The Applications of the direct inference principle need not require that the chance claim itself be part of the agent’s total evidence, nor need the agent know it to be true. Here is a close paraphrase of what Lewis says about this point (Lewis 1980, p.267 continued):

If in addition you are sure that the chance claim $ch(Ax,Rx)=r$ is true (i.e. if $P[ch(Ax,Rx)=r \,|\, Rc \cdot E]=1$, where $(Rc \cdot E)$ is your total evidence), it follows also that $r = P[Ac \,|\, Rc \cdot E]$ is your present unconditional degree of belief that Ac is true. More generally, whether or not you are sure about the chance claim $ch(Ax,Rx)=r$, your unconditional degree of belief that Ac is given by summing over alternative hypotheses about chance:

$P[Ac \,|\, Rc \cdot E] = \sum _{q}\,\, q\times P[ch(Ax,Rx)=q \,|\, Rc \cdot E].$^{Footnote 9}

We investigate several kinds of cases where, on purely logical grounds, direct inference outcomes must fail to be probabilistically independent of a statement D. Thus, any adequate account of admissibility should rule the defeating statement D to be inadmissible. We call such statements logically inadmissible with respect to the direct inferences they defeat. In some cases we show precisely how much the addition of these defeaters to the premises of a direct inference must divert the credence value from the associated chance value. We argue that some of these logically inadmissible statements may be easily acquired by an agent, thus tainting her total evidence and inhibiting her warrant to engage in legitimate direct inferences about these chance events.

Here is how we’ll proceed. In Sect. 2 we prove results^{Footnote 10} that show that material conditional and biconditional statements involving the conclusions of direct inferences must be inadmissible on purely logical grounds. This may present some surprising challenges for Bayesian direct inference principles.

In Sect. 3 we show that in an important class of cases the evidential relevance of a statement D to an outcome Ac implies the logical inadmissibility of D. It seems to be relatively easy for an agent to acquire this kind of information. Thus, an agent’s ability to engage in direct inferences is shown to be somewhat fragile.

In Sect. 4 we consider some fairly mild conditions on credence functions that makes them “inappropriate” for G-DIP, because any credence function that satisfies these conditions must get straightforward direct inferences wrong.

In Sect. 5 we discuss direct inferences in cases where several reference classes may compete. We argue that direct inference probabilities are best characterized as expected values over credences of possible observational statements or over extensive chance theories. We show how this fact is problematic for Bayesian direct inference principles.

The authors of this paper are divided over what these results show. One of us (Wallmann) thinks that many of these logically inadmissible statements should not defeat direct inferences. Rather, an agent who has such information as part of her total evidence should still conform her rational credences, and her betting behavior, to the objective chances. Therefore, this author reads these troubles as showing that the Bayesian account of direct inference fails, that having P satisfy the axioms of conditional probability is incompatible with a correct account of direct inference. The other author thinks that the logically inadmissible statements explored in this paper should indeed defeat direct inferences, so the Bayesian account gets it right. We will elaborate our reasons for disagreement in the main body of the paper. In any case, the paper explores a wide range statements of a kind that must turn out to be inadmissible on any Bayesian account of direct inference.

2 Logical Admissibility Troubles

The troubles we will raise for direct inference principles in this section and the next are quite general. They plague all Bayesian accounts where the P notion satisfies the usual axioms of conditional probability, regardless of whether the conception of objective chance applies to full propositions (as does Lewis’s Principal Principle) or is couched in terms of generic probabilities (containing only open sentences, as in G-DIP, above). All the admissibility failures we’ll discuss draw on cases where probabilistic independence must fail on purely logical grounds. We will first investigate several kinds of such logically inadmissible statements. Section 3 will go on to provide a more general characterization of an important class of logically inadmissible statements.

2.1 Logically Inadmissible Biconditionals

Consider the following situation. John and Maria are standing next to the craps table watching the action. Let H represent the chance hypotheses associated with a fair pair of dice tossed onto a flat surface in the usual (fair) way. In particular, R says that a pair of fair dies is tossed onto a flat surface in the usual (fair) way, and A says that the outcome of a toss is seven. According to chance hypothesis H, the chance of outcome A for a system in state R is 1 / 6, $ch(Ax,Rx)=1/6$, which is the usual objective chance for getting seven on a (fair) toss of a pair of fair dice. Let c be the event consisting of the next toss of the dice, so Rc says that the next toss is that of a pair of fair dice (fairly) tossed onto a flat surface, and Ac says that the next toss comes up seven. Let E represent Maria’s background knowledge about dice and craps tables, and perhaps about human relationships, and about anything else that may be relevant to the following situation (including the fact that Maria trusts John to keep his word). Surely E is itself admissible with respect to possible chance outcomes for $(H \cdot Rc)$—otherwise we will already have trouble applying direct inference principles to this kind of chance situation. Thus, we should have the direct inference $P[Ac \,|\, H \cdot Rc \cdot E] = 1/6$, where P is Maria’s (initial) credence function.

Now, John says to Maria, “I’ll buy you dinner this evening if, but only if, the next toss comes up seven.” That is, John sincerely asserts a statement of form $(F \equiv Ac)$, where Maria understands F to say that John will pay for Maria’s dinner this evening (provided that no extraordinary circumstance arises—e.g. provided that Maria permits it, and John doesn’t fall ill before hand, etc.).^{Footnote 11}

Taking John at his word, Maria adds $(F \equiv Ac)$ to her total body of evidence. Thus, the premise for the direct inference regarding Ac, based on her total body of evidence, becomes $(H \cdot Rc \cdot E \cdot (F \equiv Ac))$. Should Maria’s rational credence that the dice will come up seven on the next toss now differ from the objective chance value?—i.e. does ${P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)]}$ differ from 1 / 6? Or has Maria’s total information, $(E \cdot (F \equiv Ac))$ become inadmissible, undermining the direct inference? More urgently, should Maria still be willing to bet on the next toss turning up seven at the usual fair odds (which is 5 to 1 against, corresponding to the chance of occurrence being 1 / 6)? You might well think so!^{Footnote 12}

As it happens, probability theory itself guarantees that this kind of biconditional information is almost always logically inadmissible for the relevant direct inference. For, whenever $P[F \,|\, H \cdot Rc \cdot E \cdot Ac]\ne P[\lnot F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac]$, Ac cannot be probabilistically independent of $(F \equiv Ac)$ given $(H \cdot Rc \cdot E)$. And any such failure of probabilistic independence entails inadmissibility. Worse yet, we will see that, according to her credence function, the odds at which Maria should be willing to bet that seven turns up may differ significantly from the usual fair betting-odds suggested by the objective chance.

Theorem 1

Inadmissible Biconditionals.

Let r be any real number such that $0< r < 1$. Suppose $P[Ac \,|\, H \cdot Rc \cdot E] = r$ and $1> P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] > 0$. Then both $P[F \,|\, H \cdot Rc \cdot E \cdot Ac] = s$ and $P[\lnot F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] = t$ are well-defined (for some s and t), and

(1)
either $s > 0$ or $t > 0$, and either $s < 1$ or $t < 1$, and
(2)
$P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = 1 \,/\, [1 \,+\, ((1-r)/r)\times (t/s)]$.

Furthermore,

$P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] > r$ if and only if $s > t$,

$P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] < r$ if and only if $s < t$,

$P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = r$ if and only if $s = t$.

If, in addition, $P[Ac \,|\, H \cdot Rc \cdot E \cdot F] = P[Ac \,|\, H \cdot Rc \cdot E]$ (i.e. if Ac is probabilistically independent of F given $H \cdot Rc \cdot E$), then $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = r$ if and only if $P[F \,|\, H \cdot Rc \cdot E] = 1/2$.

Thus, when John says to Maria, “I’ll buy you dinner this evening if, but only if, the next roll comes up seven”, almost everyone who overhears this assertion, and who takes John to be sincere, should employ credences, based on the total available evidence, that fail to match the objective chances of the dice coming up seven on the next roll. Only one kind of exception is possible. Those individuals whose credences remain faithful to the objective chance are just those individuals who, before hearing John’s statement, happen to find the conditional credibility of the claim “John will buy Maria dinner this evening” given seven comes up on the next roll (i.e. ${P[F \,|\, H \cdot Rc \cdot Ac \cdot E]}$) equal to the conditional credibility of the claim “John won’t buy Maria dinner this evening” given seven does not come up on the next roll (i.e. ${P[\lnot F \,|\, H \cdot Rc \cdot \lnot Ac \cdot E]}$) — where both credence conditions include the agent’s total available evidence E together with the relevant chance claims, $(H \cdot Rc)$.

Indeed, before hearing John’s statement ($F \equiv Ac$), perhaps Maria and most bystanders will have taken “seven comes up on the next roll” to be probabilistically independent of “John buys Maria dinner this evening”, given $(H \cdot Rc \cdot E)$. Such an agent cannot have her credence that “the next roll turn up seven” remain faithful to the objective chance unless she happens to assign $P[F \,|\, H \cdot Rc \cdot E] = 1/2$. Thus, the Bayesian account of direct inference apparently implies a form of the principle of indifference (Hawthorne et al. 2017). However, it seems highly doubtful that most agents will assign the value 1 / 2 to ${P[F \,|\, H \cdot Rc \cdot E]}$. For, in place of F, John might well have asserted biconditionals involving any number of distinct alternative conditions, $F_1$, $F_2$, $F_3$, ..., etc. (e.g., “I’ll buy you dinner at McDonald’s”, “I’ll buy you dinner at Chez Panisse”, ..., etc.). But the statements $F_k$ for the resulting biconditional claims, ($F_k \equiv Ac$), cannot all have conditional credence values $P[F_k \,|\, H \cdot Rc \cdot E] = 1/2$. Thus, the agent’s direct inference credence $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F_k \equiv Ac)]$ must deviate from the objective chance value 1 / 6 for almost all such claims, $F_k$.

When the value of $s = P[F \,|\, H \cdot Rc \cdot A \cdot E]$ is much closer to 0 than the value of $t = P[\lnot F \,|\, H \cdot Rc \cdot \lnot A \cdot E]$, the value of $P[Ac \,|\, H \cdot Rc \cdot E\cdot (F \equiv Ac)]$ must be very close to 0, as the theorem shows.^{Footnote 13} So, if Maria (and eavesdropping bystanders) takes John’s offer to be very unlikely before he asserts it, then her total-evidence credence for seven on the next toss should be very close to 0! Thus, if the objective chance values provide the correct betting odds, then Maria (and bystanders) should be willing to accept wagers against seven at incorrect odds that are extremely unfavorable to themselves. This is true regardless of whether there is any evidence available for Maria (or the bystanders) that justifies assigning low credence to John paying for the dinner. We will discuss situations in which credences based on no evidence whatsoever lead to defeat of direct inferences in more detail in Sect. 5.2.

2.2 Some Other Logically Inadmissible Statements

Similar to biconditionals, material conditionals and disjunctions involving the outcome Ac must be logically inadmissible. The extent to which the resulting probabilities deviate from the corresponding direct inference probabilities will be characterized precisely here. We will also prove a result for the case where adding a further statement to the body of evidence defeats a defeater and restores the original direct inference.

A statement is a defeater just in case its negation is also a defeater. The only exceptions are cases where the candidate statement has probability 1 or 0, given the premise of the direct inference. This suggests an easy algorithm for generating a host of inadmissible statements: (1) find an obvious inadmissible statement (e.g. $(\lnot F \cdot \lnot Ac)$); then (2) take its negation (e.g. $\lnot (\lnot F \cdot \lnot Ac)$, which is logically equivalent to $(F \vee Ac)$). The following result establishes this claim.

Theorem 2

Defeater just when Negation-Defeater.

It follows immediately that whenever $0< P[D \,|\, H \cdot Rc \cdot E] < 1$, we have $P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \ne P[Ac \,|\, H \cdot Rc \cdot E]$if and only if $P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot D] \ne P[Ac \,|\, H \cdot Rc \cdot E]$. It also follows immediately that disjunctions and material conditionals involving the outcome Ac are inadmissible. From ${P[Ac \,|\, H \cdot Rc \cdot E]}$$= r > 0$ and $P[\lnot (Ac \vee F)\,|\, H \cdot Rc \cdot E] > 0$, we have $0 = P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot (Ac \vee F)] \ne P[Ac \,|\, H \cdot Rc \cdot E]$; so (via the previous result) $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \vee F)] \ne P[Ac \,|\, H \cdot Rc \cdot E]$. Similarly, from $P[Ac \,|\, H \cdot Rc \cdot E] = r > 0$ and $P[\lnot (Ac \supset F)\,|\, H \cdot Rc \cdot E] > 0$, we have $1 = P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot (Ac \supset F)] \ne P[Ac \,|\, H \cdot Rc \cdot E]$; so (via the previous result) $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \supset F)] \ne P[Ac \,|\, H \cdot Rc \cdot E]$. The following theorem extends this result by showing more precisely the degree to which $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \vee Ac)]$ differs from $P[Ac \,|\, H \cdot Rc \cdot E]$.

Theorem 3

Inadmissible Disjunctions.

Let r be any real number such that $0< r < 1$.

Suppose $P[Ac \,|\, H \cdot Rc \cdot E] = r$ and $P[(Ac \vee F) \,|\, H \cdot Rc \cdot E] < 1$.

Then $P[F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] = s$ is well-defined for some value of $s < 1$, and $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \vee F)] = 1 \,/\, [1 \,+\, ((1-r)/r) \times s] > r$.

It follows immediately that:

Corollary 4

Inadmissible Material Conditionals.

Let r be any real number such that $0< r < 1$ and suppose $P[Ac \,|\, H \cdot Rc \cdot E] = r$.

1.
If $P[Ac \supset F \,|\, H \cdot Rc \cdot E] < 1$, then $P[F \,|\, H \cdot Rc \cdot E \cdot Ac] = s$ is well-defined for some $s < 1$, and $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \supset F)] = 1 / [1 + ((1-r)/r)/s] < r$.
2.
If $P[\lnot Ac \supset F \,|\, H \cdot Rc \cdot E] < 1$, then $P[F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] = s$ is well-defined for some $s < 1$, and $P[Ac \,|\, H \cdot Rc \cdot E \cdot (\lnot Ac \supset F)] = 1 / [1 + ((1-r)/r)\times s] > r$.
3.
If $P[F \supset Ac \,|\, H \cdot Rc \cdot E] < 1$, then $P[F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] = 1-s$ is well-defined for some $s < 1$, and $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F\supset Ac)] = 1 / [1 + ((1-r)/r)\times s] > r$.

This corollary characterizes additional counter-intuitive defeaters for Bayesian direct inference. Suppose that in our craps example from Sect. 2.1 John says “If seven comes up on the next toss, I’ll buy you dinner this evening”. Then, where $r=1/6$, for $s=0.5$, $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \supset F)] = 1/11$. Furthermore, if, believing that John is stingy, Maria considers “John buys Maria dinner this evening”, F, to be highly unlikely (given $H \cdot Rc \cdot E$), say $s=.01$, then $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \supset F)] = 1/501<< 1/6$. Thus, such (material) conditional claims turn out to overwhelmingly defeat the direct inference. This is true regardless of whether Maria has any evidence that justifies her in considering John as stingy.

In some cases a defeated direct inference may be restored by the addition of information. Consider, for example, the case where $(Ac \vee F)$ is a defeater for the direct inference to Ac, but where F is not itself a defeater. In that case, although $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \vee F)] \ne P[Ac \,|\, H \cdot Rc \cdot E]$, adding F as a premise restores the direct inference, since $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \vee Ac)\cdot F] = P[Ac \,|\, H \cdot Rc \cdot E \cdot F] = P[Ac \,|\, H \cdot Rc \cdot E]$. In this case the statement F is a defeater–defeater for the defeater $(Ac \vee F)$. An earlier (Theorem 2) showed that the negation of a defeater must also be a defeater. So, one may well wonder whether the negation of a defeater–defeater may also be a defeater–defeater. The following theorem shows that this never happens. The negation of a defeater–defeater can never restore the previously defeated direct inference.

Theorem 5

Negations of Defeater–Defeaters cannot be Defeater–Defeaters.

Suppose $P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \ne P[Ac \,|\, H \cdot Rc \cdot E]$, but for G such that $1> P[G \,|\, H \cdot Rc \cdot E \cdot D] > 0$ we have $P[Ac \,|\, H \cdot Rc \cdot E \cdot D\cdot G] = P[Ac \,|\, H \cdot Rc \cdot E]$—i.e. suppose that D defeats the direct inference $P[Ac \,|\, H \cdot Rc \cdot E] = r$ but G defeats the defeater, restoring the direct inference. Then $1> P[\lnot G \,|\, H \cdot Rc \cdot E \cdot D] > 0$ and $P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot \lnot G] \ne P[Ac \,|\, H \cdot Rc \cdot E]$—i.e. $\lnot G$cannot also defeat the defeater D.

The next subsection provides an important example of a defeater–defeater.

2.3 Escape from These Troubles via Stronger Conditionals

The craps table examples presented in Sects. 2.1 and 2.2 show how easy it can be to taint an agent’s total body of evidence with statements that defeat her direct inferences. But perhaps our way of interpreting these examples is mistaken. For, although direct inferences are indeed defeated by such material conditionals and biconditionals (in which the antecedents are the target statement of the direct inference, or its negation, Ac or $\lnot Ac$), perhaps such defeating conditionals and biconditionals may not be so easily introduced into an agent’s total body of evidence in such a way that they function as defeaters. If this suggestion is right, then although the formal results about material conditional and biconditional defeaters are correct, the intuitive examples we used to illustrate the impact of these formal results may be misleading. Properly represented, the intuitive examples might not give rise to direct inference defeaters after all. Here is what we have in mind.

We first treat the case of simple conditional statements, before turning to the biconditional case. Consider John’s conditional assertion to Maria, “If seven comes up on the next toss, I’ll buy you dinner this evening.” As usually understood, such an assertion suggests a clear causal asymmetry between John’s dinner offer (i.e. “I’ll will buy Maria dinner this evening”) and the outcome of the dice roll (i.e. “seven comes up on the next toss”). John may wait for the outcome of the toss and may then act in such a way that the conditional will be true. So, perhaps the representation of the example in terms of a mere material conditional is inadequate. Perhaps the conditional involved is more adequately represented by some stronger kind of indicative or causal conditional. Let’s formally represent John’s assertion this way: $(Ac \rightarrow F)$, where $\rightarrow $ represents some kind of strong, causal or indicative conditional. Then, the central issue is whether or not $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \rightarrow F)] = P[Ac \,|\, H \cdot Rc \cdot E]$ may hold for direct inference $P[Ac \,|\, H \cdot Rc \cdot E] = r$. The following result will prove useful.

$P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \rightarrow F)] = P[Ac \,|\, H \cdot Rc \cdot E]$—i.e. the direct inference remains undefeated by $(Ac \rightarrow F)$—whenever

$P[(Ac \rightarrow F) \,|\, H \cdot Rc \cdot Ac \cdot E] = P[(Ac \rightarrow F) \,|\, H \cdot Rc \cdot E]$—i.e. whenever Ac provides no evidence for (or against) $(Ac \rightarrow F)$, given $(H \cdot Rc \cdot E)$.^{Footnote 14}

Arguably, in the craps-table example the claim Ac (given $(H \cdot Rc \cdot E)$) does not provide evidence for or against a strong (causal or indicative) conditional claim of form $(Ac \rightarrow F)$.^{Footnote 15} Thus, our example of easy defeat for an agent’s direct inference may be side-stepped. Supplying the agent with a convincing conditional claim involving the target statement of her direct inference, Ac, need not defeat her direct inference after all, unless that convincing conditional claim is merely a material conditional claim. A truly convincing example of easy defeat via the acquisition of a knowledge of conditional claim will have to show how the rational agent may (easily) become convinced of the material conditional claim in cases where she is not also convinced of the corresponding strong conditional claim.^{Footnote 16}

All of the previous points carry over fairly directly to the case of the biconditional defeater. In this context, John’s biconditional assertion to Maria, “I’ll buy you dinner this evening if, but only if, seven comes up on the by next toss”, clearly suggests a causal asymmetry between John’s dinner offer and the outcome of the dice roll. So, perhaps John’s biconditional assertion is not adequately captured by the material biconditional. Perhaps it is more adequately represented by a conjunction of stronger, indicative or causal conditional claims, as follows: $((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))$, where $\rightarrow $ again represents some kind of strong, causal or indicative conditional. Then, the issue is whether or not $P[Ac \,|\, H \cdot Rc \cdot E \cdot ((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))] = P[Ac \,|\, H \cdot Rc \cdot E]$ may hold, where $P[Ac \,|\, H \cdot Rc \cdot E] = r$ is a direct inference. The following result should help.

The direct inference remains undefeated by the strong biconditional—i.e. $P[Ac \,|\, H \cdot Rc \cdot E \cdot ((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))] =$$P[Ac \,|\, H \cdot Rc \cdot E]$—whenever Ac and $\lnot Ac$ each provide the same evidence for (or against) $((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))$, given ${(H \cdot Rc \cdot E)}$—i.e. whenever $P[((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F)) \,|\, H \cdot Rc \cdot E \cdot Ac] = P[((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F)) \,|\, H \cdot Rc \cdot E \cdot \lnot Ac]$.^{Footnote 17}

Arguably, in the context of the craps-table example, the claims Ac and $\lnot Ac$ should (given $(H \cdot Rc \cdot E)$) each provide the same amount of evidence for or against a strong (causal or indicative) biconditional claim of form $((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))$.^{Footnote 18} Thus, the prospect of easy defeat for an agent’s direct inference about a future chance event, via the easy acquisition of a biconditional, may be averted. Informing the agent with a convincing biconditional claim need not defeat her direct inference, unless that convincing biconditional claim involves only a material biconditional, rather than conditionals of some stronger kind.

None of this is to suggest that defeat via material conditionals and biconditionals is unimportant to Bayesian direct inferences; only that their availability should not be so easily acquired as the craps-table examples suggest. Furthermore, in cases where the chance event Ac has already occurred, when the agent’s total available evidence remains admissible for the relevant direct inference, her chance claims may continue to guide her credence that Ac holds via the usual kind of direct inference. However, in such cases an agent may more easily become informed of a material conditional or biconditional statement that informationally ties Ac to another statement F. When that happens, this additional information may well defeat her chance-based direct inference regarding the chance event Ac, as indicated by the defeater theorems presented in this section. From a Bayesian perspective, this may sound plausible. When F and Ac are informationally tied together by a material conditional or biconditional claim, and that claim is added to the agent’s total evidence, then whatever credence F itself already had will drag the credence of Ac away from its direct inference value.^{Footnote 19} This is true, however, even for the case where no evidence is available for or against F. In this case, it seems that defeat by biconditionals may be problematic. We will discuss situations in which credences based on no evidence whatsoever lead to defeat of direct inferences in more detail in Sect. 5.2.

3 Evidential Relevance and Admissibility

It is commonly supposed that chance hypotheses screen off “many propositions that one can easily come to know and that would otherwise be relevant to the proposition A under discussion.” (Schwarz 2014, p. 82). When this is so, the direct inference from the chance hypothesis is said to be resilient.^{Footnote 20} A high degree of resiliency for direct inferences is crucial. Otherwise, they may be largely inapplicable, given the total evidence available to agents. In this section we will characterize a broad class of statements that, on logical grounds, must defeat direct inferences. Thus, to the extent that such information is readily available to agents, direct inferences may turn out to be rather less resilient than usually supposed.

We investigate some quite general conditions under which a statement D may defeat direct inferences. Our results are general enough to apply to extensive chance hypotheses—i.e. chance hypotheses (and theories) that entail chance claims for an algebra of outcomes of initial chance states R, and may do so for any number of distinct initial chance states. We’ll say more about the nature of extensive chance hypotheses below.

We will characterize some classes of statements that must defeat direct inferences, and so must be inadmissible on any account. For example, under assumptions very commonly met, one of our main results shows that evidential support of a statement D for Ac implies inadmissibility of D in direct inferences for Ac and goes like this:

Let $A_1c$ and $A_2c$ be any two possible chance outcomes of initial state R for chance system c, and suppose E is admissible for the direct inferences from H to each of these two outcomes. Consider a statement D to which each of the possible chance events $(Rc \cdot A_1c)$ and $(Rc \cdot A_2c)$ is directly relevant. Indeed, suppose that each of these possible chance events is so directly relevant to D that it overrides (or screens-off) whatever relevance H might have to D, given E (for credence function P). Then, provided that D is more likely according to one of these two chance events than according to the other, given E (for P), D must defeat either the direct inference from $(H \cdot Rc \cdot E)$ to $A_1c$ or the direct inference from $(H \cdot Rc \cdot E)$ to $A_2c$ (for P). Thus, any such statement D, in conjunction with the admissible statement E, must be inadmissible for direct inferences from $(H \cdot Rc)$.

This section is mainly devoted to explicating several results of this kind.

We proceed by first characterizing extensive chance hypotheses, and generalizing the principle of direct inference, G-DIP, to cover them. Then we identify an important class of statements D that turn out to defeat direct inferences from chance hypothesis H: statements D to which some of H’s chance outcomes are “more directly relevant” than is H itself. We provide an illustrative example of such a case. Finally, we establish two general results that show the logical inadmissibility of such statements. The first result, stated informally above, provides sufficient conditions for such statements to defeat direct inferences. The second result provides necessary and sufficient conditions for such statements to defeat direct inferences, but under slightly stricter conditions (involving partitions of chance outcomes) than supposed by the first result.

3.1 Extensive Chance Hypotheses and Algebras of Attributes

Sophisticated chance hypotheses (or chance theories) entail chance claims for all Boolean combinations of possible outcome attributes of an initial chance state (or reference class) R. That is, whenever the hypothesis entails chance claims of form $ch(Ax,Rx)=r$ and $ch(Bx,Rx)=s$, it also entails chance claims of form $ch(\lnot Ax,Rx)=p$, $ch((Ax\vee Bx),Rx)=q$, and $ch((Ax\cdot Bx),Rx)=t$, where p, q, r, s, t are standard terms for real numbers between 0 and 1. Thus, associated with each chance state Rx is a Boolean algebra of outcome attributes $\Theta _R$ for R, where, whenever $\Theta _R$ contains Ax and Bx, it also contains $\lnot Ax$, $(Ax \vee Bx)$, and $(Ax \cdot Bx)$; and where $\Theta _R$ contains no other expressions.^{Footnote 21} Furthermore, for each initial state (or reference class) R treated by H, the associated chance function $ch(\ ,Rx)$ should satisfy the usual axioms of probability theory for its algebra of attributes, $\Theta _R$.^{Footnote 22} An extensive chance theory of this kind will often cover a variety of distinct initial states (or reference classes) Rx, and provide chance claims for Boolean algebras of outcomes, $\Theta _R$, for each such R.

One more bit of notation will prove useful. When a particular chance system c is in an initial chance state R, we denote the algebra of chance outcomes for event Rc by the term ‘$\Theta _R(c)$’, which represents the algebra of outcome attributes for R, $\Theta _R$, applied to the individual system c. That is, when Rc holds, for each Ax in $\Theta _R$, there is an associated possible outcome of Rc, Ac, in the algebra of associated outcomes $\Theta _R(c)$.

Throughout the remainder of this paper our treatment of chance and direct inference will apply to the kind of extensive chance hypotheses just described. We’ll use ‘H’ to represent chance hypotheses of this kind. Here is a generalization of the direct inference principle that applies to direct inferences from extensive chance hypotheses.

Generalized Generic Direct Inference Principle—GG-DIP:

Let P be an appropriate classical probability function (credence function) on a language that contains chance (or frequency) statements. Let H be any extensive chance hypothesis: that is, for each initial state (or reference class) R treated by H, for each $A_j$ in the associated Boolean algebra, $\Theta _R$, of possible outcome attributes for systems in state R, H entails a chance claim of form $ch(A_jx, Rx) = r_j$, where $r_j$ is a standard term for a real number between 0 and 1 (inclusive), and where each chance function $ch(\ , Rx)$ satisfies the usual axioms of probability theory on $\Theta _R$. Then, for each outcome attribute $A_j$ in $\Theta _R$, for each chance system c,
$$\begin{aligned} P[A_jc \,|\, H \cdot Rc \cdot E] = r_j, \end{aligned}$$
provided that E is both consistent with $(H \cdot Rc)$ and admissible with respect to $(H \cdot Rc)$ over $\Theta _R(c)$ (where tautologies are always considered admissible).

A statement E may defeat some of the direct inferences based on $(H \cdot Rc)$, while leaving others intact. That is, we may have $P[A_jc \,|\, H \cdot Rc \cdot E] = r_j$ for some possible outcomes $A_jc$, while $P[A_kc \,|\, H \cdot Rc \cdot E]$$\ne r_k$ for some other possible outcomes. In that case E should count as inadmissible for the direct inferences from $(H \cdot Rc)$ to the outcomes in $\Theta _R(c)$, regardless of the fact that some of these chance outcomes happen to be probabilistically independent of E. For, when a agent’s total body of evidence consists of $(Rc \cdot E)$ and she is contemplating bets on outcomes of Rc, no proper account of admissibility should count her total evidence as admissible for some of the possible outcomes, but inadmissible for others—admissible for the dice coming up six, but inadmissible for coming up nine. Any proper account of admissibility involves more than mere probabilistic independence. Any specific notion of admissibility is supposed to provide a rational for probabilistic independence in direct inference contexts, and that rational should apply to all the possible outcomes of an initial chance state Rc for a chance system c.

At the beginning of this section we introduced the notion of resiliency for direct inferences. The idea is that the alignment of credences with chances should not be undermined by the addition of easily acquired information. Otherwise, the ability to apply direct inferences becomes unstable. Resiliency is meant to capture this kind of desired stability for direct inferences. A direct inference is highly stable provided that nearly all of the kinds of information that might become available to an agent who is in a position to apply that direct inference falls within its “sphere of resiliency”. It will prove useful to specify this notion formally.

Definition 6

Resiliency Spheres.

For a credence function P, an extended chance hypothesis H, and a chance system c in initial state R covered by chance claims in H, the resiliency sphere for direct inferences from $(H \cdot Rc)$ is the collection of statements E such that, for every outcome Ac in algebra $\Theta _R(c)$ of outcomes for Rc (according to H), $P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc]$.

Notice that a resiliency sphere surrounds not merely individual chance outcomes, taken one at a time, but the whole algebra of outcomes of chance state Rc. A statement E that is probabilistically independent of one outcome of Rc, given $(H \cdot Rc)$, but fails to be probabilistically independent of another of its outcomes, falls outside the resiliency sphere.^{Footnote 23}

The resiliency sphere for $(H \cdot Rc)$ will usually be broader than its class of admissible statements, depending on how the notion of admissibility is specified. To see why, notice how GG-DIP (and G-DIP) is supposed to work. Any application of GG-DIP presupposes some concrete notion of admissibility, specified in advance of identifying associated credence functions P. That is, a concrete notion of admissibility specifies, for each chance statement in H and its initial state Rc (for arbitrary systems c), exactly what statements E are to count as admissible. It will usually do so in terms of the information carried by the chance claims in H, the information carried by Rc and its associate chance outcomes in $\Theta _R(c)$, and by the information carried by statements E. This will usually involve conditions that take into account whether the information in E is (or is not) “directly relevant” to outcomes Ac in $\Theta _R(c)$.^{Footnote 24} The specification of admissibility doesn’t depend in any way on the particular credence function considered. Rather, after a specific account of admissibility is spelled out, GG-DIP (or G-DIP) does its work by ruling out those credence functions P that either fail to make $P[Ac \,|\, H \cdot Rc] = r$ when H entails $ch(Ax,Rx)=r$, or that fail to make $P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc]$ when E has been deemed admissible by the account of admissibility on offer. All credence function P that are not ruled out in this way may count as “appropriate” for some agent, provided that they satisfy whatever other constraints are deemed proper (e.g. for Lewis they must also satisfy regularity). The point is, for a credence function P that passes these hurdles, so succeeds in satisfying GG-DIP, there may well be a number statements E not designated as admissible but that still yield $P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc] = r$ for all Ac in $\Theta _R(c)$. Thus, the resiliency sphere of $(H \cdot Rc)$ for P may well contain more than the class of admissible statements for $(H \cdot Rc)$ specified by a specific account of admissibility. However, any statement E that falls outside the resiliency sphere of $(H \cdot Rc)$ for P must be inadmissible for $(H \cdot Rc)$ according to every possible coherent account of admissibility.

3.2 When Chance Outcomes of a Hypothesis HOverride Its Relevance to a Statement D

Typically, the relevance of a chance hypothesis H to a statement D will be overridden by outcomes of an initial chance state Rc in the following kind of situation. Statement D contains information about possible chance outcome Ac (and its alternatives), so Ac is evidentially relevant to D given $(Rc \cdot E)$. And because hypothesis H is relevant to chance outcome Ac, it will be relevant to (information in) D as well. But, the chance claim $ch(Ax,Rx)=r$ entailed by H is more directly about outcome Ac than about D, so the relevance of H to D derives from its relevance to Ac. When that’s the case, the information contained in outcomes Ac and $\lnot Ac$ may override what information H contains (about possible outcomes) that is relevant to D, given $(Rc\cdot E)$, because the information Ac and $\lnot Ac$ contain is more directly tied to D than the information contained in H. Thus:

$$\begin{aligned}&P[D \,|\, Ac\cdot H \cdot Rc \cdot E] = P[D \,|\, Ac\cdot Rc \cdot E]~ \hbox {and}\\&\quad P[D \,|\, \lnot Ac\cdot H \cdot Rc \cdot E] = P[D \,|\, \lnot Ac\cdot Rc \cdot E]. \end{aligned}$$

In such cases let’s say that the relevance of chance hypothesis H to statement D is overridden by the associated chance outcomes of chance state Rc. It turns out that whenever this condition holds and D is evidentially relevant to Ac ($P[D \,|\, Ac\cdot H \cdot Rc \cdot E] \ne P[D \,|\, Rc \cdot E]$), D (together with admissible E) must defeat the direct inference from $(H \cdot Rc)$ to Ac.

Definition 7

Chance Outcomes with Overriding Relevance to D.

The relevance of chance hypothesis H to statement D is overridden by its direct inference outcomes in $\Gamma _R(c) = \{A_ic, A_jc, \dots , A_kc\}$ (which is some subset of the algebra of outcomes $\Theta _R(c)$ for chance state Rc), given admissible E, just in case for each of the chance outcomes $A_jc$ in $\Gamma _R(c)$ (associated with direct inferences based on $(H \cdot Rc)$, for admissible E), $P[D \,|\, A_jc\cdot H \cdot Rc \cdot E] = P[D \,|\, A_jc\cdot Rc \cdot E]$.

Here is an illustration of a case where chance outcomes $\{Ac, \lnot Ac\}$ of a chance hypothesis H are overridingly relevant to a statement D.

Let H be a theory about the chances that people who fit some particular profile R have the attribute, “will develop Alzheimer’s disease by age 70”, attribute A. Thus, H entails $ch(Ax, Rx)=r$, for some specific value r (e.g. perhaps $r = .83$). Suppose that a 50 year old male named Chuck, c, fits the profile, so Rc holds. Thus, for admissible background information E, $P[Ac \,|\, H \cdot Rc \cdot E] = r$ is a perfectly good direct inference about Chuck’s chances of developing Alzheimer’s by age 70. E may include whatever admissible background information we may know about medical conditions and medical testing (including brain imaging), about the chance theory H, about Chuck himself, etc.

We may be interested in other indications of whether Chuck will develop Alzheimer’s by age 70, indications that are independent of the information provided by chance theory H. Suppose that by means of an imaging technique it is possible to detect brain plaque of the kind usually associated with Alzheimer’s. The detection of a “moderate accumulation” of this plaque (in a patient like Chuck) does not guarantee that the patient will acquire Alzheimer’s as he ages, but it is an indication of a significantly increased risk of developing the disease. Included among the admissible background knowledge E may be information about this technique and its implications. Let statement Fc state the fact that Chuck undergoes the imaging technique at age 50, and let statement D say that the image of Chuck’s brain shows that a “moderate accumulation” of plaque is present. Presumably, absent the result D, Fc taken together with the other information in E is admissible, so let’s suppose that Fc is included within E. However, the result of this this procedure, D, may well be evidentially relevant to whether or not Chuck will develop Alzheimer’s at age 70. Suppose it indicates an increased likelihood of the onset of Alzheimer’s by age 70: $P[Ac \,|\, D \cdot Rc \cdot E] > P[Ac \,|\, Rc \cdot E]$.

Regardless of whatever relevance a person’s chances of developing Alzheimer’s by age 70, H, may have to his likelihoods of exhibiting a “moderate accumulation” of brain plaques by age 50, D, the relevance of that chance claim H to image result D is overridden by the claim that the individual will indeed develop Alzheimer’s by age 70, Ac. That is, the fact that a person will develop the disease, Ac, is predictive enough about the amount of plaque build up over time that it overrides the relevance of the chances of developing the disease (expressed by H) to the likelihood of outcome D from a brain scan at age 50. Thus, $P[D \,|\, Ac \cdot H \cdot Rc \cdot E] = P[D \,|\, Ac \cdot Rc \cdot E]$. Similarly, the fact that a person will not develop the disease, $\lnot Ac$, is predictive enough about the amount of plaque build up over time that it overrides the relevance of the chances of developing the disease (expressed by H) to the likelihood of outcome D from a brain scan at age 50. Thus, $P[D \,|\, \lnot Ac \cdot H \cdot Rc \cdot E] = P[D \,|\, \lnot Ac \cdot Rc \cdot E]$.

Thus, in the order discussed, we have the following:

1.
$P[Ac \,|\, H \cdot Rc \cdot E] = r$ is a direct inference about Chuck’s chances of developing Alzheimer’s by age 70, given he fits profile R.
2.
$1> P[Ac \,|\, D \cdot Rc \cdot E]> P[Ac \,|\, Rc \cdot E] > 0$: given membership in risk group R, the fact that a person’s brain scan at age 50 shows a “moderate accumulation” of plaque is positive evidence that the person will develop Alzheimer’s by age 70.
3.
$P[D \,|\, Ac \cdot H \cdot Rc \cdot E] = P[D \,|\, Ac \cdot Rc \cdot E]$ and $P[D \,|\, \lnot Ac \cdot H \cdot Rc \cdot E]$$=$$P[D \,|\, \lnot Ac \cdot Rc \cdot E]$: relevance of the chances of developing Alzheimer’s by age 70 (according to hypothesis H) to whether a person’s brain scan at age 50 shows a “moderate accumulation” (statement D) is overridden by the claim that the person will (or will not) develop Alzheimer’s by age 70 (the direct inference outcomes of H in $\{Ac, \lnot Ac\}$), given admissible E.

Therefore, the claim that Chuck’s brain scan shows a “moderate accumulation” of plaque, D, defeats the direct inference regarding Chuck’s chances, r, of developing Alzheimer’s by age 70: $P[Ac \,|\, D \cdot H \cdot Rc \cdot E] \ne P[Ac \,|\, H \cdot Rc \cdot E] = r$, for admissible E. Thus, D (in conjunction with E) must be inadmissible for this direct inference.

Here is the relevant formal result. It shows that whenever a chance hypothesis H satisfies the above “overridden relevance to D” condition for its outcomes $\{Ac, \lnot Ac\}$, given $(Rc \cdot E)$, statement D must defeat the direct inference from $(H \cdot Rc \cdot E)$ to Ac if and only if D is evidentially relevant to Ac, given $(Rc \cdot E)$.

Corollary of Theorem 9

Inadmissible Evidence for Outcomes.^{Footnote 25}

We assume throughout that $P[D \cdot H \cdot Rc \cdot E] > 0$ (so that all the conditional probabilities are well-defined).

Let $P[Ac \,|\, H \cdot Rc \cdot E] = r$ for $0< r < 0$, be a direct inference for admissible E.

3.3 The Main Results

The next two theorems provide the main formal results of this section. Each result has two parts. Near the beginning of this section we summarized the first part of the first theorem. Here is an interpretive account of both parts of the first theorem.

Let P be any classical probability function (or rational credence function) that satisfies GG-DIP for the direct inferences from $(H \cdot Rc \cdot E)$ for admissible E. Let $A_1c$ and $A_2c$ be any two possible chance outcomes of initial state R for chance system c. Suppose that (according to the credences represented by function P) each of these two chance events overrides (or screens-off) whatever relevance H might have to D, given E (according to P):

$P[D \,|\, A_kc \cdot H \cdot Rc \cdot E] = P[D \,|\, A_kc \cdot Rc \cdot E]$ for $k=1, 2$. Then:

(1)
If D is more likely according to $(A_1c \cdot Rc \cdot E)$ than according to $(A_2c \cdot Rc \cdot E)$ (as represented by P), then $(D \cdot E)$ must defeat one of the direct inferences based on $(H \cdot Rc)$—i.e. $(D \cdot E)$ falls outside the resiliency sphere of $(H \cdot Rc)$ for P.

(2)
If, given $(Rc \cdot E)$, either $A_1c$ is positively supported by D and $A_2c$ is not positively supported by it, or $A_2c$ is negatively supported by D and $A_1c$ is not negatively supported by it, then $(D \cdot E)$ must defeat one of the direct inferences based on $(H \cdot Rc)$—i.e. $(D \cdot E)$ falls outside the resiliency sphere of $(H \cdot Rc)$ for P.

Here is the formal statement of this result.

Theorem 8

Sufficient Condition for Inadmissible Evidence.

We assume throughout that $P[D \cdot H \cdot Rc \cdot E] > 0$ (so that all conditional probabilities are well-defined).

Let $A_1c$ and $A_2c$ be any two outcomes of initial state Rc such that, for admissible E, the following direct inferences hold:

$$\begin{aligned} P[A_kc \,|\, H \cdot Rc \cdot E] = P[A_kc \,|\, H \cdot Rc] = r_k,\quad \hbox {where}~ 1> r_k > 0,\,\, \hbox {for}~ k=1, 2. \end{aligned}$$

Suppose, for $k=1, 2$: $P[D \,|\, A_kc \cdot H \cdot Rc \cdot E] = P[D \,|\, A_kc \cdot Rc \cdot E]$.

Whereas the first theorem applies for any two chance outcomes of chance hypothesis H, the next theorem relies on outcomes that form a partition. The payoff for this stronger supposition is a biconditional connection between support for (or by) D and the failure of direct inferences.

The first part of this theorem shows that whenever, for each $B_jc$ in a partition of outcomes of Rc, the support for D by chance hypothesis H is overridden by the support afforded to D by $(Rc \cdot B_jc)$, according to P, the following result holds:

D falls outside the resiliency sphere for the direct inferences based on $(H \cdot Rc \cdot E)$if and only if D is supported more (or less) by $B_ic$ than by $B_jc$, given $(Rc \cdot E)$, for some $B_ic$ and $B_jc$ in the partition.

The second part of this theorem shows that under the same conditions stated above for the first part, the following result holds:

D falls outside the resiliency sphere for the direct inferences based on $(H \cdot Rc \cdot E)$if and only if $B_kc$ is either positively or negatively supported by D, given $(Rc \cdot E)$, for at least one of the $B_kc$ in the partition.

Theorem 9

Necessary and Sufficient Condition for Inadmissible Evidence.

We assume throughout that $P[D \cdot H \cdot Rc \cdot E] > 0$ (so that all conditional probabilities are well-defined).

Let $\Delta _R(c) = \{B_1c, B_2c, \dots \}$ be some partition of outcomes of initial state Rc for $P[\ \,|\, H \cdot Rc \cdot E]$ such that, for each $B_kc$ in $\Delta _R(c)$, the following direct inferences hold for admissible E:

$$\begin{aligned} P[B_kc \,|\, H \cdot Rc \cdot E] = P[B_kc \,|\, H \cdot Rc] = r_k, \quad \hbox {for}~\, r_k > 0. \end{aligned}$$

Suppose, for each $B_kc$ in $\Delta _R(c)$, $P[D \,|\, B_kc \cdot H \cdot Rc \cdot E] = P[D \,|\, B_kc \cdot Rc \cdot E]$.

Then we have the following result:

4 “Inappropriate” Credence Functions

It should be pretty clear that, given a specific account of admissibility, not all credence functions are “appropriate” in the way required by G-DIP and GG-DIP. Our next result shows that the axioms of classical probability put tight constraints on precisely which credence functions can get direct inference right. Let P be any “appropriate” initial credence function, which gets direct inferences from $(H \cdot Rc \cdot E)$ to chance outcomes $A_jc$ right, where E is admissible. Let Q be any credence function that varies from P by even a small shift in the non-direct inference credence for a chance outcome—i.e., such that $Q[A_jc \,|\, Rc \cdot E] \ne P[A_jc \,|\, Rc \cdot E]$. Then, provided that Q satisfies an additional weak condition, it cannot get all the direct inferences right.

One example of the additional weak condition is that P and Q agree on the amount of evidential support that $(Rc \cdot A_kc \cdot E)$ would provide to H, for each $A_kc$ in a partition. Another example is where Q comes from P via certain instances of Jeffrey Conditionalization (see Jeffrey 1990). Thus, some rather minor variants of credence functions that satisfy GG-DIP (including some that come about via the kinematics of Jeffrey updating) must fail to satisfy GG-DIP—they fail to count among the “appropriate” credence functions for direct inferences.

For the sake of clarity, we first present our results for binary chance outcomes, Ac and $\lnot Ac$. We generalize these results in a later subsection.

4.1 Examples of “Inappropriate” Credence Functions

Consider the Alzheimer’s example described in Sect. 3. Chance hypothesis H says that the chance of an individual in reference class R getting Alzheimer’s by age 70 is r; Rc says that Chuck is in reference class R; and Ac says that Chuck will get Alzheimer’s by age 70. Suppose that Maria and John agree on the amount of evidential support that $(Rc \cdot Ac)$, were it true, would supply to chance hypothesis H, given all their other relevant evidence E (on which they completely agree): $Q[H \,|\, Rc \cdot Ac \cdot E]$$=$$P[H \,|\, Rc \cdot Ac \cdot E]$, where P is Maria’s credence function and Q is John’s credence function. And also suppose they agree on the amount of evidential support that $(Rc \cdot \lnot Ac)$, were it true, would supply to chance hypothesis H, given all their other relevant evidence E: $Q[H \,|\, Rc \cdot \lnot Ac \cdot E]$$=$$P[H \,|\, Rc \cdot \lnot Ac \cdot E]$. However, Maria is somewhat more optimistic than John about Chuck’s future health, particularly his prospects of getting Alzheimer’s by age 70; thus, $Q[Ac \,|\, Rc \cdot E] < P[Ac \,|\, Rc \cdot E]$.

Although neither Maria nor John is confident that chance hypothesis H is true, both want to draw the correct direct inference value, r, when H is added to their total admissible evidence $(Rc \cdot E)$: $P[Ac \,|\, H \cdot Rc \cdot E] = r$ and $Q[Ac \,|\, H \cdot Rc \cdot E] = r$. However, it turns out that at least one of them must get the direct inference wrong, since: $P[Ac \,|\, H \cdot Rc \cdot E] \ne Q[Ac \,|\, H \cdot Rc \cdot E]$. That is, if Maria gets the direct inference right, then John must get it wrong.

Corollary of Theorem 10

“Inappropriate” Credence Functions.

Suppose, for admissible E, $P[Ac \,|\, H \cdot Rc \cdot E] = r > 0$ is a direct inference. (We assume $P[H \cdot Rc \cdot Ac \cdot E] > 0$ and $P[H \cdot Rc \cdot \lnot Ac \cdot E] > 0$, so that all the conditional probabilities are well-defined.)

Suppose that probability function Q is related to P in the following way (where $Q[Rc \cdot Ac \cdot E] > 0$ and $Q[Rc \cdot \lnot Ac \cdot E] > 0$):

$Q[H \,|\, Rc \cdot Ac \cdot E] = P[H \,|\, Rc \cdot Ac \cdot E]$ and $Q[H \,|\, Rc \cdot \lnot Ac \cdot E] = P[H \,|\, Rc \cdot \lnot Ac \cdot E]$.

Then, $Q[Ac \,|\, Rc \cdot E] \ne P[Ac \,|\, Rc \cdot E]$if and only if $Q[Ac \,|\, H \cdot Rc \cdot E] \ne P[Ac \,|\, H \cdot Rc \cdot E]$.

And, if $Q[H \,|\, Rc \cdot E] \ne P[H \,|\, Rc \cdot E]$, then $Q[Ac \,|\, H \cdot Rc \cdot E] \ne P[Ac \,|\, H \cdot Rc \cdot E]$.

Proof

Follows immediately from setting $\Delta (c)=\{Ac, \lnot Ac\}$ in the more general Theorem 10 below. $\square $

Jeffrey Conditionalization is the best known approach to the representation of learning based on uncertain new evidence. It deals with cases where, rather than learning by becoming certain of new information F, the agent has an experience or an insight that directly changes her confidence in the truth of each alternative among some range of possibilities, $\{F_1, F_2, \dots , F_n\}$. Formally, when P is the agent’s initial credence function, her new information induces a new credence function Q that directly assigns new credence values to the directly affected alternative possibilities in $\{F_1, F_2, \dots , F_n\}$ as follows: $Q[F_i \cdot F_k] = 0$ (they are alternative possibilities), $\sum _{j=1}^n Q[F_j] = 1$ (they are a complete collection of alternative possibilities). The relationship between the old credence function P and the new credence function Q is this: $Q[G \;|\, F_j] = P[G \;|\, F_j]$, for all statements G, for each $F_j$ in $\{F_1, F_2, \dots , F_n\}$. That is, were the agent to become certain of any one of the statements $F_j$, her new credence value, $Q[G \;|\, F_j]$ should be identical to the old credence value, $P[G \;|\, F_j]$ (for each statement G). It follows immediately that, for each statement G, the new credence value is given by $Q[G] = \sum _{j=1}^n P[G \;|\, F_j] \times Q[F_j]$.^{Footnote 26} We now consider a case where Jeffrey Conditionalization (or a similar update method) induces a new credence function that must get direct inferences wrong.

Consider once again the Alzheimer’s example from Sect. 3. As before, chance hypothesis H says that the chance of an individual in reference class R getting Alzheimer’s by age 70 is r; Rc says that Chuck is in reference class R; and Ac says that Chuck will get Alzheimer’s by age 70; statement D says that Chuck’s brain scan at age 50 shows a “moderate accumulation” of plaque. Suppose (this time) that Maria considers the relevance of the chance claim H to brain imaging result D be overridden by the claim, “Chuck gets Alzheimer’s by age 70” (if added as a premise): $P[D \,|\, H \cdot Rc \cdot Ac \cdot E] = P[D \,|\, Rc \cdot Ac \cdot E]$. Similarly, suppose Maria considers the relevance of the chance claim H to brain imaging result D be overridden by the claim, “Chuck does not get Alzheimer’s by age 70” (if added as a premise): $P[D \,|\, H \cdot Rc \cdot \lnot Ac \cdot E] = P[D \,|\, Rc \cdot \lnot Ac \cdot E]$. Furthermore, suppose Maria isn’t privy to the result of Chuck’s brain scan, but she overhears two technicians talking about it. What she hears is vague (mostly tone of voice), but her impression changes her credence from $P[Ac \,|\, Rc \cdot E] = s$ to $Q[Ac \,|\, Rc \cdot E] = t > s$. Maria updates her credences via Jeffrey Conditionalization, according to (1) and (2) below. Thus, her new credence function must get the direct inference (concerning Chuck having Alzheimer’s by age 70) wrong: $Q[Ac \,|\, H \cdot Rc \cdot E]\ne r = P[Ac \,|\, H \cdot Rc \cdot E]$, for admissible E.

Corollary of Theorem 11

“Inappropriate” Credence Functions, Extended.

Proof

Follows immediately from setting $\Delta (c)=\{Ac, \lnot Ac\}$ and $\Gamma =\{D,\lnot D\}$ in the more general Theorem 11 below. $\square $

Our result here fits the pattern of Jeffrey Conditionalization, but our result is more general. For, the result itself doesn’t assume that every statement is updated via the Jeffrey update formula; it only supposes that the update formula applies to $(Rc \cdot Ac)$, $(Rc \cdot \lnot Ac)$, $(H \cdot Rc \cdot Ac)$, and $(H \cdot Rc \cdot \lnot Ac)$. Furthermore, the result itself says nothing about updating, and need not be interpreted that way. Rather, the result applies to any pair of credence functions, Q and P, whatever their origins. The result says that for any credence function P that satisfies the initial suppositions, and for any credence function Q related to P as specified by conditions (1) and (2), when they disagree on the credence values for chance outcome Ac based on $(Rc \cdot E)$ alone, then (and only then) at least one of them must get the direct inference wrong; so at least one of them must be an “inappropriate” credence function according to GG-DIP.

4.2 Generalization to Algebras of Outcomes

We now state the main results of this section in a more general form. The corollaries stated earlier follow directly from these.

Theorem 10

“Inappropriate” Credence Functions.

Let $\Delta (c) = \{B_1c, B_2c, \dots \}$ be a partition for $P[\ \,|\, Rc \cdot E]$, where according to H the members of $\Delta = \{B_1, B_2, \dots \}$ are chance outcome attributes for systems in state R, and where, for admissible E, $P[B_kc \,|\, H \cdot Rc \cdot E] = r_k > 0$ are direct inferences. (We assume $P[H \cdot Rc \cdot B_kc \cdot E] > 0$ for each $B_kc$ in $\Delta (c)$, so that all the conditional probabilities are well-defined.)

Suppose probability function Q is related to P in the following way, where $Q[Rc \cdot B_kc \cdot E] > 0$ for each $B_kc$ in $\Delta (c)$:

The next theorem applies to all cases where probability function Q comes from function P via Jeffrey Conditionalization, but it applies to lots of other Q functions as well. Conditions (3.1) and (3.2) of the theorem only require the weaker claim that the $Q[\ \,|\, E]$ values for expressions $(Rc \cdot B_jc)$ and $(H\cdot Rc \cdot B_jc)$ (for each $B_jc$ in $\Delta (c)$) are related to their $P[\ \,|\, E]$ values by Jeffrey’s formula on partition $\Gamma = \{D_1, D_2, \dots \}$. Full Jeffrey Conditionalization would require the stronger claim that the $Q[\ \,|\, E]$ values for all expressions are related to their $P[\ \,|\, E]$ values via Jeffrey’s formula on partition $\Gamma = \{D_1, D_2, \dots \}$. When full Jeffrey Conditionalization applies, the supposition that $\Delta (c)$ is a partition for $Q[\ \,|\, Rc \cdot E]$ (supposition (2)) is derivable from Jeffrey’s formula, since $\Delta (c)$ is a partition for $P[\ \,|\, Rc \cdot E]$.

Theorem 11

“Inappropriate” Credence Functions, Extended.

Let $\Delta (c) = \{B_1c, B_2c, \dots \}$ be a partition for $P[\ \,|\, Rc \cdot E]$, where according to H the members of $\Delta = \{B_1, B_2, \dots \}$ are chance outcome attributes for systems in state R, and where, for admissible E, $P[B_kc \,|\, H \cdot Rc \cdot E] = r_k > 0$ are direct inferences. (We assume $P[H \cdot Rc \cdot B_kc \cdot E] > 0$ for each $B_kc$ in $\Delta (c)$, so that all the conditional probabilities are well-defined.)

Suppose probability function Q is related to P in the following way, where $Q[Rc \cdot B_kc \cdot E] > 0$ for each $B_kc$ in $\Delta (c)$, and where $\Gamma = \{D_1, D_2, \dots \}$ is a partition for $Q[\ \,|\, E]$, with each $Q[D_i \,|\, E] > 0$:

5 Reference Class Problems

Accounts of direct inference, Bayesian or not, often encounter troubles in dealing with overlapping reference classes or initial chance states. Lots of ink has been spilt trying to sort out these problems.^{Footnote 27} In this section we raise some troubles for Bayesian accounts. We focus on issues that arise when the object language notion, ch, is some kind of objective chance. (Frequency accounts have distinct troubles of there own.) We will suggest some ways a Bayesian account may deal with these troubles.

5.1 Defeat by Outcome Attributes

Consider the case where an extensive chance hypothesis H entails chances for at least two distinct outcome attributes, Ax and Bx, for initial state R—i.e. Ax and Bx are members of the algebra of outcome attributes $\Theta _R$. Then it will usually be the case that possible outcome Bc for system c defeats the direct inference from $(H \cdot Rc \cdot E)$ to outcome Ac, for admissible E:^{Footnote 28}

$$\begin{aligned} P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E] \ne P[Ac \,|\, H \cdot Rc \cdot E] = r. \end{aligned}$$

Defeat of this kind turns out to be easy to finesse. Indeed, when H is an extensive chance hypothesis, as defined earlier, defeat of this kind turns into a direct inference success. For, whenever an extensive chance hypothesis H entails $ch(Ax,Rx) =r$, and Bx is a chance attribute for Rx according to H, then H must also entail $ch(Bx,Rx)=s$ and $ch(Ax\cdot Bx,Rx)=t$, where s and t are standard terms for real numbers. Thus, for admissible E, the following two direct inferences result:

$$\begin{aligned} P[Bc \,|\, H \cdot Rc \cdot E] = s ~\hbox {and}~ P[Ac\cdot Bc \,|\, H \cdot Rc \cdot E] = t. \end{aligned}$$

So, although Bc defeats the simple direct inference to Ac, we still obtain the direct inference we should want, but we get it via the following “complex direct inference”:

$$\begin{aligned} P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E] = P[Ac\cdot Bc \,|\, H \cdot Rc \cdot E] / P[Bc \,|\, H \cdot Rc \cdot E] = t/s. \end{aligned}$$

This is exactly the value we should want for $P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E]$. And we’ve gotten it without complicating the account of chance by taking on a notion of conditional chance. That is, when Bx is an outcome attribute for Rx, the Bayesian machinery yields the desired direct inference value for Ax without needing to draw on chance expressions of form $ch(Ax ,Rx\cdot Bx)=q$.^{Footnote 29} This approach avoids drawing on the notion of conditional chance, and the attendant difficulties identified by Humphreys (1985). It also benefits by not requiring the account of chance to make sense of expressions that conditionalize on outcome attributes: when Bx is an outcome attribute for Rx, what does an expression of form $ch(Ax ,Rx\cdot Bx)=q$say?^{Footnote 30}

One more point before moving on. The treatment described above works well for extensive chance hypotheses. But what about cases where H is not extensive, say, where H only entails one of $ch(Bx,Rx)=s$ or $ch(Ax\cdot Bx,Rx)=t$. In that case, although Bc should defeat the direct inference to Ac,

$$\begin{aligned} P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E] \ne P[Ac \,|\, H \cdot Rc \cdot E] = r \end{aligned}$$

The Bayesian direct inference approach doesn’t produce a chance-based value for ${P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E]}$. Is this a problem for the Bayesian account?

By Bayesian lights, not at all. The incomplete, non-extensive chance hypothesis cannot supply the desired direct inference, but this is just as it should be! First, recall that the present account of direct inference doesn’t suppose that the agent is certain of the chance hypothesis involved. Application of the Bayesian direct inference principle (GG-DIP or G-DIP) only supposes that the agent’s total evidence is expressed by $(Rc \cdot E)$, or by $(Rc \cdot Bc \cdot E)$ in this case, and contemplates the appropriate credence value when a chance hypothesis is added (as an additional premise) to this evidence. It does not suppose that the agent’s total evidence contains the chance hypotheses on which the direct inferences depend. The main issue for the theory of direct inference is to determine the conditions under which the addition of a chance hypothesis (however well confirmed) to an agent’s total evidence specifies appropriate direct inferences to possible outcomes. In this regard, the direct inference principle does not privilege any one chance hypothesis over another.

So, one plausible Bayesian line goes like this. It is not at all surprising that an incomplete chance hypothesis may fail to produce a direct inference when it fails to specify appropriate chance claims. The failure of the Bayesian account to produce direct inferences in such cases is not a fault of the account. Indeed, when hypothesis H doesn’t include the chance claim $ch(Ax,Rx)=r$, it is no fault of the Bayesian account that it fails to produce the direct inference $P[Ac \,|\, H \cdot Rc \cdot E] = r$. Similarly, when an incomplete chance hypothesis H fails to supply one of the chance claims $ch(Bx,Rx)=s$ or $ch(Ax\cdot Bx,Rx)=t$, it is no fault of the Bayesian account that it fails to produce one of the direct inferences $P[Bc \,|\, H \cdot Rc \cdot E] = s$ or $P[Ac\cdot Bc \,|\, H \cdot Rc \cdot E] = t$, and so fails to produce the appropriate direct inference $P[Ac \,|\, H \cdot Rc \cdot Bc \cdot E] = t/s$. In such a case, a more filled-out extension of H hypothesizes specific chance values for ch(Bx, Rx) and $ch(Ax\cdot Bx,Rx)$, and can thereby supply the appropriate direct inferences. If an agent lacks confidence in any of the filled-out extensions of H, then she simply needs to acquire more evidence for (or against) them, in the usual Bayesian way.

5.2 Competing Chance Claims

We now turn to cases where two chance claims may compete for direct inference priority. This can only happen when two chance claims about the same outcome attribute have “overlapping reference classes”—i.e. when some chance systems can be in two distinct initial chance states, R and S, at the same time, and where both initial chance states provide chances for the same outcome attribute, A. Bayesian direct inference runs into some trouble in trying to accommodate this situation. We’ll suggest some ways that the Bayesian account may deal with these troubles.

Let $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot E] = r$ be a perfectly good direct inference (for admissible E). Then, presumably, $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax, Sx)=s \cdot E] = r$, where $s\ne r$, should also be a perfectly good direct inference. The addition of some chance claim $ch(Ax,Sx)=s$ should not be problematic for such straightforward direct inferences. Otherwise, extended chance hypotheses, involving multiple chance claims, would be unable to ground direct inferences. Now, the usual way to raise “multiple reference class problems” for direct inference goes like this. Suppose we add Sc as a premise to this direct inference. This clearly must defeat the direct inference, since we have two equally good but incompatible direct inferences:

$P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Sx)=s \cdot Sc \cdot E] = r \ne s =$

$P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Sx)=s \cdot Sc \cdot E]$. Thus, we must have

$P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Sx)=s \cdot Sc \cdot E] \ne r$ (or $\ne s$).^{Footnote 31}

What happens when $\lnot Sc$, instead of Sc, is added as a premise? Since Sc defeats the direct inference, it’s negation must also defeat it (see Theorem 2), so: $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Sx)=s \cdot \lnot Sc \cdot E] \ne r$.

Now, on the usual story, this kind of defeat may be averted when state Sx is a sub-state of Rx—when every possible system in state Sx must also be in state Rx. We may express this as $\forall x (Sx\supset Rx)$ if the quantifier is taken to range over all possible systems, or modally as $\Box \forall x (Sx\supset Rx)$ when the quantifier is more restricted. The sub-state claim can then be expressed by adding this statement to the premise of the the direct inference. However, for our purposes the same idea can be expressed by replacing the chance claim $ch(Ax,Sx)=s$ in the above example with the claim $ch(Ax,Rx\cdot Sx)=s$. With this replacement, the following should be a perfectly good direct inference: $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Rx\cdot Sx)=s \cdot Sc \cdot E] = s$.^{Footnote 32}

That’s the usual idea. But it presents problems in a Bayesian context. Here is why. Let H be $(ch(Ax,Rx)=r \cdot ch(Ax,Rx\cdot Sx)=s)$. Let’s suppose (as seems reasonable) that s can be quite far away from r.^{Footnote 33} Consider the following equation, which follows from the axioms of probability theory, assuming that $0< P[H \cdot Rc \cdot Sc \cdot E] < 1$ and $0< P[H \cdot Rc \cdot \lnot Sc \cdot E] < 1$:

$$\begin{aligned} r= & {} P[Ac \,\,|\,\, H \cdot Rc \cdot E] \\= & {} P[Ac \,\,|\,\, H \cdot Rc \cdot Sc \cdot E] \times P[Sc \,\,|\,\, H \cdot Rc \cdot E] + P[Ac \,\,|\,\, H \cdot Rc \cdot \lnot Sc \cdot E] \\&\times \, (1 - P[Sc \,\,|\,\, H \cdot Rc \cdot E]) \\= & {} s \times P[Sc \,\,|\,\, H \cdot Rc \cdot E] + P[Ac \,\,|\,\, H \cdot Rc \cdot \lnot Sc \cdot E] \\&\times \, (1 - P[Sc \,\,|\,\, H \cdot Rc \cdot E]), \end{aligned}$$

provided that $(Rc \cdot E)$ and $(Rc \cdot Sc \cdot E)$, respectively, are admissible for the two direct inferences $P[Ac \,\,|\,\, H \cdot Rc \cdot E] = r$ and $P[Ac \,\,|\,\, H \cdot Rc \cdot Sc \cdot E] = s$. However, in the normal course of events an agent’s total evidence may push the value of her credence, $P[Sc \,\,|\,\, H \cdot Rc \cdot E]$, close to 1. When that happens, the value of $P[Ac \,\,|\,\, H \cdot Rc \cdot E]$ must approach s.^{Footnote 34} This contradicts the supposition that $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Rx\cdot Sx)=s \cdot E]$ equals r, the value the direct inference should apparently have.

Notice that this analysis doesn’t really depend on whether E itself provides evidence for or against Sc. Even in cases where the evidence E says nothing about Sc, the value an agent assigns to $P[Sc \,\,|\,\, H \cdot Rc \cdot E]$ (perhaps only due to her gut feeling) may force her credence value for Ac to significantly depart from the direct inference value based on $ch(Ax,Rx)=r$.

One Bayesian response to this problem is to restrict the agent’s possible credence values for Sc so as not to permit the defeat of the direct inference unless E contains explicit evidence for or against Sc. Direct inference restricts other credence values, including the value for Sc. That should not be at all surprising. Any axiom or constraint added to the usual axioms for conditional probabilities is bound to result in the propagation of constrains on credence values throughout the system. Given the way that the credence value for Sc depends on the recommended direct inference value for Ac, one may simply maintain that the direct inference rule provides a kind of objectivist Bayesian constraint on what credence values Sc may take.^{Footnote 35}

However, Bayesians are also free to reject this kind of constraint on credence values for Sc, provided they can find some other way to accommodate the above analysis. For instance, they may adopt a more straightforward response to this problem: simply void (or invalidate) the direct inference $P[Ac \,\,|\,\, H \cdot Rc \cdot E] = r$ in all cases where H contains a chance claim, $ch(Ax,Rx\cdot Sx)=s$, based on a more specific initial chance state than Rx. This may be a more coherent view than the “objectivist” approach described above. For, clearly in some cases the credence for Sc may be near 1 based on good evidence, stated within E. In such cases the agent’s credence for Ac should be close to s rather than r. But then, precisely how much evidence, and of what kind, must occur within E to warrant a value of $P[Sc \,\,|\,\, H \cdot Rc \cdot E]$ that can break the direct inference based on $ch(Ax,Rx)=r$? Rather than try to parse this tricky issue (which may have no clear solution), it may make better sense to simply let the presence of the more specific chance claim override the weaker chance claim, as the above analysis seemed to initially suggest.

One of the authors (Wallmann) takes the overall thrust of this analysis to show that Bayesian direct inference cannot work properly—that it should be rejected in favor of some more lenient, more intuitively plausible account of direct inference. The idea that a direct inference based on $(ch(Ax,Rx)=r \cdot Rc)$ should be defeated simply by the presence of some additional chance claim that draws on a more specific chance state, $ch(Ax,Rx\cdot Sx)=s$, absent an assertion of the applicability of that chance claim, $(Rc\cdot Sc)$, just seems too implausible. The other author finds the above Bayesian response both acceptable and reasonable, although he finds it somewhat surprising that the Bayesian account of direct inference leads to this view.

A further move in the spirit of the “straightforward approach” suggested above is a Bayesian approach that rules out the very possibility of overlapping initial chance-states that have outcome attributes in common.^{Footnote 36} This chance-state overlap restriction has an important precedent. Our best indeterministic scientific theory, quantum theory, does not draw on overlapping initial quantum states. Each quantum system is in precisely one basic quantum state at any given time, and that state completely accounts for chances of quantum outcomes (upon system collapse, or upon “measurement”). To make good on this view, we need an account of how the usual kinds of chance models of macro-systems can be accommodated within the Bayesian direct inference framework without drawing on overlapping initial states that have outcome attributes in common.

When a chance hypothesis asserts that the chance of Ax (dying by age 75) for systems in chance state Rx (male in good health at age 50), the applicability of the chance claim, $ch(Ax,Rx)=r$, is of little import if it fails to account for important risk factors. For instance, if it hasn’t taken into account whether (and how much) an individual smokes, Sx, then it doesn’t tell you much of anything about anyone’s individual chances. So, perhaps $ch(Ax,Rx\cdot Sx)=s$ is the more relevant chance claim for Chuck. And if state Sx is relevant, so is state $\lnot Sx$, which yields some chance claim $ch(Ax,Rx\cdot \lnot Sx)=t$. Indeed, the amount an individual smokes is relevant, so instead of Sx and $\lnot Sx$, perhaps a range of alternatives, describing amount smoked, and for how many years, is in order: $ch(Ax,Rx\cdot S_jx)=s_j$ for a range of categories $S_jx$. So, supposing Chuck is a 50 year old male in good health who has never smoked, does $ch(Ax,Rx\cdot S_0x)=s_0$ capture his chances of dying by age 75? How much does Chuck drink? Is he engaged in a particularly hazardous occupation? The point is that Chuck’s chances depend on the most specific relevant chance state to which he belongs, according to the most specific, accurate chance hypothesis we can develop (and evidentially support) about people in various initial states of health. Anything less is at best an approximation of Chuck’s real chances.^{Footnote 37}

A Bayesian approach that excludes overlapping initial chance states will need to draw on hypotheses about approximate chance models, where these chance models rely on most basic initial chance states—chance states that are most basic according to the model. Associated with any given chance model is a chance hypothesis that asserts that the model fits the real world to some specified degree of approximation. Fitting the world means capturing the most significant causal factors and their associated chances for producing various kinds of outcomes. Evidence for such hypotheses confirms those that do the best job of capturing the most significant causal factors. Such approximations of chance mechanisms is the best we can hope for within the special sciences. So, the fact that a Bayesian approach to direct inference needs to draw on hypotheses about chance models for macroscopic systems (and the basic initial chance states posited by such models) is no defect. Any theory of direct inference, Bayesian or not, will need to accommodate hypotheses about approximate chance models, since that’s the best the special sciences can offer. And each such model will have chance states that are most basic for that model.

6 Conclusion

In this paper we’ve identified a variety of different kinds statements that are logically inadmissible for Bayesian direct inference. Such statements must defeat direct inferences on any coherent Bayesian account. In particular, whenever such information is available to the Bayesian agent, it supplies credence values for chance outcomes that significantly depart from the fair betting odds represented by objective chance statements. One of the authors (Wallmann) finds these results so counter-intuitive that he advocates giving up Bayesian direct inference.^{Footnote 38} He favors some alternative account on which direct inferences remain intact when faced with such information. The other author thinks that whenever an agent is in possession of such information, those deviations from objective chance values required by the Bayesian account make good sense. We agree that the Bayesian account places severe constraints on the theory of chance. Whether the costs imposed by these constraints are paid for by the avowed Bayesian benefits remains unresolved, for now.

Notes

See Peirce (1883), Venn (1888), Reichenbach (1949), Salmon (1971), Kyburg (1961, 1974), Levi (1977), Pollock (1990), Bacchus (1990), Thorn (2012, 2018), and Wallmann (2017).
Direct inference principles have been proposed by a number of probabilistic logicians. Prominent among them are proposals by Carnap (1962), Kyburg and Teng (2001), Levi (1977), Lewis (1980), Pollock (2011), and Bacchus (1990), and Thorn (2012, 2018). These accounts differ in their interpretations of the P and ch notions. Carnap and Kyburg take the ch notion to be frequencies (of attributes among members of reference classes), Pollock interprets it as nomic probability (or proportions among physically possible objects), and Bacchus and Thorn explicate it as a kind of expected frequency; Levi and Lewis both take ch to be some kind of objective chance, although their accounts of chance differ in significant ways—e.g. Lewis takes chance statements to apply to whole propositions at specific times, while Levi takes them to apply to predicates containing free variables, as in G-DIP. These accounts also interpret the P notion in several distinct ways. Carnap, Levi, and Lewis take the P notion to be Bayesian probability functions of some kind, although they differ on the interpretation of these probability functions (e.g. for Carnap they are logical, for Levi they are credal probability functions (relative to a potential corpus of certain knowledge, K), for Lewis they are reasonable initial credence functions).
Contrary to what the term direct inference suggests probability$_1$ statements are not strictly inferred from probability$_2$ statements. G-DIP is a statement about what value certain conditional probabilities should attain. However, since the name ‘direct inference’ has regularly been used for principles like G-DIP, we use it here as well.
When ‘ch(Ax, Rx)’ represents the relative frequency of A among R, perhaps the premise must also include a statement saying that c is a random member of R with respect to being an A. The proper way to spell out the account of randomness is controversial.
This account of admissibility seems to work just fine, provided that chance is taken to be fundamental. The well-known “bug” in this approach only bites those already infected by the Lewisian-Humean best systems account of the nature of chance. For discussion of this “bug” see Lewis (1994), Hall (1994), and Thau (1994).
This terminology parallels its use for defeasible conditionals. Direct inferences are defeasible in much the same way that some kinds of strong conditionals are defeasible. For such conditionals, when $(C \rightarrow A)$ holds, the addition of some statements D to the antecedent may defeat the connection between C and A: thus, ${((C\cdot D) \rightarrow A)}$ may fail to hold.
Lewis (1980) places additional constraints on “appropriate” credence functions. One of his requirements is that they be initial credence functions. That is, for direct inference to work properly, the same initial credence function must be maintained throughout. For, in order for any account of direct inference to work properly, careful account must be kept of whatever statements E is conditionalized upon, so that their admissibility for a proposed direct inference may be assessed. When probabilistic updating occurs in the usual Bayesian kinematic way, via $P_{new}[S] = P[S \,|\, K]$ for an agent who learns K, the updated function $P_{new}$ suppresses the learned information K by assigning $P_{new}[K] = 1$. This introduces complications with admissibility assessments of information that might well defeat a proposed direct inference. To assess whether $P_{new}[Ac \,|\, ch(Ax,Rx)=r\cdot Rc] = r$ holds, one needs to keep track of any update-information K from which $P_{new}$ results, so that one can assess its admissibility. In cases where all updating is via explicit information K, this is easy enough to accomplish, but not significantly different than simply making information K explicit as a premise in the initial credence function, $P[S \,|\, K]$. However, whenever $P_{new}$ results from P in a less direct way, such as via Jeffrey conditionalization, the resulting credence function may be deflected from the (seemingly appropriate) direct inference value, with no justification via the inadmissibility of some explicit statement K. Later, in Sect. 4, we will construct a specific example of this kind. So, Lewis’s approach to direct inference largely bypasses the kinematics of Bayesian updating. Rather, the Bayesian agent is taken to employ the same initial credence function throughout. On this approach, Bayesian updating simply amounts to what the logic of credence functions implies about the results of conditionalizing on additional premises.
When an agent’s betting quotient is r (or less), she should be willing to place a bet that loses her 1 (or less) if A turns out to be false, but gains her at least $(1-r)/r$ dollars if A turns out to be true (supposing the utility curve is linear for the amounts of money involved).
Lewis’ article suggests a continuum of possible chance values for ch(Ax, Rx); so it makes sense to read “summing” to mean integrating, the limit of summing over arbitrarily small intervals: $P[Ac \,|\, Rc \cdot E] = \int _{0}^{1} q\times p[ch(Ax,Rx)=q \,|\, Rc \cdot E]\, dq$. The function $p[ch(Ax,Rx)=q \,|\, Rc \cdot E]$ is a density function such that ${P[u< ch(Ax,Rx)\le v \,|\, Rc \cdot E] =}$$\int _{u}^{v} p[ch(Ax,Rx)=q \,|\, Rc \cdot E]\, dq$.
Detailed proofs of all theorems can be found in the “Appendix”.
It seems that there is a causal asymmetry between John’s offer to pay for dinner and the outcome of the dice roll. We will consider this issue below in Sect. 2.3, so hold it aside for now.
White (2010) uses a similar example to argue that imprecise credences lead to irrationality. However, White’s argument assumes both $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = P[Ac \,|\, H \cdot Rc \cdot E]$ and $P[F \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = P[F \,|\, H \cdot Rc \cdot E]$. Given these assumptions, it’s straightforward to show that $P[F \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] = P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)]$. It then follows that $P[F \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E]$—i.e. that $P[F \,|\, H \cdot Rc \cdot E]$ must have the same value, r, as the direct inference $P[Ac \,|\, H \cdot Rc \cdot E] = r$, regardless of the content of statement F. In our example this means that the value of $P[F \,|\, H \cdot Rc \cdot E]$ must be 1 / 6 for Maria—her rational credence that John will buy her dinner this evening must be 1 / 6 before John even brings up the subject. This is absurd! For, by similar reasoning, had it happened that John instead told Maria that he’d buy her dinner just in case seven does not turn up on the next toss (had he told her $(F \equiv \lnot Ac)$), then Maria’s rational credence must be $P[F \,|\, H \cdot Rc \cdot E] = 5/6$ before John even brings up the subject. However, the second premise of White’s argument might be challenged. Our argument here won’t make any such assumption. White’s argument has attracted quite a debate: Hawthorne et al. (2017), White (2010), Sturgeon (2010) endorse it, while Joyce 2010; Pedersen and Wheeler 2014; Pettigrew 2018; Hart and Titelbaum 2015; Titelbaum and Hart 2018 reject it.
$P[Ac \,|\, H \cdot Rc \cdot (F \equiv Ac) \cdot E] = 1\,/\,[1 + ((1-r)/r)\times (t/s)] = 1\,/\,[1 + ((5/6)/(1/6)) (t/s)] = 1/[1 + 5 (t/s)] = 1/$(large-number). e.g. if t is 10 times the size of s, then $P[Ac \,|\, H \cdot Rc \cdot (F \equiv Ac) \cdot E] = 1/51<< 1/6$.
Proof: $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \rightarrow F)] = P[Ac\cdot (Ac \rightarrow F) \,|\, H \cdot Rc \cdot E] / P[(Ac \rightarrow F) \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E]\times P[(Ac \rightarrow F) \,|\, H \cdot Rc \cdot E \cdot Ac] / P[(Ac \rightarrow F) \,|\, H \cdot Rc \cdot E]$.
i.e., given $(H \cdot Rc \cdot E)$, but in the absence of any additional information about whether F or $\lnot F$ holds, Ac provides no evidence for or against the strong conditional claim.
Notice that the stronger conditional claim ‘$(Ac \rightarrow F)$’ should logically entail the material conditional claim ‘$(Ac \supset F)$’. Proof: Presumably, $((Ac \rightarrow F) \cdot Ac)$ logically entails F; so $(Ac \rightarrow F)$ logically entails $(Ac \supset F)$ via the deduction theorem for deductive logic. So, when the agent adds the strong conditional claim of form ‘$(Ac \rightarrow F)$’ to her total evidence $(H \cdot Rc \cdot E)$, her total evidence will also contain the material conditional claim ‘$(Ac \supset F)$’. But, although the material conditional claim is a defeater when on its own, its ability to defeat the direct inference is mitigated by the presence of the strong conditional claim that logically implies it. That is, technically, although $(Ac \supset F)$ is a defeater of the direct inference from $(H \cdot Rc \cdot E)$ to Ac, the stronger claim ‘$(Ac \rightarrow F)$’ is a defeater–defeater—adding ‘$(Ac \rightarrow F)$’ to $(H \cdot Rc \cdot E \cdot (Ac \supset F))$ restores the direct inference: $P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \supset F) \cdot (Ac \rightarrow F)] = P[Ac \,|\, H \cdot Rc \cdot E \cdot (Ac \rightarrow F)] = P[Ac \,|\, H \cdot Rc \cdot E]$.
Proof: Let ‘BC’ abbreviate the strong biconditional claim ‘$((Ac \rightarrow F)\cdot (\lnot Ac \rightarrow \lnot F))$’; let $P[Ac \,|\, H \cdot Rc \cdot E] = r$, $P[BC \,|\, H \cdot Rc \cdot E \cdot Ac] = s$, and $P[BC \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] = t$. Then, $P[Ac \,|\, H \cdot Rc \cdot E \cdot BC] =$$P[Ac\cdot BC \,|\, H \cdot Rc \cdot E] / P[BC \,|\, H \cdot Rc \cdot E]$$=$$s\times r / P[BC \,|\, H \cdot Rc \cdot E]$. Similarly, $P[\lnot Ac \,|\, H \cdot Rc \cdot E \cdot BC] =$$t\times (1-r) / P[BC \,|\, H \cdot Rc \cdot E]$. Then, $P[Ac \,|\, H \cdot Rc \cdot E \cdot BC] =$$1 / (1 + P[\lnot Ac \,|\, H \cdot Rc \cdot E \cdot BC] / P[Ac \,|\, H \cdot Rc \cdot E \cdot BC])$$=$$1 / [1 + ((1-r)/r)\times (t/s)]$, which equals r just when $t = s$.
i.e., given $(H \cdot Rc \cdot E)$, but in the absence of any additional information about whether F or $\lnot F$ holds, Ac should provide no more evidence for (or against) the strong biconditional claim than does $\lnot Ac$.
And vice versa, the credence-based direct inference value that Ac already had will drag the credence of F away from its previous value. The addition to the total evidence of a material conditional or a material biconditional between Ac and F, Bayesian credences cannot hold the value of one of them fixed and only readjust the other. That’s essentially what the theorems in this section establish.
Skyrms (1977) introduced the notion of resiliency for chances. Some philosophers have adapted Skyrms’ idea to the Principal Principle, e.g. Schwarz (2014).
When chances are represented in terms of sets, rather than attribute–predicates of form ‘Ax’, the associated collection of sets is called a field of sets.
Let ‘$\vdash Fx$’ say that Fx has the form of a tautology. For each attribute Ax and Bx in $\Theta _R$: (1) $0 \le ch(Ax, Rx) \le 1$; (2) if $\vdash A$, then $ch(Ax, Rx) = 1$; (3) if $\vdash \lnot (A \cdot B)$, then $ch((Ax \vee Bx), Rx) = ch(Ax, Rx) + ch(Bx, Rx)$. This suffices to guarantee that $ch(\ , Rx)$ satisfies all the usual theorems of probability theory.
For the same reason that admissibility should work this way—see above.
Lewis (1980) does this in terms of the times at which events occur: if E consists entirely of propositions about matters of particular at times before the chance outcome occurs, then it is admissible for that chance outcome, and for all the alternative outcomes.
Follows from Theorem 9 by taking $\Delta (c)=\{Ac, \lnot Ac\}$.
Substituting $P[G \;|\, F_j] = Q[G \;|\, F_j]$ into the following equation, which comes from the axioms for probability theory: $Q[G] = \sum _{j=1}^n Q[G \;|\, F_j] \times Q[F_j]$.
For instance, Kyburg’s account is plagued by reference class troubles. (See Harper (1981)).
A direct inference from $(H \cdot Rc \cdot Bc \cdot E)$ to Ac is applicable when Bc is part of the agent’s total evidence, $(Rc \cdot Bc \cdot E)$.
Where, in the previous example $q = t/s$.
Presumably this expression makes some sort of future-possible conditional claim. It says something like this: “the chance that a system in initial chance state Rx, if it comes to acquire outcome attribute Bx, will also come to acquire outcome attribute Ax, is q”.
The credence value might happen to be r or s, but not due to direct inference. If the credence value is r, then the following argument should be run with Rc and Sc exchanged throughout. So, without loss, we assume the credence value is not equal to r.
$\lnot S$ must still be a defeater: $P[Ac \,\,|\,\, ch(Ax,Rx)=r \cdot Rc \cdot ch(Ax,Rx\cdot Sx)=s \cdot \lnot Sc \cdot E] \ne r$.
In that case, the value of $P[Ac \,\,|\,\, H \cdot Rc \cdot \lnot Sc \cdot E]$ must also be quite far away from r, provided that r is not too close to 0 or 1.
Similarly, total evidence may push the value of her credence, $P[Sc \,\,|\,\, H \cdot Rc \cdot E]$, close to 0. In that case the value of $P[Ac \,\,|\,\, H \cdot Rc \cdot E]$ must approach the value of $P[Ac \,\,|\,\, H \cdot Rc \cdot \lnot Sc \cdot E]$, which we already saw, cannot equal r.
Here is one reason a Bayesian may want to reject this particular “objectivist” approach. Suppose that along with $ch(Ax,Rx)=r$, H contains both $ch(Ax,Rx\cdot Sx)=s$ and $ch(Ax,Rx\cdot \lnot Sx)=t$. Then the objectivist commitment to $P[Ac \,\,|\,\, H \cdot Rc \cdot E] = r$ implies that the credence value for Sc is fixed once and for all. For, it follows from the direct inferences with values r, s, and t, that $P[Sc \,\,|\,\, H \cdot Rc \cdot E] = (r-t)/(s-t)$. So, no amount of evidence E can change this value for $P[Sc \,\,|\,\, H \cdot Rc \cdot E]$, unless E breaks one of the direct inferences by being inadmissible for it.
A Bayesian account of objective chance relies on a collection of axioms for the theory of chance. All credence functions “appropriate” for direct inference should give these axioms credence value 1. These include axioms that make the function $ch(\ ,Rx)$, for each initial state R, satisfy the axioms of probability theory, as described in an earlier footnote. One way to get the theory of chance to rule out overlapping initial chance states is to add the following axiom schema: $(\exists u ch(Ax, Rx)=u \cdot \exists v ch(Ax, Sx)=v) \supset \lnot \exists x (Rx \cdot Sx)$, where u and v are variables restricted to real numbers, and where $\exists u ch(Ax, Rx)=u$ is a way to express the claim that Rx is an initial chance state (for at least one attribute Ax).
Fetzer (1982), for instance, argues for a view that relativizes single-case chances to all causally relevant factors.
Other authors have been claiming that Bayesianism and direct inference may be incompatible (Kyburg 1977; Thorn 2014). However, these authors argue that it is the updating rule Bayesian Conditionalisation that is problematic for direct inference. The troubles start already with the axioms of conditional probability, however—no particular rule of updating is needed as explained in the Introduction.
Whenever a set of sentences $\Lambda = \{Z_1, Z_2, \dots \}$ is a partition for a probability function $P[\ \,|\, X]$ and $P[Y \,|\, X] > 0$, then $\Lambda $ must also be a partition for the probability function $P[\ \,|\, Y \cdot X]$. Proof: First note that $0 < P[Y \,|\, X] = \sum _{\{Z_k \in \Lambda \}} P[Z_k \cdot Y \,|\, X]$, so for at least one $Z_j$ in $\Lambda $, $P[Z_j \cdot Y \,|\, X] > 0$—and when $P[Z_j \cdot Y \,|\, X] > 0$, then $P[Z_j \,|\, Y \cdot X] > 0$; furthermore, for any $Z_j$ in $\Lambda $ such that $P[Y \,|\, X] \ne P[Z_j \cdot Y \,|\, X] > 0$, $1> P[Z_j \,|\, Y \cdot X] > 0$; (i) if $Z_j$ and $Z_k$ are in $\Lambda $ and $P[Z_j \cdot Z_k \,|\, X] = 0$, then $P[Z_j \cdot Z_k \cdot Y \,|\, X] = 0$, so $P[Z_j \cdot Z_k \,|\, Y \cdot X] = 0$; (ii) $P[Y \,|\, X] = \sum _{\{Z_k \in \Lambda \}} P[Z_k \cdot Y \,|\, X]$, so $1 = \sum _{\{Z_k \in \Lambda \}} P[Z_k \cdot Y \,|\, X] / P[Y \,|\, X] = \sum _{\{Z_k \in \Lambda \}} P[Z_k \,|\, Y \cdot X]$.

References

Bacchus, F. (1990). Representing and reasoning with probabilistic knowledge. Cambridge: MIT Press.
Google Scholar
Carnap, R. (1962). Logical foundations of probability. Chicago: University of Chicago Press.
Google Scholar
Fetzer, J. H. (1982). Probabilistic explanations. In PSA: Proceedings of the biennial meeting of the Philosophy of Science Association (pp. 194–207). JSTOR.
Hall, N. (1994). Correcting the guide to objective chance. Mind, 103(412), 505–518.
Article Google Scholar
Harper, W. L. (1981). Kyburg on direct inference. In R. J. Bogdan & D. Reidel (Eds.), Henry E. Kyburg, Jr. & Isaac Levi (pp. 97–127). Dordrecht: Springer.
Chapter Google Scholar
Hart, C., & Titelbaum, M. G. (2015). Intuitive dilation? Thought: A Journal of Philosophy, 4(4), 252–262.
Google Scholar
Hawthorne, J., Landes, J., Wallmann, C., & Williamson, J. (2017). The principal principle implies the principle of indifference. The British Journal for the Philosophy of Science, 68(1), 123–131.
Google Scholar
Humphreys, P. (1985). Why propensities cannot be probabilities. The Philosophical Review, 94(4), 557–570.
Article Google Scholar
Jeffrey, R. C. (1990). The logic of decision. Chicago: University of Chicago Press.
Google Scholar
Joyce, J. M. (2010). A defense of imprecise credences in inference and decision making. Philosophical Perspectives, 24(1), 281–323.
Article Google Scholar
Kyburg, H. E, Jr., & Teng, C. M. (2001). The theory of probability. Cambridge: Cambridge University Press.
Google Scholar
Kyburg, H. E, Jr. (1977). Randomness and the right reference class. The Journal of Philosophy, 74(9), 501–521.
Article Google Scholar
Kyburg, H. E, Jr. (1961). Probability and the logic of rational belief. Middletown: Wesleyan University Press.
Google Scholar
Kyburg, H. E, Jr. (1974). The logical foundations of statistical inference. Boston: Reidel.
Book Google Scholar
Levi, I. (1977). Direct inference. The Journal of Philosophy, 74(1), 5–29.
Article Google Scholar
Lewis, D. (1980). Subjectivists guide to objective chance. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), IFS (pp. 267–297). Dordrecht: Springer.
Chapter Google Scholar
Lewis, D. (1994). Humean supervenience debugged. Mind, 103(412), 473–490.
Article Google Scholar
Pedersen, A., & Wheeler, G. (2014). Demystifying dilation, Erkenntnis. An International Journal of Scientific Philosophy, 79(6), 1305–1342.
Google Scholar
Peirce, C. S. (1883). A theory of probable inference. In C. S. Peirce (Ed.), Studies in logic (pp. 126–203). Boston: Little, Brown and Company.
Google Scholar
Pettigrew, R. (2018). The principal principle does not imply the principle of indifference. The British Journal for the Philosophy of Science, axx060. https://doi.org/10.1093/bjps/axx060.
Pollock, J. L. (1990). Nomic probability and the foundations of induction. Oxford: Oxford University Press.
Google Scholar
Pollock, J. L. (2011). Reasoning defeasibly about probabilities. Synthese, 181(2), 317–352.
Article Google Scholar
Reichenbach, H. (1949). The theory of probability. Berkeley: University of California Press.
Google Scholar
Salmon, W. C. (1971). Statistical Explanation. In W. C. Salmon (Ed.), Statistical explanation and statistical relevance (pp. 29–87). Pittsburgh: University of Pittsburgh Press.
Chapter Google Scholar
Schwarz, W. (2014). Proving the principal principle. In A. Wilson (Ed.), Chance and temporal asymmetry (pp. 81–99). Oxford: Oxford University Press.
Chapter Google Scholar
Skyrms, B. (1977). Resiliency, propensities, and causal necessity. The Journal of Philosophy, 74(11), 704–713.
Article Google Scholar
Sturgeon, S. (2010). Confidence and coarse-grained attitudes. Oxford Studies in Epistemology, 3, 126–149.
Google Scholar
Thau, M. (1994). Undermining and admissibility. Mind, 103(412), 491–504.
Article Google Scholar
Thorn, P. D. (2012). Two problems of direct inference. Erkenntnis, 76(3), 299–318.
Article Google Scholar
Thorn, P. D. (2014). Defeasible conditionalization. Journal of Philosophical Logic, 43(2–3), 283–302.
Article Google Scholar
Thorn, P. D. (2018). On the preference of more specific reference classes. Synthese., 194, 2025–2051. https://doi.org/10.1007/s11229-016-1035-y.
Article Google Scholar
Titelbaum, M. G., & Hart, C. (2018). The principal principle does not imply the principle of indifference, because conditioning on biconditionals is counterintuitive. The British Journal for the Philosophy of Science, axy011. https://doi.org/10.1093/bjps/axy011.
Venn, J. (1888). The logic of chance (3rd ed.). London: Macmillan.
Google Scholar
Wallmann, C. (2017). A Bayesian solution to the conflict of narrowness and precision in direct inference. Journal for General Philosophy of Science, 48(3), 485–500.
Article Google Scholar
White, R. (2010). Evidential symmetry and mushy credence. Oxford Studies in Epistemology, 3, 161–86.
Google Scholar

Download references

Acknowledgements

Open access funding provided by University of Applied Sciences Upper Austria. We want to thank Jon Williamson for his stimulating, insightful and very supportive discussions on various versions of the manuscript. We also want to thank Jan Willem Romeijn for valuable feedback on earlier drafts of this paper. We also want to thank Richard Pettigrew, Gregory Wheeler and Mike Titelbaum for very stimulating discussions. Finally, we thank two anonymous reviewers for their very helpful and detailed comments.

Funding

Christian Wallmann is grateful to Arts and Humanities Research Council (AHRC) for supporting this research as a part of the Project Evaluating Evidence in Medicine (AH/M005917/1). Christian Wallmann was also supported by the Federal Ministry of Science, Research and Economy of the Republic Austria (BMWFW) in cooperation with the Austrian Agency for International Mobility and Cooperation in Education, Science and Research (OeAD-GmbH) (Grant: Marietta Blau).

Author information

Authors and Affiliations

Department of Logistics, University of Applied Sciences Upper Austria, Steyr, Wehrgrabengasse 1-3, Steyr, 4400, Austria
Christian Wallmann
Department of Philosophy, University of Oklahoma, 605 Dale Hall Tower, Norman, OK, 73019-2006, USA
James Hawthorne

Authors

Christian Wallmann
View author publications
You can also search for this author in PubMed Google Scholar
James Hawthorne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Wallmann.

Appendix

1.1 Proof of Theorem 1

Proof

Suppose $P[Ac \,|\, H \cdot Rc \cdot E] = r$ and $1> P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] > 0$.

Then $P[Ac \,|\, H \cdot Rc \cdot E] = r > 0$, so $P[F \,|\, H \cdot Rc \cdot E \cdot Ac] =$

$P[F \cdot Ac \,|\, H \cdot Rc \cdot E] / P[Ac \,|\, H \cdot Rc \cdot E]$ is well-defined—let it have value s. Similarly, $P[\lnot Ac \,|\, H \cdot Rc \cdot E] = 1-r > 0$, so $P[\lnot F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] =$

$P[\lnot F \cdot \lnot Ac \,|\, H \cdot Rc \cdot E] / P[\lnot Ac \,|\, H \cdot Rc \cdot E]$ is well-defined—let it have value t. Notice that $P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] = P[(F \cdot Ac) \vee (\lnot F \cdot \lnot Ac) \,|\, H \cdot Rc \cdot E] = P[(F \cdot Ac) \,|\, H \cdot Rc \cdot E] + P[(\lnot F \cdot \lnot Ac) \,|\, H \cdot Rc \cdot E] =$

$P[F \,|\, H \cdot Rc \cdot E \cdot Ac] \times P[Ac \,|\, H \cdot Rc \cdot E] ~+~ P[\lnot F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] \times P[\lnot Ac \,|\, H \cdot Rc \cdot E] = s \times r ~+~ t \times (1-r)$; so $P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] = s \times r ~+~ t \times (1-r)$. Now, if both $s = 0$ and $t = 0$, then $0 = P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] > 0$, contradiction; and if both $s = 1$ and $t = 1$, then 1 = $P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] < 1$, contradiction; so, either $s > 0$ or $t > 0$, and either $s < 1$ or $t < 1$. Then, $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv Ac)] =$

$P[Ac \cdot (F \equiv Ac) \,|\, H \cdot Rc \cdot E] / P[(F \equiv Ac) \,|\, H \cdot Rc \cdot E] =$

$P[Ac \cdot F \,|\, H \cdot Rc \cdot E] / [s \times r ~+~ t \times (1-r)] =$

$P[F \,|\, H \cdot Rc \cdot E \cdot A] \times P[Ac \,|\, H \cdot Rc \cdot E] / [s \times r ~+~ t \times (1-r)] = (s \times r) / [s \times r ~+~ t \times (1-r)]$.

By supposition, $r > 0$; and notice that (since either $s > 0$ or $t > 0$ and either $s < 1$ or $t < 1$) we have: $s = t$ iff $0< s = t < 1$. Then,

$(r \times s) / [r \times s ~+~ (1-r)\times t] = r$ iff $s = [r \times s ~+~ (1-r)\times t]$ iff $s - r \times s = (1-r)\times t$ iff $(1-r) \times s = (1-r)\times t$ iff $s = t$. Similarly, $(r \times s) / [r \times s ~+~ (1-r)\times t] > r$ iff $s > t$. Similarly, $(r \times s) / [r \times s ~+~ (1-r)\times t] < r$ iff $s < t$. That covers everything but the final claim.

$1 - (P[F \cdot \lnot Ac \,|\, H \cdot Rc \cdot E] / P[\lnot Ac \,|\, H \cdot Rc \cdot E]) =$

$1 - (P[\lnot Ac \,|\, H \cdot Rc \cdot E \cdot F] \times P[F \,|\, H \cdot Rc \cdot E] / P[\lnot Ac \,|\, H \cdot Rc \cdot E]) = 1 - P[F \,|\, H \cdot Rc \cdot E] = P[\lnot F \,|\, H \cdot Rc \cdot E]$ (since $P[\lnot Ac \,|\, H \cdot Rc \cdot E \cdot F] = 1 - P[Ac \,|\, H \cdot Rc \cdot E \cdot F] = 1 - P[Ac \,|\, H \cdot Rc \cdot E] = P[\lnot Ac \,|\, H \cdot Rc \cdot E]$); thus $ t = P[\lnot F \,|\, H \cdot Rc \cdot E] = 1 - P[F \,|\, H \cdot Rc \cdot E] = 1-s$. Then, given what we’ve already established above, $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \equiv A)] ~=~ r$ if and only if $s = t$ if and only if $s = 1-s$ if and only if $s+s = 1$ if and only if $s = 1/2$ if and only if $P[F \,|\, H \cdot Rc \cdot E] = 1/2$. $\square $

1.2 Proof of Theorem 2

Proof

We show the left to right direction of the biconditional. The proof of the other direction is similar, but with the ‘$\lnot D$’ interchanged with ‘D’.

Suppose throughout that $P[D \,|\, H \cdot Rc \cdot E] > 0$ and

$P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \ne P[Ac \,|\, H \cdot Rc \cdot E]$.

To establish that $P[\lnot D \,|\, H \cdot Rc \cdot E] > 0$, suppose (for reductio) $P[\lnot D \,|\, H \cdot Rc \cdot E] = 0$. Then $P[D \,|\, H \cdot Rc \cdot E] = 1$ and $0 = P[\lnot D \,|\, H \cdot Rc \cdot E]\ge P[Ac \cdot \lnot D \,|\, H \cdot Rc \cdot E] \ge 0$, so $P[Ac \cdot \lnot D \,|\, H \cdot Rc \cdot E] = 0$; then $P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \cdot D \,|\, H \cdot Rc \cdot E] + P[Ac \cdot \lnot D \,|\, H \cdot Rc \cdot E] = P[Ac \cdot D \,|\, H \cdot Rc \cdot E] = P[Ac \cdot D \,|\, H \cdot Rc \cdot E] / P[D \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E \cdot D]$—contradiction!

Thus, $P[\lnot D \,|\, H \cdot Rc \cdot E] > 0$.

In addition, $P[D \,|\, H \cdot Rc \cdot E] = 1 - P[\lnot D \,|\, H \cdot Rc \cdot E] < 1$. So we have $0< P[D \,|\, H \cdot Rc \cdot E] < 1$; then also, $0< P[\lnot D \,|\, H \cdot Rc \cdot E] < 1$.

To establish that $P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot D] \ne P[Ac \,|\, H \cdot Rc \cdot E]$, suppose (for reductio) $P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot D] = P[Ac \,|\, H \cdot Rc \cdot E]$. Then $P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \times P[D \,|\, H \cdot Rc \cdot E] + P[Ac \,|\, H \cdot Rc \cdot E \cdot \lnot D] \times P[\lnot D \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \times (1 - P[\lnot D \,|\, H \cdot Rc \cdot E]) + P[Ac \,|\, H \cdot Rc \cdot E] \times P[\lnot D \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E \cdot D] + (P[Ac \,|\, H \cdot Rc \cdot E] - P[Ac \,|\, H \cdot Rc \cdot E \cdot D]) \times P[\lnot D \,|\, H \cdot Rc \cdot E]$; so $P[Ac \,|\, H \cdot Rc \cdot E] - P[Ac \,|\, H \cdot Rc \cdot E \cdot D] = (P[Ac \,|\, H \cdot Rc \cdot E] - P[Ac \,|\, H \cdot Rc \cdot E \cdot D]) \times P[\lnot D \,|\, H \cdot Rc \cdot E]$; then, because $0< P[\lnot D \,|\, H \cdot Rc \cdot E] < 1$, we must have

1.3 Proof of Theorem 3

Proof

Suppose $P[Ac \,|\, H \cdot Rc \cdot E] = r$, for $0< r < 1$, and $P[(F \vee Ac) \,|\, H \cdot Rc \cdot E] < 1$.

Then $P[\lnot Ac \,|\, H \cdot Rc \cdot E] = 1-r > 0$, so $P[F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] =$

$P[F \cdot \lnot Ac \,|\, H \cdot Rc \cdot E] / P[\lnot Ac \,|\, H \cdot Rc \cdot E]$ is well-defined—let it have some value s. Notice that $P[(F \vee Ac) \,|\, H \cdot Rc \cdot E] = P[(F \cdot \lnot Ac) \vee Ac \,|\, H \cdot Rc \cdot E] = P[F \cdot \lnot Ac \,|\, H \cdot Rc \cdot E] + P[Ac \,|\, H \cdot Rc \cdot E] = P[F \,|\, H \cdot Rc \cdot E \cdot \lnot Ac] \times P[\lnot Ac \,|\, H \cdot Rc \cdot E] + r = s \times (1-r) + r$; so $P[(F \vee Ac) \,|\, H \cdot Rc \cdot E] = s \times (1-r) + r$. Now, if $s = 1$ we have $s \times (1-r) + r = 1 = P[(F \vee Ac) \,|\, H \cdot Rc \cdot E] < 1$, contradiction; so we must have $s < 1$. Then, $P[Ac \,|\, H \cdot Rc \cdot E \cdot (F \vee Ac)] = P[Ac \cdot (F \vee Ac) \,|\, H \cdot Rc \cdot E] / P[F \vee Ac \,|\, H \cdot Rc \cdot E]$ = $P[Ac \,|\, H \cdot Rc \cdot E] / [s \times (1-r) + r]$ = $r / [r + (1-r) \times s] > r$ (i.e. since $s < 1$, we must have $[r + (1-r) \times s] < 1$).$\square $

1.4 Proof of Corollary 4

Proof

Part 1 and Part 2 follow by substituting logical equivalences into Theorem 3. For Part 3, observe that $P[(Ac \supset F) \,|\, H \cdot Rc \cdot E] = P[(Ac \wedge F) \vee \lnot Ac \,|\, H \cdot Rc \cdot E] = P[F \cdot Ac \,|\, H \cdot Rc \cdot E] + P[\lnot Ac \,|\, H \cdot Rc \cdot E]$; The rest of the proof is then similar to the proof of Theorem 3. $\square $

1.5 Proof of Theorem 5

Proof

Suppose $1> P[G \,|\, H \cdot Rc \cdot E \cdot D] > 0$ and $P[Ac \,|\, H \cdot Rc \cdot E \cdot D] \ne P[Ac \,|\, H \cdot Rc \cdot E] = P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot G]$.

Clearly, $1> P[\lnot G \,|\, H \cdot Rc \cdot E \cdot D] > 0$.

To establish that $P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot \lnot G] \ne P[Ac \,|\, H \cdot Rc \cdot E]$, suppose (for reductio) that $P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot \lnot G] = P[Ac \,|\, H \cdot Rc \cdot E]$.

Then $P[Ac \,|\, H \cdot Rc \cdot E \cdot D] = P[Ac \cdot G \,|\, H \cdot Rc \cdot E \cdot D] + P[Ac \cdot \lnot G \,|\, H \cdot Rc \cdot E \cdot D] =$

$P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot G] \times P[G \,|\, H \cdot Rc \cdot E \cdot D] + P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot \lnot G] \times P[\lnot G \,|\, H \cdot Rc \cdot E \cdot D] =$

$P[Ac \,|\, H \cdot Rc \cdot E] \times P[G \,|\, H \cdot Rc \cdot E \cdot D]) + P[Ac \,|\, H \cdot Rc \cdot E] \times P[\lnot G \,|\, H \cdot Rc \cdot E \cdot D] =$

$P[Ac \,|\, H \cdot Rc \cdot E] \times (P[G \,|\, H \cdot Rc \cdot E \cdot D] + P[\lnot G \,|\, H \cdot Rc \cdot E \cdot D] = P[Ac \,|\, H \cdot Rc \cdot E])$.

So, $P[Ac \,|\, H \cdot Rc \cdot E \cdot D] = P[Ac \,|\, H \cdot Rc \cdot E]$—contradiction!

Thus, $P[Ac \,|\, H \cdot Rc \cdot E \cdot D \cdot \lnot G] \ne P[Ac \,|\, H \cdot Rc \cdot E]$.$\square $

1.6 Proof of Theorem 8

Proof

Assume all the antecedent conditions for the theorem.

1.7 Proof of Theorem 9

Proof

Assume the antecedent conditions for the theorem. Notice that $\Delta (c)$ must also be a partition for $P[\ \,|\, H \cdot Rc \cdot E \cdot D]$.^{Footnote 39}

We show:

Suppose, for every pair $B_ic$, $B_jc$ in $\Delta (c)$, $P[D \,|\, B_jc \cdot Rc \cdot E] = P[D \,|\, B_ic \cdot Rc \cdot E]$.

so $P[B_jc \,|\, H \cdot Rc \cdot E \cdot D] \times P[B_ic \,|\, H \cdot Rc \cdot E] = P[B_ic \,|\, H \cdot Rc \cdot E \cdot D] \times P[B_jc \,|\, H \cdot Rc \cdot E]$, for each pair $B_ic$, $B_jc$ in $\Delta (c)$. Then, summing over all the $B_ic$ in $\Delta (c)$,

$\sum _{B_ic \,\in \Delta (c)} P[B_jc \,|\, H \cdot Rc \cdot E \cdot D] \times P[B_ic \,|\, H \cdot Rc \cdot E] =$

$\sum _{B_ic \,\in \Delta (c)} P[B_ic \,|\, H \cdot Rc \cdot E \cdot D] \times P[B_jc \,|\, H \cdot Rc \cdot E]$, for each $B_jc$ in $\Delta (c)$.

Thus, $P[B_jc \,|\, H \cdot Rc \cdot E \cdot D] = P[B_jc \,|\, H \cdot Rc \cdot E]$, for each $B_jc$ in $\Delta (c)$.

[3] The theorem’s second biconditional claim is established by supposing (in addition to the theorem’s other suppositions) that $\Delta (c)$ is a partition for $P[\ \,|\, Rc \cdot E]$, and showing that:

The second biconditional follows from this together with the first (previously established) biconditional.

So, in addition to the other suppositions of the theorem, suppose $\Delta (c)$ is a partition for $P[\ \,|\, Rc \cdot E]$.

[3.1] Suppose, for each pair $B_ic$, $B_jc$ in $\Delta (c)$, $P[D \,|\, B_jc \cdot Rc \cdot E] = P[D \,|\, B_ic \cdot Rc \cdot E]$.

Then for each pair $B_ic$, $B_jc$ in $\Delta (c)$, $P[B_jc \,|\, Rc \cdot E \cdot D]\times P[D \,|\, Rc \cdot E] / P[B_jc \,|\, Rc \cdot E] = P[B_ic \,|\, Rc \cdot E \cdot D]\times P[D \,|\, Rc \cdot E] / P[B_ic \,|\, Rc \cdot E]$; then for each pair $B_ic$, $B_jc$ in $\Delta (c)$, $P[B_jc \,|\, Rc \cdot E \cdot D]\times P[B_ic \,|\, Rc \cdot E] = P[B_ic \,|\, Rc \cdot E \cdot D]\times P[B_jc \,|\, Rc \cdot E]$; then for each $B_jc$ in $\Delta (c)$, $\sum _{B_ic \,\in \Delta (c)} P[B_jc \,|\, Rc \cdot E \cdot D]\times P[B_ic \,|\, Rc \cdot E] = \sum _{B_ic \,\in \Delta (c)} P[B_ic \,|\, Rc \cdot E \cdot D]\times P[B_jc \,|\, Rc \cdot E]$; so, for each $B_jc$ in $\Delta (c)$, $P[B_jc \,|\, Rc \cdot E \cdot D] = P[B_jc \,|\, Rc \cdot E]$.

Thus, if each pair $B_ic$, $B_jc$ in $\Delta (c)$, $P[D \,|\, B_jc \cdot Rc \cdot E] = P[D \,|\, B_ic \cdot Rc \cdot E]$, then for each $B_jc$ in $\Delta (c)$, $P[B_jc \,|\, Rc \cdot E \cdot D] = P[B_jc \,|\, Rc \cdot E]$.

[3.2] Suppose, for each $B_jc$ in $\Delta (c)$, $P[B_jc \,|\, Rc \cdot E \cdot D] = P[B_jc \,|\, Rc \cdot E]$.

From [3.1] and [3.2] the desired result follows directly.$\square $

1.8 Proof of Theorem 10

Proof

Suppose the antecedent conditions of the theorem. We show:

Proving these two claims will suffice, since the theorem’s biconditional claim follows immediately from [1] and [2], and its simple conditional claim follows immediately from [1].

$\square $

1.9 Proof of Theorem 11

Proof

Suppose the antecedent conditions of the theorem.

We first prove an intermediate result, which we will label (*).

From, supposition (3.2), then (2), then (3.1), we have, for each $B_jc$ in $\Delta (c)$,

$\square $

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Wallmann, C., Hawthorne, J. Admissibility Troubles for Bayesian Direct Inference Principles. Erkenn 85, 957–993 (2020). https://doi.org/10.1007/s10670-018-0070-0

Download citation

Received: 20 October 2017
Accepted: 22 September 2018
Published: 24 November 2018
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10670-018-0070-0

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Admissibility Troubles for Bayesian Direct Inference Principles

Abstract

Similar content being viewed by others

The Principal Principle and subjective Bayesianism

Indirect illusory inferences from disjunction: a new bridge between deductive inference and representativeness

The Likelihood Ratio Measure and the Logicality Requirement

1 Introduction

2 Logical Admissibility Troubles

2.1 Logically Inadmissible Biconditionals

Theorem 1

2.2 Some Other Logically Inadmissible Statements

Theorem 2

Theorem 3

Corollary 4

Theorem 5

2.3 Escape from These Troubles via Stronger Conditionals

3 Evidential Relevance and Admissibility

3.1 Extensive Chance Hypotheses and Algebras of Attributes

Definition 6

3.2 When Chance Outcomes of a Hypothesis HOverride Its Relevance to a Statement D

Definition 7

Corollary of Theorem 9

3.3 The Main Results

Theorem 8

Theorem 9

4 “Inappropriate” Credence Functions

4.1 Examples of “Inappropriate” Credence Functions

Corollary of Theorem 10

Proof

Corollary of Theorem 11

Proof

4.2 Generalization to Algebras of Outcomes

Theorem 10

Theorem 11

5 Reference Class Problems

5.1 Defeat by Outcome Attributes

5.2 Competing Chance Claims

6 Conclusion

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Proof of Theorem 1

Proof

1.2 Proof of Theorem 2

Proof

1.3 Proof of Theorem 3

Proof

1.4 Proof of Corollary 4

Proof

1.5 Proof of Theorem 5

Proof

1.6 Proof of Theorem 8

Proof

1.7 Proof of Theorem 9

Proof

1.8 Proof of Theorem 10

Proof

1.9 Proof of Theorem 11

Proof

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation