Admissibility Troubles for Bayesian Direct Inference Principles

Direct inferences identify certain probabilistic credences or confirmation-function-likelihoods with values of objective chances or relative frequencies. The best known version of a direct inference principle is David Lewis’s Principal Principle. Certain kinds of statements undermine direct inferences. Lewis calls such statements inadmissible. We show that on any Bayesian account of direct inference several kinds of intuitively innocent statements turn out to be inadmissible. This may pose a significant challenge to Bayesian accounts of direct inference. We suggest some ways in which these challenges may be addressed.


Introduction
Direct inferences identify values of some probabilistic credences with values of objective chances or relative frequencies. The main idea has been around for a long time. It goes by various names and has been articulated in a variety of ways. 1 Peirce calls it "probable deduction." Contemporary logicians sometimes call it "statistical syllogism." David Lewis's Principal Principle is perhaps the most widely known version of an explicit direct inference principle (Lewis 1980). Accounts of direct inference usually draw on two distinct notions of probability: an object-language notion, either relative frequency or some notion of objective chance, and a higher level metalinguistic notion that applies to object-language expressions, usually characterized as some kind of logical probability or as a probabilistic measure of rational credence. Carnap (1962), for instance, calls the object language notion pr obabilit y 2 , and takes it to represent relative frequencies of attributes among members of populations. He calls the metalanguage notion pr obabilit y 1 , and takes it to be a kind of degree of logical entailment, which he calls "degree of confirmation." For notational convenience we write 'P' for the pr obabilit y 1 notion and 'ch' for the pr obabilit y 2 notion. Although we will often take the ch function to represent some kind of objective chance, in most contexts the reader may interpret it to be either a chance function or a relative frequency function. In either case, expressions involving the function ch will take the form: 'ch(Ax, Rx) = r '. On a reading of 'ch' as relative frequency, this expression says that the frequency of objects (or systems, or events) possessing attribute A among those in reference class R is r . On the reading of 'ch' as chance, this expression says that the chance that a system in initial state R will acquire attribute A is r .
Letting P represent the pr obabilit y 1 notion and taking ch to represent the pr obabilit y 2 notion, here is a generic version of a direct inference principle. Later we'll extend it to more complex chance hypotheses. 2 Generic Direct Inference Principle-G-DIP: 3 Let P be an "appropriate" probability function on a language that contains chance (or frequency) claims. Let 'ch(Ax, Rx) = r ' be an object-language statement that says that the chance that a system in state R acquires attribute A is r (alternatively, that the frequency of possessing attribute A among objects in reference class R is r ), where r is a standard term for a real number between 0 and 1 (inclusive). Let 'Rc' say that system c is in state (or reference class) R, and let 'Ac' say that system c acquires (or possesses) attribute A. Then, P[Ac | ch(Ax, Rx) = r · Rc · E] = r , 2 Direct inference principles have been proposed by a number of probabilistic logicians. Prominent among them are proposals by Carnap (1962), Kyburg and Teng (2001), Levi (1977), Lewis (1980), Pollock (2011), and Bacchus (1990), and Thorn (2012Thorn ( , 2018. These accounts differ in their interpretations of the P and ch notions. Carnap and Kyburg take the ch notion to be frequencies (of attributes among members of reference classes), Pollock interprets it as nomic probability (or proportions among physically possible objects), and Bacchus and Thorn explicate it as a kind of expected frequency; Levi and Lewis both take ch to be some kind of objective chance, although their accounts of chance differ in significant ways-e.g. Lewis takes chance statements to apply to whole propositions at specific times, while Levi takes them to apply to predicates containing free variables, as in G-DIP. These accounts also interpret the P notion in several distinct ways. Carnap, Levi, and Lewis take the P notion to be Bayesian probability functions of some kind, although they differ on the interpretation of these probability functions (e.g. for Carnap they are logical, for Levi they are credal probability functions (relative to a potential corpus of certain knowledge, K ), for Lewis they are reasonable initial credence functions). 3 Contrary to what the term direct inference suggests probability 1 statements are not strictly inferred from probability 2 statements. G-DIP is a statement about what value certain conditional probabilities should attain. However, since the name 'direct inference' has regularly been used for principles like G-DIP, we use it here as well.
provided that E is both consistent with (ch(Ax, Rx) = r · Rc) and admissible with respect to (ch(Ax, Rx) = r · Rc) (where tautologies are always considered admissible). 4 We won't attempt to spell out an account of admissibility. Doing so is a complex and controversial undertaking. But, for our purposes, no specific account of admissibility need be supposed. Thau's proposal works well enough for our purposes: "A proposition is inadmissible if it provides direct information about what the outcome of some chance event is." (Thau 1994, p. 500, emphasis added) Since tautologies are always admissible, the admissibility of any other statement E requires that E be probabilistically independent of Ac, given (ch(Ax, Rx) = r · Rc) (for P). However, admissibility does not simply reduce to probabilistic independence; rather, it is designed to motivate probabilistic independence in appropriate cases. For instance, Lewis' substantive account (in Lewis 1980) declares a statement admissible for a direct inference provided that it contains only information about particular matters of fact that occur before the time at which the associated chance outcome occurs. On this account, all future statements about particular matters of fact are inadmissible, even those that may happen to be probabilistically independent of Ac given chance claim (ch(Ax, Rx) = r · Rc). 5 When a statement D fails to be probabilistically independent of Ac, given (ch(Ax, Rx) = r · Rc · E) for admissible E (for probability function P), then we say that D defeats the corresponding direct inference. That is, defeat of a direct inference by D just means that P[Ac | D · ch(Ax, Rx) = r · Rc · E] = P[Ac | ch(Ax, Rx) = r · Rc · E] = P[Ac | ch(Ax, Rx) = r · Rc] = r for admissible E.
Notice that if D is a defeater, then on any adequate account of admissibility, (D · E) must be inadmissible for the direct inference, since failure of probabilistic independence is a sure-fire way for admissibility to fail. But its also possible for admissibility to fail in cases where probabilistic independence remains intact. In such a case, although D (or (D · E)) is inadmissible, D does not count as a direct inference defeater, not as we use that term in this paper. Thus, as we use the term, a direct inference defeater is a particularly strong kind of inadmissible statement. 6 We will investigate several kinds of cases where, on purely logical grounds, when P satisfies the classical axioms of probability, direct inference outcomes must fail to be probabilistically independent of a statement D. Thus, any account of direct inference based on G-DIP will rule the defeating statement D to be inadmissible, regardless of the particular account of admissibility employed. These are the kinds of troubles we consider. These troubles pose significant challenges if an agent wants to use these probability functions in a certain epistemic situation she finds herself in. One such use is to determine one's current credence via the total evidence requirement.
For Bayesians, the logic of credence functions (or confirmation functions) is captured by the way in which the axioms of probability theory constrain the numerical values of P[A | B] for the range of statements A and B, often under conditions (or suppositions) that constrain the probability values of other statements. Logically speaking, a direct inference rule such as G-DIP is merely an additional axiomatic constraint. Any function P that satisfies the other axioms, but violates the direct inference rule, is "ruled out" for failing to be an "appropriate" credence (or confirmation) function. 7 However, the further issue of how a rational agent is supposed to apply these functions, given the situation in which she finds herself, including her current state of knowledge, is not a purely logical matter. Carnap realized this long ago. His Requirement of Total Evidence is merely a way to make explicit our usual implicit assumptions about how an agent is supposed to apply her credence (or confirmation) function. Here is a fairly close paraphrase of Carnap's requirement, adapted to apply to the P functions of G-DIP.
Total Evidence Requirement: Suppose that the logic of credence functions (or confirmation functions) supplies a result of form 'P[A | B] = r ', where A and B are statements, r is a real number between 0 and 1, and P is the rational initial credence function (or the confirmation function) for an agent. If B expresses this agent's total available evidence at the time t, then she is justified at t in believing A to the degree r , and hence in betting that A is true with a betting quotient no higher than r . 8 (Compare Carnap 1962, p. 211.) For an agent to apply our version of the direct inference principle, G-DIP, the agent's total evidence should be captured by '(Rc · E)'. What about the chance claim 7 Lewis (1980) places additional constraints on "appropriate" credence functions. One of his requirements is that they be initial credence functions. That is, for direct inference to work properly, the same initial credence function must be maintained throughout. For, in order for any account of direct inference to work properly, careful account must be kept of whatever statements E is conditionalized upon, so that their admissibility for a proposed direct inference may be assessed. When probabilistic updating occurs in the usual Bayesian kinematic way, via P new [S] = P[S | K ] for an agent who learns K , the updated function P new suppresses the learned information K by assigning P new [K ] = 1. This introduces complications with admissibility assessments of information that might well defeat a proposed direct inference. To assess whether P new [Ac | ch(Ax, Rx) = r · Rc] = r holds, one needs to keep track of any update-information K from which P new results, so that one can assess its admissibility. In cases where all updating is via explicit information K , this is easy enough to accomplish, but not significantly different than simply making information K explicit as a premise in the initial credence function, P[S | K ]. However, whenever P new results from P in a less direct way, such as via Jeffrey conditionalization, the resulting credence function may be deflected from the (seemingly appropriate) direct inference value, with no justification via the inadmissibility of some explicit statement K . Later, in Sect. 4, we will construct a specific example of this kind. So, Lewis's approach to direct inference largely bypasses the kinematics of Bayesian updating. Rather, the Bayesian agent is taken to employ the same initial credence function throughout. On this approach, Bayesian updating simply amounts to what the logic of credence functions implies about the results of conditionalizing on additional premises. 8 When an agent's betting quotient is r (or less), she should be willing to place a bet that loses her 1 (or less) if A turns out to be false, but gains her at least (1 − r )/r dollars if A turns out to be true (supposing the utility curve is linear for the amounts of money involved).
'ch(Ax, Rx) = r ' (the chance claim X , for Lewis)? The Applications of the direct inference principle need not require that the chance claim itself be part of the agent's total evidence, nor need the agent know it to be true. Here is a close paraphrase of what Lewis says about this point (Lewis 1980, p.267 continued): If in addition you are sure that the chance claim ch(Ax, Rx) = r is true (i.e. if P[ch(Ax, Rx) = r | Rc · E] = 1, where (Rc · E) is your total evidence), it follows also that r = P[Ac | Rc · E] is your present unconditional degree of belief that Ac is true. More generally, whether or not you are sure about the chance claim ch(Ax, Rx) = r , your unconditional degree of belief that Ac is given by summing over alternative hypotheses about chance: We investigate several kinds of cases where, on purely logical grounds, direct inference outcomes must fail to be probabilistically independent of a statement D. Thus, any adequate account of admissibility should rule the defeating statement D to be inadmissible. We call such statements logically inadmissible with respect to the direct inferences they defeat. In some cases we show precisely how much the addition of these defeaters to the premises of a direct inference must divert the credence value from the associated chance value. We argue that some of these logically inadmissible statements may be easily acquired by an agent, thus tainting her total evidence and inhibiting her warrant to engage in legitimate direct inferences about these chance events.
Here is how we'll proceed. In Sect. 2 we prove results 10 that show that material conditional and biconditional statements involving the conclusions of direct inferences must be inadmissible on purely logical grounds. This may present some surprising challenges for Bayesian direct inference principles.
In Sect. 3 we show that in an important class of cases the evidential relevance of a statement D to an outcome Ac implies the logical inadmissibility of D. It seems to be relatively easy for an agent to acquire this kind of information. Thus, an agent's ability to engage in direct inferences is shown to be somewhat fragile.
In Sect. 4 we consider some fairly mild conditions on credence functions that makes them "inappropriate" for G-DIP, because any credence function that satisfies these conditions must get straightforward direct inferences wrong.
In Sect. 5 we discuss direct inferences in cases where several reference classes may compete. We argue that direct inference probabilities are best characterized as expected values over credences of possible observational statements or over extensive chance theories. We show how this fact is problematic for Bayesian direct inference principles.
The authors of this paper are divided over what these results show. One of us (Wallmann) thinks that many of these logically inadmissible statements should not 9 Lewis' article suggests a continuum of possible chance values for ch(Ax, Rx); so it makes sense to read "summing" to mean integrating, the limit of summing over arbitrarily small intervals: 10 Detailed proofs of all theorems can be found in the "Appendix". defeat direct inferences. Rather, an agent who has such information as part of her total evidence should still conform her rational credences, and her betting behavior, to the objective chances. Therefore, this author reads these troubles as showing that the Bayesian account of direct inference fails, that having P satisfy the axioms of conditional probability is incompatible with a correct account of direct inference. The other author thinks that the logically inadmissible statements explored in this paper should indeed defeat direct inferences, so the Bayesian account gets it right. We will elaborate our reasons for disagreement in the main body of the paper. In any case, the paper explores a wide range statements of a kind that must turn out to be inadmissible on any Bayesian account of direct inference.

Logical Admissibility Troubles
The troubles we will raise for direct inference principles in this section and the next are quite general. They plague all Bayesian accounts where the P notion satisfies the usual axioms of conditional probability, regardless of whether the conception of objective chance applies to full propositions (as does Lewis's Principal Principle) or is couched in terms of generic probabilities (containing only open sentences, as in G-DIP, above). All the admissibility failures we'll discuss draw on cases where probabilistic independence must fail on purely logical grounds. We will first investigate several kinds of such logically inadmissible statements. Section 3 will go on to provide a more general characterization of an important class of logically inadmissible statements.

Logically Inadmissible Biconditionals
Consider the following situation. John and Maria are standing next to the craps table watching the action. Let H represent the chance hypotheses associated with a fair pair of dice tossed onto a flat surface in the usual (fair) way. In particular, R says that a pair of fair dies is tossed onto a flat surface in the usual (fair) way, and A says that the outcome of a toss is seven. According to chance hypothesis H , the chance of outcome A for a system in state R is 1/6, ch(Ax, Rx) = 1/6, which is the usual objective chance for getting seven on a (fair) toss of a pair of fair dice. Let c be the event consisting of the next toss of the dice, so Rc says that the next toss is that of a pair of fair dice (fairly) tossed onto a flat surface, and Ac says that the next toss comes up seven. Let E represent Maria's background knowledge about dice and craps tables, and perhaps about human relationships, and about anything else that may be relevant to the following situation (including the fact that Maria trusts John to keep his word). Surely E is itself admissible with respect to possible chance outcomes for (H · Rc)otherwise we will already have trouble applying direct inference principles to this kind of chance situation. Thus, we should have the direct inference P[Ac| H · Rc· E] = 1/6, where P is Maria's (initial) credence function. Now, John says to Maria, "I'll buy you dinner this evening if, but only if, the next toss comes up seven." That is, John sincerely asserts a statement of form (F ≡ Ac), where Maria understands F to say that John will pay for Maria's dinner this evening (provided that no extraordinary circumstance arises-e.g. provided that Maria permits it, and John doesn't fall ill before hand, etc.). 11 Taking John at his word, Maria adds (F ≡ Ac) to her total body of evidence. Thus, the premise for the direct inference regarding Ac, based on her total body of evidence, becomes (H · Rc · E · (F ≡ Ac)). Should Maria's rational credence that the dice will come up seven on the next toss now differ from the objective chance value?-i.e. does P[Ac | H · Rc · E · (F ≡ Ac)] differ from 1/6? Or has Maria's total information, (E · (F ≡ Ac)) become inadmissible, undermining the direct inference? More urgently, should Maria still be willing to bet on the next toss turning up seven at the usual fair odds (which is 5 to 1 against, corresponding to the chance of occurrence being 1/6)? You might well think so! 12 As it happens, probability theory itself guarantees that this kind of biconditional information is almost always logically inadmissible for the relevant direct inference. For, whenever P[F | H · Rc · E · Ac] = P[¬F | H · Rc · E · ¬Ac], Ac cannot be probabilistically independent of (F ≡ Ac) given (H · Rc · E). And any such failure of probabilistic independence entails inadmissibility. Worse yet, we will see that, according to her credence function, the odds at which Maria should be willing to bet that seven turns up may differ significantly from the usual fair betting-odds suggested by the objective chance.

are well-defined (for some s and t), and
(1) either s > 0 or t > 0, and either s < 1 or t < 1, and Thus, when John says to Maria, "I'll buy you dinner this evening if, but only if, the next roll comes up seven", almost everyone who overhears this assertion, and who takes John to be sincere, should employ credences, based on the total available evidence, that fail to match the objective chances of the dice coming up seven on the next roll. Only one kind of exception is possible. Those individuals whose credences remain faithful to the objective chance are just those individuals who, before hearing John's statement, happen to find the conditional credibility of the claim "John will buy Maria dinner this evening" given seven comes up on the next roll (i.e. P[F | H · Rc · Ac · E]) equal to the conditional credibility of the claim "John won't buy Maria dinner this evening" given seven does not come up on the next roll (i.e. P[¬F | H · Rc · ¬Ac · E]) -where both credence conditions include the agent's total available evidence E together with the relevant chance claims, (H · Rc).
Indeed, before hearing John's statement (F ≡ Ac), perhaps Maria and most bystanders will have taken "seven comes up on the next roll" to be probabilistically independent of "John buys Maria dinner this evening", given (H · Rc · E). Such an agent cannot have her credence that "the next roll turn up seven" remain faithful to the objective chance unless she happens to assign P[F | H · Rc · E] = 1/2. Thus, the Bayesian account of direct inference apparently implies a form of the principle of indifference (Hawthorne et al. 2017). However, it seems highly doubtful that most agents will assign the value 1/2 to P[F | H · Rc · E]. For, in place of F, John might well have asserted biconditionals involving any number of distinct alternative conditions, F 1 , F 2 , F 3 , …, etc. (e.g., "I'll buy you dinner at McDonald's", "I'll buy you dinner at Chez Panisse", …, etc.). But the statements F k for the resulting biconditional claims, (F k ≡ Ac), cannot all have conditional credence values P[F k | H · Rc · E] = 1/2. Thus, the agent's direct inference credence P[Ac | H · Rc · E ·(F k ≡ Ac)] must deviate from the objective chance value 1/6 for almost all such claims, F k .
When the value of s = P[F | H · Rc · A · E] is much closer to 0 than the value of t = P[¬F | H · Rc · ¬A · E], the value of P[Ac | H · Rc · E · (F ≡ Ac)] must be very close to 0, as the theorem shows. 13 So, if Maria (and eavesdropping bystanders) takes John's offer to be very unlikely before he asserts it, then her total-evidence credence for seven on the next toss should be very close to 0! Thus, if the objective chance values provide the correct betting odds, then Maria (and bystanders) should be willing to accept wagers against seven at incorrect odds that are extremely unfavorable to themselves. This is true regardless of whether there is any evidence available for Maria (or the bystanders) that justifies assigning low credence to John paying for the dinner. We will discuss situations in which credences based on no evidence whatsoever lead to defeat of direct inferences in more detail in Sect. 5.2.

Some Other Logically Inadmissible Statements
Similar to biconditionals, material conditionals and disjunctions involving the outcome Ac must be logically inadmissible. The extent to which the resulting probabilities deviate from the corresponding direct inference probabilities will be characterized precisely here. We will also prove a result for the case where adding a further statement to the body of evidence defeats a defeater and restores the original direct inference.
A statement is a defeater just in case its negation is also a defeater. The only exceptions are cases where the candidate statement has probability 1 or 0, given the premise of the direct inference. This suggests an easy algorithm for generating a host of inadmissible statements: (1) find an obvious inadmissible statement (e.g. (¬F ·¬Ac)); then (2) take its negation (e.g. ¬(¬F ·¬Ac), which is logically equivalent to (F ∨ Ac)).
The following result establishes this claim.

Theorem 2 Defeater just when Negation
It follows immediately that whenever 0 It also follows immediately that disjunctions and material conditionals involving the outcome Ac are inadmissible.
The following theorem extends this result by showing more precisely the degree to which P

Theorem 3 Inadmissible Disjunctions. Let r be any real number such that
It follows immediately that:

Corollary 4 Inadmissible Material Conditionals.
Let r be any real number such that 0 < r < 1 and suppose P[Ac | H · Rc · E] = r.
This corollary characterizes additional counter-intuitive defeaters for Bayesian direct inference. Suppose that in our craps example from Sect. 2.1 John says "If seven comes up on the next toss, I'll buy you dinner this evening". Then, where r = 1/6, for s = 0.5, P[Ac | H · Rc · E · (Ac ⊃ F)] = 1/11. Furthermore, if, believing that John is stingy, Maria considers "John buys Maria dinner this evening", F, to be highly unlikely (given H · Rc · E), say s = .01, then P[Ac | H · Rc · E · (Ac ⊃ F)] = 1/501 << 1/6. Thus, such (material) conditional claims turn out to overwhelmingly defeat the direct inference. This is true regardless of whether Maria has any evidence that justifies her in considering John as stingy.
In some cases a defeated direct inference may be restored by the addition of information. Consider, for example, the case where (Ac ∨ F) is a defeater for the direct inference to Ac, but where F is not itself a defeater. In that case, although In this case the statement F is a defeater-defeater for the defeater (Ac ∨ F). An earlier (Theorem 2) showed that the negation of a defeater must also be a defeater. So, one may well wonder whether the negation of a defeater-defeater may also be a defeater-defeater. The following theorem shows that this never happens. The negation of a defeater-defeater can never restore the previously defeated direct inference.

Theorem 5 Negations of Defeater-Defeaters cannot be Defeater-Defeaters. Suppose P[Ac
suppose that D defeats the direct inference P[Ac | H · Rc · E] = r but G defeats the defeater, restoring the direct inference. Then The next subsection provides an important example of a defeater-defeater.

Escape from These Troubles via Stronger Conditionals
The craps table examples presented in Sects. 2.1 and 2.2 show how easy it can be to taint an agent's total body of evidence with statements that defeat her direct inferences. But perhaps our way of interpreting these examples is mistaken. For, although direct inferences are indeed defeated by such material conditionals and biconditionals (in which the antecedents are the target statement of the direct inference, or its negation, Ac or ¬Ac), perhaps such defeating conditionals and biconditionals may not be so easily introduced into an agent's total body of evidence in such a way that they function as defeaters. If this suggestion is right, then although the formal results about material conditional and biconditional defeaters are correct, the intuitive examples we used to illustrate the impact of these formal results may be misleading. Properly represented, the intuitive examples might not give rise to direct inference defeaters after all. Here is what we have in mind.
We first treat the case of simple conditional statements, before turning to the biconditional case. Consider John's conditional assertion to Maria, "If seven comes up on the next toss, I'll buy you dinner this evening." As usually understood, such an assertion suggests a clear causal asymmetry between John's dinner offer (i.e. "I'll will buy Maria dinner this evening") and the outcome of the dice roll (i.e. "seven comes up on the next toss"). John may wait for the outcome of the toss and may then act in such a way that the conditional will be true. So, perhaps the representation of the example in terms of a mere material conditional is inadequate. Perhaps the conditional involved is more adequately represented by some stronger kind of indicative or causal conditional. Let's formally represent John's assertion this way: (Ac → F), where → represents some kind of strong, causal or indicative conditional. Then, the central issue is whether The following result will prove useful.
whenever Ac provides no evidence for (or against) (Ac → F), given (H · Rc · E). 14 Arguably, in the craps-table example the claim Ac (given (H · Rc · E)) does not provide evidence for or against a strong (causal or indicative) conditional claim of form (Ac → F). 15 Thus, our example of easy defeat for an agent's direct inference may be side-stepped. Supplying the agent with a convincing conditional claim involving the target statement of her direct inference, Ac, need not defeat her direct inference after all, unless that convincing conditional claim is merely a material conditional claim. A truly convincing example of easy defeat via the acquisition of a knowledge of conditional claim will have to show how the rational agent may (easily) become convinced of the material conditional claim in cases where she is not also convinced of the corresponding strong conditional claim. 16 All of the previous points carry over fairly directly to the case of the biconditional defeater. In this context, John's biconditional assertion to Maria, "I'll buy you dinner this evening if, but only if, seven comes up on the by next toss", clearly suggests a causal asymmetry between John's dinner offer and the outcome of the dice roll. So, perhaps John's biconditional assertion is not adequately captured by the material biconditional.
Perhaps it is more adequately represented by a conjunction of stronger, indicative or causal conditional claims, as follows: ((Ac → F) · (¬Ac → ¬F)), where → again represents some kind of strong, causal or indicative conditional. Then, the issue is The direct inference remains undefeated by the strong biconditional-i.e.
whenever Ac and ¬Ac each provide the same evidence for (or against) Arguably, in the context of the craps-table example, the claims Ac and ¬Ac should (given (H · Rc · E)) each provide the same amount of evidence for or against a strong (causal or indicative) biconditional claim of form ((Ac → F) · (¬Ac → ¬F)). 18 Thus, the prospect of easy defeat for an agent's direct inference about a future chance event, via the easy acquisition of a biconditional, may be averted. Informing the agent with a convincing biconditional claim need not defeat her direct inference, unless that convincing biconditional claim involves only a material biconditional, rather than conditionals of some stronger kind.
None of this is to suggest that defeat via material conditionals and biconditionals is unimportant to Bayesian direct inferences; only that their availability should not be so easily acquired as the craps-table examples suggest. Furthermore, in cases where the chance event Ac has already occurred, when the agent's total available evidence remains admissible for the relevant direct inference, her chance claims may continue to guide her credence that Ac holds via the usual kind of direct inference. However, in such cases an agent may more easily become informed of a material conditional or biconditional statement that informationally ties Ac to another statement F. When that happens, this additional information may well defeat her chance-based direct inference regarding the chance event Ac, as indicated by the defeater theorems presented in this section. From a Bayesian perspective, this may sound plausible. When F and Ac are informationally tied together by a material conditional or biconditional claim, and that claim is added to the agent's total evidence, then whatever credence F itself already had will drag the credence of Ac away from its direct inference value. 19 This is true, however, even for the case where no evidence is available for or against F. In this case, it seems that defeat by biconditionals may be problematic. We will discuss situations in which credences based on no evidence whatsoever lead to defeat of direct inferences in more detail in Sect. 5.2.

Evidential Relevance and Admissibility
It is commonly supposed that chance hypotheses screen off "many propositions that one can easily come to know and that would otherwise be relevant to the proposition A under discussion." (Schwarz 2014, p. 82). When this is so, the direct inference from the chance hypothesis is said to be resilient. 20 A high degree of resiliency for direct inferences is crucial. Otherwise, they may be largely inapplicable, given the total evidence available to agents. In this section we will characterize a broad class of statements that, on logical grounds, must defeat direct inferences. Thus, to the extent that such information is readily available to agents, direct inferences may turn out to be rather less resilient than usually supposed.
We investigate some quite general conditions under which a statement D may defeat direct inferences. Our results are general enough to apply to extensive chance hypotheses-i.e. chance hypotheses (and theories) that entail chance claims for an algebra of outcomes of initial chance states R, and may do so for any number of distinct initial chance states. We'll say more about the nature of extensive chance hypotheses below.
We will characterize some classes of statements that must defeat direct inferences, and so must be inadmissible on any account. For example, under assumptions very commonly met, one of our main results shows that evidential support of a statement D for Ac implies inadmissibility of D in direct inferences for Ac and goes like this: Let A 1 c and A 2 c be any two possible chance outcomes of initial state R for chance system c, and suppose E is admissible for the direct inferences from H to each of these two outcomes. Consider a statement D to which each of the possible chance events (Rc · A 1 c) and (Rc · A 2 c) is directly relevant. Indeed, suppose that each of these possible chance events is so directly relevant to D that it overrides (or screens-off ) whatever relevance H might have to D, given E (for credence function P). Then, provided that D is more likely according to one of these two chance events than according to the other, given E (for P), D must defeat either the direct inference from (H · Rc · E) to A 1 c or the direct inference from (H · Rc · E) to A 2 c (for P). Thus, any such statement D, in conjunction with the admissible statement E, must be inadmissible for direct inferences from (H · Rc).
This section is mainly devoted to explicating several results of this kind.
We proceed by first characterizing extensive chance hypotheses, and generalizing the principle of direct inference, G-DIP, to cover them. Then we identify an important class of statements D that turn out to defeat direct inferences from chance hypothesis H : statements D to which some of H 's chance outcomes are "more directly relevant" than is H itself. We provide an illustrative example of such a case. Finally, we establish two general results that show the logical inadmissibility of such statements. The first result, stated informally above, provides sufficient conditions for such statements to defeat direct inferences. The second result provides necessary and sufficient conditions for such statements to defeat direct inferences, but under slightly stricter conditions (involving partitions of chance outcomes) than supposed by the first result.

Extensive Chance Hypotheses and Algebras of Attributes
Sophisticated chance hypotheses (or chance theories) entail chance claims for all Boolean combinations of possible outcome attributes of an initial chance state (or reference class) R. That is, whenever the hypothesis entails chance claims of form ch(Ax, Rx) = r and ch(Bx, Rx) = s, it also entails chance claims of form ch(¬Ax, Rx) = p, ch((Ax ∨ Bx), Rx) = q, and ch((Ax · Bx), Rx) = t, where p, q, r , s, t are standard terms for real numbers between 0 and 1. Thus, associated with each chance state Rx is a Boolean algebra of outcome attributes R for R, where, whenever R contains Ax and Bx, it also contains ¬Ax, (Ax ∨ Bx), and (Ax · Bx); and where R contains no other expressions. 21 Furthermore, for each initial state (or reference class) R treated by H , the associated chance function ch( , Rx) should satisfy the usual axioms of probability theory for its algebra of attributes, R . 22 An extensive chance theory of this kind will often cover a variety of distinct initial states (or reference classes) Rx, and provide chance claims for Boolean algebras of outcomes, R , for each such R. One more bit of notation will prove useful. When a particular chance system c is in an initial chance state R, we denote the algebra of chance outcomes for event Rc by the term ' R (c)', which represents the algebra of outcome attributes for R, R , applied to the individual system c. That is, when Rc holds, for each Ax in R , there is an associated possible outcome of Rc, Ac, in the algebra of associated outcomes R (c). Throughout the remainder of this paper our treatment of chance and direct inference will apply to the kind of extensive chance hypotheses just described. We'll use 'H ' to represent chance hypotheses of this kind. Here is a generalization of the direct inference principle that applies to direct inferences from extensive chance hypotheses.

Generalized Generic Direct Inference Principle-GG-DIP:
Let P be an appropriate classical probability function (credence function) on a language that contains chance (or frequency) statements. Let H be any extensive 21 When chances are represented in terms of sets, rather than attribute-predicates of form ' Ax', the associated collection of sets is called a field of sets. 22 Let ' F x' say that F x has the form of a tautology. For each attribute Ax and Bx in R : (1) 0 ≤ ch(Ax, Rx) ≤ 1; (2) if A, then ch(Ax, Rx) = 1; (3) if ¬(A · B), then ch((Ax ∨ Bx), Rx) = ch(Ax, Rx) + ch (Bx, Rx). This suffices to guarantee that ch( , Rx) satisfies all the usual theorems of probability theory. chance hypothesis: that is, for each initial state (or reference class) R treated by H , for each A j in the associated Boolean algebra, R , of possible outcome attributes for systems in state R, H entails a chance claim of form ch(A j x, Rx) = r j , where r j is a standard term for a real number between 0 and 1 (inclusive), and where each chance function ch( , Rx) satisfies the usual axioms of probability theory on R . Then, for each outcome attribute A j in R , for each chance system c, provided that E is both consistent with (H · Rc) and admissible with respect to (H · Rc) over R (c) (where tautologies are always considered admissible).
A statement E may defeat some of the direct inferences based on (H · Rc), while leaving others intact. That is, we may have P[A j c | H · Rc · E] = r j for some possible outcomes A j c, while P[A k c | H · Rc · E] = r k for some other possible outcomes. In that case E should count as inadmissible for the direct inferences from (H · Rc) to the outcomes in R (c), regardless of the fact that some of these chance outcomes happen to be probabilistically independent of E. For, when a agent's total body of evidence consists of (Rc· E) and she is contemplating bets on outcomes of Rc, no proper account of admissibility should count her total evidence as admissible for some of the possible outcomes, but inadmissible for others-admissible for the dice coming up six, but inadmissible for coming up nine. Any proper account of admissibility involves more than mere probabilistic independence. Any specific notion of admissibility is supposed to provide a rational for probabilistic independence in direct inference contexts, and that rational should apply to all the possible outcomes of an initial chance state Rc for a chance system c.
At the beginning of this section we introduced the notion of resiliency for direct inferences. The idea is that the alignment of credences with chances should not be undermined by the addition of easily acquired information. Otherwise, the ability to apply direct inferences becomes unstable. Resiliency is meant to capture this kind of desired stability for direct inferences. A direct inference is highly stable provided that nearly all of the kinds of information that might become available to an agent who is in a position to apply that direct inference falls within its "sphere of resiliency". It will prove useful to specify this notion formally.

Definition 6 Resiliency Spheres.
For a credence function P, an extended chance hypothesis H , and a chance system c in initial state R covered by chance claims in H , the resiliency sphere for direct inferences from (H · Rc) is the collection of statements E such that, for every outcome Ac in algebra R (c) of outcomes for Rc (according to H ), P[Ac|H ·Rc·E] = P[Ac|H ·Rc].
Notice that a resiliency sphere surrounds not merely individual chance outcomes, taken one at a time, but the whole algebra of outcomes of chance state Rc. A statement E that is probabilistically independent of one outcome of Rc, given (H · Rc), but fails to be probabilistically independent of another of its outcomes, falls outside the resiliency sphere. 23 The resiliency sphere for (H · Rc) will usually be broader than its class of admissible statements, depending on how the notion of admissibility is specified. To see why, notice how GG-DIP (and G-DIP) is supposed to work. Any application of GG-DIP presupposes some concrete notion of admissibility, specified in advance of identifying associated credence functions P. That is, a concrete notion of admissibility specifies, for each chance statement in H and its initial state Rc (for arbitrary systems c), exactly what statements E are to count as admissible. It will usually do so in terms of the information carried by the chance claims in H , the information carried by Rc and its associate chance outcomes in R (c), and by the information carried by statements E. This will usually involve conditions that take into account whether the information in E is (or is not) "directly relevant" to outcomes Ac in R (c). 24 The specification of admissibility doesn't depend in any way on the particular credence function considered. Rather, after a specific account of admissibility is spelled out, GG-DIP (or G-DIP) does its work by ruling out those credence functions P that either fail to make P[Ac | H · Rc] = r when H entails ch(Ax, Rx) = r , or that fail to make P[Ac| H · Rc· E] = P[Ac| H · Rc] when E has been deemed admissible by the account of admissibility on offer. All credence function P that are not ruled out in this way may count as "appropriate" for some agent, provided that they satisfy whatever other constraints are deemed proper (e.g. for Lewis they must also satisfy regularity). The point is, for a credence function P that passes these hurdles, so succeeds in satisfying GG-DIP, there may well be a number statements E not designated as admissible but that still yield P[Ac | H · Rc · E] = P[Ac | H · Rc] = r for all Ac in R (c). Thus, the resiliency sphere of (H · Rc) for P may well contain more than the class of admissible statements for (H · Rc) specified by a specific account of admissibility. However, any statement E that falls outside the resiliency sphere of (H · Rc) for P must be inadmissible for (H · Rc) according to every possible coherent account of admissibility.

When Chance Outcomes of a Hypothesis H Override Its Relevance to a Statement D
Typically, the relevance of a chance hypothesis H to a statement D will be overridden by outcomes of an initial chance state Rc in the following kind of situation. Statement D contains information about possible chance outcome Ac (and its alternatives), so Ac is evidentially relevant to D given (Rc · E). And because hypothesis H is relevant to chance outcome Ac, it will be relevant to (information in) D as well. But, the chance claim ch(Ax, Rx) = r entailed by H is more directly about outcome Ac than about D, so the relevance of H to D derives from its relevance to Ac. When that's the case, the information contained in outcomes Ac and ¬Ac may override what information 23 For the same reason that admissibility should work this way-see above.
H contains (about possible outcomes) that is relevant to D, given (Rc · E), because the information Ac and ¬Ac contain is more directly tied to D than the information contained in H . Thus: In such cases let's say that the relevance of chance hypothesis H to statement D is overridden by the associated chance outcomes of chance state Rc. It turns out that whenever this condition holds and D is evidentially relevant to Ac (P[D | Ac · H · Rc · E] = P[D | Rc · E]), D (together with admissible E) must defeat the direct inference from (H · Rc) to Ac.
Here is an illustration of a case where chance outcomes {Ac, ¬Ac} of a chance hypothesis H are overridingly relevant to a statement D.
Let H be a theory about the chances that people who fit some particular profile R have the attribute, "will develop Alzheimer's disease by age 70", attribute A. Thus, H entails ch(Ax, Rx) = r , for some specific value r (e.g. perhaps r = .83). Suppose that a 50 year old male named Chuck, c, fits the profile, so Rc holds. Thus, for admissible background information E, P[Ac | H · Rc · E] = r is a perfectly good direct inference about Chuck's chances of developing Alzheimer's by age 70. E may include whatever admissible background information we may know about medical conditions and medical testing (including brain imaging), about the chance theory H , about Chuck himself, etc.
We may be interested in other indications of whether Chuck will develop Alzheimer's by age 70, indications that are independent of the information provided by chance theory H . Suppose that by means of an imaging technique it is possible to detect brain plaque of the kind usually associated with Alzheimer's. The detection of a "moderate accumulation" of this plaque (in a patient like Chuck) does not guarantee that the patient will acquire Alzheimer's as he ages, but it is an indication of a significantly increased risk of developing the disease. Included among the admissible background knowledge E may be information about this technique and its implications. Let statement Fc state the fact that Chuck undergoes the imaging technique at age 50, and let statement D say that the image of Chuck's brain shows that a "moderate accumulation" of plaque is present. Presumably, absent the result D, Fc taken together with the other information in E is admissible, so let's suppose that Fc is included within E. However, the result of this this procedure, D, may well be evidentially relevant to whether or not Chuck will develop Alzheimer's at age 70. Suppose it indicates an increased likelihood of the onset of Alzheimer's by age 70: Regardless of whatever relevance a person's chances of developing Alzheimer's by age 70, H , may have to his likelihoods of exhibiting a "moderate accumulation" of brain plaques by age 50, D, the relevance of that chance claim H to image result D is overridden by the claim that the individual will indeed develop Alzheimer's by age 70, Ac. That is, the fact that a person will develop the disease, Ac, is predictive enough about the amount of plaque build up over time that it overrides the relevance of the chances of developing the disease (expressed by H ) to the likelihood of outcome D from a brain scan at age 50. Thus, Similarly, the fact that a person will not develop the disease, ¬Ac, is predictive enough about the amount of plaque build up over time that it overrides the relevance of the chances of developing the disease (expressed by H ) to the likelihood of outcome D from a brain scan at age 50. Thus, Thus, in the order discussed, we have the following: 1. P[Ac | H · Rc · E] = r is a direct inference about Chuck's chances of developing Alzheimer's by age 70, given he fits profile R.

1 > P[Ac | D · Rc · E] > P[Ac | Rc · E] > 0: given membership in risk group
R, the fact that a person's brain scan at age 50 shows a "moderate accumulation" of plaque is positive evidence that the person will develop Alzheimer's by age 70.

P[D | Ac
: relevance of the chances of developing Alzheimer's by age 70 (according to hypothesis H ) to whether a person's brain scan at age 50 shows a "moderate accumulation" (statement D) is overridden by the claim that the person will (or will not) develop Alzheimer's by age 70 (the direct inference outcomes of H in {Ac, ¬Ac}), given admissible E.
Therefore, the claim that Chuck's brain scan shows a "moderate accumulation" of plaque, D, defeats the direct inference regarding Chuck's chances, r , of developing Alzheimer's by age 70: P[Ac | D · H · Rc · E] = P[Ac | H · Rc · E] = r , for admissible E. Thus, D (in conjunction with E) must be inadmissible for this direct inference.
Here is the relevant formal result. It shows that whenever a chance hypothesis H satisfies the above "overridden relevance to D" condition for its outcomes {Ac, ¬Ac}, given (Rc · E), statement D must defeat the direct inference from (H · Rc · E) to Ac if and only if D is evidentially relevant to Ac, given (Rc · E).

Corollary of Theorem 9
Inadmissible Evidence for Outcomes. 25 We assume throughout that P[D · H · Rc · E] > 0 (so that all the conditional probabilities are well-defined). Let P[Ac | H · Rc · E] = r for 0 < r < 0, be a direct inference for admissible E.
if and only if (D · E) falls outside the resiliency sphere of (H · Rc) (since P[Ac | D · H · Rc · E] = P[Ac | H · Rc · E]).

The Main Results
The next two theorems provide the main formal results of this section. Each result has two parts. Near the beginning of this section we summarized the first part of the first theorem. Here is an interpretive account of both parts of the first theorem.
Let P be any classical probability function (or rational credence function) that satisfies GG-DIP for the direct inferences from (H · Rc· E) for admissible E. Let A 1 c and A 2 c be any two possible chance outcomes of initial state R for chance system c. Suppose that (according to the credences represented by function P) each of these two chance events overrides (or screens-off ) whatever relevance H might have to D, given E (according to P): Here is the formal statement of this result.

Theorem 8 Sufficient Condition for Inadmissible Evidence.
We assume throughout that P[D · H · Rc · E] > 0 (so that all conditional probabilities are well-defined). Let A 1 c and A 2 c be any two outcomes of initial state Rc such that, for admissible E, the following direct inferences hold: Suppose, for k = 1, 2: It follows that: Whereas the first theorem applies for any two chance outcomes of chance hypothesis H , the next theorem relies on outcomes that form a partition. The payoff for this stronger supposition is a biconditional connection between support for (or by) D and the failure of direct inferences.
The first part of this theorem shows that whenever, for each B j c in a partition of outcomes of Rc, the support for D by chance hypothesis H is overridden by the support afforded to D by (Rc · B j c), according to P, the following result holds: D falls outside the resiliency sphere for the direct inferences based on (H · Rc· E) if and only if D is supported more (or less) by B i c than by B j c, given (Rc · E), for some B i c and B j c in the partition.
The second part of this theorem shows that under the same conditions stated above for the first part, the following result holds: D falls outside the resiliency sphere for the direct inferences based on (H · Rc· E) if and only if B k c is either positively or negatively supported by D, given (Rc· E), for at least one of the B k c in the partition.

Theorem 9 Necessary and Sufficient Condition for Inadmissible Evidence.
We assume throughout that P[D · H · Rc · E] > 0 (so that all conditional probabilities are well-defined). Let R (c) = {B 1 c, B 2 c, . . . } be some partition of outcomes of initial state Rc for P[ | H · Rc · E] such that, for each B k c in R (c), the following direct inferences hold for admissible E: Then we have the following result: if and only if (D · E) falls outside the resiliency sphere of (H · Rc) for P (since, for some B k . Furthermore, when R (c) is a partition for P[ | Rc · E], from the same suppositions we get the following result: if and only if (D · E) falls outside the resiliency sphere of (H · Rc) for P (since, for some B k

"Inappropriate" Credence Functions
It should be pretty clear that, given a specific account of admissibility, not all credence functions are "appropriate" in the way required by G-DIP and GG-DIP. Our next result shows that the axioms of classical probability put tight constraints on precisely which credence functions can get direct inference right. Let P be any "appropriate" initial credence function, which gets direct inferences from (H · Rc · E) to chance outcomes A j c right, where E is admissible. Let Q be any credence function that varies from P by even a small shift in the non-direct inference credence for a chance outcome-i.e., Then, provided that Q satisfies an additional weak condition, it cannot get all the direct inferences right. One example of the additional weak condition is that P and Q agree on the amount of evidential support that (Rc· A k c· E) would provide to H , for each A k c in a partition. Another example is where Q comes from P via certain instances of Jeffrey Conditionalization (see Jeffrey 1990). Thus, some rather minor variants of credence functions that satisfy GG-DIP (including some that come about via the kinematics of Jeffrey updating) must fail to satisfy GG-DIP-they fail to count among the "appropriate" credence functions for direct inferences.
For the sake of clarity, we first present our results for binary chance outcomes, Ac and ¬Ac. We generalize these results in a later subsection.

Examples of "Inappropriate" Credence Functions
Consider the Alzheimer's example described in Sect. 3. Chance hypothesis H says that the chance of an individual in reference class R getting Alzheimer's by age 70 is r ; Rc says that Chuck is in reference class R; and Ac says that Chuck will get Alzheimer's by age 70. Suppose that Maria and John agree on the amount of evidential support that (Rc · Ac), were it true, would supply to chance hypothesis H , given all their other relevant evidence E (on which they completely agree): where P is Maria's credence function and Q is John's credence function. And also suppose they agree on the amount of evidential support that (Rc · ¬Ac), were it true, would supply to chance hypothesis H , given all their other relevant evidence E: Q[H | Rc · ¬Ac · E] = P[H | Rc · ¬Ac · E]. However, Maria is somewhat more optimistic than John about Chuck's future health, particularly his prospects of getting Alzheimer's by age 70; thus, Q[Ac | Rc · E] < P[Ac | Rc · E].
Although neither Maria nor John is confident that chance hypothesis H is true, both want to draw the correct direct inference value, r , when H is added to their total admissible evidence (Rc · E): P[Ac | H · Rc · E] = r and Q[Ac | H · Rc · E] = r . However, it turns out that at least one of them must get the direct inference wrong, since: P[Ac | H · Rc · E] = Q[Ac | H · Rc · E]. That is, if Maria gets the direct inference right, then John must get it wrong. Proof Follows immediately from setting (c) = {Ac, ¬Ac} in the more general Theorem 10 below. Jeffrey Conditionalization is the best known approach to the representation of learning based on uncertain new evidence. It deals with cases where, rather than learning by becoming certain of new information F, the agent has an experience or an insight that directly changes her confidence in the truth of each alternative among some range of possibilities, {F 1 , F 2 , . . . , F n }. Formally, when P is the agent's initial credence function, her new information induces a new credence function Q that directly assigns new credence values to the directly affected alternative possibilities in {F 1 , F 2 , . . . , F n } as follows: Q[F i · F k ] = 0 (they are alternative possibilities), n j=1 Q[F j ] = 1 (they are a complete collection of alternative possibilities). The relationship between the old credence function P and the new credence function Q is this:

Corollary of
for all statements G, for each F j in {F 1 , F 2 , . . . , F n }. That is, were the agent to become certain of any one of the statements F j , her new credence value, Q[G | F j ] should be identical to the old credence value, P[G | F j ] (for each statement G). It follows immediately that, for each statement G, the new credence value is given by 26 We now consider a case where Jeffrey Conditionalization (or a similar update method) induces a new credence function that must get direct inferences wrong.
Consider once again the Alzheimer's example from Sect. 3. As before, chance hypothesis H says that the chance of an individual in reference class R getting Alzheimer's by age 70 is r ; Rc says that Chuck is in reference class R; and Ac says that Chuck will get Alzheimer's by age 70; statement D says that Chuck's brain scan at age 50 shows a "moderate accumulation" of plaque. Suppose (this time) that Maria considers the relevance of the chance claim H to brain imaging result D be overridden by the claim, "Chuck gets Alzheimer's by age 70" (if added as a premise): Similarly, suppose Maria considers the relevance of the chance claim H to brain imaging result D be overridden by the claim, "Chuck does not get Alzheimer's by age 70" (if added as a premise): P[D | H · Rc ·¬Ac · E] = P[D | Rc ·¬Ac · E]. Furthermore, suppose Maria isn't privy to the result of Chuck's brain scan, but she overhears two technicians talking about it. What she hears is vague (mostly tone of voice), but her impression changes her credence from P[Ac | Rc · E] = s to Q[Ac | Rc · E] = t > s. Maria updates her credences via Jeffrey Conditionalization, according to (1) and (2) below. Thus, her new credence function must get the direct inference (concerning Chuck having Alzheimer's by age 70) wrong:

Corollary of Theorem 11 "Inappropriate" Credence Functions, Extended. Suppose, for admissible E, P[Ac| H ·
and P[D | H · Rc · ¬Ac · E] = P[D | Rc · ¬Ac · E]. Let probability function Q be related to P in the following way,
Our result here fits the pattern of Jeffrey Conditionalization, but our result is more general. For, the result itself doesn't assume that every statement is updated via the Jeffrey update formula; it only supposes that the update formula applies to (Rc · Ac), (Rc · ¬Ac), (H · Rc · Ac), and (H · Rc · ¬Ac). Furthermore, the result itself says nothing about updating, and need not be interpreted that way. Rather, the result applies to any pair of credence functions, Q and P, whatever their origins. The result says that for any credence function P that satisfies the initial suppositions, and for any credence function Q related to P as specified by conditions (1) and (2), when they disagree on the credence values for chance outcome Ac based on (Rc · E) alone, then (and only then) at least one of them must get the direct inference wrong; so at least one of them must be an "inappropriate" credence function according to GG-DIP.

Generalization to Algebras of Outcomes
We now state the main results of this section in a more general form. The corollaries stated earlier follow directly from these.
The next theorem applies to all cases where probability function Q comes from function P via Jeffrey Conditionalization, but it applies to lots of other Q functions as well. Conditions (3.1) and (3.2) for each B j c in (c) and D i in ; for each B j c in (c). Then, Q[B k

Reference Class Problems
Accounts of direct inference, Bayesian or not, often encounter troubles in dealing with overlapping reference classes or initial chance states. Lots of ink has been spilt trying to sort out these problems. 27 In this section we raise some troubles for Bayesian accounts. We focus on issues that arise when the object language notion, ch, is some kind of objective chance. (Frequency accounts have distinct troubles of there own.) We will suggest some ways a Bayesian account may deal with these troubles.

Defeat by Outcome Attributes
Consider the case where an extensive chance hypothesis H entails chances for at least two distinct outcome attributes, Ax and Bx, for initial state R-i.e. Ax and Bx are members of the algebra of outcome attributes R . Then it will usually be the case that possible outcome Bc for system c defeats the direct inference from (H · Rc · E) to outcome Ac, for admissible E: 28 Defeat of this kind turns out to be easy to finesse. Indeed, when H is an extensive chance hypothesis, as defined earlier, defeat of this kind turns into a direct inference success. For, whenever an extensive chance hypothesis H entails ch(Ax, Rx) = r , and Bx is a chance attribute for Rx according to H , then H must also entail ch(Bx, Rx) = s and ch(Ax · Bx, Rx) = t, where s and t are standard terms for real numbers. Thus, for admissible E, the following two direct inferences result: So, although Bc defeats the simple direct inference to Ac, we still obtain the direct inference we should want, but we get it via the following "complex direct inference": This is exactly the value we should want for P[Ac | H · Rc · Bc · E]. And we've gotten it without complicating the account of chance by taking on a notion of conditional chance. That is, when Bx is an outcome attribute for Rx, the Bayesian machinery yields the desired direct inference value for Ax without needing to draw on chance expressions of form ch(Ax, Rx · Bx) = q. 29 This approach avoids drawing on the notion of conditional chance, and the attendant difficulties identified by Humphreys (1985). It also benefits by not requiring the account of chance to make sense of expressions that conditionalize on outcome attributes: when Bx is an outcome attribute for Rx, what does an expression of form ch(Ax, Rx · Bx) = q say? 30 One more point before moving on. The treatment described above works well for extensive chance hypotheses. But what about cases where H is not extensive, say, where H only entails one of ch(Bx, Rx) = s or ch(Ax · Bx, Rx) = t. In that case, although Bc should defeat the direct inference to Ac, The Bayesian direct inference approach doesn't produce a chance-based value for P[Ac | H · Rc · Bc · E]. Is this a problem for the Bayesian account?
By Bayesian lights, not at all. The incomplete, non-extensive chance hypothesis cannot supply the desired direct inference, but this is just as it should be! First, recall that the present account of direct inference doesn't suppose that the agent is certain of the chance hypothesis involved. Application of the Bayesian direct inference principle (GG-DIP or G-DIP) only supposes that the agent's total evidence is expressed by (Rc · E), or by (Rc · Bc · E) in this case, and contemplates the appropriate credence value when a chance hypothesis is added (as an additional premise) to this evidence. It does not suppose that the agent's total evidence contains the chance hypotheses on which the direct inferences depend. The main issue for the theory of direct inference is to determine the conditions under which the addition of a chance hypothesis (however well confirmed) to an agent's total evidence specifies appropriate direct inferences to possible outcomes. In this regard, the direct inference principle does not privilege any one chance hypothesis over another.
So, one plausible Bayesian line goes like this. It is not at all surprising that an incomplete chance hypothesis may fail to produce a direct inference when it fails to specify appropriate chance claims. The failure of the Bayesian account to produce direct inferences in such cases is not a fault of the account. Indeed, when hypothesis H doesn't include the chance claim ch(Ax, Rx) = r , it is no fault of the Bayesian account that it fails to produce the direct inference P[Ac | H · Rc · E] = r . Similarly, when an incomplete chance hypothesis H fails to supply one of the chance claims ch(Bx, Rx) = s or ch(Ax · Bx, Rx) = t, it is no fault of the Bayesian account that it fails to produce one of the direct inferences P[Bc| H · Rc· E] = s or P[Ac· Bc| H · Rc· E] = t, and so fails to produce the appropriate direct inference P[Ac| H · Rc· Bc· E] = t/s. In such a case, a more filled-out extension of H hypothesizes specific chance values for ch (Bx, Rx) and ch(Ax · Bx, Rx), and can thereby supply the appropriate direct inferences. If an agent lacks confidence in any of the filled-out extensions of H , then she simply needs to acquire more evidence for (or against) them, in the usual Bayesian way.

Competing Chance Claims
We now turn to cases where two chance claims may compete for direct inference priority. This can only happen when two chance claims about the same outcome attribute have "overlapping reference classes"-i.e. when some chance systems can be in two distinct initial chance states, R and S, at the same time, and where both initial chance states provide chances for the same outcome attribute, A. Bayesian direct inference runs into some trouble in trying to accommodate this situation. We'll suggest some ways that the Bayesian account may deal with these troubles.
Let P[Ac | ch(Ax, Rx) = r · Rc · E] = r be a perfectly good direct inference (for admissible E). Then, presumably, P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Sx) = s · E] = r , where s = r , should also be a perfectly good direct inference. The addition of some chance claim ch(Ax, Sx) = s should not be problematic for such straightforward direct inferences. Otherwise, extended chance hypotheses, involving multiple chance claims, would be unable to ground direct inferences. Now, the usual way to raise "multiple reference class problems" for direct inference goes like this. Suppose we add Sc as a premise to this direct inference. This clearly must defeat the direct inference, since we have two equally good but incompatible direct inferences: P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Sx) = s · Sc · E] = r = s = P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Sx) = s · Sc · E]. Thus, we must have P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Sx) = s · Sc · E] = r (or = s). 31 What happens when ¬Sc, instead of Sc, is added as a premise? Since Sc defeats the direct inference, it's negation must also defeat it (see Theorem 2), so: P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Sx) = s · ¬Sc · E] = r . Now, on the usual story, this kind of defeat may be averted when state Sx is a sub-state of Rx-when every possible system in state Sx must also be in state Rx. We may express this as ∀x(Sx ⊃ Rx) if the quantifier is taken to range over all possible systems, or modally as ∀x(Sx ⊃ Rx) when the quantifier is more restricted. The sub-state claim can then be expressed by adding this statement to the premise of the the direct inference. However, for our purposes the same idea can be expressed by replacing the chance claim ch(Ax, Sx) = s in the above example with the claim ch(Ax, Rx · Sx) = s. With this replacement, the following should be a perfectly good direct inference: P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Rx · Sx) = s · Sc · E] = s. 32 That's the usual idea. But it presents problems in a Bayesian context. Here is why. Let H be (ch(Ax, Rx) = r ·ch(Ax, Rx·Sx) = s). Let's suppose (as seems reasonable) that s can be quite far away from r . 33 Consider the following equation, which follows from the axioms of probability theory, assuming that 0 < P[H · Rc · Sc · E] < 1 and 0 < P[H · Rc · ¬Sc · E] < 1: provided that (Rc · E) and (Rc · Sc · E), respectively, are admissible for the two direct inferences P[Ac | H · Rc · E] = r and P[Ac | H · Rc · Sc · E] = s. However, in the normal course of events an agent's total evidence may push the value of her credence, P[Sc | H · Rc · E], close to 1. When that happens, the value of P[Ac | H · Rc · E] must approach s. 34 This contradicts the supposition that P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Rx · Sx) = s · E] equals r , the value the direct inference should apparently have.
Notice that this analysis doesn't really depend on whether E itself provides evidence for or against Sc. Even in cases where the evidence E says nothing about Sc, the value an agent assigns to P[Sc | H · Rc · E] (perhaps only due to her gut feeling) may force 31 The credence value might happen to be r or s, but not due to direct inference. If the credence value is r , then the following argument should be run with Rc and Sc exchanged throughout. So, without loss, we assume the credence value is not equal to r . 32 ¬S must still be a defeater: P[Ac | ch(Ax, Rx) = r · Rc · ch(Ax, Rx · Sx) = s · ¬Sc · E] = r . her credence value for Ac to significantly depart from the direct inference value based on ch(Ax, Rx) = r .
One Bayesian response to this problem is to restrict the agent's possible credence values for Sc so as not to permit the defeat of the direct inference unless E contains explicit evidence for or against Sc. Direct inference restricts other credence values, including the value for Sc. That should not be at all surprising. Any axiom or constraint added to the usual axioms for conditional probabilities is bound to result in the propagation of constrains on credence values throughout the system. Given the way that the credence value for Sc depends on the recommended direct inference value for Ac, one may simply maintain that the direct inference rule provides a kind of objectivist Bayesian constraint on what credence values Sc may take. 35 However, Bayesians are also free to reject this kind of constraint on credence values for Sc, provided they can find some other way to accommodate the above analysis. For instance, they may adopt a more straightforward response to this problem: simply void (or invalidate) the direct inference P[Ac | H · Rc · E] = r in all cases where H contains a chance claim, ch(Ax, Rx · Sx) = s, based on a more specific initial chance state than Rx. This may be a more coherent view than the "objectivist" approach described above. For, clearly in some cases the credence for Sc may be near 1 based on good evidence, stated within E. In such cases the agent's credence for Ac should be close to s rather than r . But then, precisely how much evidence, and of what kind, must occur within E to warrant a value of P[Sc | H · Rc · E] that can break the direct inference based on ch(Ax, Rx) = r ? Rather than try to parse this tricky issue (which may have no clear solution), it may make better sense to simply let the presence of the more specific chance claim override the weaker chance claim, as the above analysis seemed to initially suggest.
One of the authors (Wallmann) takes the overall thrust of this analysis to show that Bayesian direct inference cannot work properly-that it should be rejected in favor of some more lenient, more intuitively plausible account of direct inference. The idea that a direct inference based on (ch(Ax, Rx) = r · Rc) should be defeated simply by the presence of some additional chance claim that draws on a more specific chance state, ch(Ax, Rx · Sx) = s, absent an assertion of the applicability of that chance claim, (Rc · Sc), just seems too implausible. The other author finds the above Bayesian response both acceptable and reasonable, although he finds it somewhat surprising that the Bayesian account of direct inference leads to this view.
A further move in the spirit of the "straightforward approach" suggested above is a Bayesian approach that rules out the very possibility of overlapping initial chance- 35 Here is one reason a Bayesian may want to reject this particular "objectivist" approach. Suppose that along with ch(Ax, Rx) = r , H contains both ch(Ax, Rx · Sx) = s and ch(Ax, Rx · ¬Sx) = t. Then the objectivist commitment to P[Ac | H · Rc · E] = r implies that the credence value for Sc is fixed once and for all. For, it follows from the direct inferences with values r , s, and t, that P[Sc | H · Rc · E] = (r − t)/(s − t). So, no amount of evidence E can change this value for P[Sc | H · Rc · E], unless E breaks one of the direct inferences by being inadmissible for it. states that have outcome attributes in common. 36 This chance-state overlap restriction has an important precedent. Our best indeterministic scientific theory, quantum theory, does not draw on overlapping initial quantum states. Each quantum system is in precisely one basic quantum state at any given time, and that state completely accounts for chances of quantum outcomes (upon system collapse, or upon "measurement"). To make good on this view, we need an account of how the usual kinds of chance models of macro-systems can be accommodated within the Bayesian direct inference framework without drawing on overlapping initial states that have outcome attributes in common.
When a chance hypothesis asserts that the chance of Ax (dying by age 75) for systems in chance state Rx (male in good health at age 50), the applicability of the chance claim, ch(Ax, Rx) = r , is of little import if it fails to account for important risk factors. For instance, if it hasn't taken into account whether (and how much) an individual smokes, Sx, then it doesn't tell you much of anything about anyone's individual chances. So, perhaps ch(Ax, Rx · Sx) = s is the more relevant chance claim for Chuck. And if state Sx is relevant, so is state ¬Sx, which yields some chance claim ch(Ax, Rx · ¬Sx) = t. Indeed, the amount an individual smokes is relevant, so instead of Sx and ¬Sx, perhaps a range of alternatives, describing amount smoked, and for how many years, is in order: ch(Ax, Rx · S j x) = s j for a range of categories S j x. So, supposing Chuck is a 50 year old male in good health who has never smoked, does ch(Ax, Rx · S 0 x) = s 0 capture his chances of dying by age 75? How much does Chuck drink? Is he engaged in a particularly hazardous occupation? The point is that Chuck's chances depend on the most specific relevant chance state to which he belongs, according to the most specific, accurate chance hypothesis we can develop (and evidentially support) about people in various initial states of health. Anything less is at best an approximation of Chuck's real chances. 37 A Bayesian approach that excludes overlapping initial chance states will need to draw on hypotheses about approximate chance models, where these chance models rely on most basic initial chance states-chance states that are most basic according to the model. Associated with any given chance model is a chance hypothesis that asserts that the model fits the real world to some specified degree of approximation. Fitting the world means capturing the most significant causal factors and their associated chances for producing various kinds of outcomes. Evidence for such hypotheses confirms those that do the best job of capturing the most significant causal factors. Such approximations of chance mechanisms is the best we can hope for within the special sciences. So, the fact that a Bayesian approach to direct inference needs to draw on hypotheses about chance models for macroscopic systems (and the basic initial chance 36 A Bayesian account of objective chance relies on a collection of axioms for the theory of chance. All credence functions "appropriate" for direct inference should give these axioms credence value 1. These include axioms that make the function ch( , Rx), for each initial state R, satisfy the axioms of probability theory, as described in an earlier footnote. One way to get the theory of chance to rule out overlapping initial chance states is to add the following axiom schema: (∃uch(Ax, Rx) = u · ∃vch(Ax, Sx) = v) ⊃ ¬∃x(Rx · Sx), where u and v are variables restricted to real numbers, and where ∃uch(Ax, Rx) = u is a way to express the claim that Rx is an initial chance state (for at least one attribute Ax). 37 Fetzer (1982), for instance, argues for a view that relativizes single-case chances to all causally relevant factors.
states posited by such models) is no defect. Any theory of direct inference, Bayesian or not, will need to accommodate hypotheses about approximate chance models, since that's the best the special sciences can offer. And each such model will have chance states that are most basic for that model.

Conclusion
In this paper we've identified a variety of different kinds statements that are logically inadmissible for Bayesian direct inference. Such statements must defeat direct inferences on any coherent Bayesian account. In particular, whenever such information is available to the Bayesian agent, it supplies credence values for chance outcomes that significantly depart from the fair betting odds represented by objective chance statements. One of the authors (Wallmann) finds these results so counter-intuitive that he advocates giving up Bayesian direct inference. 38 He favors some alternative account on which direct inferences remain intact when faced with such information. The other author thinks that whenever an agent is in possession of such information, those deviations from objective chance values required by the Bayesian account make good sense. We agree that the Bayesian account places severe constraints on the theory of chance. Whether the costs imposed by these constraints are paid for by the avowed Bayesian benefits remains unresolved, for now.

Proof of Theorem 8
Proof Assume all the antecedent conditions for the theorem.