Skip to main content
Log in

Foundationalism, Probability, and Mutual Support

  • Original article
  • Published:
Erkenntnis Aims and scope Submit manuscript

Abstract

The phenomenon of mutual support presents a specific challenge to the foundationalist epistemologist: Is it possible to model mutual support accurately without using circles of evidential support? We argue that the appearance of loops of support arises from a failure to distinguish different synchronic lines of evidential force. The ban on loops should be clarified to exclude loops within any such line, and basing should be understood as taking place within lines of evidence. Uncertain propositions involved in mutual support relations are conduits to each other of independent evidence originating ultimately in the foundations. We examine several putative examples of benign loops of support and show that, given the distinctions noted, they can be accurately modeled in a foundationalist fashion. We define an evidential “tangle,” a relation among three propositions that appears to require a loop for modeling, and prove that all such tangles are trivial in a sense that precludes modeling them with an evidential circle.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Rigidity requires that the conditional probabilities–in this case P old (H 2|H 1) and P old (H 2|∼H 1)–must be the same in the new distribution. When this condition is satisfied, the Jeffrey formula is simply a special case of the Theorem on Total Probability. It has been notoriously hard to specify conditions in which the rigidity conditions are satisfied, but this is because of Jeffrey’s own desire to have the change from the old to the new distribution arise as a probabilistic surd from experience without the addition of new certain evidence. When the shift is induced by the acquisition of a new certainty, as in the cases where we wish to use it, screening of the new certain evidence by both the assertion and the negation of H 1 guarantees the relevant rigidity conditions.

  2. It has been pointed out to us by a reviewer for Erkenntnis that there is a parallel to what we are doing (when screening conditions are fulfilled) in a formula for what is known as “convolution.”

  3. More generally, we should call these “lines of evidence,” since evidence can be either positively or negatively relevant.

  4. We discuss this issue at greater length in McGrew and McGrew 2007. See pp. 72–73 and the various discussions of Johnny Wideawake, pp. 76–77, 94, 121.

  5. This condition blocks the swelling of a basis by the addition of irrelevant sub-components. If, in the context, C is quite irrelevant to H, then there is no point in tacking C or its negation onto the end of the propositions that constitute the basis, even if the resulting partition would also screen H from the given evidence.

  6. Jim Hawthorne, personal communication.

  7. Hawthorne uses e to stand for an observation or experience, which, thus described, need not be propositional in nature. Our own position is that foundational evidence should be thought of as propositional and certain and as justified by experience in virtue of the fact that we have direct referential access to our own experiences (McGrew 1995).

  8. It will, of course, be pertinent to P(H 2), but only in the sense that we can calculate P(H 2) in terms of P(H 1) and the relevant conditional probabilities. But the question of how H 1 supports H 2 is different and more interesting.

  9. The qualification “at time t” is important, though the reasons are too complex for detailed treatment here. According to the strong foundationalist model one is gaining and losing certainties across time. Therefore, a full treatment of the dynamics of belief will have to take account of the fact that probabilities of 0 and 1 are not, as in the standard Bayesian picture, immune to revision. But changes in foundational beliefs will not be driven by an updating rule.

  10. This is the way that Bayesian networks operate, updating probabilities acyclically throughout a distribution from new values given at the bottom nodes. While a Bayes net is not the same thing as an evidence tree, working with Bayes nets can be useful heuristically in avoiding mistakes in building evidence trees and in seeing how evidential force is propagated. An understanding of Bayes nets can also help to clarify the fact that conditionalization can never be done twice, thus driving home the impossibility of a real and probabilistically significant “loop of support”.

  11. We want to distinguish the notion of basing from the notion of confirmation. This is important because we want to discuss negative as well as positive relevance. But there is a more complicated reason for maintaining the distinction. Tomoji Shogenji (2003) has proven that, when ±Y screens off X from Z, and the inequalities P(Y|X) > P(Y) and P(Z|Y) > P(Z) both hold, then P(Z|X) > P(Z). In the case where a given partition acts as a basis for e with respect to H, the items in the partition that take the place of ±Y might be complex, and although there are several useful extensions of Shogenji’s theorem for such cases, they all require some additional constraints for the transitivity of confirmation to hold with regard to any particular member of the partition.

  12. We owe this objection to an anonymous reviewer for Erkenntnis.

  13. This is almost certainly a case where a detailed analysis of the situation requires a complex node. We cannot say that the negation of “4 across is ‘ruby’” screens the impact of the clue from the proposition “2 down is ‘irate’,” as 4 across might be, for example, ‘opal’ or ‘jade’. The clue raises the probability that 4 across is one of those words (with ‘ruby’ being the best fit for the clue), and if we were given that 4 across is not ‘ruby’, we would still have to consider the negative impact of ‘opal’ or ‘jade’ upon the proposition that 2 down is ‘irate’. Hence, it appears that what screens the evidence of the clue from the claim that 2 down is ‘irate’ will have to be a partition consisting of, e.g., “4 across is ‘ruby’ (hence not ‘jade’ or ‘opal’),” “4 across is ‘jade’ (hence not ‘ruby’ or ‘opal’),” and so forth. In other words, the node in question will have to take account of all the four-letter words from the domain of possibilities that are compatible with the clue. But this technical point does not change the fact that a node consisting of statements about the specific word represented by 4 across channels the independent impact of the clue indirectly to various possibilities for 2 down.

  14. Personal communication. Peter subsequently presented the example in commentary on our papers at the Formal Epistemology Workshop at Berkeley, May 2006.

  15. This set of conditions–screening with relevance–is known as “strong screening.” The example provides a counterexample to the conjecture that strong screening is transitive. For example, though ±Y strongly screens X from Z and ±Z strongly screens Y from W, it is not the case that ±Z strongly screens X from W; in fact ±Z does not screen X from W.

  16. The posteriors P (W| ±Z) are not rigid from the old distribution (before F was given) to the new one. See note 1.

  17. It is tempting to think that there must be some function on Y, W such that a node could be drawn through a partition on that function – e.g. (Y & W) – rather than through Y and W separately. This, however, is not the case. In fact, none of the possible combinations (Y & W), (Y & ∼W), etc., screens X from Z. Modulo any of these combinations, X affects Z directly. This curious fact is semantically explicable, but the explanation would be tedious to spell out.

References

  • Dancy, J. (2003). A defense of coherentism. In L. Pojman (Ed.), The theory of knowledge: Classical and contemporary readings (pp. 206–215). Belmont, CA: Wadsworth.

    Google Scholar 

  • Haack, S. (1993). Evidence and inquiry. Oxford: Blackwell.

    Google Scholar 

  • Hawthorne, J. (2004). Three models of sequential belief updating on uncertain evidence. Journal of Philosophical Logic, 33, 89–123.

    Article  Google Scholar 

  • Jeffrey, R. (1990). The logic of decision. Chicago: University of Chicago Press.

    Google Scholar 

  • Jeffrey, R. (1992). Conditioning, kinematics, and exchangeability. In Probability and the art of judgment (pp. 117–153). Cambridge: Cambridge University Press.

  • McGrew, T. (1995). The foundations of knowledge. Lanham, MD: Littlefield Adams Books.

    Google Scholar 

  • McGrew, T. (1999). How foundationalists do crossword puzzles. Philosophical Studies, 96, 333–350.

    Article  Google Scholar 

  • McGrew, T., & McGrew, L. (2000). Foundationalism, transitivity, and confirmation. Journal of Philosphical Research, 25, 47–66.

    Google Scholar 

  • McGrew, T., & McGrew, L. (2007). Internalism and epistemology: The architecture of reason. London: Routledge.

    Google Scholar 

  • Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  • Post, J. (1996). Epistemology. In D Weissman (Ed.), Discourse on the method and meditations on first philosophy (pp. 236–271). New Haven: Yale University Press.

    Google Scholar 

  • Post, J., & Turner, D. (2000). Sic transitivity: Reply to McGrew and McGrew. The Journal of Philosophical Research, 25, 67–82.

    Google Scholar 

  • Shogenji, T. (2003). A condition for transitivity in probabilistic support. BJPS, 54, 613–616.

    Article  Google Scholar 

Download references

Acknowledgements

We wish to thank Jim Hawthorne for patient help with several technical points and for consistent encouragement during our work on this paper, Carl Wagner for help on technical points and for checking over the appendix, Peter Vranas for challenging us with his attempted counterexample and discussions of it, and Mike Titlelbaum for several helpful discussions. We are also grateful to the participants at FEW 2006, where we presented an earlier version of the paper, and to the referees for Erkenntnis, whose thoughtful comments helped us to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lydia McGrew.

Appendix: All Tangles are Trivial

Appendix: All Tangles are Trivial

1.1 Part I: Proof that all Tangles are Trivial

1.1.1 Definitions

Tangle: A set of propositions {A, B, C}, all with regular (non-extremal) probabilities, constitutes a tangle iff:

  1. 1*.

    A is relevant to both B and C,

  2. 2*.

    ±B screens A from C, and

  3. 3*.

    ±C screens A from B.

The tangle is trivial iff:

  1. 4*.

    P(C|B) = 0 (equivalently, P(B|C) = 0) or P(C|B) = P(B|C) = 1.

Triviality, then, arises when the propositions B and C exclude each other (in the sense that the probability of each given the other is 0) or each entails the other (in the sense that the probability of each given the other is 1).

1.1.2 Conditions

Let {A, B, C} constitute a tangle. Then:

   

Suppose, further, that the tangle is non-trivial. Then, without loss of generality:

  1. 7.

    0 < P(C|B) < 1

To simplify notation, let P* be a probability distribution arrived at from P by simple conditioning on A, so that P*(−) = P(−|A). Then we can rewrite these seven conditions thus:

   

Lemma 1

P*(B), P*(C) > 0.

P(B) > 0, by the definition of a regular probability function. Suppose P*(B) = 0. Then P*(C|B) is undefined. But since P(B) > 0, P(C|B) is defined. But by 3, P*(C|B) = P(C|B). Hence both conditional probabilities must be well defined. Therefore, P*(B) > 0. Exactly similar reasoning establishes that P*(C) > 0.

From 1, without loss of generality, P*(B) = kP(B), for some k ≠ 1; P*(B)/P(B) = k. Hence, k > 0.

Lemma 2

P*(B&C)/P(B&C) = k.

1.1.3 Proof of Lemma 2

   

Lemma 3

P*(C)/P(C) = k.

1.1.4 Proof of Lemma 3

   

Lemma 4

P*(B&∼C)/P(B&∼C) = k.

1.1.5 Proof of Lemma 4

   

Lemma 5

P*(∼C)/P(∼C) ≠ k.

1.1.6 Proof of Lemma 5

We know that k ≠ 1 [from 1]; therefore either k > 1 or k < 1. Suppose k > 1. Then P*(C) > P(C); but in that case P*(∼C) < P(∼C), so P*(∼C)/P(∼C) < 1, in which case P*(∼C)/P(∼C) ≠ k.

Suppose k < 1. Then P*(C) > P(C); but in that case P*(∼C) > P(∼C), so P*(∼C)/P(∼C) > 1, in which case P*(∼C)/P(∼C) ≠ k. Therefore, P*(∼C)/P(∼C) ≠ k.

Theorem

Conditions 1–7 are not simultaneously satisfiable.

1.1.7 Proof of Theorem

   

This contradicts the assumption. Hence the conditions are not simultaneously satisfiable.

Hence, all tangles are trivial. QED

1.2 Part II: On the Triviality of Tangles Under Exclusion

Recall that, by 4*, a tangle is trivial iff:

  • P(C|B) = 0 (equivalently, P(B|C) = 0) or P(C|B) = P(B|C) = 1.

The proof below shows that if the tangle is trivial in the sense that it exemplifies the exclusion condition, P(B|C) = 0, it follows that P(∼B & ∼C) = 0 as well––so the distinctions between B and ∼C, on the one hand, and C & ∼B, on the other, collapse.

Lemma 1

If P(B|C) = 0, P(B&C) = 0.

1.2.1 Proof

   

Note that a comparable lemma holds for P(C|B) = 0; then, also, P(B&C) = 0.

Theorem

Under conditions 1–6 plus the exclusion condition, P(∼B&∼C) = 0. See Fig. 7.

This figure represents the remaining probability space under the initial distribution P since it has been shown by Lemma 1 that P(B&C) = 0. The variables a, b, and c represent P(∼B & ∼C), P(B), and P(C) respectively. P(∼B) = a + c; P(∼C) = a + b.

In the following chart, all twelve logically possible combinations of changes in a, b, and c in the shift from P to P* are shown, assuming that a > 0. Each variable either goes up (u), goes down (d), or remains the same (s). We know from conditions 1 and 2 that b and c must change, so we can dispense with the third option (s) for them. See Table 1.

Fig. 7
figure 7

Remaining probability space in P under the exclusion condition

Table 1 Impossibility of a > 0

Since every possible combination is ruled out by the conditions given, the assumption that a > 0 is false. But a represents a probability, namely P(∼B & ∼C), and therefore it cannot be negative. Therefore, a = 0. So P(∼B & ∼C) = 0. QED.

1.3 Part III: Triviality Explained

Consider the triviality condition under which P (B|C) = P (C|B) = 1. Under this condition, P (C &∼B) = 0 and P (∼C & B) = 0 Therefore, {B & C, ∼B & ∼C} is a partition of the probability space.

Therefore, this partition can be treated as a node of the evidence tree in itself. Since B and C are coextensive and ∼B and ∼C are coextensive, this node can be treated as a single variable––either simply as B, ∼B or simply as C, ∼C. The nodes for B and for C have collapsed into a single node. Thus, evidence A that influences both B and C, where the relevant screening and relevance conditions hold to create a “tangle,” influences both together, and a line in the evidence tree can be drawn from evidence A to this node alone. There is no need to try to draw arrows from B to C or vice versa to show the impact of A on both, and such arrows would have no meaning if one did draw them, as A is not influencing one by way of the other. So this tangle is trivial.

A simple semantic model of this sort of triviality would be a case where two people are co-owners of the same lottery ticket and where neither owns any other ticket. Hence, either both win or both lose, and the node can be thought of simply in terms of the winning or losing of either person (or the winning or losing of the ticket they both own).

Consider the other triviality condition under which P (B|C) = P (C|B) = 0. Under this condition, P (B & C) = 0 and (even more interestingly), modulo the “tangle” conditions, P (∼B & ∼C) = 0. Therefore, {B & ∼C, C & ∼B} is a partition of the probability space.

Therefore, this partition also can be treated as a single node of the evidence tree. And, since B and ∼C are coextensive and C and ∼B are coextensive, this node can be treated as a single variable, as above. Here, too, though in a different way, the nodes for B and for C have collapsed into a single node.

A simple example of this sort of triviality would be a case of a lottery with a guaranteed winner having only two people entered, each with a different ticket. They cannot both win and they cannot both lose. So, again, the node can be thought of in terms simply of the winning or losing of one ticket-holder.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McGrew, L., McGrew, T. Foundationalism, Probability, and Mutual Support. Erkenn 68, 55–77 (2008). https://doi.org/10.1007/s10670-007-9062-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10670-007-9062-1

Keywords

Navigation