Skip to main content

Stability and Scepticism in the Modelling of Doxastic States: Probabilities and Plain Beliefs

Abstract

There are two prominent ways of formally modelling human belief. One is in terms of plain beliefs (yes-or-no beliefs, beliefs simpliciter), i.e., sets of propositions. The second one is in terms of degrees of beliefs, which are commonly taken to be representable by subjective probability functions. In relating these two ways of modelling human belief, the most natural idea is a thesis frequently attributed to John Locke: a proposition is or ought to be believed (accepted) just in case its subjective probability exceeds a contextually fixed probability threshold \(t<1\). This idea is known to have two serious drawbacks: first, it denies that beliefs are closed under conjunction, and second, it may easily lead to sets of beliefs that are logically inconsistent. In this paper I present two recent accounts of aligning plain belief with subjective probability: the Stability Theory of Leitgeb (Ann Pure Appl Log 164(12):1338–1389, 2013; Philos Rev 123(2):131–171, 2014; Proc Aristot Soc Suppl Vol 89(1):143–185, 2015a; The stability of belief: an essay on rationality and coherence. Oxford University Press, Oxford, 2015b) and the Probalogical Theory (or Tracking Theory) of Lin and Kelly (Synthese 186(2):531–575, 2012a; J Philos Log 41(6):957–981, 2012b). I argue that Leitgeb’s theory may be too sceptical for the purposes of real life.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. Cohen (1992) had some strong arguments that this should be called ‘(rational) acceptance’ rather than ‘(rational) belief’. I will neglect this distinction in the present paper.

  2. ‘Belief’ here means plain belief, yes-or-no belief, or belief simpliciter.

  3. Thresholds need not be fully determined by the context. It is more realistic to assume that a particular context determines only an interval of suitable thresholds.

  4. On Easwaran’s (2016) value-theoretic justification of the Lockean thesis, the thresholds \(\frac {2}{3}\) and \(\frac {3}{4}\) are obtained if the value lost by believing a falsity is twice or, respectively, thrice the value gained by believing a truth.

  5. Leitgeb (2013, pp. 1375–1378, 2014, pp. 143–145, 2015a, footnote 14; 2015b, pp. 197–198).

  6. Formal epistemologists do not apply these labels as historians of philosophy, so we should not be too fussy about where exactly in Hume’s writings one can find this thesis.

  7. See Lin and Kelly (2012a, b).

  8. According to Leitgeb (2014, p. 133, 2015b, p. 105), ‘the essential point’ in his interpretation of the Lockean idea lies in a distinction between a claim of the form (1) ‘there is an \(r<1\), ... for all P ...’ and a claim of the form (2) ‘for all P ... there is an \(r<1\) ...’. I agree with Leitgeb that requirement (1) is too strong. Why should there be a single real number r that is suitable for all actual (and, perhaps, for all conceivable) representations of degrees of belief? But the problem goes in fact deeper. Even a person with a single, unchanging probability function may be facing varying standards of rigour for belief that find expression in varying thresholds (recall our comparison of the courtroom interrogation and the party conversation above). And this point relates to the reason why we think that Leitgeb’s (2) is too weak. The mere existence of some threshold that can do the work for his Stability Theory of belief is not enough, the threshold must be large enough to serve the purposes of the situation.

  9. Leitgeb (2013, p. 1375) starts by saying, ‘for simplicity, let \(r=\frac {1}{2}\)’; then, after two pages of discussion of this case, he adds: ‘If \(r > \frac {1}{2}\), then a diagram similar to [the diagram for the case \(r = \frac {1}{2}\)] can be drawn, with all of the interior straight line segments being pushed towards the three vertices to an extent that is proportional to the magnitude of r.’ (p. 1377; similarly Leitgeb 2015b, p. 198) and ‘That is: the more cautious one aims to be in terms of one’s threshold r, the harder it gets to find a probability measure that allows for non-trivial P-stable\(^r\) sets.’ (p. 1378) But this remains a perceptive side remark, Leitgeb does not further discuss the epistemological significance of higher thresholds. In Leitgeb (2014, p. 140), the very notion of P-stability is tied to the threshold \(\frac {1}{2}\). And in footnote 14 of Leitgeb (2015a), he raises the question ‘[h]ow severe ... the constraint that HT\(^r\) [the Humean Thesis with threshold r] imposes on Bel and P’ is, without mentioning any restrictions on the values of the parameter r. This may perhaps be misinterpreted as implying that the claims made in footnote 14 hold for all r—which is not true (as Leitgeb is keenly aware; there is a correction in his 2015b, footnote 130).

  10. What we call ‘possibility’ is mostly called ‘possible world’ by Leitgeb and ‘hypothesis’ (or ‘potential answer’ to a question) by Lin and Kelly. The latter way of speaking is philosophically somewhat more precise, because one cannot lump together (or split) possible worlds, while one can lump together (or refine) propositions, i.e., sets of possibilities. It is only because there are many occurrences of the letter ‘p’ in this paper anyway that I will be using Leitgeb’s ‘w’ as a variable for possibilities.

  11. Leitgeb (2013, p. 1376, 2014, p. 144) and Lin and Kelly (2012a, b) use the equiangular standard 2-simplex.

  12. Or our total ignorance which of these probability functions characterises the agent’s actual mental state. As a more general assumption, one would naturally want to take a Dirichlet distribution which is the conjugate prior distribution of the categorical distribution in Bayesian statistics. The uniform distribution over \({\mathbb {R}}^3\) is the symmetric Dirichlet distribution with parameter \(\alpha =\langle 1,1,1\rangle\).

  13. More precisely, this is one formulation of the Lockean thesis, the one used by Leitgeb. In Lin and Kelly’s (2012a, b) terminology, the Lockean acceptance rule has high probability as a sufficient, but not as a necessary condition of belief/acceptance. They call Kyburgian acceptance rule what Leitgeb calls the Lockean thesis.

  14. Lower thresholds don’t seem to make much sense. It is not clear whether anyone has ever advocated a non-parameterised version of the Lockean thesis in one of the following ways: ‘There is a threshold t such that for all probability functions P, (\({\hbox {LT}}_{\underset{(-)}{>}}^t\))’ or ‘For all probability functions P there is a threshold t such that (\({\hbox {LT}}_{\underset{(-)}{>}}^t\))’. Any such version raises difficult issues of interpretation.

  15. On the value-theoretic approach taken by Easwaran (2016), both (\({\hbox {LT}}_>^t\)) and (\({\hbox {LT}}_\ge ^t\)) maximise expected epistemic utility for a particular choice of the threshold value t.

  16. In order to keep this paper reasonably self-contained, the proofs of the more interesting equivalences of this section are given in the Appendix.

  17. The term ‘basic outclassing condition’ is due to Makinson (2015, p. 4). I write \(\overline{X }\) for \(W-X\) and P(w) for \(P(\{w\})\).

  18. This picks up on Benferhat et al.’s (1997) talk of ‘big-stepped’ probability functions. Compare Leitgeb (2013, p. 1349, 2015b, p. 163).

  19. Leitgeb does not address the range question. A few of his formulations might be misread as suggesting that the Lockean threshold applied in (\({\hbox {LT}}_\ge ^t\)) is uniquely determined to be P(X); cf. Leitgeb (2013, p. 1362, note 26, 2014, pp. 141, 143, 2015a, p. 174, 2015b, pp. 88, 113, 115). Instead of listing the series of sharp Lockean thresholds 0.54, 0.882, 0.97994, 0.99794, 0.99994 and 1 in Leitgeb (2014, p. 143), it would have been less mistakable to list a series of intervals of Lockean thresholds: (0.46, 0.54], (0.658, 0.882], (0.96006–0.97994], (0.982–0.99794], (0.998, 0.99994] and [1]. Real numbers from one of these intervals are considered to be natural thresholds by Leitgeb, real numbers that lie between these intervals are considered unnatural.

  20. I am slightly changing Leitgeb’s terminology and will drop the explicit reference to the probability function P in the following. For the concept of P-stability with the threshold value \(\frac {1}{2}\), cf. Leitgeb (2013, pp. 1348, 1359), for the identical concept of P-stability (simpliciter), cf. Leitgeb (2014, pp. 139–140) and (2015b, p. 112). I have also slightly adjusted Leitgeb’s concept here and transferred the condition that \(P(B)>0\) from the antecedent to the consequent of the conditional involved in the restricted quantification. Notice that it is not required here that \(P(X\vert B) > t\).

  21. The existence of stability gaps is mentioned (but not analysed) by Leitgeb (2014, p. 151, 2015b, p. 125).

  22. This point is acknowledged and discussed by Leitgeb (2014, pp. 148–152, 2015b, pp. 120–126). Whereas Leitgeb distances himself from what Ross and Schroeder (2014) call ‘pragmatic credal reductivism’, my arguments in this paper are fully consonant with it.

  23. Compare Rott (2009, p. 327): ‘Belief is a vague notion, and the threshold, if there really is one, is certainly context-dependent. We would set the threshold high in the courtroom interrogation, and we would set it low in a casual chat over lunch.’

  24. Lin and Kelly (2012a, p. 536).

  25. See Leitgeb (2015a, pp. 152, 163), notation adapted.

  26. Leitgeb (2013, p. 1359, 2015b, p. 176). Leitgeb’s official definition has r vary between 0 and 1, but from his informal remarks it seems clear that for him only thresholds greater than (or equal to) \(\frac {1}{2}\) make good sense. I have refrained from making the slight adjustment to the concept of r-stability here that I made to the concept of \(\frac {1}{2}\)-stability before; cf. footnote 20 above.

  27. See Leitgeb (2013, p. 1369, Observation 9, 2015b, pp. 193–194, Observation 21).

  28. The generalisation of \(\frac {1}{2}\)-stability to r-stability does not affect the correctness of the determination of t as in condition (6). The only thing that higher values of r change is that they guarantee a certain positive width of the range of potential Locke thresholds (see Sect. 2.2). If a proposition X is r-stable, then this width is no smaller than \(\frac{2r-1}{r}{\cdot } \min _{w\in X}P(w)\).

  29. Leitgeb (2013, p. 1363, 2015b, pp. 184–185).

  30. See Makinson (2015, p. 4).

  31. Here I presuppose that the corresponding variants of the Lockean and Humean theses are used, i.e., that \(({\hbox {LT}}_>^t)\) is combined with \(({\hbox {HT}}_>^r)\) or \(({\hbox {LT}}_\ge^t)\) is combined with \(({\hbox {HT}}_\ge ^r)\).

  32. That is, if a proposition is r-stable, any other r-stable proposition must be either a subset or a superset of it (unless both propositions have probability 1). Cf. Leitgeb (2013, p. 1364, Theorem 4; 2015b, p. 186, Theorem 18).

  33. For an extended discussion of this question, see Leitgeb (2015b, pp. 217–223, Appendix C).

  34. Despite its complexity, the graph of the total likelihood function within the interval [0.5,1] is surprisingly close to the linear function \(y = 2{\cdot }(1-x)\). Outside this interval it starts deviating dramatically very soon.

  35. Lin and Kelly (2012a, p. 537, 2012b, p. 970). Lin and Kelly are not completely explicit about the range of values for their threshold s, in particular they don’t comment on the extremal values of 0 and 1. I think that 0 should be excluded and 1 included.—I will disregard in the following the strict variant (\({\hbox {LK}}_<^s\)), as well as Lin and Kelly’s (2012a, pp. 550–551) more general camera shutter rules.

  36. Lin and Kelly mention that their rule has been discussed earlier by Isaac Levi at the very end of his book For the Sake of Argument (1996, pp. 286–288). Levi’s ratio inductive expansion principle is actually more general, but it reduces to Lin and Kelly’s acceptance rule if the potential answers (‘elements of the ultimate partition’ in Levi’s terms) have equal informational content (‘informational value’). Lin and Kelly address this question in (2012a, pp. 572–574).

  37. There is of course no objective standard that could tell us how r and s really relate. Our threshold transformation is the only one that allows for a direct comparison.

  38. The threshold value r plays the role of an absolute threshold in the definition of stability. But in the way we just presented a condition that is equivalent to stability, r (or rather its transform \(\frac{1-r}{r}\)) looks more like a threshold for a relation between probabilities.

  39. The calculations are rather complex, but not interesting. Details will are available on the author’s website.

  40. From my personal point of view, there are two reasons why the theory of Lin and Kelly is particularly attractive. First, I have been collecting in earlier work a variety of arguments (from base revision, from multiple criteria, and from imperfect discrimination) that we should approve incomparabilities between possibilities or propositions and renounce Rational Monotony. Lin and Kelly’s acknowledgement of incomparabilities and their abandoning of Rational Monotony is consonant with these arguments. Second, Lin and Kelly’s (2012b) surprising discovery that (iterated) qualitative belief revision can track (iterated) probabilistic conditionalisation, if their acceptance rule is used, is in line with my general worry that the assignment of (sharp) numerical probability values might over-represent real people’s belief states. Lin and Kelly’s result may be interpreted as showing that what matters can be captured in qualitative (purely relational) terms.—There are other considerations that favour Leitgeb’s theory over Lin and Kelly’s; see, for instance, Leitgeb (2015b, pp. 173, 221–223, and footnote 108).

  41. Condition (iii) is very plausible for unconditional probabilities (appealed to in the Lockean Thesis), but somewhat less so for conditional probabilities (appealed to in the Humean Thesis).

  42. In my view, the use of the same (high) threshold value for both unconditional and conditional probabilities is the distinctive new trait of Leitgeb’s Stability Theory. But Hannes Leitgeb has told me in personal communication that this interpretation is not in accord with his intentions. Leitgeb thus rejects the first part of premise (ii) and premise (iv). I think that the best defence would be to argue against premise (i), or also against the second part of premise (ii). Unfortunately, this point cannot be discussed here any more. Rott (2016) has more thoughts on stability.

  43. Leitgeb (2013, p. 1360, 2015b, p. 177).

References

  • Benferhat, S., Dubois, D., & Prade, H. (1997). Possibilistic and standard probabilistic semantics of conditional knowledge. Journal of Logic and Computation, 9(6), 873–895.

    MathSciNet  Article  MATH  Google Scholar 

  • Cohen, L. J. (1992). An essay on belief and acceptance. Oxford: Clarendon Press.

    Google Scholar 

  • Easwaran, K. (2016). Dr. Truthlove or: How I learned to stop worrying and love Bayesian probabilities. Noûs, 50(4), 816–853.

    MathSciNet  Article  MATH  Google Scholar 

  • Leitgeb, H. (2013). Reducing belief simpliciter to degrees of belief. Annals of Pure and Applied Logic, 164(12), 1338–1389.

    MathSciNet  Article  MATH  Google Scholar 

  • Leitgeb, H. (2014). The stability theory of belief. Philosophical Review, 123(2), 131–171.

    Article  Google Scholar 

  • Leitgeb, H. (2015a). The Humean thesis on belief. Proceedings of the Aristotelian Society, Supplementary Volume, 89(1), 143–185.

    Article  Google Scholar 

  • Leitgeb, H. (2015b). The stability of belief: An essay on rationality and coherence. Oxford: Oxford University Press. (Draft of 3 March 2015, retrieved from www.academia.edu).

    Google Scholar 

  • Levi, I. (1996). For the sake of argument: Ramsey test conditionals, inductive inference, and nonmonotonic reasoning. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Lin, H., & Kelly, K. T. (2012a). A geo-logical solution to the lottery paradox, with applications to conditional logic. Synthese, 186(2), 531–575.

    MathSciNet  Article  MATH  Google Scholar 

  • Lin, H., & Kelly, K. T. (2012b). Propositional reasoning that tracks probabilistic reasoning. Journal of Philosophical Logic, 41(6), 957–981.

    MathSciNet  Article  MATH  Google Scholar 

  • Makinson, D. (2015). The scarcity of stable belief sets. Last revised 22 February 2015, scheduled for a volume on the stability theory of beliefs, retrieved from https://sites.google.com/site/davidcmakinson/listofpublications.

  • Ross, J., & Schroeder, M. (2014). Belief, credence and pragmatic encroachment. Philosophy and Phenomenological Research, 88(2), 259–288.

    Article  Google Scholar 

  • Rott, H. (2009). Degrees all the way down: Beliefs, non-beliefs and disbeliefs. In F. Huber & C. Schmidt-Petri (Eds.), Degrees of belief (pp. 301–339). Dordrecht: Springer.

    Chapter  Google Scholar 

  • Rott, H. (2016). Unstable knowledge, unstable belief. Unpublished manuscript, August 2016.

Download references

Acknowledgements

I am grateful to audiences in Etelsen, Regensburg, Patras, Uppsala and Maastricht, to John Cantwell, Paul Égré, Tim Kraft, an anonymous referee of this journal, and most of all to Hannes Leitgeb for valuable discussions of various versions of this paper. I have checked the correctness of my calculations for the space of four possibilities (Sect. 5) by determining the values for particular thresholds in numerous special cases. In doing this, I have made extensive use of the websites www.rechneronline.de/function-graphs and www.polymake.org. I am grateful to the people running these sites.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans Rott.

Appendix: Proofs for Section 2.1

Appendix: Proofs for Section 2.1

Proof of the equivalence of (1) and (2). That (2) implies (1) (and even closure under infinite intersections) is trivial. For the converse, suppose that \(LS_t^P\) is closed under finite intersections. Since we are assuming that the field for P is finite, we get that \(\bigcap LS_t^P = LC_t^P\) is in \(LS_t^P\). Thus, by the definition of \(LS_t^P\), \(P(LC_t^P) \underset{(-)}{>}t\). By the laws of probability, we then get for all A such that \(LC_t^P \subseteq A\), \(P(A) \underset{(-)}{>}t\), that is \(A \in LS_t^P\). This is one direction of (2). For the other direction, suppose that A is in \(LS_t^P\). Then \(\bigcap LS_t^P \subseteq A\), i.e., \(LC_t^P \subseteq A\), as desired.

Proof of the equivalence of (4) and (5), with identical propositions X. Assume (4), fix a suitable proposition X from (4) and let \(w\in X\). By the left-to-right part of (4), \(P(W-\{w\}) \underset{)-(}{<}t\), i.e., \(P(w) \underset{)-(}{>}1-t\). By the right-to-left part of (4), \(P(X)\underset{(-)}{>}t\), i.e., \(P(\overline{X } )\underset{(-)}{<}1-t\). Taking this together, we get (5), for the same proposition X as in (4). For the converse, assume (5) and fix a suitable proposition X from (5). In order to prove (4), let first \(P(A)\underset{(-)}{>}t\). Suppose for reductio that \(X \not \subseteq A\). Then \(A \subseteq W-\{w\}\) and thus \(P(A) \le 1-P(w)\) for some w in X. So, for this w, \(P(A) \le 1 - P(w) \underset{)-(}{<}1-(1-t) = t\), by (5), and we get a contradiction. Conversely, let \(X\subseteq A\). Then \(P(X)\le P(A)\). But by (5), \(P(X)\underset{(-)}{>}t\), so \(P(A)\underset{(-)}{>}t\), as desired for (4), with the same proposition X as in (5).

Proof that (5) implies (6). Take the X from (5). We first check the range of the threshold t. For version (\({\hbox {LT}}_>^t\)) of the Lockean thesis, (5) requires us to have \(P(\overline{X } ) < 1-t \le \min _{w\in X} P(w)\) which means that t is in the interval \([1-\min _{w\in X} P(w),P(X))\). For version (\({\hbox {LT}}_\ge ^t\)) of the Lockean thesis, (5) requires us to have \(P(\overline{X } ) \le 1-t < \min _{w\in X} P(w)\) which means that t is in the interval \((1-\min _{w\in X} P(w),P(X)]\). But this is just what the first part of (6) says. For the second part of (6), let B be such that \(B\cap X \not = \emptyset\). Then, since by (5) \(P(w)>0\) for every \(w\in X\), \(P(B)>0\). Moreover, \(P(X\vert B) = \frac{P(X\cap B)}{P(B)} = \frac{P(X\cap B)}{P(X\cap B)+P(\overline{X } \cap B)}\). Now on the one hand \(P(\overline{X } \cap B) \le P(\overline{X } ) \underset{(-)}{<}1-t\) by (5), and on the other hand \(P(X\cap B) \ge P(w)\) for some w in X and \(P(w)\underset{)-(}{>}1-t\) by (5) again, so \(P(X\cap B)\underset{)-(}{>}1-t\). Thus clearly \(P(\overline{X } \cap B)<P(X\cap B)\), and thus \(\frac {1}{2} < P(X\vert B) \le 1\).

Proof that (6) implies (5). Take X from (6), choose an arbitrary \(w \in X\) and put \(B:=\overline{X } \cup \{w\}\). From (6), we then get \(P(\overline{X } \cup \{w\})>0\) and \(P(X\vert \overline{X } \cup \{w\}) > \frac {1}{2}\), that is \(\frac{P(X\cap (\overline{X } \cup \{w\}))}{P(\overline{X } \cup \{w\})} > \frac {1}{2}\). Since \(\overline{X }\) and \(\{w\}\) are disjoint, this reduces to \(\frac{P(w)}{P(\overline{X } )+P(w)} > \frac {1}{2}\), or just \(P(w) > P(\overline{X } )\). Since w was chosen arbitrarily from X and the threshold condition is checked as in the direction from (5) to (6), this establishes (5).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rott, H. Stability and Scepticism in the Modelling of Doxastic States: Probabilities and Plain Beliefs. Minds & Machines 27, 167–197 (2017). https://doi.org/10.1007/s11023-016-9415-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11023-016-9415-0

Keywords

  • Plain belief
  • Subjective probability
  • Formal epistemology
  • Lockean thesis
  • Stability Theory of belief
  • Leitgeb
  • Lin
  • Kelly