Expert deference as a belief revision schema

Abstract

When an agent learns of an expert’s credence in a proposition about which they are an expert, the agent should defer to the expert and adopt that credence as their own. This is a popular thought about how agents ought to respond to (ideal) experts. In a Bayesian framework, it is often modelled by endowing the agent with a set of priors that achieves this result. But this model faces a number of challenges, especially when applied to non-ideal agents (who nevertheless interact with ideal experts). I outline these problems, and use them as desiderata for the development of a new model. Taking inspiration from Richard Jeffrey’s development of Jeffrey conditioning, I develop a model in which expert reports are taken as exogenous constraints on the agent’s posterior probabilities. I show how this model can handle a much wider class of expert reports (for example reports of conditional probabilities), and can be naturally extended to cover propositions for which the agent has no prior.

This is a preview of subscription content, log in to check access.

Fig. 1

Notes

  1. 1.

    For an explicit discussion of these two kinds of norms in epistemology see (Simion et al. 2016, S4.1), and for a similar discussion in decision theory see Buchak (2013, Ch 1) and Thoma (2019). The Bayesian statistics papers referenced in this section almost all have a prescriptive element, but for a particularly clear example see French (1980).

  2. 2.

    This section, and indeed this paper, is not intended as a complete survey of the literature on Bayesian approaches to expert disagreement and, where I do survey the literature, my review is partial to philosophy. There is a Bayesian statistics literature on the topic of expert testimony covering both supra-Bayesianism and deference, and I engage with it here only partially. Part of the difficulty in using that literature arises from the difference in focus. Statistics papers often assume that orthodox Bayesianism is the right norm, while I wish to evaluate that claim. They work through how a real agent might reason in the kinds of cases under consideration, and regularly assume a particular form for the agents’ priors and likelihoods (i.e., assuming particular distributions) in order to make progress. While valuable for building understanding of Bayesianism and its implications, they are rarely directly concerned with my topic here.

  3. 3.

    The label “supra-Bayesianism” comes from Keeney and Raiffa (1976). It has been much discussed in the Bayesian statistics literature (see Genest and Zidek 1986), and I do not claim that these problems are without possible responses. In particular, much work has been done on how to make it more tractable in cases where particular symmetries, or known distributions, simplify the updating required. Lindley (1982) notes cases in which it reduces to the very simple expert deference. Others have studied when it reduces to averaging. French (1980) is an early analysis of how thinking through the procedure a real agent might use to enact supra-Bayesianism can generate plausible simplifications.

  4. 4.

    One might also worry that this demand, taken literally, means that the simple model above won’t work. Experts report probability values, and so these reports are themselves continuous random variables. Strictly speaking your prior for \(\ulcorner W(H)=x \urcorner \) should thus be zero, for any x. I won’t dwell on this problem, as the issues it raises aren’t core criticisms of supra-Bayesianism and I believe that a more complex model could work around it.

  5. 5.

    There are alternate definitions of expert out there. For example, Easwaran et al. (2016) define experts as reliable witnesses. For them, \(P^1\) is an expert for P, in some domain D, when the following holds: for any \(X\in D\), when \(P^1(X)>P(X)\), P takes \(P^1\)’s credence in X as evidence for X and raises their credence. The same applies to lower credences as evidence against.

  6. 6.

    We may of course wish to defer to experts on matters which occur only once, in which case this notion of calibration to frequencies isn’t useful.

  7. 7.

    Lindley (1982, p. 118) notes the connection between calibration and deference in an early discussion of supra-Bayesianism and deference. When the result of supra-Bayesian updating matches the expert’s report, Lindley calls the expert “probability calibrated” for that novice. Due to the differences in how we approach the problem, I won’t use Lindley’s terminology. As DeGroot (1988, p. 299) says: “it would be unnecessary to use the term ‘well calibrated’ in this paper because that property is now simply the defining characteristic of an expert.”

  8. 8.

    Experts report non-probabilistic information too, but here I’ll neglect such reports. We can perhaps assume, as many probabilists do, that categorical statements (e.g., “it will rain tomorrow”) are expressions of high credence (P(rain tomorrow\()\approx 1\)).

  9. 9.

    This is linked to Jeffrey’s rejection of what he calls “dogmatic empiricism”, the view under which there is some basic, sense-data proposition capturing exactly what the agent learns.

  10. 10.

    This suggestion is due to Steele (2012).

  11. 11.

    In this I follow Jeffrey himself (e.g., Jeffrey and Hendrickson 1989) and much of the Bayesian statistics literature.

  12. 12.

    I take this term from Bradley (2017).

  13. 13.

    Bradley takes these two options to be exhaustive; other forms of awareness growth are reducible to combinations of them, perhaps in combination with two corresponding forms of awareness contraction. I do not need this to be true in what follows.

  14. 14.

    Note a persistent idealisation here: \(\models \) is the implication relation which ordered the old algebra, and it also orders the new propositions. So, the agents that we model in this framework are logically omniscient (as is standard) and this omniscience extends to propositions they were previously unaware of. The problem of logical omniscience is a significant one for someone with my non-ideal theory interests. However, treating it is notoriously difficult. I therefore put up with this idealisation, noting that allowing agents to be unaware does mitigate force of the problem of logical omniscience.

  15. 15.

    Bradley’s approach fits naturally with my topic of expert deference. I want the posterior attitudes to the new propositions to come from the expert reports, and therefore it is helpful to maintain the separation between these steps. Karni and Vierø (2013) do not provide anything like this clean separation. In other cases, this methodological separation may be not be desirable: in a forthcoming paper Steele and Stefánsson (forthcoming) argue that this two-step procedure is baseless. In my case I think the value of the separation is clear.

  16. 16.

    Put another way: \(\mathbb {P}^\oplus \) has no constraints on conditional probabilties involving H that aren’t just consequences of Rigid Extension.

  17. 17.

    Wagner presents his results as an extension of those which stipulate sufficient conditions for Jeffrey conditioning commuting—see the next footnote. For Wagner, these conditions are two identities for the Bayes factors generated by each of two experiences, relative to two different partitions \(\mathbb {X}\) and \(\mathbb {Y}\). Let \(\beta _{1\mathbb {X}}\) represent having the 1st Jeffrey experience occurring relative to partition \(\mathbb {X}\). Then Wagner’s identities are \(\beta _{1\mathbb {X}}(X_i,X_j)=\beta _{2\mathbb {X}}(X_i,X_j),\, \forall i,j\) and the same for \(\mathbb {Y}\). But as Wagner notes, Bayes factors are the right way of representing what is learned in an experience in a prior-free way. So if we stipulate that the experiences are identical, then Wagner’s Bayes factor identities hold, and therefore Jeffrey conditioning commutes across the order of the experiences.

  18. 18.

    Why only “may be” non-commutative? In the finite setting, Diaconis and Zabell (1982) provide necessary and sufficient conditions for commutativity. Consider two partitions and the sequences of probabilities assigned to them in a Jeffrey update: \(\{\mathbb {X}, \langle x_j \rangle \}\) and \(\{\mathbb {Y}, \langle y_k \rangle \}\). \(\mathbb {X}\) and \(\mathbb {Y}\) are Jeffrey independent with respect to P, \(\langle x_j \rangle \) and \(\langle y_k \rangle \), if \(P_{\mathbb {X}}(Y_k)=P(Y_k)\) and \(P_{\mathbb {Y}}(X_j)=P(X_j)\) holds for all jk. Then successive Jeffrey updates commute, \(P_{\mathbb {XY}}=P_{\mathbb {YX}}\), if and only if \(\mathbb {X}\) and \(\mathbb {Y}\) are Jeffrey independent with respect to P, \(\langle x_j \rangle \) and \(\langle y_k \rangle \) (Diaconis and Zabell 1982, Theorem 3.2). This turns out to be a weaker condition than probabilistic independence, so that if \(\mathbb {X, Y}\) are probabilistically independent with respect to P, successive updates on them will commute for any update probabilities (Diaconis and Zabell 1982, Theorem 3.3). So, while some sequences of Jeffrey updates will commute, in general we should expect them not to.

  19. 19.

    In a recent presentation of some work in progress, James Joyce suggested a hierarchy of ideal experts, such that reports from higher-ranked experts trump reports from lower-ranked experts. In his example, learning the truth trumps learning the chances. This proposal could be built into my model, by taking only the report of the highest-ranked expert as a constraint (and perhaps retaining some memory of the source of one’s credences, so as not to later have a lower-ranked expert override a higher). This could in principle be extended to cover any disagreeing experts, so long as the agent could order them by reliability, and assuming that the fact of their disagreement did not undermine the grounds for deference to them.

References

  1. Bradley, R. (2005). Radical probabilism and Bayesian conditioning. Philosophy of Science, 72(2), 342–364. https://doi.org/10.1086/432427.

    Article  Google Scholar 

  2. Bradley, R. (2017). Decision theory with a human face. Cambridge: Cambridge University Press.

    Google Scholar 

  3. Buchak, L. (2013). Risk and rationality. Oxford: Oxford University Press.

    Google Scholar 

  4. DeGroot, M. H. (1988). A Bayesian view of assessing uncertainty and comparing expert opinion. Journal of Statistical Planning and Inference, 20, 295–306.

    Article  Google Scholar 

  5. Diaconis, P., & Zabell, S. L. (1982). Updating subjective probability. Journal of the American Statistical Association, 77(380), 822–30.

    Article  Google Scholar 

  6. Dietrich, F., List, C., & Bradley, R. (2016). Belief revision generalized: A joint characterization of Bayes’ and Jeffrey’s rules. Journal of Economic Theory, 162, 352–371. https://doi.org/10.1016/j.jet.2015.11.006.

    Article  Google Scholar 

  7. Easwaran, K., Fenton-Glynn, L., Hitchcock, C., & Velasco, J. D. (2016). Updating on the credences of others: Disagreement, agreement, and synergy. Philosopher’s Imprint, 16(11), 1–39.

    Google Scholar 

  8. Elga, A. (2007). Reflection and disagreement. Nous, 41(3), 478–502.

    Article  Google Scholar 

  9. Eva, B., Hartmann, S., & Rad, S. R. (2019). Learning from conditionals. Mind,. https://doi.org/10.1093/mind/fzz025.

    Article  Google Scholar 

  10. French, S. (1980). Updating of belief in the light of someone else’s opinion. Journal of the Royal Statistical Society Series A (General), 143(1), 43. https://doi.org/10.2307/2981768.

    Article  Google Scholar 

  11. Gaifman, H. (1988). A theory of higher order probabilities. In Causation, chance, and credence (vol. 1, pp. 191–220). Kluwer.

  12. Genest, C., & Zidek, J. V. (1986). Combining probability distributions: A critique and an annotated bibliography. Statistical Science, 1(1), 114–135. https://doi.org/10.1214/ss/1177013825.

    Article  Google Scholar 

  13. Jeffrey, R. (1983). The logic of decision (2nd ed.). Chicago, IL: University of Chicago Press.

    Google Scholar 

  14. Jeffrey, R., & Hendrickson, M. (1989). Probabilizing pathology. Proceedings of the Aristotelian Society, 89(1), 211–226. https://doi.org/10.1093/aristotelian/89.1.211.

    Article  Google Scholar 

  15. Joyce, J. M. (1998). A nonpragmatic vindication of probabilism. Philosophy of Science, 65, 575–603.

    Article  Google Scholar 

  16. Joyce, J. M. (2007). Epistemic deference: The case of chance. Proceedings of the Aristotelian Society, 107, 187–206. https://doi.org/10.1111/j.1467-9264.2007.00218.x.

    Article  Google Scholar 

  17. Karni, E., & Vierø, M. L. (2013). “Reverse Bayesianism”: A choice-based theory of growing awareness. American Economic Review, 103(7), 2790–2810. https://doi.org/10.1257/aer.103.7.2790.

    Article  Google Scholar 

  18. Keeney, R. L., & Raiffa, H. (1976). Decisions with multiple objectives: Preferences and value tradeoffs. New York: Wiley.

    Google Scholar 

  19. Lindley, D. (1982). The improvement of probability judgements. Journal of the Royal Statistical Society, Series A, 145(1), 117–126.

    Article  Google Scholar 

  20. Pettigrew, R. (2016). Accuracy and the laws of credence. Oxford: Oxford University Press.

    Google Scholar 

  21. Simion, M., Kelp, C., & Ghijsen, H. (2016). Norms of belief. Philosophical Issues, 26(1), 374–392. https://doi.org/10.1111/phis.12077.

    Article  Google Scholar 

  22. Steele, K. (2012). Testimony as evidence: More problems for linear pooling. Journal of Philosophical Logic, 41(6), 983–999. https://doi.org/10.1007/s10992-012-9227-5.

    Article  Google Scholar 

  23. Steele, K., & Stefánsson, H.O. (forthcoming). belief revision for growing awareness. Mind.

  24. Thoma, J. (2019). Decision theory. In R. Pettigrew & J. Weisberg (Eds.), The open handbook of formal epistemology. Jackson Park, CA: PhilPapers Foundation.

    Google Scholar 

  25. Titelbaum, M. G. (forthcoming). Fundamentals of Bayesian epistemology. Oxford University Press.

  26. van Fraassen, B. (1981). A problem for relative information minimizers in probability kinematics. The British Journal for the Philosophy of Science, 32(4), 375–379. https://doi.org/10.1093/bjps/32.4.375.

    Article  Google Scholar 

  27. Wagner, C. (2002). Probability kinematics and commutativity. Philosophy of Science, 69(2), 266–78.

    Article  Google Scholar 

Download references

Acknowledgements

Thanks to Richard Bradley for feedback and discussion on drafts of this paper, and to Richard Pettigrew for pressing me on the objections to supra-Bayesianism. Thanks also to the members of the LSE Choice Group for their comments. This research was partly conducted in the project Climate Ethics and Future Generations supported by Riksbankens Jubileumsfond (Grant M17-0372:1) and Institute for Futures Studies.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Joe Roussos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Roussos, J. Expert deference as a belief revision schema. Synthese (2020). https://doi.org/10.1007/s11229-020-02942-3

Download citation

Keywords

  • Belief revision
  • Expert deference
  • Expert disagreement
  • Awareness