# Learning and Pooling, Pooling and Learning

## Abstract

We explore which types of probabilistic updating commute with convex IP pooling (Stewart and Ojea Quintana 2017). Positive results are stated for Bayesian conditionalization (and a mild generalization of it), imaging, and a certain parameterization of Jeffrey conditioning. This last observation is obtained with the help of a slight generalization of a characterization of (precise) externally Bayesian pooling operators due to Wagner (Log J IGPL 18(2):336–345, 2009). These results strengthen the case that pooling should go by imprecise probabilities since no precise pooling method is as versatile.

This is a preview of subscription content, log in to check access.

1. 1.

Not all merging of opinions results require probabilities to converge to certainty (Blackwell and Dubins 1962). Under certain conditions, Bayesian conditionalizing can bring probabilities close even if they do not converge to 1 or 0.

2. 2.

$$\Omega$$ may be thought of as a partition of a space of agent-relative serious possibilities determined by consistency with a state of full belief. As is a state of full belief, $$\Omega$$ is open to being revised, refined, etc., as judged appropriate (Levi 1980).

3. 3.

Notice that, due to the way geometric pooling is defined, there are profiles for which $$F(\varvec{p}_1,\ldots , \varvec{p}_n)(\omega ) = 0$$ for all $$\omega \in \Omega$$—in violation of the probability axioms. Such a situation arises if for each $$\omega \in \Omega$$ there is a $$\varvec{p}_i \in (\varvec{p}_1,\ldots , \varvec{p}_n)$$ such that $$\varvec{p}_i(\omega ) = 0$$. Circumventing this problem, Wagner restricts the domain of pooling operators to the set of profiles for which this does not happen. That is, the domain of a pooling function is the set of profiles such that there is some $$\omega \in \Omega$$ for which $$\varvec{p}_i(\omega ) > 0$$ for all $$i=1,\ldots , n$$.

4. 4.

See Schervish and Seidenfeld (1990), Herron et al. (1997) for studies of convergence relevant to IP.

5. 5.

Within the IP research community, convexity is a matter of some controversy. For attacks on the requirement, see Seidenfeld et al. (1989, 2010), Kyburg and Pittarelli (1992). For defenses, see Levi (1990, 2009).

6. 6.

In the IP setting, conditionalization can actually lead to greater uncertainty in the short-run, a very interesting phenomenon known as dilation (Seidenfeld and Wasserman 1993; Pedersen and Wheeler 2014).

7. 7.

For any $$A \in \mathscr {A},\quad \varvec{p}^E(A) = \frac{\varvec{p}(A \cap E)}{\varvec{p}(E)} = \frac{\sum _{\omega \in A \cap E}\varvec{p}(\omega )}{\sum _{\omega \in E}\varvec{p}(\omega )}$$. By the definition of a probability measure, $$\varvec{p}(A) = \sum _{\omega \in A} \varvec{p}(\omega )$$,   so $$\sum _{\omega \in A} \varvec{p}^\lambda (\omega ) = \frac{\sum _{\omega \in A} \varvec{p}(\omega )\lambda (\omega )}{\sum _{\omega ' \in \Omega } \varvec{p}(\omega ')\lambda (\omega ')}$$ gives us $$\varvec{p}^\lambda (A)$$. We show that these two fractions are equal by showing the equality of both the numerators and denominators. Since, for all $$\omega \in A$$, $$\varvec{p}(\omega )\lambda (\omega ) = \varvec{p}(\omega )$$ if $$\omega \in E$$ and 0 otherwise, $$\sum _{\omega \in A}\varvec{p}(\omega )\lambda (\omega ) = \sum _{\omega \in A \cap E} \varvec{p}(\omega ) = \varvec{p}(A \cap E)$$. Hence, the numerators are equal. And since, for all $$\omega ' \in \Omega , \varvec{p}(\omega ')\lambda (\omega ') = \varvec{p}(\omega ')$$ if $$\omega ' \in E$$ and 0 otherwise, we have $$\sum _{\omega ' \in \Omega } \varvec{p}(\omega ')\lambda (\omega ') = \sum _{\omega ' \in E} \varvec{p}(\omega ') = \varvec{p}(E)$$. Hence, the denominators are equal, too. So, $$\varvec{p}^E = \varvec{p}^\lambda$$.

8. 8.

Thanks to Paul Pedersen for emphasizing this point to us.

9. 9.

Wagner contends that identical learning should be thought of as identical Bayes factors rather than identical posteriors. One alleged reason is that posteriors are tainted by the prior, whereas Bayes factors are an uncontaminated measure of the impact of the evidence. How do Bayes factors measure the impact of the evidence in isolation from the prior? Consider the case in which $$\varvec{q}$$ comes from $$\varvec{p}$$ by Bayesian conditionalization on E. Then,

\begin{aligned} \varvec{q}(A)/\varvec{q}(B) = \frac{\varvec{p}(A|E)}{\varvec{p}(B|E)} \end{aligned}

and

\begin{aligned} {\mathcal {B}}(\varvec{q}, \varvec{p}; A:B) = \frac{\varvec{p}(A|E)/\varvec{p}(B|E)}{\varvec{p}(A)/\varvec{p}(B)}. \end{aligned}

So, $${\mathcal {B}}(\varvec{q}, \varvec{p}; A:B)$$ is a measure of the change the evidence, E, induces in favor of A over B. $${\mathcal {B}}(\varvec{q}, \varvec{p}; A:B)$$ can also be rearranged using Bayes’ theorem.

\begin{aligned} \frac{\varvec{q}(A)}{\varvec{q}(B)} = \frac{\varvec{p}(A|E)}{\varvec{p}(B|E)} = \frac{\frac{\varvec{p}(A)\varvec{p}(E|A)}{\varvec{p}(E)}}{\frac{\varvec{p}(B)\varvec{p}(E|B)}{\varvec{p}(E)}} = \frac{\varvec{p}(A)\varvec{p}(E|A)}{\varvec{p}(B)\varvec{p}(E|B)} = \frac{\varvec{p}(A)}{\varvec{p}(B)} \times \frac{\varvec{p}(E|A)}{\varvec{p}(E|B)} \end{aligned}

Dividing now by $$\frac{\varvec{p}(A)}{\varvec{p}(B)}$$, the denominator of $${\mathcal {B}}(\varvec{q}, \varvec{p}; A:B)$$, gives us

\begin{aligned} {\mathcal {B}}(\varvec{q}, \varvec{p}; A:B) = \frac{\varvec{p}(E|A)}{\varvec{p}(E|B)} \end{aligned}

The quantity $$\varvec{p}(E|A) \big / \varvec{p}(E|B)$$ is sometimes referred to as the likelihood ratio. So, the Bayes factor is a ratio of the non-prior quantities involved in Bayes’ theorem, the quantities that revise the prior.

10. 10.

Wagner’s version of commutativity with Jeffrey conditionalization involves some additional technical assumptions. First, that $$\varvec{p}_i(E_k) > 0$$ for all i and all k. Second, that $$b_1 = 1$$ and $$\sum _k b_k \varvec{p}_i(E_k) < \infty$$ for $$i = 1,\ldots , n$$. Third, where $$\varvec{q}_i(\omega ) = \frac{\sum _k b_k \varvec{p}_i(\omega )[\omega \in E_k]}{\sum _k b_k \varvec{p}_i(E_k)}$$, it is the case that $$0< \sum _k b_k F(\varvec{p}_1,\ldots , \varvec{p}_n)(E_k) < \infty$$. In the IP setting, this last assumption may be adjusted to be a requirement for each $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$.

11. 11.

In finite spaces, any revision method can be represented as conditionalization in a richer space via superconditioning provided the posterior probability is absolutely continuous with repsect to the prior.

12. 12.

A metaphysically deflationary conception of possible worlds has it that a possible world is just a maximally complete set of sentences in some propositional language, instead of a “possible totality of facts.”.

13. 13.

Others, however, have offered more uniform accounts of supposition (e.g., Levi 1996).

14. 14.

Though, as Diaconis and Zabell’s aforementioned result shows us, in a range of cases there is no mathematical necessity in adopting Jeffrey conditionalization in order to obtain the results of Jeffrey conditionalization.

15. 15.

Though it is not uncontroversial that conditionalization or some other type of updating of represents learning. Isaac Levi, for instance, writes, “All conditions of rationality are equilibrium conditions. In a sense they are synchronic conditions [...] Furthermore, in stating conditions of rational equilibrium, no prescription is made regarding the psychological path to be taken in moving from disequilibrium or from one equilibrium position to another. In other words, there are no norms prescribing rational learning processes” (Levi 1970).

## References

1. Arló-Costa, H. (2007). The logic of conditionals. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2014 ed.). Stanford University: Metaphysics Research Lab.

2. Baratgin, J., & Politzer, G. (2010). Updating: A psychologically basic situation of probability revision. Thinking & Reasoning, 16(4), 253–287.

3. Blackwell, D., & Dubins, L. (1962). Merging of opinions with increasing information. The Annals of Mathematical Statistics, 33, 882–886.

4. Christensen, D. (2009). Disagreement as evidence: The epistemology of controversy. Philosophy Compass, 4(5), 756–767.

5. de Finetti, B. (1964). Foresight: Its logical laws, its subjective sources. In H. E. Kyburg & H. E. Smoklery (Eds.), Studies in Subjective Probability. Hoboken: Wiley.

6. Diaconis, P., & Zabell, S. L. (1982). Updating subjective probability. Journal of the American Statistical Association, 77(380), 822–830.

7. Dietrich, F., & List, C. (2014). Probabilistic opinion pooling. In A. Hájek & C. Hitchcock (Eds.), Oxford Handbook of Probability and Philosophy. Oxford: Oxford University Press.

8. Elga, A. (2007). Reflection and disagreement. Noûs, 41(3), 478–502.

9. Elkin, L., & Wheeler, G. (2016). Resolving peer disagreements through imprecise probabilities. Noûs. doi:10.1111/nous.12143.

10. Field, H. (1978). A note on jeffrey conditionalization. Philosophy of Science, 45, 361–367.

11. Gaifman, H., & Snir, M. (1982). Probabilities over rich languages, testing and randomness. The Journal of Symbolic Logic, 47(03), 495–548.

12. Gaifman, H., & Vasudevan, A. (2012). Deceptive updating and minimal information methods. Synthese, 187(1), 147–178.

13. Gärdenfors, P. (1982). Imaging and conditionalization. The Journal of Philosophy, 79, 747–760.

14. Genest, C. (1984). A characterization theorem for externally bayesian groups. The Annals of Statistics, 12, 1100–1105.

15. Genest, C., McConway, K. J., & Schervish, M. J. (1986). Characterization of externally bayesian pooling operators. The Annals of Statistics,14, 487–501.

16. Genest, C., & Wagner, C. G. (1987). Further evidence against independence preservation in expert judgement synthesis. Aequationes Mathematicae, 32(1), 74–86.

17. Genest, C., & Zidek, J. V. (1986). Combining probability distributions: A critique and an annotated bibliography. Statistical Science, 1, 114–135.

18. Girón, F. J., & Ríos, S. (1980). Quasi-bayesian behaviour: A more realistic approach to decision making? Trabajos de Estadística y de Investigación Operativa, 31(1), 17–38.

19. Good, I. J. (1983). Good Thinking: The Foundations of Probability and Its Applications. Minneapolis: U of Minnesota Press.

20. Hájek, A., & Hall, N. (1994). The hypothesis of the conditional construal of conditional probability. In E. Eells & B. Skyrms (Eds.), Probability and conditionals: Belief revision and rational decision (pp. 75–112). Cambridge: Cambridge University Press.

21. Hartmann, S. (2014). A new solution to the problem of old evidence. In Philosophy of Science Association 24th Biennial Meeting, Chicago, IL.

22. Herron, T., Seidenfeld, T., & Wasserman, L. (1997). Divisive conditioning: Further results on dilation. Philosophy of Science, 64, 411–444.

23. Huttegger, S. M. (2015). Merging of opinions and probability kinematics. The Review of Symbolic Logic, 8(04), 611–648.

24. Jeffrey, R. (2004). Subjective Probability: The Real Thing. Cambridge: Cambridge University Press.

25. Joyce, J. M. (1999). The Foundations of Causal Decision Theory. Cambridge: Cambridge University Press.

26. Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22, 79–86.

27. Kyburg, H. E. (1987). Bayesian and non-bayesian evidential updating. Artificial Intelligence, 31(3), 271–293.

28. Kyburg, H.E., Pittarelli, M. (1992). Some problems for convex bayesians. In Proceedings of the Eighth International Conference on Uncertainty in Artificial Intelligence, pp. 149–154. Morgan Kaufmann Publishers Inc.

29. Leitgeb, H. (2016). Imaging all the people. Episteme. doi:10.1017/epi.2016.14.

30. Levi, I. (1967). Probability kinematics. British Journal for the Philosophy of Science, 18(3), 197–209.

31. Levi, I. (1970). Probability and evidence. In M. Swain (Ed.), Induction, Acceptance, and Rational Belief (pp. 134–156). New York: Humanities Press.

32. Levi, I. (1978). Irrelevance. In C. Hooker, J. Leach, & E. McClennen (Eds.), Foundations and Applications of Decision Theory (Vol. 1, pp. 263–273). Boston: Springer.

33. Levi, I. (1980). The Enterprise of Knowledge. Cambridge, MA: MIT Press.

34. Levi, I. (1985). Consensus as shared agreement and outcome of inquiry. Synthese, 62(1), 3–11.

35. Levi, I. (1990). Pareto unanimity and consensus. The Journal of Philosophy, 87(9), 481–492.

36. Levi, I. (1996). For the Sake of the Argument: Ramsey Test Conditionals, Inductive Inference and Nonmonotonic Reasoning. Cambridge: Cambridge University Press.

37. Levi, I. (2009). Why indeterminate probability is rational. Journal of Applied Logic, 7(4), 364–376.

38. Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. The Philosophical Review, 85, 297–315.

39. Madansky, A. (1964). Externally Bayesian Groups. Santa Monica, CA: RAND Corporation.

40. Nau, R. F. (2002). The aggregation of imprecise probabilities. Journal of Statistical Planning and Inference, 105(1), 265–282.

41. Pedersen, A. P., & Wheeler, G. (2014). Demystifying dilation. Erkenntnis, 79(6), 1305–1342.

42. Raiffa, H. (1968). Decision analysis: Introductory lectures on choices under uncertainty. Random House.

43. Ramsey, F. P. (1990). Truth and probability. In D. H. Mellor (Ed.), Philosophical Papers (pp. 52–109). Cambridge University Press.

44. Russell, J. S., Hawthorne, J., & Buchak, L. (2015). Groupthink. Philosophical Studies, 172(5), 1287–1309.

45. Savage, L. (1972, originally published in 1954). The Foundations of Statistics. New York: Wiley.

46. Schervish, M., & Seidenfeld, T. (1990). An approach to consensus and certainty with increasing evidence. Journal of Statistical Planning and Inference, 25(3), 401–414.

47. Seidenfeld, T. (1986). Entropy and uncertainty. Philosophy of Science, 53, 467–491.

48. Seidenfeld, T., Kadane, J. B., & Schervish, M. J. (1989). On the shared preferences of two bayesian decision makers. The Journal of Philosophy, 86(5), 225–244.

49. Seidenfeld, T., Schervish, M. J., & Kadane, J. B. (2010). Coherent choice functions under uncertainty. Synthese, 172(1), 157–176.

50. Seidenfeld, T., & Wasserman, L. (1993). Dilation for sets of probabilities. The Annals of Statistics, 21(3), 1139–1154.

51. Skyrms, B. (1986). Choice and Chance: An Introduction to Inductive Logic (3rd ed.). Belmont: Wadsworth Publishing Company.

52. Spohn, W. (2012). The Laws of Belief: Ranking Theory and Its Philosophical Applications. Oxford: Oxford University Press.

53. Stewart, R. T. & Ojea Quintana, I. (2017). Probabilistic opinion pooling with imprecise probabilities. Journal of Philosophical Logic. doi:10.1007/s10992-016-9415-9.

54. van Fraassen, B. C. (1989). Laws and Symmetry. Oxford: Clarendon Press.

55. Wagner, C. (2002). Probability kinematics and commutativity. Philosophy of Science, 69(2), 266–278.

56. Wagner, C. (2009). Jeffrey conditioning and external bayesianity. Logic Journal of IGPL, 18(2), 336–345.

57. Williams, P. M. (1980). Bayesian conditionalisation and the principle of minimum information. British Journal for the Philosophy of Science, 31, 131–144.

## Acknowledgements

The bulk of this work was done while we were on a Junior Group Visiting Fellowship at the Munich Center for Mathematical Philosophy. The paper benefited from conversations with Stephan Hartmann and Hannes Leitgeb. We would especially like to thank Greg Wheeler for feedback, numerous relevant discussions, and support. We are grate- ful to Matt Duncan, Robby Finley, Arthur Heller, Isaac Levi, Michael Nielsen, Rohit Parikh, Paul Pedersen, Teddy Seidenfeld, and Reuben Stern for their excellent comments on drafts or presentations of the pa- per. Finally, thanks to an anonymous referee for his or her meticulous and valuable review.

## Author information

Authors

### Corresponding author

Correspondence to Rush T. Stewart.

## Appendices

### Proof

We follow through Wagner’s proof for the precise case (2009, Theorem 3.3), adapting it for IP where necessary.

$$(\Rightarrow )$$ Assume that $${\mathcal {F}}$$ is externally Bayesian, i.e., for all profiles and any likelihood function, $${\mathcal {F}}^\lambda (\varvec{p}_1,\ldots , \varvec{p}_n) = {\mathcal {F}}(\varvec{p}_1^\lambda ,\ldots , \varvec{p}_n^\lambda )$$. We want to show that, for all partitions $$\varvec{E} = \{E_k\}$$ of $$\Omega$$ and all profiles in $${\mathbb {P}}^n$$,

\begin{aligned} {\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n)= & {} \left\{ \dfrac{\sum _k b_k \varvec{p}[\cdot \in E_k]}{\sum _k b_k \varvec{p}(E_k)}: \varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)\right\} \\= & {} {\mathcal {F}}\left( \dfrac{\sum _k b_k \varvec{p}_1[\cdot \in E_k]}{\sum _k b_k \varvec{p}_1(E_k)},\ldots , \dfrac{\sum _k b_k \varvec{p}_n[\cdot \in E_k]}{\sum _k b_k \varvec{p}_n(E_k)}\right) \\= & {} {\mathcal {F}}(\varvec{p}_{1J}^{\varvec{E}},\ldots , \varvec{p}_{nJ}^{\varvec{E}}) \end{aligned}

where the first and last equalities are definitional. Recall the definition of $$b_k$$: $$b_k = {\mathcal {B}}(\varvec{q},\varvec{p};E_k:E_1) = \dfrac{\varvec{q}(E_k)/\varvec{q}(E_1)}{\varvec{p}(E_k)/\varvec{p}(E_1)}$$, $$k = 1, 2,\ldots$$ Set $$\lambda (\omega ) = \sum _k b_k [\omega \in E_k]$$. Wagner observes the following chain of equalities then obtains for $$\varvec{p}_i, i = 1,\ldots , n$$ (2009, (3.10), p. 342):

\begin{aligned} (\star )\sum _{\omega \in \Omega } \lambda (\omega )\varvec{p}_i(\omega ) = \sum _{\omega \in \Omega }\varvec{p}_i(\omega )\sum _k b_k [\omega \in E_k] = \sum _k b_k \sum _{\omega \in \Omega }\varvec{p}_i(\omega )[\omega \in E_k] = \sum _k b_k \varvec{p}_i(E_k) \end{aligned}

Since each of the terms $$b_k \varvec{p}_i(E_k)$$ is positive and $$\sum _k b_k \varvec{p}_i(E_k) < \infty$$, $$\lambda$$ is a likelihood function for $$\varvec{p}_i,$$ with $$\varvec{p}^{\lambda}_{i}$$ a defined, updated pmf for $$i = 1,\ldots , n.$$ Using $$(\star )$$, we can obtain

\begin{aligned} {\mathcal {F}}(\varvec{p}_{1J}^{\varvec{E}},\ldots , \varvec{p}_{nJ}^{\varvec{E}}) = {\mathcal {F}}\left( \frac{\varvec{p}_1\lambda (\cdot )}{\sum _{\omega ' \in \Omega }\varvec{p}_1(\omega ')\lambda (\omega ')},\ldots , \frac{\varvec{p}_n\lambda (\cdot )}{\sum _{\omega ' \in \Omega } \varvec{p}_n(\omega ')\lambda (\omega ')}\right) \end{aligned}

by substituting, for each $$i=1,\ldots , n$$, $$\lambda (\cdot )$$ for $$\sum _k b_k [\omega \in E_k]$$ in the numerator and $$\sum _{\omega ' \in \Omega } \varvec{p}_i(\omega ')\lambda (\omega ')$$ for $$\sum _k b_k \varvec{p}_i(E_k)$$ in the denominator. But by definition,

\begin{aligned} {\mathcal {F}}\left( \frac{\varvec{p}_1\lambda (\cdot )}{\sum _{\omega ' \in \Omega }\varvec{p}_1(\omega ')\lambda (\omega ')},\ldots , \frac{\varvec{p}_n\lambda (\cdot )}{\sum _{\omega ' \in \Omega } \varvec{p}_n(\omega ')\lambda (\omega ')}\right) = {\mathcal {F}}(\varvec{p}_1^\lambda ,\ldots , \varvec{p}_n^\lambda ) \end{aligned}

and by assumption $${\mathcal {F}}(\varvec{p}_1^\lambda ,\ldots , \varvec{p}_n^\lambda )={\mathcal {F}}^\lambda (\varvec{p}_1,\ldots , \varvec{p}_n)$$. By definition, $${\mathcal {F}}^\lambda (\varvec{p}_1,\ldots , \varvec{p}_n) = \{\varvec{p}^\lambda : \varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)\}$$. But, for all $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$, $$\varvec{p}^\lambda = \frac{\sum _k b_k \varvec{p}[\cdot \in E_k]}{\sum _k b_k \varvec{p}(E_k)}$$. Hence, $${\mathcal {F}}^\lambda (\varvec{p}_1,\ldots , \varvec{p}_n) = {\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$. So, $${\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n) = {\mathcal {F}}(\varvec{p}_{1J}^{\varvec{E}},\ldots , \varvec{p}_{nJ}^{\varvec{E}})$$ follows from the assumption.

$$(\Leftarrow )$$ Suppose that $${\mathcal {F}}$$ satisfies $$\textit{CJC}_W$$ and that $$\lambda$$ is a likelihood function for $$\varvec{p}_i, i = 1,\ldots , n$$. Let $$(\omega _1, \omega _2,\ldots )$$ be a list of all of those $$\omega \in \Omega$$ such that $$\lambda (\omega ) > 0$$, and let $$\varvec{E} = \{E_1, E_2,\ldots \},$$ where $$E_i:\,= \{\omega _i\}.$$ Setting $$b_k = \frac{\lambda (\omega _k)}{\lambda (\omega _1)}$$ for $$k = 1, 2,\ldots$$, it follows that $$b_k>0$$ and that $$b_1=1$$. Since $$\lambda$$ is a likelihood for $$\varvec{p}_i, i = 1,\ldots , n,$$ we have $$\sum _k b_k \varvec{p}_i(E_k)<\infty , i = 1,\ldots , n,$$ and that $$(\varvec{q}_1,\ldots , \varvec{q}_n) \in {\mathbb {P}}^n,$$ where $$\varvec{q}_i(\omega ):\,= \frac{\sum _k b_k \varvec{p}_i(\omega )[\omega \in E_k]}{\sum _k b_k \varvec{p}_i(E_k)}.$$ From $$\textit{CJC}_W$$, it follows that $$1)\ 0< \sum _k b_k \varvec{p}(E_k) < \infty$$ for all $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n),$$ and that $$2)\ {\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n) = {\mathcal {F}}(\varvec{p}_{1J}^{\varvec{E}},\ldots , \varvec{p}_{nJ}^{\varvec{E}})$$. 1) implies that $$0<\sum _{\omega \in \Omega } \lambda (\omega ) \varvec{p}(\omega ) < \infty$$ for all $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$, and 2) implies that $${\mathcal {F}}^\lambda (\varvec{p}_1,\ldots , \varvec{p}_n) = {\mathcal {F}}(\varvec{p}_1^\lambda ,\ldots , \varvec{p}_n^\lambda )$$ (since substituting the definition of $$b_k$$ in terms of $$\lambda$$ in $$\frac{\sum _k b_k \varvec{p}_i(\omega )[\omega \in E_k]}{\sum _k b_k \varvec{p}_i(E_k)}$$, the formula for obtaining the $$\varvec{q}_i$$, reduces that formula to the formula for updating on that $$\lambda$$). $$\square$$

### Proof

We provide a case in which convex IP pooling and Jeffrey conditionalization as standardly construed do not commute. Let $$\varvec{q}_i$$ come from $$\varvec{p}_i$$ by Jeffrey conditionalization, and let $$\varvec{q}$$ be a common posterior distribution over partition $$\varvec{E}$$ for $$\varvec{p}_i$$, $$i = 1,\ldots , n$$. Let $${\mathcal {F}}_{J}^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$ come from $${\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$ by Jeffrey conditionalizing each $$\varvec{p}_i$$ using $$\varvec{q}$$, the common posterior distribution over $$\varvec{E}$$. We offer a counterexample to commutativity in which $${\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1,\ldots , \varvec{p}_n) \ne {\mathcal {F}}(\varvec{q}_1,\ldots , \varvec{q}_n)$$.

Let $$\Omega = \{\omega _1, \omega _2, \omega _3, \omega _4\}$$, and consider the following two pmfs listed in Table 2. Let $$\varvec{E} = \{E_1, E_2\}$$ with $$E_1 = \{\omega _1, \omega _2\}$$ and $$E_2 = \{\omega _3, \omega _4\}$$ be a partition of $$\Omega$$. Jeffrey updating both pmfs using $$\varvec{q}$$, where $$\varvec{q}(E_1) = 2/3$$ and $$\varvec{q}(E_2) = 1/3$$, we obtain the following posteriors listed in (Table 3).

Consider the $$.50-.50$$ mixture of $$\varvec{p}_1$$ and $$\varvec{p}_2$$, $$\varvec{p}^\star = 0.5\varvec{p}_1 + 0.5\varvec{p}_2$$. It is clear that $$\varvec{p}^\star \in {\mathcal {F}}(\varvec{p}_1, \varvec{p}_2)$$. Jeffrey conditionalizing $$\varvec{p}^\star$$ with $$\varvec{q}$$ gives us $$\varvec{q}^\star$$. In particular, $$\varvec{q}^\star (\omega _1) = 2/9$$ and $$\varvec{q}^\star (\omega _3) = 4/21$$. It is clear that $$\varvec{q}^\star \in {\mathcal {F}}^J_{\varvec{E}}(\varvec{p}_1, \varvec{p}_2)$$. Any $$\varvec{q}_\star \in {\mathcal {F}}(\varvec{q}_1, \varvec{q}_2)$$ is of the form $$\varvec{q}_\star = \alpha \varvec{q}_1 + (1 - \alpha ) \varvec{q}_2$$ for $$\alpha \in [0, 1]$$.

Suppose that $${\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1, \varvec{p}_2) = {\mathcal {F}}(\varvec{q}_1, \varvec{q}_2)$$. Then, there is a $$\varvec{q}_\star \in {\mathcal {F}}(\varvec{q}_1, \varvec{q}_2)$$ such that $$\varvec{q}^\star = \varvec{q}_\star$$. In particular, $$\varvec{q}_\star (\omega _1) = 2/9$$ and $$\varvec{q}_\star (\omega _3) = 4/21$$. Letting $$\varvec{q}_\star (\omega _1) = 2/9$$, we can compute $$\alpha$$.

\begin{aligned} 2/9 = \varvec{q}_\star (\omega _1) = \alpha \varvec{q}_1(\omega _1) + (1 - \alpha )\varvec{q}_2(\omega _1) = \alpha 1/3 + (1 - \alpha ) 2/15 \end{aligned}

Solving, we get $$\alpha = 4/9$$. However, we are supposed to have $$\varvec{q}_\star (\omega _3) = 4/21$$. For $$\alpha = 4/9$$, that is not the case.

\begin{aligned} \varvec{q}_\star (\omega _3) = \alpha \varvec{q}_1(\omega _3) + (1 - \alpha ) \varvec{q}_2(\omega _3) = 4/9(1/6) + 5/9(2/9) = 16/81 > 4/21 = \varvec{q}^\star (\omega _3) \end{aligned}

It follows that $${\mathcal {F}}_J^{\varvec{E}}(\varvec{p}_1, \varvec{p}_2) \ne {\mathcal {F}}(\varvec{q}_1, \varvec{q}_2)$$. $$\square$$

### Proof

We want to show that $${\mathcal {F}}(\varvec{q}_1,\ldots , \varvec{q}_n) = {\mathcal {F}}_I^E(\varvec{p}_1,\ldots , \varvec{p}_n)$$, where $$\varvec{q}_i$$ comes from $$\varvec{p}_i$$ by general imaging on E, and $${\mathcal {F}}_I^E(\varvec{p}_1,\ldots , \varvec{p}_n)$$ comes from $${\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$ by general imaging each $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$ on E. Again, we show both inclusions. In the proofs, we appeal to the fact any element of a convex set is some convex combination of the generating, extreme points: For any $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n), \varvec{p}=\sum _{i=1}^n \alpha _i\varvec{p}_i$$, where $$\alpha _i \ge 0$$ for $$i = 1,\ldots , n$$, and $$\sum _{i=1}^n \alpha _i= 1$$ (see, e.g., Stewart & Ojea Quintana 2017, Lemma 1).

Let $$\varvec{q}\in {\mathcal {F}}(\varvec{q}_1,\ldots , \varvec{q}_n)$$. So, $$\varvec{q}= \sum _{i=1}^n \alpha _i\varvec{q}_i$$. Since $$\varvec{q}$$ is a linear pool of $$\varvec{q}_i$$ for $$i = 1,\ldots , n$$, by Gärdenfors’ result, Theorem 5, $$\varvec{q}$$ is also the result of imaging $$\varvec{p}= \sum _{i=1}^n\alpha _i\varvec{p}_i$$ on E, because linear pooling and general imaging commute. Since $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$, it follows that $$\varvec{q}\in {\mathcal {F}}_I^E(\varvec{p}_1,\ldots , \varvec{p}_n)$$.

For the other direction, assume that $$\varvec{q}\in {\mathcal {F}}_I^E(\varvec{p}_1,\ldots , \varvec{p}_n)$$. So, $$\varvec{q}$$ is the result of general imaging some $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n)$$ on E. For any $$\varvec{p}\in {\mathcal {F}}(\varvec{p}_1,\ldots , \varvec{p}_n), \varvec{p}= \sum _{i=1}^n\alpha _i\varvec{p}_i$$. By Gärdenfors’ result, $$\varvec{q}= \sum _{i=1}^n \alpha _i \varvec{q}_i$$, where the $$\varvec{q}_i$$ come from the $$\varvec{p}_i$$ by general imaging on E, because general imaging and linear pooling commute. But then it follows that $$\varvec{q}\in {\mathcal {F}}(\varvec{q}_1,\ldots , \varvec{q}_n)$$. $$\square$$

## Rights and permissions

Reprints and Permissions