Skip to main content
Log in

Bayesian learning with multiple priors and nonvanishing ambiguity

  • Research Article
  • Published:
Economic Theory Aims and scope Submit manuscript

Abstract

The existing models of Bayesian learning with multiple priors by Marinacci (Stat Pap 43:145–151, 2002) and by Epstein and Schneider (Rev Econ Stud 74:1275–1303, 2007) formalize the intuitive notion that ambiguity should vanish through statistical learning in an one-urn environment. Moreover, the multiple priors decision maker of these models will eventually learn the “truth.” To accommodate nonvanishing violations of Savage’s (The foundations of statistics, Wiley, New York, 1954) sure-thing principle, as reported in Nicholls et al. (J Risk Uncertain 50:97–115, 2015), we construct and analyze a model of Bayesian learning with multiple priors for which ambiguity does not necessarily vanish in an one-urn environment. Our decision maker only forms posteriors from priors that survive a prior selection rule which discriminates, with probability one, against priors whose expected Kullback–Leibler divergence from the “truth” is too far off from the minimal expected Kullback–Leibler divergence over all priors. The “stubbornness” parameter of our prior selection rule thereby governs how much ambiguity will remain in the limit of our learning model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. An alternative (and under specific circumstances formally equivalent) class of models that accommodate ambiguity attitudes are models of Choquet decision making/Choquet expected utility (Schmeidler 1989; Gilboa 1987). These Choquet models express ambiguity attitudes through nonadditive probability measures.

  2. The seminal contribution is Doob’s (1949) consistency theorem. For generalizations and further references, see, e.g., Diaconis and Freedman (1986), Chapter 1 in Gosh and Ramamoorthi (2003), and Lijoi et al. (2004).

  3. Of course, the utility function u is only unique up to some positive affine transformation.

  4. Gilboa and Schmeidler (1989) axiomatize MEU within an Anscombe and Aumann (1963) framework where the set of consequences Z contains all lotteries over some non-degenerate set of deterministic prizes. Under this Gilboa and Schmeidler (1989) axiomatization, \(\mathcal {P}\) is uniquely pinned down as a non-empty, closed and convex set of finitely additive probability measures. We ignore here this specific axiomatic foundation and also allow for, e.g., non-convex \(\mathcal {P}\).

  5. For a more realistic generalization of MEU, see the \(\alpha \)-MEU concept of Ghirardato et al. (2004).

  6. We exclude \(\theta =0\) and \(\theta =60\) out of convenience since we do not want to make a stand about Bayesian updating in light of events that the decision maker perceives as impossible. For example, we want to avoid the case that a prior attaches probability one to zero yellow balls in the urn, but the decision maker observes a yellow ball drawn from the urn.

  7. In the literature, \(\left( \Theta ,\mathcal {F}\right) \) is also called the (possibly multiple) parameter space.

  8. As a generalization of the single-likelihood environment, ES-2007 consider a “ multiple-likelihoods” environment where an index \(\theta \) in \(\Theta \) corresponds to a set of \(\theta \) -conditional probability measures. Although the formal results of this paper will be exclusively derived for the single-likelihood environment, compare Sect. 6 for an outlook on future research.

  9. Since there is an one-to-one correspondence between all probability measures on \(\left( \Theta ,\mathcal {F}\right) \) and the points in \(\triangle ^{n}\), we slightly abuse notation and write \(\mu _{0}\equiv \left( \mu _{0}^{1},\ldots ,\mu _{0}^{n}\right) \in \triangle ^{n}\) for the additive probability measure \(\mu _{0}:\mathcal {F}\rightarrow \left[ 0,1\right] \) such that, for all non-empty \(\Theta ^{\prime }\in \mathcal {F}\),

    $$\begin{aligned} \mu _{0}\left( \Theta ^{\prime }\right) =\Sigma _{\left\{ \theta _{j}\in \Theta ^{\prime }\right\} }\mu _{0}^{j}\text {.} \end{aligned}$$
  10. Whenever we henceforth speak of emerging posteriors or emerging priors, the qualification “with probability one” is implicitly included.

  11. An accessible proof can be found in Section 1.3.3. of Gosh and Ramamoorthi (2003).

  12. The KL-divergence is asymmetric and does not satisfy the triangle inequality.

  13. By Bayes’ rule, we have that

    $$\begin{aligned} \frac{\mathrm{{d}}\varphi _{\theta ^{*}}\left( x\right) }{\mathrm{{d}}\varphi _{\theta }\left( x\right) }=\frac{\pi _{\mu _{0}}\left( \theta ^{*}\mid x\right) /\mu _{0}\left( \theta ^{*}\right) }{\pi _{\mu _{0}}\left( \theta \mid x\right) /\mu _{0}\left( \theta \right) }\text {.} \end{aligned}$$
  14. The set of all cluster points of a given sequence of sets is also called the topological lim sup of this sequence (Aliprantis and Border 2006, p. 114) or the upper limit of this sequence (Berge 1997, p. 119).

  15. The set of all limit points of a given sequence of sets is also called the topological lim inf of this sequence (Aliprantis and Border 2006, p. 114) or the lower limit of this sequence (Berge 1997, p. 119).

  16. Which is here, due to the degenerate priors, equivalent to the maximum expected loglikelihood rule.

  17. ES-2007 restrict attention to a finite state space \(\Omega \). Because ES-2007 admit for non-finite index sets \(\Theta \), they impose weak compactness of \(\mathcal {M}_{0}\) and they also require that \(\mu _{0}\left( \theta ^{*}\right) \) has to be uniformly bounded away from zero if \( \theta ^{*}\) is in the support of \(\mu _{0}\). For our finite index sets, \(\mathcal {M}_{0}\) is weakly compact if, and only if, it is closed whereby the bounded-away-from-zero condition is automatically satisfied for finite index sets. For further details about their regularity assumptions, see Theorem 1 (ES-2007, p. 1288).

  18. A negative expected cross-entropy is impossible for the finite but not for the continuous case.

  19. Note that this assumption holds under Gilboa and Schmeidler’s (1989, p. 142) “extreme case.”

  20. Note that, for all \(\theta \) and \(\theta ^{\prime }=60-\theta \),

    $$\begin{aligned} D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })=D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta ^{\prime }}). \end{aligned}$$

    so that there might be priors in \(\mathcal {M}_{0}\), e.g.,

    $$\begin{aligned} 0.5\delta _{\theta }+0.5\delta _{\theta ^{\prime }}\text {,} \end{aligned}$$

    with two different KL-divergence minimizers in their support. Emerging posteriors formed from such priors are not necessarily Dirac measures.

  21. Also see p. 570 in Aliprantis and Border (2006).

References

  • Aliprantis, D.C., Border, K.: Infinite Dimensional Analysis. Springer, Berlin (2006)

    Google Scholar 

  • Anscombe, F.J., Aumann, R.J.: A definition of subjective probability. Ann. Am. Stat. 34, 199–205 (1963)

    Google Scholar 

  • Berge, C.: Topological Spaces. Dover Publications, New York (1997)

    Google Scholar 

  • Berk, R.H.: Limiting behaviour of posterior distributions when the model is incorrect. Ann. Math. Stat. 31, 51–58 (1966)

    Article  Google Scholar 

  • Billingsley, P.: Probability and Measure. Wiley, New York (1995)

    Google Scholar 

  • Chung, K.L., Fuchs, W.H.J.: On the distribution of values of sums of random variables. Mem. Am. Math. Soc. 6, 1–12 (1951) Reprinted in: AitSaliha, F., Hsu, E., Williams, R. (eds.) Selected Works of Kai Lai Chung, pp. 157–168. World Scientific, New Jersey (2008)

  • Diaconis, P., Freedman, D.: On the consistency of Bayes estimates. Ann. Stat. 14, 1–26 (1986)

    Article  Google Scholar 

  • Doob, J.L.: Application of the theory of martingales. In: Le Calcul des Probabilites et ses Applications, Colloques Internationaux du Centre National de la Recherche Scientifique, vol. 13, pp. 23–27. Paris: CNRS (1949)

  • Dow, J., Madrigal, V., Werlang, S.R.: Preferences, Common Knowledge, and Speculative Trade. Mimeo, New York (1990)

    Google Scholar 

  • Ellsberg, D.: Risk, ambiguity and the Savage axioms. Q. J. Econ. 75, 643–669 (1961)

    Article  Google Scholar 

  • Epstein, L.G., Schneider, M.: Learning under ambiguity. Rev. Econ. Stud. 74, 1275–1303 (2007)

    Article  Google Scholar 

  • Ghirardato, P., Maccheroni, F., Marinacci, M.: Differentiating ambiguity and ambiguity attitude. J. Econ. Theory 118, 133–173 (2004)

    Article  Google Scholar 

  • Gilboa, I.: Expected utility with purely subjective non-additive probabilities. J. Math. Econ. 16, 65–88 (1987)

    Article  Google Scholar 

  • Gilboa, I., Schmeidler, D.: Maxmin expected utility with non-unique priors. J. Math. Econ. 18, 141–153 (1989)

    Article  Google Scholar 

  • Gosh, J.K., Ramamoorthi, R.V.: Bayesian Nonparametrics. Springer, Berlin (2003)

    Google Scholar 

  • Halevy, Y.: The possibility of speculative trade between dynamically consistent agents. Games Econ. Behav. 46, 189–198 (2004)

    Article  Google Scholar 

  • Harrison, M., Kreps, D.: Speculative investor behavior in a stock market with heterogeneous expectations. Q. J. Econ. 42, 323–336 (1978)

    Article  Google Scholar 

  • Jaffray, J.-Y.: Dynamic decision making with belief functions. In: Yager, R.R., Fedrizzi, M., Kacprzyk, J. (eds.) Advances in the Dempster-Shafer Theory of Evidence, pp. 331–352. Wiley, New York (1994)

    Google Scholar 

  • Kleijn, B.J.K., van der Vaart, A.W.: Misspecification in infinite-dimensional Bayesian statistics. Ann. Stat. 34, 837–877 (2006)

    Article  Google Scholar 

  • Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)

    Article  Google Scholar 

  • Lijoi, A., Pruenster, I., Walker, S.O.: Extending Doob’s consistency theorem to nonparametric densities. Bernoulli 10, 651–663 (2004)

    Article  Google Scholar 

  • Marinacci, M.: Learning from ambiguous urns. Stat. Pap. 43, 145–151 (2002)

    Article  Google Scholar 

  • Mehra, R., Prescott, E.C.: The equity premium: a puzzle. J. Monet. Econ. 15, 145–161 (1985)

    Article  Google Scholar 

  • Mehra, R., Prescott, E.C.: The equity premium in retrospect. In: Constantinides, G.M., Harris, M., Stulz, R.M. (eds.) Handbook of the Economics and Finance, pp. 808–887. Elsevier, Amsterdam (2003)

    Google Scholar 

  • Muth, J.F.: Rational expectations and the theory of price movements. Econometrica 29, 315–335 (1961)

    Article  Google Scholar 

  • Nicholls, N., Romm, A.T., Zimper, A.: The impact of statistical learning on violations of the sure-thing principle. J. Risk Uncertain. 50, 97–115 (2015)

    Article  Google Scholar 

  • Savage, L.J.: The Foundations of Statistics. Wiley, New York (1954)

    Google Scholar 

  • Schmeidler, D.: Subjective probability and expected utility without additivity. Econometrica 57, 571–587 (1989)

    Article  Google Scholar 

  • Wakker, P.P.: Prospect Theory for Risk and Ambiguity. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  • Werner, J.: Speculative trade under ambiguity. Working paper (2014)

  • Wu, G., Gonzalez, R.: Nonlinear decision weights in choice under uncertainty. Manag. Sci. 45, 74–85 (1999)

    Article  Google Scholar 

  • Zimper, A.: Half empty, half full and why we can agree to disagree forever. J. Econ. Behav. Organ. 71, 283–299 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Zimper.

Additional information

We thank Larry Epstein, Massimo Marinacci, Daniele Pennesi, an anonymous referee, and especially Jan Werner for helpful comments and suggestions. Financial support from ERSA (Economic Research Southern Africa) is gratefully acknowledged.

Appendix: formal proofs

Appendix: formal proofs

Existence of an emerging posterior To see that the limit (43 ) exists, rewrite, for any \(\Theta ^{\prime }\), \(\pi _{\mu _{0}}\left( \Theta ^{\prime }\mid X_{1},\ldots ,X_{t}\right) \) as conditional expectation of the indicator function of \(\Theta ^{\prime }\) with respect to the induced probability measure P on the joint index and parameter space \(\left( \Theta \times \Omega ^{\infty },\mathcal {F\otimes }\Sigma ^{\infty }\right) \). To be precise, for the notation of our setup, it holds that, for all \( \theta \in \Theta \) and \(B\in \Sigma ^{\infty }\),

$$\begin{aligned} P_{\theta }\left( B\right) \equiv P\left( B\mid \theta \right) \equiv P\left( \Theta \times B\mid \left\{ \theta \right\} \times \Omega ^{\infty }\right) \end{aligned}$$

as well as, for all \(\Theta ^{\prime }\in \mathcal {F}\),

$$\begin{aligned} \mu _{0}\left( \Theta ^{\prime }\right) \equiv P\left( \Theta ^{\prime }\right) \equiv P\left( \Theta ^{\prime }\times \Omega ^{\infty }\right) \text {.} \end{aligned}$$

By Theorem 35.6 in Billingsley (1995) (which is an implication of the martingale convergence theorem), we obtain

$$\begin{aligned} \pi _{\mu _{0}}\left( \Theta ^{\prime }\mid X_{1},\ldots ,X_{t}\right)\equiv & {} E \left[ I_{\Theta ^{\prime }}\left( \theta \right) ,P\left( \theta \mid X_{1},\ldots ,X_{t}\right) \right] \\\rightarrow & {} E\left[ I_{\Theta ^{\prime }}\left( \theta \right) ,P\left( \theta \mid X_{1},X_{2},\ldots \right) \right] \equiv \pi _{\mu _{0}}^{\infty }\left( \Theta ^{\prime }\right) \end{aligned}$$

whereby convergence happens with P probability one.\(\square \)

Proof of Proposition 2

For notational convenience, we write

$$\begin{aligned} \pi _{\mu _{0}}^{t}\equiv \pi _{\mu _{0}}\left( \cdot \mid X_{1},\ldots ,X_{t}\right) \text {.} \end{aligned}$$
  • Step 1 Suppose that there exists some \(\pi _{\mu _{0}^{*}}^{\infty }\in \Pi _{R}^{\infty }\) but

    $$\begin{aligned} \pi _{\mu _{0}^{*}}^{\infty }\notin \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M}_{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}}\text {.} \end{aligned}$$
    (113)

    Since \(\mu _{0}^{*}\in \mathcal {M}_{0,R}^{\infty }\) and \(\mathcal {M}_{0}\) is compact, there must be (a.s. \(P_{\theta ^{*}}\)) some subsequence \( \left\{ \mu _{0}^{t_{k}}\right\} _{k\in \mathbb {N}}\) such that \(\mu _{0}^{t_{k}}\in \mathcal {M}_{0,R}^{t_{k}}\) for all \(k=1,2,\ldots \) which converges to \(\mu _{0}^{*}\). But then \(\pi _{\mu _{0}^{t_{k}}}^{t}\in \Pi _{R}^{t}\) for all \(k=1,2,\ldots \), implying

    $$\begin{aligned} \pi _{\mu _{0}^{*}}^{\infty }\in \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M}_{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}}\text {,} \end{aligned}$$
    (114)

    a contradiction to (113).

  • Step 2 Next suppose that there exists some

    $$\begin{aligned} \pi ^{\infty }\in \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M} _{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}} \end{aligned}$$
    (115)

    but

    $$\begin{aligned} \pi ^{\infty }\notin \Pi _{R}^{\infty }. \end{aligned}$$
    (116)

    By (115), there must be (a.s. \(P_{\theta ^{*}}\)) some subsequence \( \left\{ \pi _{\mu _{0}^{t_{k}}}^{t_{k}}\right\} _{k\in \mathbb {N}}\) such that \(\pi _{\mu _{0}^{t_{k}}}^{t_{k}}\in \Pi _{R}^{t_{k}}\) for all \( k=1,2,\ldots \) which converges to \(\pi ^{\infty }\). By compactness of \(\mathcal { M}_{0}\), we can extract a subsequence \(\left\{ \mu _{0}^{t_{k^{\prime }}}\right\} _{k^{\prime }\in \mathbb {N}}\) from \(\left\{ \mu _{0}^{t_{k}}\right\} _{k\in \mathbb {N}}\) which converges to some \(\mu _{0}^{*}\in \mathcal {M}_{0}\). As this \(\mu _{0}^{*}\) is a cluster point of \(\left\{ \mathcal {M}_{0,R}^{t}\right\} _{t\in \mathbb {N}}\), we have that \(\mu _{0}^{*}\in \mathcal {M}_{0,R}^{\infty }\) so that \(\pi _{\mu _{0}^{*}}^{\infty }\in \Pi _{R}^{\infty }\). Since \(\left\{ \pi _{\mu _{0}^{t_{k^{\prime }}}}^{t_{k^{\prime }}}\right\} _{k^{\prime }\in \mathbb {N} }\) converges to \(\pi _{\mu _{0}^{*}}^{\infty }\) as well as to \(\pi ^{\infty }\), we must have that \(\pi _{\mu _{0}^{*}}^{\infty }=\pi ^{\infty }\), a contradiction to (116).

Collecting Steps 1 and 2 proves the proposition. \(\square \)

Proof of Theorem 2

  • Step 1 By Definition 7, the expected loglikelihood maximizing prior(s) will be in \(\mathcal {M}_{0,\gamma }^{t}\) for all t, i.e.,

    $$\begin{aligned} \arg \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\ln \prod _{i=1}^{t}\frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \subseteq \mathcal {M}_{0,\gamma }^{t}\text {.} \end{aligned}$$
    (117)

    By a similar formal argument as under Step 3 below, it can be shown that \( \mathcal {M}_{0,\gamma }^{\infty }\) (a.s. \(P_{\theta ^{*}}\)) is never empty since

    $$\begin{aligned} \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \subseteq \mathcal {M}_{0,\gamma }^{\infty }\text { a.s. } P_{\theta ^{*}}\text {,} \end{aligned}$$
    (118)

    i.e., the expected Kullback–Leibler divergence minimizers belong asymptotically to the expected \(\gamma \)-loglikelihood maximizers for any value of \(\gamma \).

  • Step 2 Observe that any

    $$\begin{aligned} \mu _{0}^{\prime }\in \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
    (119)

    belongs to (88) if (88) is non-empty.

  • Step 3 Suppose now that

    $$\begin{aligned} \mu _{0}^{\prime }\in \mathcal {M}_{0,\gamma }^{\infty } \end{aligned}$$
    (120)

    but

    $$\begin{aligned} \mu _{0}^{\prime }\notin \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \text {.} \end{aligned}$$
    (121)

    This is only possible if there exists some subsequence \(\left\{ t_{k}\right\} _{k\in \mathbb {N}}\subseteq \left\{ t\right\} _{t\in \mathbb {N} }\) such that

    $$\begin{aligned} \sum _{\theta \in \Theta }\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \Leftrightarrow \end{aligned}$$
    (122)
    $$\begin{aligned} \sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta } \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \cdot \mu _{0}\left( \theta \right) \Rightarrow \nonumber \\ \end{aligned}$$
    (123)
    $$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}} \sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \lim _{t_{k}\rightarrow \infty }\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta } \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \cdot \mu _{0}\left( \theta \right) \text {.}\nonumber \\ \end{aligned}$$
    (124)

Focus on the l.h.s. term of (124). Because \(\Theta \) is finite, we can switch the sum and the limit to obtain

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}} \sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right) =\sum _{\theta \in \Theta }\lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right) \text {.} \end{aligned}$$
(125)

Turn now to the r.h.s. term of (124). We are going to argue, via Berge’s (1997) maximum theorem, that we can switch the max and the limit. To this purpose, define the following inner product

$$\begin{aligned} f\left( y_{t_{k}},\mu _{0}\right)\equiv & {} \sum _{\theta \in \Theta }\frac{1}{ t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
(126)
$$\begin{aligned}= & {} y_{t_{k}}\cdot \mu _{0} \end{aligned}$$
(127)

where

$$\begin{aligned} y_{t_{k}}=\left( \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta _{1}}}{\mathrm{{d}}m},\ldots ,\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{ \mathrm{{d}}\varphi _{\theta _{n}}}{\mathrm{{d}}m}\right) \end{aligned}$$
(128)

and

$$\begin{aligned} \mu _{0}=\left( \mu _{0}\left( \theta _{1}\right) ,\ldots ,\mu _{0}\left( \theta _{n}\right) \right) \text {.} \end{aligned}$$
(129)

Next define the value function of (126) as

$$\begin{aligned} M\left( y_{t_{k}}\right) =\max _{\mu _{0}\in \mathcal {M}_{0}}f\left( y_{t_{k}},\mu _{0}\right) \text {.} \end{aligned}$$
(130)

Since \(\mathcal {M}_{0}\) is, as a closed subset of \(\triangle ^{n}\), compact and f is continuous, we know from Berge’s (1997, p. 116)Footnote 21 maximum theorem that the value function \(M\left( y_{t_{k}}\right) \) is continuous. Consequently, if \( \lim _{t_{k}\rightarrow \infty }y_{t_{k}}\) exists, then

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\gamma \cdot M\left( y_{t_{k}}\right) =\gamma \cdot M\left( \lim _{t_{k}\rightarrow \infty }y_{t_{k}}\right) \text {.} \end{aligned}$$
(131)

In other words, if

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \end{aligned}$$
(132)

exists (a.s. \(P_{\theta ^{*}}\)) for all \(\theta \), which we will show in a moment, then

$$\begin{aligned}&\lim _{t_{k}\rightarrow \infty }\gamma \cdot \max _{\mu _{0}\in \mathcal {M} _{0}}\sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
(133)
$$\begin{aligned}&\quad =\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}} \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
(134)
$$\begin{aligned}&\quad =\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \text {.} \end{aligned}$$
(135)

Recall that the law of large numbers implies for the i.i.d.

$$\begin{aligned} \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\left( X_{1}\right) ,\ldots ,\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\left( X_{n}\right) \end{aligned}$$
(136)

that

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}=E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \text { a.s. }P_{\theta ^{*}} \end{aligned}$$
(137)

for any \(\theta \). By (137) and using (125) and (135), we obtain that (124) is (a.s. \(P_{\theta ^{*}}\)) equivalent to

$$\begin{aligned}&\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \ge \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) \Leftrightarrow \end{aligned}$$
(138)
$$\begin{aligned}&\quad \sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \ge \gamma \cdot \left( -\min _{\mu _{0}\in \mathcal {M} _{0}}\sum _{\theta \in \Theta }-E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) \right) \qquad \qquad \end{aligned}$$
(139)
$$\begin{aligned}&\quad \Leftrightarrow \nonumber \\&\quad -\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) +\gamma \cdot E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{*}}}{\mathrm{{d}}m}\right] \end{aligned}$$
(140)
$$\begin{aligned}&\quad \le \gamma \cdot \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }-E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{ \mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) +\gamma \cdot E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{*}}}{\mathrm{{d}}m}\right] \nonumber \\&\quad \Leftrightarrow \nonumber \\&\quad \gamma \cdot \sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}^{\prime }\left( \theta \right) -\left( 1-\gamma \right) \sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}} \left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \end{aligned}$$
(141)
$$\begin{aligned}&\quad \le \gamma \cdot \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}\left( \theta \right) \nonumber \\&\quad \Leftrightarrow \nonumber \\&\quad \sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}^{\prime }\left( \theta \right) \\&\quad \le \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}\left( \theta \right) +\frac{\left( 1-\gamma \right) }{\gamma }\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{ \mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \text {.} \nonumber \end{aligned}$$
(142)

This proves that

$$\begin{aligned} \mu _{0}^{\prime }\notin \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
(143)

is in \(\mathcal {M}_{0,\gamma }^{\infty }\) if, and only if, \(\mu _{0}^{\prime }\) is in (88).

Step 4 Combining the last argument with Step 1 shows that

$$\begin{aligned} \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) =\mathcal {M}_{0,\gamma }^{\infty } \end{aligned}$$
(144)

whenever (88) is empty.

Collecting results proves the theorem. \(\square \)

Proof of Proposition 4

  • Step 1 Consider the a priori decision situation. Analogous to the argumentation for Proposition 1, we have that

    $$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h,\mathcal {M}_{0}\circ \Phi \right) =\frac{1}{3}\text { and } \mathrm{{MEU}}\left( g_{E}h^{\prime },\mathcal {M}_{0}\circ \Phi \right) =\frac{2}{3} \text {.} \end{aligned}$$
    (145)

    Further, note that

    $$\begin{aligned} \mathrm{{MEU}}\left( g_{E}h,\mathcal {M}_{0}\circ \Phi \right)\le & {} \mathrm{{MEU}}\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \mu _{0}^{\prime }\left( \theta \right) \right) \end{aligned}$$
    (146)
    $$\begin{aligned}\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{29}\right) \end{aligned}$$
    (147)
    $$\begin{aligned}= & {} EU\left( g_{E}h,\varphi _{29}\right) \end{aligned}$$
    (148)
    $$\begin{aligned}= & {} \frac{29}{90}<\frac{1}{3} \end{aligned}$$
    (149)

    as well as

    $$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h^{\prime },\mathcal {M}_{0}\circ \Phi \right)\le & {} \mathrm{{MEU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \mu _{0}^{\prime \prime }\left( \theta \right) \right) \end{aligned}$$
    (150)
    $$\begin{aligned}\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{31}\right) \end{aligned}$$
    (151)
    $$\begin{aligned}= & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\varphi _{31}\right) \end{aligned}$$
    (152)
    $$\begin{aligned}= & {} \frac{1}{3}+\frac{29}{90}<\frac{2}{3}\text {.} \end{aligned}$$
    (153)

    Consequently, the inequalities (2223) hold.

  • Step 2 Consider the a posteriori decision situation. Note that

    $$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h,\Pi _{\gamma }^{\infty }\circ \Phi \right) =\frac{1}{3} \text { and }\mathrm{{MEU}}\left( g_{E}h^{\prime },\Pi _{\gamma }^{\infty }\circ \Phi \right) =\frac{2}{3}\text {.} \end{aligned}$$
    (154)

    The specifications of \(\mu _{0}^{\prime }\) and \(\mu _{0}^{\prime \prime }\) imply, by Corollary 1, for some sufficiently large \(\gamma \) the existence of some

    $$\begin{aligned} \delta _{\theta ^{\prime }},\delta _{\theta ^{\prime \prime }}\in \Pi _{\gamma }^{\infty } \end{aligned}$$
    (155)

    such that \(\theta ^{\prime }<30<\theta ^{\prime \prime }\). Consequently,

    $$\begin{aligned} \mathrm{{MEU}}\left( g_{E}h,\Pi _{Z}^{\infty }\circ \Phi \right)\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{\theta ^{\prime }}\right) \end{aligned}$$
    (156)
    $$\begin{aligned}\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{29}\right) \end{aligned}$$
    (157)
    $$\begin{aligned}< & {} \frac{1}{3} \end{aligned}$$
    (158)

    as well as

    $$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h^{\prime },\Pi _{Z}^{\infty }\circ \Phi \right)\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{\theta ^{\prime \prime }}\right) \end{aligned}$$
    (159)
    $$\begin{aligned}\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{31}\right) \end{aligned}$$
    (160)
    $$\begin{aligned}< & {} \frac{2}{3}\text {,} \end{aligned}$$
    (161)

    which proves the inequalities (2425).

\(\square \)

Proof of Proposition 5

By the proof of Theorem 2 (cf., inequality (138) as well as Step 2.), \(\mu _{0}^{\prime }\in \mathcal {M} _{0,\gamma }^{\infty }\) if, and only if,

$$\begin{aligned} \sum _{\theta ^{\prime }\in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{\prime }}}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta ^{\prime }\right) \ge \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta ^{\prime }\in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{\prime }}}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta ^{\prime }\right) \text {.} \end{aligned}$$
(162)

By Assumption 2, we have, for any \(\theta \in \Theta \),

$$\begin{aligned} \delta _{\theta }\in \Pi _{\gamma }^{\infty } \end{aligned}$$
(163)

if, and only if,

(164)
(165)
(166)
$$\begin{aligned} \frac{\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \cdot \ln \mathrm{{d}}\varphi _{\theta }\left( \omega _{2}\right) +\left( \frac{2}{3}-\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) \cdot \ln \left( \frac{2 }{3}-\mathrm{{d}}\varphi _{\theta }\left( \omega _{2}\right) \right) }{\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \cdot \ln \mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) +\left( \frac{2}{3}-\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) \cdot \ln \left( \frac{2}{3} -\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) }\\\le & {} \gamma \nonumber \\\Leftrightarrow & {} \nonumber \end{aligned}$$
(167)
$$\begin{aligned} \frac{\frac{1}{3}\cdot \ln \frac{\theta }{90}+\frac{1}{3}\cdot \ln \left( \frac{60-\theta }{90}\right) }{\frac{2}{3}\cdot \ln \frac{1}{3}}\le & {} \gamma \text {,}\nonumber \\ \end{aligned}$$
(168)

which proves the proposition. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zimper, A., Ma, W. Bayesian learning with multiple priors and nonvanishing ambiguity. Econ Theory 64, 409–447 (2017). https://doi.org/10.1007/s00199-016-1007-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00199-016-1007-y

Keywords

JEL Classification

Navigation