Bayesian learning with multiple priors and nonvanishing ambiguity

Zimper, Alexander; Ma, Wei

doi:10.1007/s00199-016-1007-y

Bayesian learning with multiple priors and nonvanishing ambiguity

Research Article
Published: 26 October 2016

Volume 64, pages 409–447, (2017)
Cite this article

Economic Theory Aims and scope Submit manuscript

Alexander Zimper^1,2 &
Wei Ma^3,4

547 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

The existing models of Bayesian learning with multiple priors by Marinacci (Stat Pap 43:145–151, 2002) and by Epstein and Schneider (Rev Econ Stud 74:1275–1303, 2007) formalize the intuitive notion that ambiguity should vanish through statistical learning in an one-urn environment. Moreover, the multiple priors decision maker of these models will eventually learn the “truth.” To accommodate nonvanishing violations of Savage’s (The foundations of statistics, Wiley, New York, 1954) sure-thing principle, as reported in Nicholls et al. (J Risk Uncertain 50:97–115, 2015), we construct and analyze a model of Bayesian learning with multiple priors for which ambiguity does not necessarily vanish in an one-urn environment. Our decision maker only forms posteriors from priors that survive a prior selection rule which discriminates, with probability one, against priors whose expected Kullback–Leibler divergence from the “truth” is too far off from the minimal expected Kullback–Leibler divergence over all priors. The “stubbornness” parameter of our prior selection rule thereby governs how much ambiguity will remain in the limit of our learning model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Networks: Theory and Philosophy

Bayesian Inference

Bayesian inference and “truth”: a comment on Hoffman, Singh, and Prakash

Article 18 September 2015

Notes

An alternative (and under specific circumstances formally equivalent) class of models that accommodate ambiguity attitudes are models of Choquet decision making/Choquet expected utility (Schmeidler 1989; Gilboa 1987). These Choquet models express ambiguity attitudes through nonadditive probability measures.
The seminal contribution is Doob’s (1949) consistency theorem. For generalizations and further references, see, e.g., Diaconis and Freedman (1986), Chapter 1 in Gosh and Ramamoorthi (2003), and Lijoi et al. (2004).
Of course, the utility function u is only unique up to some positive affine transformation.
Gilboa and Schmeidler (1989) axiomatize MEU within an Anscombe and Aumann (1963) framework where the set of consequences Z contains all lotteries over some non-degenerate set of deterministic prizes. Under this Gilboa and Schmeidler (1989) axiomatization, $\mathcal {P}$ is uniquely pinned down as a non-empty, closed and convex set of finitely additive probability measures. We ignore here this specific axiomatic foundation and also allow for, e.g., non-convex $\mathcal {P}$.
For a more realistic generalization of MEU, see the $\alpha $-MEU concept of Ghirardato et al. (2004).
We exclude $\theta =0$ and $\theta =60$ out of convenience since we do not want to make a stand about Bayesian updating in light of events that the decision maker perceives as impossible. For example, we want to avoid the case that a prior attaches probability one to zero yellow balls in the urn, but the decision maker observes a yellow ball drawn from the urn.
In the literature, $\left( \Theta ,\mathcal {F}\right) $ is also called the (possibly multiple) parameter space.
As a generalization of the single-likelihood environment, ES-2007 consider a “ multiple-likelihoods” environment where an index $\theta $ in $\Theta $ corresponds to a set of $\theta $ -conditional probability measures. Although the formal results of this paper will be exclusively derived for the single-likelihood environment, compare Sect. 6 for an outlook on future research.
Since there is an one-to-one correspondence between all probability measures on $\left( \Theta ,\mathcal {F}\right) $ and the points in $\triangle ^{n}$, we slightly abuse notation and write $\mu _{0}\equiv \left( \mu _{0}^{1},\ldots ,\mu _{0}^{n}\right) \in \triangle ^{n}$ for the additive probability measure $\mu _{0}:\mathcal {F}\rightarrow \left[ 0,1\right] $ such that, for all non-empty $\Theta ^{\prime }\in \mathcal {F}$,
$$\begin{aligned} \mu _{0}\left( \Theta ^{\prime }\right) =\Sigma _{\left\{ \theta _{j}\in \Theta ^{\prime }\right\} }\mu _{0}^{j}\text {.} \end{aligned}$$
Whenever we henceforth speak of emerging posteriors or emerging priors, the qualification “with probability one” is implicitly included.
An accessible proof can be found in Section 1.3.3. of Gosh and Ramamoorthi (2003).
The KL-divergence is asymmetric and does not satisfy the triangle inequality.
By Bayes’ rule, we have that
$$\begin{aligned} \frac{\mathrm{{d}}\varphi _{\theta ^{*}}\left( x\right) }{\mathrm{{d}}\varphi _{\theta }\left( x\right) }=\frac{\pi _{\mu _{0}}\left( \theta ^{*}\mid x\right) /\mu _{0}\left( \theta ^{*}\right) }{\pi _{\mu _{0}}\left( \theta \mid x\right) /\mu _{0}\left( \theta \right) }\text {.} \end{aligned}$$
The set of all cluster points of a given sequence of sets is also called the topological lim sup of this sequence (Aliprantis and Border 2006, p. 114) or the upper limit of this sequence (Berge 1997, p. 119).
The set of all limit points of a given sequence of sets is also called the topological lim inf of this sequence (Aliprantis and Border 2006, p. 114) or the lower limit of this sequence (Berge 1997, p. 119).
Which is here, due to the degenerate priors, equivalent to the maximum expected loglikelihood rule.
ES-2007 restrict attention to a finite state space $\Omega $. Because ES-2007 admit for non-finite index sets $\Theta $, they impose weak compactness of $\mathcal {M}_{0}$ and they also require that $\mu _{0}\left( \theta ^{*}\right) $ has to be uniformly bounded away from zero if $ \theta ^{*}$ is in the support of $\mu _{0}$. For our finite index sets, $\mathcal {M}_{0}$ is weakly compact if, and only if, it is closed whereby the bounded-away-from-zero condition is automatically satisfied for finite index sets. For further details about their regularity assumptions, see Theorem 1 (ES-2007, p. 1288).
A negative expected cross-entropy is impossible for the finite but not for the continuous case.
Note that this assumption holds under Gilboa and Schmeidler’s (1989, p. 142) “extreme case.”
Note that, for all $\theta $ and $\theta ^{\prime }=60-\theta $,
$$\begin{aligned} D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })=D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta ^{\prime }}). \end{aligned}$$
so that there might be priors in $\mathcal {M}_{0}$, e.g.,
$$\begin{aligned} 0.5\delta _{\theta }+0.5\delta _{\theta ^{\prime }}\text {,} \end{aligned}$$
with two different KL-divergence minimizers in their support. Emerging posteriors formed from such priors are not necessarily Dirac measures.
Also see p. 570 in Aliprantis and Border (2006).

References

Aliprantis, D.C., Border, K.: Infinite Dimensional Analysis. Springer, Berlin (2006)
Google Scholar
Anscombe, F.J., Aumann, R.J.: A definition of subjective probability. Ann. Am. Stat. 34, 199–205 (1963)
Google Scholar
Berge, C.: Topological Spaces. Dover Publications, New York (1997)
Google Scholar
Berk, R.H.: Limiting behaviour of posterior distributions when the model is incorrect. Ann. Math. Stat. 31, 51–58 (1966)
Article Google Scholar
Billingsley, P.: Probability and Measure. Wiley, New York (1995)
Google Scholar
Chung, K.L., Fuchs, W.H.J.: On the distribution of values of sums of random variables. Mem. Am. Math. Soc. 6, 1–12 (1951) Reprinted in: AitSaliha, F., Hsu, E., Williams, R. (eds.) Selected Works of Kai Lai Chung, pp. 157–168. World Scientific, New Jersey (2008)
Diaconis, P., Freedman, D.: On the consistency of Bayes estimates. Ann. Stat. 14, 1–26 (1986)
Article Google Scholar
Doob, J.L.: Application of the theory of martingales. In: Le Calcul des Probabilites et ses Applications, Colloques Internationaux du Centre National de la Recherche Scientifique, vol. 13, pp. 23–27. Paris: CNRS (1949)
Dow, J., Madrigal, V., Werlang, S.R.: Preferences, Common Knowledge, and Speculative Trade. Mimeo, New York (1990)
Google Scholar
Ellsberg, D.: Risk, ambiguity and the Savage axioms. Q. J. Econ. 75, 643–669 (1961)
Article Google Scholar
Epstein, L.G., Schneider, M.: Learning under ambiguity. Rev. Econ. Stud. 74, 1275–1303 (2007)
Article Google Scholar
Ghirardato, P., Maccheroni, F., Marinacci, M.: Differentiating ambiguity and ambiguity attitude. J. Econ. Theory 118, 133–173 (2004)
Article Google Scholar
Gilboa, I.: Expected utility with purely subjective non-additive probabilities. J. Math. Econ. 16, 65–88 (1987)
Article Google Scholar
Gilboa, I., Schmeidler, D.: Maxmin expected utility with non-unique priors. J. Math. Econ. 18, 141–153 (1989)
Article Google Scholar
Gosh, J.K., Ramamoorthi, R.V.: Bayesian Nonparametrics. Springer, Berlin (2003)
Google Scholar
Halevy, Y.: The possibility of speculative trade between dynamically consistent agents. Games Econ. Behav. 46, 189–198 (2004)
Article Google Scholar
Harrison, M., Kreps, D.: Speculative investor behavior in a stock market with heterogeneous expectations. Q. J. Econ. 42, 323–336 (1978)
Article Google Scholar
Jaffray, J.-Y.: Dynamic decision making with belief functions. In: Yager, R.R., Fedrizzi, M., Kacprzyk, J. (eds.) Advances in the Dempster-Shafer Theory of Evidence, pp. 331–352. Wiley, New York (1994)
Google Scholar
Kleijn, B.J.K., van der Vaart, A.W.: Misspecification in infinite-dimensional Bayesian statistics. Ann. Stat. 34, 837–877 (2006)
Article Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Article Google Scholar
Lijoi, A., Pruenster, I., Walker, S.O.: Extending Doob’s consistency theorem to nonparametric densities. Bernoulli 10, 651–663 (2004)
Article Google Scholar
Marinacci, M.: Learning from ambiguous urns. Stat. Pap. 43, 145–151 (2002)
Article Google Scholar
Mehra, R., Prescott, E.C.: The equity premium: a puzzle. J. Monet. Econ. 15, 145–161 (1985)
Article Google Scholar
Mehra, R., Prescott, E.C.: The equity premium in retrospect. In: Constantinides, G.M., Harris, M., Stulz, R.M. (eds.) Handbook of the Economics and Finance, pp. 808–887. Elsevier, Amsterdam (2003)
Google Scholar
Muth, J.F.: Rational expectations and the theory of price movements. Econometrica 29, 315–335 (1961)
Article Google Scholar
Nicholls, N., Romm, A.T., Zimper, A.: The impact of statistical learning on violations of the sure-thing principle. J. Risk Uncertain. 50, 97–115 (2015)
Article Google Scholar
Savage, L.J.: The Foundations of Statistics. Wiley, New York (1954)
Google Scholar
Schmeidler, D.: Subjective probability and expected utility without additivity. Econometrica 57, 571–587 (1989)
Article Google Scholar
Wakker, P.P.: Prospect Theory for Risk and Ambiguity. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Werner, J.: Speculative trade under ambiguity. Working paper (2014)
Wu, G., Gonzalez, R.: Nonlinear decision weights in choice under uncertainty. Manag. Sci. 45, 74–85 (1999)
Article Google Scholar
Zimper, A.: Half empty, half full and why we can agree to disagree forever. J. Econ. Behav. Organ. 71, 283–299 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics, University of Pretoria, Hatfield, 0028, South Africa
Alexander Zimper
Kiel Institute for the World Economy, Kiel, Germany
Alexander Zimper
International Business School Suzhou, Xi’an Jiaotong-Liverpool University, Suzhou, China
Wei Ma
Department of Economics, University of Pretoria, Pretoria, South Africa
Wei Ma

Authors

Alexander Zimper
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Zimper.

Additional information

We thank Larry Epstein, Massimo Marinacci, Daniele Pennesi, an anonymous referee, and especially Jan Werner for helpful comments and suggestions. Financial support from ERSA (Economic Research Southern Africa) is gratefully acknowledged.

Appendix: formal proofs

Existence of an emerging posterior To see that the limit (43 ) exists, rewrite, for any $\Theta ^{\prime }$, $\pi _{\mu _{0}}\left( \Theta ^{\prime }\mid X_{1},\ldots ,X_{t}\right) $ as conditional expectation of the indicator function of $\Theta ^{\prime }$ with respect to the induced probability measure P on the joint index and parameter space $\left( \Theta \times \Omega ^{\infty },\mathcal {F\otimes }\Sigma ^{\infty }\right) $. To be precise, for the notation of our setup, it holds that, for all $ \theta \in \Theta $ and $B\in \Sigma ^{\infty }$,

$$\begin{aligned} P_{\theta }\left( B\right) \equiv P\left( B\mid \theta \right) \equiv P\left( \Theta \times B\mid \left\{ \theta \right\} \times \Omega ^{\infty }\right) \end{aligned}$$

as well as, for all $\Theta ^{\prime }\in \mathcal {F}$,

$$\begin{aligned} \mu _{0}\left( \Theta ^{\prime }\right) \equiv P\left( \Theta ^{\prime }\right) \equiv P\left( \Theta ^{\prime }\times \Omega ^{\infty }\right) \text {.} \end{aligned}$$

By Theorem 35.6 in Billingsley (1995) (which is an implication of the martingale convergence theorem), we obtain

$$\begin{aligned} \pi _{\mu _{0}}\left( \Theta ^{\prime }\mid X_{1},\ldots ,X_{t}\right)\equiv & {} E \left[ I_{\Theta ^{\prime }}\left( \theta \right) ,P\left( \theta \mid X_{1},\ldots ,X_{t}\right) \right] \\\rightarrow & {} E\left[ I_{\Theta ^{\prime }}\left( \theta \right) ,P\left( \theta \mid X_{1},X_{2},\ldots \right) \right] \equiv \pi _{\mu _{0}}^{\infty }\left( \Theta ^{\prime }\right) \end{aligned}$$

whereby convergence happens with P probability one.$\square $

Proof of Proposition 2

For notational convenience, we write

$$\begin{aligned} \pi _{\mu _{0}}^{t}\equiv \pi _{\mu _{0}}\left( \cdot \mid X_{1},\ldots ,X_{t}\right) \text {.} \end{aligned}$$

Step 1 Suppose that there exists some $\pi _{\mu _{0}^{*}}^{\infty }\in \Pi _{R}^{\infty }$ but
$$\begin{aligned} \pi _{\mu _{0}^{*}}^{\infty }\notin \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M}_{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}}\text {.} \end{aligned}$$
(113)
Since $\mu _{0}^{*}\in \mathcal {M}_{0,R}^{\infty }$ and $\mathcal {M}_{0}$ is compact, there must be (a.s. $P_{\theta ^{*}}$) some subsequence $ \left\{ \mu _{0}^{t_{k}}\right\} _{k\in \mathbb {N}}$ such that $\mu _{0}^{t_{k}}\in \mathcal {M}_{0,R}^{t_{k}}$ for all $k=1,2,\ldots $ which converges to $\mu _{0}^{*}$. But then $\pi _{\mu _{0}^{t_{k}}}^{t}\in \Pi _{R}^{t}$ for all $k=1,2,\ldots $, implying
$$\begin{aligned} \pi _{\mu _{0}^{*}}^{\infty }\in \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M}_{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}}\text {,} \end{aligned}$$
(114)
a contradiction to (113).
Step 2 Next suppose that there exists some
$$\begin{aligned} \pi ^{\infty }\in \overline{\lim }\bigcup \limits _{\mu _{0}\in \mathcal {M} _{0,R}^{t}}\left\{ \pi _{\mu _{0}}^{t}\right\} \text {, a.s. }P_{\theta ^{*}} \end{aligned}$$
(115)
but
$$\begin{aligned} \pi ^{\infty }\notin \Pi _{R}^{\infty }. \end{aligned}$$
(116)
By (115), there must be (a.s. $P_{\theta ^{*}}$) some subsequence $ \left\{ \pi _{\mu _{0}^{t_{k}}}^{t_{k}}\right\} _{k\in \mathbb {N}}$ such that $\pi _{\mu _{0}^{t_{k}}}^{t_{k}}\in \Pi _{R}^{t_{k}}$ for all $ k=1,2,\ldots $ which converges to $\pi ^{\infty }$. By compactness of $\mathcal { M}_{0}$, we can extract a subsequence $\left\{ \mu _{0}^{t_{k^{\prime }}}\right\} _{k^{\prime }\in \mathbb {N}}$ from $\left\{ \mu _{0}^{t_{k}}\right\} _{k\in \mathbb {N}}$ which converges to some $\mu _{0}^{*}\in \mathcal {M}_{0}$. As this $\mu _{0}^{*}$ is a cluster point of $\left\{ \mathcal {M}_{0,R}^{t}\right\} _{t\in \mathbb {N}}$, we have that $\mu _{0}^{*}\in \mathcal {M}_{0,R}^{\infty }$ so that $\pi _{\mu _{0}^{*}}^{\infty }\in \Pi _{R}^{\infty }$. Since $\left\{ \pi _{\mu _{0}^{t_{k^{\prime }}}}^{t_{k^{\prime }}}\right\} _{k^{\prime }\in \mathbb {N} }$ converges to $\pi _{\mu _{0}^{*}}^{\infty }$ as well as to $\pi ^{\infty }$, we must have that $\pi _{\mu _{0}^{*}}^{\infty }=\pi ^{\infty }$, a contradiction to (116).

Collecting Steps 1 and 2 proves the proposition. $\square $

Proof of Theorem 2

Step 1 By Definition 7, the expected loglikelihood maximizing prior(s) will be in $\mathcal {M}_{0,\gamma }^{t}$ for all t, i.e.,
$$\begin{aligned} \arg \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\ln \prod _{i=1}^{t}\frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \subseteq \mathcal {M}_{0,\gamma }^{t}\text {.} \end{aligned}$$
(117)
By a similar formal argument as under Step 3 below, it can be shown that $ \mathcal {M}_{0,\gamma }^{\infty }$ (a.s. $P_{\theta ^{*}}$) is never empty since
$$\begin{aligned} \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \subseteq \mathcal {M}_{0,\gamma }^{\infty }\text { a.s. } P_{\theta ^{*}}\text {,} \end{aligned}$$
(118)
i.e., the expected Kullback–Leibler divergence minimizers belong asymptotically to the expected $\gamma $-loglikelihood maximizers for any value of $\gamma $.
Step 2 Observe that any
$$\begin{aligned} \mu _{0}^{\prime }\in \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \end{aligned}$$
(119)
belongs to (88) if (88) is non-empty.
Step 3 Suppose now that
$$\begin{aligned} \mu _{0}^{\prime }\in \mathcal {M}_{0,\gamma }^{\infty } \end{aligned}$$
(120)
but
$$\begin{aligned} \mu _{0}^{\prime }\notin \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \text {.} \end{aligned}$$
(121)
This is only possible if there exists some subsequence $\left\{ t_{k}\right\} _{k\in \mathbb {N}}\subseteq \left\{ t\right\} _{t\in \mathbb {N} }$ such that
$$\begin{aligned} \sum _{\theta \in \Theta }\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \Leftrightarrow \end{aligned}$$
(122)

$$\begin{aligned} \sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta } \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \cdot \mu _{0}\left( \theta \right) \Rightarrow \nonumber \\ \end{aligned}$$
(123)

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}} \sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right)\ge & {} \lim _{t_{k}\rightarrow \infty }\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta } \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \cdot \mu _{0}\left( \theta \right) \text {.}\nonumber \\ \end{aligned}$$
(124)

Focus on the l.h.s. term of (124). Because $\Theta $ is finite, we can switch the sum and the limit to obtain

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}} \sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right) =\sum _{\theta \in \Theta }\lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}^{\prime }\left( \theta \right) \text {.} \end{aligned}$$

(125)

Turn now to the r.h.s. term of (124). We are going to argue, via Berge’s (1997) maximum theorem, that we can switch the max and the limit. To this purpose, define the following inner product

$$\begin{aligned} f\left( y_{t_{k}},\mu _{0}\right)\equiv & {} \sum _{\theta \in \Theta }\frac{1}{ t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$

(126)

$$\begin{aligned}= & {} y_{t_{k}}\cdot \mu _{0} \end{aligned}$$

(127)

where

$$\begin{aligned} y_{t_{k}}=\left( \frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta _{1}}}{\mathrm{{d}}m},\ldots ,\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{ \mathrm{{d}}\varphi _{\theta _{n}}}{\mathrm{{d}}m}\right) \end{aligned}$$

(128)

and

$$\begin{aligned} \mu _{0}=\left( \mu _{0}\left( \theta _{1}\right) ,\ldots ,\mu _{0}\left( \theta _{n}\right) \right) \text {.} \end{aligned}$$

(129)

Next define the value function of (126) as

$$\begin{aligned} M\left( y_{t_{k}}\right) =\max _{\mu _{0}\in \mathcal {M}_{0}}f\left( y_{t_{k}},\mu _{0}\right) \text {.} \end{aligned}$$

(130)

Since $\mathcal {M}_{0}$ is, as a closed subset of $\triangle ^{n}$, compact and f is continuous, we know from Berge’s (1997, p. 116)^{Footnote 21} maximum theorem that the value function $M\left( y_{t_{k}}\right) $ is continuous. Consequently, if $ \lim _{t_{k}\rightarrow \infty }y_{t_{k}}$ exists, then

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\gamma \cdot M\left( y_{t_{k}}\right) =\gamma \cdot M\left( \lim _{t_{k}\rightarrow \infty }y_{t_{k}}\right) \text {.} \end{aligned}$$

(131)

In other words, if

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m} \end{aligned}$$

(132)

exists (a.s. $P_{\theta ^{*}}$) for all $\theta $, which we will show in a moment, then

$$\begin{aligned}&\lim _{t_{k}\rightarrow \infty }\gamma \cdot \max _{\mu _{0}\in \mathcal {M} _{0}}\sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$

(133)

$$\begin{aligned}&\quad =\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\lim _{t_{k}\rightarrow \infty }\sum _{\theta \in \Theta }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}} \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \end{aligned}$$

(134)

$$\begin{aligned}&\quad =\gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }\lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\cdot \mu _{0}\left( \theta \right) \text {.} \end{aligned}$$

(135)

Recall that the law of large numbers implies for the i.i.d.

$$\begin{aligned} \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\left( X_{1}\right) ,\ldots ,\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\left( X_{n}\right) \end{aligned}$$

(136)

that

$$\begin{aligned} \lim _{t_{k}\rightarrow \infty }\frac{1}{t_{k}}\sum \limits _{i=1}^{t_{k}}\ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}=E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \text { a.s. }P_{\theta ^{*}} \end{aligned}$$

(137)

for any $\theta $. By (137) and using (125) and (135), we obtain that (124) is (a.s. $P_{\theta ^{*}}$) equivalent to

$$\begin{aligned}&\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \ge \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) \Leftrightarrow \end{aligned}$$

(138)

$$\begin{aligned}&\quad \sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \ge \gamma \cdot \left( -\min _{\mu _{0}\in \mathcal {M} _{0}}\sum _{\theta \in \Theta }-E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) \right) \qquad \qquad \end{aligned}$$

(139)

$$\begin{aligned}&\quad \Leftrightarrow \nonumber \\&\quad -\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{ \mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) +\gamma \cdot E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{*}}}{\mathrm{{d}}m}\right] \end{aligned}$$

(140)

$$\begin{aligned}&\quad \le \gamma \cdot \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }-E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{ \mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta \right) +\gamma \cdot E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{*}}}{\mathrm{{d}}m}\right] \nonumber \\&\quad \Leftrightarrow \nonumber \\&\quad \gamma \cdot \sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}^{\prime }\left( \theta \right) -\left( 1-\gamma \right) \sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}} \left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \end{aligned}$$

(141)

$$\begin{aligned}&\quad \le \gamma \cdot \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}\left( \theta \right) \nonumber \\&\quad \Leftrightarrow \nonumber \\&\quad \sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}^{\prime }\left( \theta \right) \\&\quad \le \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })d\mu _{0}\left( \theta \right) +\frac{\left( 1-\gamma \right) }{\gamma }\sum _{\theta \in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta }}{ \mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta \right) \text {.} \nonumber \end{aligned}$$

(142)

This proves that

$$\begin{aligned} \mu _{0}^{\prime }\notin \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) \end{aligned}$$

(143)

is in $\mathcal {M}_{0,\gamma }^{\infty }$ if, and only if, $\mu _{0}^{\prime }$ is in (88).

Step 4 Combining the last argument with Step 1 shows that

$$\begin{aligned} \arg \min _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta \in \Theta }D_\mathrm{{KL}}(\varphi _{\theta ^{*}}||\varphi _{\theta })\cdot \mu _{0}\left( \theta \right) =\mathcal {M}_{0,\gamma }^{\infty } \end{aligned}$$

(144)

whenever (88) is empty.

Collecting results proves the theorem. $\square $

Proof of Proposition 4

Step 1 Consider the a priori decision situation. Analogous to the argumentation for Proposition 1, we have that
$$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h,\mathcal {M}_{0}\circ \Phi \right) =\frac{1}{3}\text { and } \mathrm{{MEU}}\left( g_{E}h^{\prime },\mathcal {M}_{0}\circ \Phi \right) =\frac{2}{3} \text {.} \end{aligned}$$
(145)
Further, note that
$$\begin{aligned} \mathrm{{MEU}}\left( g_{E}h,\mathcal {M}_{0}\circ \Phi \right)\le & {} \mathrm{{MEU}}\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \mu _{0}^{\prime }\left( \theta \right) \right) \end{aligned}$$
(146)

$$\begin{aligned}\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{29}\right) \end{aligned}$$
(147)

$$\begin{aligned}= & {} EU\left( g_{E}h,\varphi _{29}\right) \end{aligned}$$
(148)

$$\begin{aligned}= & {} \frac{29}{90}<\frac{1}{3} \end{aligned}$$
(149)
as well as
$$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h^{\prime },\mathcal {M}_{0}\circ \Phi \right)\le & {} \mathrm{{MEU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \mu _{0}^{\prime \prime }\left( \theta \right) \right) \end{aligned}$$
(150)

$$\begin{aligned}\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{31}\right) \end{aligned}$$
(151)

$$\begin{aligned}= & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\varphi _{31}\right) \end{aligned}$$
(152)

$$\begin{aligned}= & {} \frac{1}{3}+\frac{29}{90}<\frac{2}{3}\text {.} \end{aligned}$$
(153)
Consequently, the inequalities (22–23) hold.
Step 2 Consider the a posteriori decision situation. Note that
$$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h,\Pi _{\gamma }^{\infty }\circ \Phi \right) =\frac{1}{3} \text { and }\mathrm{{MEU}}\left( g_{E}h^{\prime },\Pi _{\gamma }^{\infty }\circ \Phi \right) =\frac{2}{3}\text {.} \end{aligned}$$
(154)
The specifications of $\mu _{0}^{\prime }$ and $\mu _{0}^{\prime \prime }$ imply, by Corollary 1, for some sufficiently large $\gamma $ the existence of some
$$\begin{aligned} \delta _{\theta ^{\prime }},\delta _{\theta ^{\prime \prime }}\in \Pi _{\gamma }^{\infty } \end{aligned}$$
(155)
such that $\theta ^{\prime }<30<\theta ^{\prime \prime }$. Consequently,
$$\begin{aligned} \mathrm{{MEU}}\left( g_{E}h,\Pi _{Z}^{\infty }\circ \Phi \right)\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{\theta ^{\prime }}\right) \end{aligned}$$
(156)

$$\begin{aligned}\le & {} EU\left( g_{E}h,\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{29}\right) \end{aligned}$$
(157)

$$\begin{aligned}< & {} \frac{1}{3} \end{aligned}$$
(158)
as well as
$$\begin{aligned} \mathrm{{MEU}}\left( f_{E}h^{\prime },\Pi _{Z}^{\infty }\circ \Phi \right)\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{\theta ^{\prime \prime }}\right) \end{aligned}$$
(159)

$$\begin{aligned}\le & {} \mathrm{{EU}}\left( f_{E}h^{\prime },\sum _{\theta \in \Theta }\varphi _{\theta }\left( \omega \right) \delta _{31}\right) \end{aligned}$$
(160)

$$\begin{aligned}< & {} \frac{2}{3}\text {,} \end{aligned}$$
(161)
which proves the inequalities (24–25).

$\square $

Proof of Proposition 5

By the proof of Theorem 2 (cf., inequality (138) as well as Step 2.), $\mu _{0}^{\prime }\in \mathcal {M} _{0,\gamma }^{\infty }$ if, and only if,

$$\begin{aligned} \sum _{\theta ^{\prime }\in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{\prime }}}{\mathrm{{d}}m}\right] \cdot \mu _{0}^{\prime }\left( \theta ^{\prime }\right) \ge \gamma \cdot \max _{\mu _{0}\in \mathcal {M}_{0}}\sum _{\theta ^{\prime }\in \Theta }E_{\varphi _{\theta ^{*}}}\left[ \ln \frac{\mathrm{{d}}\varphi _{\theta ^{\prime }}}{\mathrm{{d}}m}\right] \cdot \mu _{0}\left( \theta ^{\prime }\right) \text {.} \end{aligned}$$

(162)

By Assumption 2, we have, for any $\theta \in \Theta $,

$$\begin{aligned} \delta _{\theta }\in \Pi _{\gamma }^{\infty } \end{aligned}$$

(163)

if, and only if,

(164)

(165)

(166)

$$\begin{aligned} \frac{\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \cdot \ln \mathrm{{d}}\varphi _{\theta }\left( \omega _{2}\right) +\left( \frac{2}{3}-\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) \cdot \ln \left( \frac{2 }{3}-\mathrm{{d}}\varphi _{\theta }\left( \omega _{2}\right) \right) }{\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \cdot \ln \mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) +\left( \frac{2}{3}-\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) \cdot \ln \left( \frac{2}{3} -\mathrm{{d}}\varphi _{\theta ^{*}}\left( \omega _{2}\right) \right) }\\\le & {} \gamma \nonumber \\\Leftrightarrow & {} \nonumber \end{aligned}$$

(167)

$$\begin{aligned} \frac{\frac{1}{3}\cdot \ln \frac{\theta }{90}+\frac{1}{3}\cdot \ln \left( \frac{60-\theta }{90}\right) }{\frac{2}{3}\cdot \ln \frac{1}{3}}\le & {} \gamma \text {,}\nonumber \\ \end{aligned}$$

(168)

which proves the proposition. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zimper, A., Ma, W. Bayesian learning with multiple priors and nonvanishing ambiguity. Econ Theory 64, 409–447 (2017). https://doi.org/10.1007/s00199-016-1007-y

Download citation

Received: 23 July 2014
Accepted: 12 October 2016
Published: 26 October 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00199-016-1007-y

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian learning with multiple priors and nonvanishing ambiguity

Abstract

Access this article