Skip to main content
Log in

Conditional expectation of correspondences and economic applications

  • Research Article
  • Published:
Economic Theory Aims and scope Submit manuscript

Abstract

We characterize the properties of convexity, compactness and preservation of upper hemicontinuity for conditional expectation of correspondences via the condition of “nowhere equivalence,” and hence extend the classical results on integration of correspondences. To illustrate the economic applications of those properties, we present new results on large games, abstract economies with asymmetric information and stochastic games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. For some classical references, see Hildenbrand (1974), Matheron (1975), Castaing and Valadier (1977), Klein and Thompson (1984), and Yannelis (1991) and the references therein.

  2. A \({\mathcal {T}}\)-measurable set A in a measure space \((T, {\mathcal {T}}, \lambda )\) is said to be an atom if \(\lambda (A) \ne 0\), and for any \({\mathcal {T}}\)-measurable subset B of A, \(\lambda (B) =0\) or \(\lambda (A)\). A measure space is atomless if it has no atoms.

  3. See, for example, Aumann (1965), Hildenbrand (1974) and Aubin and Frankowska (2009).

  4. In this case, the conditional expectation is simply an integration, and hence all those regularity properties continue to hold.

  5. For example, if a correspondence is not convex valued, then its conditional expectation with respect to the whole \(\sigma \)-algebra is the set of all measurable selections of the correspondence, which is not convex.

  6. If \({\mathcal {T}}\) does coincide with \({\mathcal {G}}\) modulo null sets on a subset \(D\in {\mathcal {T}}\) of strictly positive measure, then D is said to be a \({\mathcal {G}}\)-atom; see the book Jacobs (1978) for further discussions.

  7. Two sub-\(\sigma \)-algebras \({\mathcal {F}}\) and \({\mathcal {G}}\) of \({\mathcal {T}}\) are said to be independent if for any \(F \in {\mathcal {F}}\) and \(G \in {\mathcal {G}}\), \(\lambda (F \cap G) = \lambda (F) \cdot \lambda (G)\).

  8. For the precise definitions of nowhere equivalence and atomless independent supplement, see Definition 1 below.

  9. As noted in Sect. 5.2, such a condition is also necessary.

  10. Theorem 1.2 in Dynkin and Evstigneev (1977) showed the scalar version of such a property under some other condition on an atomless probability space. Lemma 1 below indicates that their condition is equivalent to the nowhere equivalence condition in the case of an atomless probability space.

  11. The theory of large games has been extensively studied, see the survey by Khan and Sun (2002). For some recent developments of large games in the last decade, see Sun and Yannelis (2007a, b), Qiao et al. (2016) and Sun et al. (2017), Martins-da-Rocha and Topuzu (2008) and Barelli and Duggan (2015), Yannelis (2009), Carmona and Podczeck (2014), Yu (2014), Balbus et al. (2015) and Bilancini and Boncinelli (2016), Khan et al. (2015) and Sun and Zhang (2015), which consider the respective issues of incentive compatibility, general preferences, abstract economies, discontinuous payoffs, rationalizability, strategic complementarities, relationship between different equilibrium notions and infinite-dimensional actions.

  12. See, for example, Schmeidler (1973).

  13. It is easy to see that mixed-strategy equilibria exist for an arbitrary set of players with the whole action profile as the externality part; see, for example, Ma (1969). The purpose of Khan et al. (2013) and He et al. (2017) was to study pure-strategy equilibria in a large game with traits under suitable conditions, where the externality part was formulated as the joint distribution of the traits and actions. In these two papers, the \(\sigma \)-algebra generated by both the payoff functions and the traits is strictly smaller than the original \(\sigma \)-algebra, restricted on any non-trivial collection of players. In our setting, such a measurability constraint is only imposed on the \(\sigma \)-algebra generated by the traits, but not on the \(\sigma \)-algebra generated by the payoff functions.

  14. The condition “many more agents than strategies” was studied in Rustichini and Yannelis (1991), and further developed in Tourky and Yannelis (2001). Recently, Greinecker and Podczeck (2016a) studied core equivalence results in atomless economies with possibly infinite-dimensional and non-separable commodity spaces based on a generalization of this condition; see Greinecker and Podczeck (2016b) for more discussions.

  15. It covers the earlier existence results in Nowak and Raghavan (1992), Duffie et al. (1994) and Duggan (2012).

  16. We shall use the common notation of a conditional probability \(\nu (D\mid {\mathcal {G}})\) to denote the conditional expectation \({\mathbb {E}}^{\nu }({\mathbf {1}}_{D}|{\mathcal {G}})\) of \({\mathbf {1}}_{D}\) given \({\mathcal {G}}\) under the positive measure \(\nu \), where \({\mathbf {1}}_{D}\) is the indicator function of the set D.

  17. By Corollary 1 in Sect. 5.1, this condition is the same as nowhere equivalence with respect to \(\lambda _i\) for all \(1 \le i \le m\).

  18. To prove Theorem 1 in the “Appendix,” we first prove Proposition 1, which shows that the conditional expectation of the constant correspondence \(\{0,1\}\) is equal to the conditional expectation of its convex hull—the constant correspondence [0, 1], in terms of the vector measure \(\lambda \). When \(\lambda \) is a scalar measure, such a result can be found in Theorem 1.14 of Jacobs (1978); see also Lemma 2 in Maharam (1942) for a similar result in the scalar case. A result from Soler (1970), which is similar to Proposition 1 here, was stated in Dynkin and Evstigneev (1977, p. 337) in terms of a probability measure; see Footnote 29 for more details.

  19. For the definition of upper hemicontinuity of a correspondence, see Definition 1 in Hildenbrand (1974, p. 21).

  20. Note that \(L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) is a reflexive Banach space for \(1< p < \infty \). Thus, the weak compactness (resp. weak upper hemicontinuity) and the weak\(^*\) compactness (resp. weak\(^*\) upper hemicontinuity) are equivalent in \(L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) for \(1< p < \infty \).

  21. For discussions on the relationship between randomized strategies and related notions in large games, see, for example, Khan et al. (2015).

  22. A pure-strategy profile f can be viewed as a randomized-strategy profile \(\bar{f}\) by letting \(\bar{f}(\omega )\) be the Dirac measure concentrated at \(f(\omega )\).

  23. Since \({\mathcal {G}}\) is countably generated, \(L_{1}((\Omega ,{\mathcal {G}},\tau ),{\mathbb {R}}^m)\) is separable, which implies that the closed ball with radius C in \(L_{\infty } ((\Omega ,{\mathcal {G}},\tau ),{\mathbb {R}}^m)\) is metrizable in the weak\(^*\) topology [see Royden and Fitzpatrick (2010, Corollary 11, p. 306)]. Hence, \(\Gamma \) is also metrizable.

  24. See, for example, Khan and Sun (2002) for detailed discussions.

  25. A probability space with such a property characterizes the class of saturated probability spaces; see Subsection 4.5 of He et al. (2017) for further discussions.

  26. See, for example, Podczeck (2008, p. 840) and He et al. (2017, p. 793) for the discussions on related notions.

  27. The total variation distance of two probability measures \(\mu \) and \(\nu \) on \((S, {\mathcal {S}})\) is \(\left\| \mu -\nu \right\| _{{TV}} = \sup _{{D\in {{\mathcal {S}}}}}|\mu (D)-\nu (D)|\). A sequence of probability measures \(\{\mu _n\}\) is said to be convergent to a probability measure \(\mu _0\) in total variation if \(\lim _{n \rightarrow \infty } \left\| \mu _n -\mu _0 \right\| _{{TV}} = 0\).

  28. For simplicity, we have assumed our underlying measure \(\lambda \) to be atomless. Of course, the convexity type results need this assumption. However, the results on compactness and preservation of upper hemicontinuity of integration of correspondences do allow \(\lambda \) to have atoms; see, for example, Propositions 7 and 8 on page 73 of Hildenbrand (1974). By treating the atomless and purely atomic parts separately, one can also generalize the results in Theorems 3 and 4 to such a setting.

  29. As noted in Footnote 18, the scalar version of this result can be found in Theorem 1.14 of Jacobs (1978) [Maharam (1942, Lemma 2) also stated a similar result]. Let g be an integrable function from an atomless probability space \((T, {\mathcal {T}}, \nu )\) to \({\mathbb {R}}^m\). Assume that \({\mathcal {T}}\) is nowhere equivalent to a sub-\(\sigma \)-algebra \({\mathcal {G}}\) under \(\nu \). The following result from Soler (1970) was stated in Dynkin and Evstigneev (1977, p. 337). For every \({\mathcal {T}}\)-measurable mapping \(f :T \rightarrow [0,1]\), there exists some set \(D \in {\mathcal {T}}\) such that \({\mathbb {E}}^{\nu } (f g| {\mathcal {G}}) = {\mathbb {E}}^{\nu } ({\mathbf {1}}_{D} g | {\mathcal {G}})\). Without loss of generality, assume that g is positive. For each \(1 \le i \le m\), let \(g_i\) be the i-th component of g and \(\lambda _i\) the measure whose Radon–Nikodym derivative with respect to \(\nu \) is \(g_i\). Then, it can be verified that \({\mathbb {E}}^{\nu } (f g_i| {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} (f | {\mathcal {G}}) \cdot {\mathbb {E}}^{\nu } (g_i| {\mathcal {G}})\) and also \({\mathbb {E}}^{\nu } ({\mathbf {1}}_{D} g_i| {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{D} | {\mathcal {G}}) \cdot {\mathbb {E}}^{\nu } (g_i| {\mathcal {G}})\). It is obvious that \(\mu = \sum _{1\le i \le m} \lambda _i\) is absolutely continuous with respect to \(\nu \). By Lemma 2, \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \). Hence Proposition 1 implies that \({\mathbb {E}}^{\nu } (f g| {\mathcal {G}}) = {\mathbb {E}}^{\nu } ({\mathbf {1}}_{D} g | {\mathcal {G}})\). We are not able to read (Soler 1970) that is not in English. For the sake of completeness, we give a complete proof for Proposition 1.

  30. See He et al. (2017, Lemma 5) and the first paragraph in the Proof of (ii) \(\Longrightarrow \) (i) in He et al. (2017, p. 810).

  31. Otherwise, we can rescale \(\lambda _1\) to be a probability measure.

  32. This result is well known. See, for example, Lemma 6 of He et al. (2017).

  33. Otherwise, one can work with \(f'_1\), \(f'_2\) and \(F'\), where \(f'_i = f_i + |f_1| + |f_2| + 1\), and \(F'(t) = \left\{ a + |f_1(t)| + |f_2(t)| + 1 :a\in F(t) \right\} \).

  34. For simplicity, the target space of the correspondence is \({\mathbb {R}}\). One can easily define a new correspondence on \({\mathbb {R}}^l\) such that each of other \(l-1\) dimensions only contains 0.

  35. See, for example, Aumann (1965, p. 610).

  36. Since \((D,{\mathcal {T}}^D,\lambda _1^D)\) is an atomless probability space, for each \(k \ge 1\), one can divide the set D into \(2^k\) disjoint \({\mathcal {T}}^D\)-measurable subsets \(\{D_{k}^{j}\}_{1 \le j \le 2^k}\) such that (1) \(\cup _{1 \le j \le 2^k}D_{k}^{j} = D\), (2) \(\lambda ^D(D_{k}^{j}) = \frac{1}{2^k}\) for \(1 \le j \le 2^k\), and (3) \(D_{k+1}^{2j-1} \cup D_{k+1}^{2j} = D_k^j\) for \(k \ge 1\) and \(1 \le j \le 2^k\). For \(t \in D_k^j\), let \(\varphi _k(t) = 1\) when j is odd, and \(-1\) when j is even. Then \(\{\varphi _k\}_{k\in {\mathbb {N}}}\) is an orthonormal sequence with the needed property.

References

  • Aubin, J.-P., Frankowska, H.: Set-Valued Analysis. Springer, Berlin (2009)

    Book  Google Scholar 

  • Aumann, R.J.: Integrals of set-valued functions. J. Math. Anal. Appl. 12, 1–12 (1965)

    Article  Google Scholar 

  • Balbus, Ł., Dziewulski, P., Reffett, K., Woźny, Ł.: Differential information in large games with strategic complementarities. Econ. Theory 59, 201–243 (2015). doi:10.1007/s00199-014-0827-x

    Article  Google Scholar 

  • Balder, E.J.: A unifying approach to existence of Nash equilibria. Int. J. Game Theory 24, 79–94 (1995)

    Article  Google Scholar 

  • Barelli, P., Duggan, J.: Extremal choice equilibrium with applications to large games, stochastic games, & endogenous institutions. J. Econ. Theory 155, 95–130 (2015)

    Article  Google Scholar 

  • Bilancini, E., Boncinelli, L.: Strict Nash equilibria in non-atomic games with strict single crossing in players (or types) and actions. Econ. Theory Bull. 4, 95–109 (2016)

    Article  Google Scholar 

  • Blackwell, D.: Discounted dynamic programming. Ann. Math. Stat. 36, 226–235 (1965)

    Article  Google Scholar 

  • Bogachev, V.I.: Measure Theory, vol. 1. Springer, Berlin (2007)

    Book  Google Scholar 

  • Carmona, G., Podczeck, K.: Existence of Nash equilibrium in games with a measure space of players and discontinuous payoff functions. J. Econ. Theory 152, 130–178 (2014)

    Article  Google Scholar 

  • Castaing, C., Valadier, M.: Convex Analysis and Measurable Multifunctions, Lecture Notes in Mathematics No. 580. Springer, Berlin (1977)

    Book  Google Scholar 

  • Chow, Y.S., Teicher, H.: Probability Theory: Independence, Interchangeability, Martingales. Springer, New York (2012)

    Google Scholar 

  • Debreu, G.: A social equilibrium existence theorem. Proc. Natl. Acad. Sci. USA 38, 803–886 (1952)

    Article  Google Scholar 

  • Duffie, D., Geanakoplos, J., Mas-Colell, A., McLennan, A.: Stationary Markov equilibria. Econometrica 62, 745–781 (1994)

    Article  Google Scholar 

  • Duggan, J.: Noisy stochastic games. Econometrica 80, 2017–2045 (2012)

    Article  Google Scholar 

  • Dynkin, E.B., Evstigneev, I.V.: Regular conditional expectations of correspondences. Theory Probab. Appl. 21, 325–338 (1977)

    Article  Google Scholar 

  • Greinecker, M., Podczeck, K.: Edgeworth’s conjecture and the number of agents and commodities. Econ. Theory 62, 93–130 (2016a). doi:10.1007/s00199-015-0866-y

    Article  Google Scholar 

  • Greinecker, M., Podczeck, K.: Core equivalence with differentiated commodities. Working paper (2016b)

  • Hanen, A., Neveu, J.: Atomes conditionnels d’un espace de probabilité. Acta Mathematica Hungarica 17, 443–449 (1966)

    Article  Google Scholar 

  • He, W., Sun, Y.: Stationary Markov perfect equilibria in discounted stochastic games. J. Econ. Theory 169, 35–61 (2017)

    Article  Google Scholar 

  • He, W., Sun, X., Sun, Y.: Modeling infinitely many agents. Theor. Econ. 12, 771–815 (2017)

    Article  Google Scholar 

  • Hildenbrand, W.: Core and Equilibria of a Large Economy. Princeton University Press, Princeton (1974)

    Google Scholar 

  • Jacobs, K.: Measure and Integral. Academic Press, London (1978)

    Google Scholar 

  • Khan, M.A., Sun, Y.: Non-cooperative games with many players. In: Aumann, R.J., Hart, S. (eds.) Handbook of Game Theory, Chapter 46, vol. 3, pp. 1761–1808. North-Holland, Amsterdam (2002)

    Google Scholar 

  • Khan, M.A., Rath, K.P., Sun, Y., Yu, H.: Large games with a bio-social typology. J. Econ. Theory 148, 1122–1149 (2013)

    Article  Google Scholar 

  • Khan, M.A., Rath, K.P., Sun, Y., Yu, H.: Strategic uncertainty and the ex-post Nash property in large games. Theor. Econ. 10, 103–129 (2015)

    Article  Google Scholar 

  • Klein, E., Thompson, A.C.: Theory of Correspondences. Wiley, New York (1984)

    Google Scholar 

  • Loeb, P.A.: Real Analysis. Birkhäuser, Boston (2016)

    Book  Google Scholar 

  • Ma, T.-W.: On sets with convex sections. J. Math. Anal. Appl. 27, 413–416 (1969)

    Article  Google Scholar 

  • Maharam, D.: On homogeneous measure algebras. Proc. Natl. Acad. Sci. 28, 108–111 (1942)

    Article  Google Scholar 

  • Matheron, G.: Random Sets and Integral Geometry. Wiley, London (1975)

    Google Scholar 

  • Martins-da-Rocha, V.F., Topuzu, M.: Cournot-Nash equilibria in continuum games with non-ordered preference. J. Econ. Theory 140, 314–327 (2008)

    Article  Google Scholar 

  • Neveu, J.: Atomes conditionnels d’espaces de probalite et theorie de l’information. In: Symposium on Probability Methods in Analysis. Springer, Berlin, pp. 256–271 (1967)

  • Nowak, A.S., Raghavan, T.E.S.: Existence of stationary correlated equilibria with symmetric information for discounted stochastic games. Math. Oper. Res. 17, 519–527 (1992)

    Article  Google Scholar 

  • Podczeck, K.: On the convexity and compactness of the integral of a Banach space valued correspondence. J. Math. Econ. 44, 836–852 (2008)

    Article  Google Scholar 

  • Qiao, L., Sun, Y., Zhang, Z.: Conditional exact law of large numbers and asymmetric information economies with aggregate uncertainty. Econ. Theory 62, 43–64 (2016). doi:10.1007/s00199-014-0855-6.pdf

  • Rath, K.P.: A direct proof of the existence of pure strategy equilibria in games with a continuum of players. Econ. Theory 2, 427–433 (1992). doi:10.1007/BF01211424

    Article  Google Scholar 

  • Rauh, M.T.: Non-cooperative games with a continuum of players whose payoffs depend on summary statistics. Econ. Theory 21, 901–906 (2003). doi:10.1007/s00199-001-0252-9

    Article  Google Scholar 

  • Royden, H.L., Fitzpatrick, P.M.: Real Analysis, 4th edn. Prentice Hall, Boston (2010)

    Google Scholar 

  • Rustichini, A., Yannelis, N.C.: What is perfect competition. In: Khan, M.A., Yannelis, N.C. (eds.) Equilibrium Theory in Infinite Dimensional Spaces, pp. 249–265. Springer, Berlin (1991)

    Chapter  Google Scholar 

  • Schmeidler, D.: Equilibrium points of nonatomic games. J. Stat. Phys. 7, 295–300 (1973)

    Article  Google Scholar 

  • Shapley, L.: Stochastic games. Proc. Natl. Acad. Sci. USA 39, 1095–1100 (1953)

    Article  Google Scholar 

  • Soler, J.-L.: Notion de liberté en statistique mathématique. Université Joseph-Fourier - Grenoble I, Modélisation et simulation (1970). (in French)

    Google Scholar 

  • Sun, Y., Yannelis, N.C.: Perfect competition in asymmetric information economies: compatibility of efficiency and incentives. J. Econ. Theory 134, 175–194 (2007a)

    Article  Google Scholar 

  • Sun, Y., Yannelis, N.C.: Core, equilibria and incentives in large asymmetric information economies. Games Econ. Behav. 61, 131–155 (2007b)

    Article  Google Scholar 

  • Sun, X., Zhang, Y.: Pure-strategy Nash equilibria in nonatomic games with infinite-dimensional action spaces. Econ. Theory 58, 161–182 (2015). doi:10.1007/s00199-013-0795-6

    Article  Google Scholar 

  • Sun, X., Sun, Y., Wu, L., Yannelis, N.C.: Equilibria and incentives in private information economies. J. Econ. Theory 169, 474–488 (2017)

    Article  Google Scholar 

  • Tourky, R., Yannelis, N.C.: Markets with many more agents than commodities: Aumann’s “Hidden” assumption. J. Econ. Theory 101, 189–221 (2001)

  • Yannelis, N.C.: Integration of Banach-valued correspondences. In: Khan, M.A., Yannelis, N.C. (eds.) Equilibrium Theory in Infinite Dimensional Spaces. Springer, Berlin (1991)

    Google Scholar 

  • Yannelis, N.C.: Debreu’s social equilibrium theorem with asymmetric information and a continuum of agents. Econ. Theory 38, 419–432 (2009). doi:10.1007/s00199-007-0246-3

    Article  Google Scholar 

  • Yu, H.: Rationalizability in large games. Econ. Theory 55, 457–479 (2014). doi:10.1007/s00199-013-0756-0

    Article  Google Scholar 

  • Yu, H., Zhu, W.: Large games with transformed summary statistics. Econ. Theory 26, 237–241 (2005). doi:10.1007/s00199-004-0516-2

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei He.

Additional information

The authors thank Bin Wu and Nicholas C. Yannelis for helpful suggestions. This version owes substantially to the careful reading and expository suggestions of three anonymous referees. The research was supported in part by the Singapore Ministry of Education Academic Research Fund Tier 1 Grants R-122-000-227-112 and R-146-000-170-112.

Appendix

Appendix

Following the notation in Sect. 2.1, let \({\mathcal {T}}\) be a \(\sigma \)-algebra on a nonempty set T and \({\mathcal {G}}\) a sub-\(\sigma \)-algebra of \({\mathcal {T}}\) throughout this appendix.

1.1 Technical preparations

The following lemma shows that the nowhere equivalence condition is preserved under the absolute continuity condition of measures.

Lemma 2

If \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under some finite positive measure \(\nu \), then \({\mathcal {T}}\) is also nowhere equivalent to \({\mathcal {G}}\) under any finite positive measure \(\hat{\nu }\) which is absolutely continuous with respect to \(\nu \).

Proof

Suppose that \(\hat{\nu }\) is absolutely continuous with respect to \(\nu \), and \({\mathcal {T}}\) is not nowhere equivalent to \({\mathcal {G}}\) under \(\hat{\nu }\). Then one can find a set \(D'\in {\mathcal {T}}\) with \(\hat{\nu }(D')>0\) such that for any \({\mathcal {T}}\)-measurable subset \(D_0'\) of \(D'\), there exists a set \(D_1' \in {\mathcal {G}}^{D'}\) with \(\hat{\nu }(D_0'\triangle D_1')=0\).

Let \(\hat{\rho }\) be the Radon–Nikodym derivative of \(\hat{\nu }\) with respect to \(\nu \). Let \(E_0 = \{t\in T:\hat{\rho }(t)>0\}\) and \(D=D'\cap E_0\). We have

$$\begin{aligned} 0<\hat{\nu }(D')=\int _{D'}\hat{\rho }{{\mathrm{d}}}\nu =\int _{D'\cap E_0}\hat{\rho }{{\mathrm{d}}}\nu =\int _D\hat{\rho }{{\mathrm{d}}}\nu = \hat{\nu } (D), \end{aligned}$$

and hence \(\nu (D)>0\). For any \(\hat{\nu }\)-null set \(B\subseteq D\ (\subseteq E_0)\), we have \(0=\hat{\nu }(B)=\int _B\hat{\rho }{{\mathrm{d}}}\nu \), which implies that \(\nu (B)=0\).

Any \({\mathcal {T}}\)-measurable subset \(D_0\) of D is also a subset of \(D'\). Thus, there exists a set \(E_1\in {\mathcal {G}}\) such that \(\hat{\nu }(D_0\triangle (D'\cap E_1))=0\). Let \(E_2=D\cap E_1\). Then \(E_2\in {\mathcal {G}}^{D}\). We have

$$\begin{aligned} \hat{\nu }(E_2\setminus D_0) = \hat{\nu }((D\cap E_1) \setminus D_0) \le \hat{\nu }((D'\cap E_1) \setminus D_0)=0, \end{aligned}$$

and

$$\begin{aligned} \hat{\nu }(D_0\setminus E_2)&= \hat{\nu }(D_0\setminus (D\cap E_1)) \\&\le \hat{\nu }(D_0\setminus (D'\cap E_1)) + \hat{\nu }(D_0\cap (D'\setminus D)) \\&=0. \end{aligned}$$

Therefore, \(\hat{\nu }(D_0\triangle E_2)=0\). By the last sentence in the previous paragraph, \(\nu (D_0\triangle E_2)=0\).

As a result, we have shown that \(\nu (D)>0\) and for any \({\mathcal {T}}\)-measurable subset \(D_0\) of D, there exists a set \(E_1 \in {\mathcal {G}}\) such that \(\nu (D_0\triangle (D\cap E_1))=0\). This contradicts the assumption that \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\nu \). Therefore, \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\hat{\nu }\). \(\square \)

The following corollary implies that the nowhere equivalence condition for a vector measure is the same as the nowhere equivalence condition for each of its component.

Corollary 1

Let \(\lambda = (\lambda _1, \ldots , \lambda _m)\) be a vector measure, and \(\mu = \sum _{1\le i \le m} \lambda _i\).

  1. 1.

    If \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \), then \({\mathcal {T}}\) is also nowhere equivalent to \({\mathcal {G}}\) under \(\lambda _i\) for each i.

  2. 2.

    If \({\mathcal {T}}\) has a \({\mathcal {G}}\)-atom D under \(\mu \), then D is a \({\mathcal {G}}\)-atom of \({\mathcal {T}}\) under \(\lambda _i\) for any i such that \(\lambda _i(D) > 0\).

Proof

  1. (1)

    Since each \(\lambda _i\) is absolutely continuous with respect to \(\mu \) for each i, the claim holds by Lemma 2.

  2. (2)

    Suppose that D is a \({\mathcal {G}}\)-atom under \(\mu \). That is, \(\mu (D)>0\) and for any \({\mathcal {T}}\)-measurable subset \(D_0\) of D, there exists a set \(D_1 \in {\mathcal {G}}^{D}\) such that \(\mu (D_0\triangle D_1)=0\). By the definition of \(\mu \), we have

    $$\begin{aligned} \sum _{1\le i \le m} \lambda _i(D_0\triangle D_1) = 0, \end{aligned}$$

    which implies that \(\lambda _i(D_0\triangle D_1) = 0\) for each i. Thus, D is a \({\mathcal {G}}\)-atom of \({\mathcal {T}}\) under \(\lambda _i\) for any i such that \(\lambda _i(D) > 0\). The proof is complete.

\(\square \)

1.2 Proofs of the results in Sect. 2.2

Following Sect. 2.1, we fix some notation for this subsection. Let \(\lambda _i\) be an atomless finite positive measure on \((T,{\mathcal {T}})\) for \(1 \le i \le m\), \(\lambda = (\lambda _1, \ldots , \lambda _m)\), and \(\mu = \sum _{1\le i \le m} \lambda _i\).

To prove Theorem 1, we first prove the following proposition.

Proposition 1

Suppose that \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \). For every \({\mathcal {T}}\)-measurable mapping \(f :T \rightarrow [0,1]\), there exists some set \(D \in {\mathcal {T}}\) such that \({\mathbb {E}}^{\lambda } (f | {\mathcal {G}}) = {\mathbb {E}}^{\lambda } ({\mathbf {1}}_{D} | {\mathcal {G}})\).Footnote 29

Proof

Without loss of generality, we can assume that \((T, {\mathcal {G}}, \mu )\) is an atomless measure space. Otherwise, one can work with a larger sub-\(\sigma \)-algebra \(\hat{{\mathcal {G}}}\) such that \({\mathcal {G}}\subseteq \hat{{\mathcal {G}}}\), \((T, \hat{{\mathcal {G}}}, \mu )\) is an atomless measure space and \({\mathcal {T}}\) is nowhere equivalent to \(\hat{{\mathcal {G}}}\) under \(\mu \).Footnote 30 It is clear that \({\mathbb {E}}^{\lambda } (f | \hat{{\mathcal {G}}}) = {\mathbb {E}}^{\lambda } ({\mathbf {1}}_{D} | \hat{{\mathcal {G}}})\) implies \({\mathbb {E}}^{\lambda } (f | {\mathcal {G}}) = {\mathbb {E}}^{\lambda } ({\mathbf {1}}_{D} | {\mathcal {G}})\).

Below, we prove the result by induction.

Step 1 We first prove the result for \(\lambda _1\). Since \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \), it is also nowhere equivalent to \({\mathcal {G}}\) under \(\lambda _1\) by Corollary 1. Without loss of generality, we assume that \(\lambda _1\) is a probability measure.Footnote 31 By Lemma 1, \({\mathcal {G}}\) has a (countably generated) atomless independent supplement \({\mathcal {H}}\) in \({\mathcal {T}}\) under \(\lambda _1\).

Let \(\tilde{f} = {\mathbb {E}}^{\lambda _1} (f | {\mathcal {G}})\), and \(\tilde{{\mathcal {G}}}\) be a countably generated sub-\(\sigma \)-algebra of \({\mathcal {G}}\) such that \((T, \tilde{{\mathcal {G}}}, \lambda _1)\) is atomless and \(\tilde{f}\) is a \(\tilde{{\mathcal {G}}}\)-measurable function from T to [0, 1].

There exist two mappings g and h from T to [0, 1] satisfying that g (resp. h) is a measure-preserving mapping from \((T,\tilde{{\mathcal {G}}},\lambda _1)\) (resp. \((T,{\mathcal {H}},\lambda _1)\)) to the unit interval I with its Borel \(\sigma \)-algebra \({\mathcal {B}}\) and Lebesgue measure \(\eta \) such that for any \(E\in \tilde{{\mathcal {G}}}\) (resp. \(E \in {\mathcal {H}}\)), there exists some set \(E' \in {\mathcal {B}}\) with \(\lambda _1(E\triangle g^{-1}(E'))=0\) (resp. \(\lambda _1(E\triangle h^{-1}(E'))=0\)).Footnote 32 Then (gh) is a \({\mathcal {T}}\)-measurable function from T to \([0,1]\times [0,1]\), which induces the uniform distribution \(\eta \otimes \eta \) on the unit square \([0,1]\times [0,1]\).

Since \(\tilde{f}\) is \(\tilde{{\mathcal {G}}}\)-measurable and \(\tilde{{\mathcal {G}}}\) is essentially generated by g, there exists some \({\mathcal {B}}\)-measurable mapping \(\psi :[0,1] \rightarrow [0,1]\) such that \(\tilde{f}(t) = \psi (g(t))\) for \(\lambda _1\)-almost all \(t \in T\). Define a \({\mathcal {B}}\otimes {\mathcal {B}}\)-measurable mapping \(\tilde{\psi } :[0,1] \times [0,1] \rightarrow \{0,1\}\) as follows: \(\tilde{\psi }(a,b) = 1\) if \(0 \le b \le \psi (a)\), and 0 otherwise. Denote \(D = (g,h)^{-1}\left( \tilde{\psi }^{-1}(\{1\}) \right) \).

By the choice of the function g, for any set \(E \in {\mathcal {G}}\), there exists some \({\mathcal {B}}\)-measurable function \(e :[0,1] \rightarrow [0,1]\) such that \({\mathbb {E}}^{\lambda _1} ({\mathbf {1}}_{E} | \tilde{{\mathcal {G}}}) = e(g)\) for \(\lambda _1\)-almost all \(t \in T\). Since \({\mathcal {G}}\) and \({\mathcal {H}}\) are independent sub-\(\sigma \)-algebras of \({\mathcal {T}}\), it follows from Theorem 1 in Chow and Teicher (2012, p. 230) that \({\mathbb {E}}^{\lambda _1}({\mathbf {1}}_{E} | \sigma (\tilde{{\mathcal {G}}}, {\mathcal {H}})) = {\mathbb {E}}^{\lambda _1} ({\mathbf {1}}_{E} | \tilde{{\mathcal {G}}})\), where \(\sigma (\tilde{{\mathcal {G}}}, {\mathcal {H}}))\) is the \(\sigma \)-algebra generated by \(\{\tilde{{\mathcal {G}}}, {\mathcal {H}}\}\). We have

$$\begin{aligned} \int _{T} {\mathbf {1}}_{D}(t) {\mathbf {1}}_{E}(t) \lambda _1({{\mathrm{d}}}t)&= \int _T {\mathbf {1}}_{[0,\psi (g(t))]}(h(t)) {\mathbf {1}}_{E}(t)\lambda _1({{\mathrm{d}}}t) \\&= \int _T {\mathbb {E}}^{\lambda _1} \left( {\mathbf {1}}_{[0,\psi (g(t))]}(h(t)) {\mathbf {1}}_{E}(t) | \sigma (\tilde{{\mathcal {G}}}, {\mathcal {H}}) \right) \lambda _1({{\mathrm{d}}}t) \\&= \int _T {\mathbf {1}}_{[0,\psi (g(t))]}(h(t)) {\mathbb {E}}^{\lambda _1} ({\mathbf {1}}_{E} | \tilde{{\mathcal {G}}})\lambda _1({{\mathrm{d}}}t) \\&= \int _T {\mathbf {1}}_{[0,\psi (g(t))]}(h(t)) e(g(t))\lambda _1({{\mathrm{d}}}t) \\&= \int _{[0,1]}\int _{[0,1]} {\mathbf {1}}_{[0,\psi (a)]}(b) e(a) \eta ({{\mathrm{d}}}b)\eta ({{\mathrm{d}}}a) \\&= \int _{[0,1]} \psi (a) e(a) \eta ({{\mathrm{d}}}a) \\&= \int _{T} \psi (g(t)) e(g(t)) \lambda _1({{\mathrm{d}}}t) \\&= \int _{T} \psi (g(t)) {\mathbb {E}}^{\lambda _1} ({\mathbf {1}}_{E} | \tilde{{\mathcal {G}}}) \lambda _1({{\mathrm{d}}}t) \\&= \int _{T} \tilde{f}(t) {\mathbf {1}}_{E}(t) \lambda _1({{\mathrm{d}}}t). \end{aligned}$$

Thus, \({\mathbb {E}}^{\lambda _1} (f | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _1} ({\mathbf {1}}_{D} | {\mathcal {G}})\).

Step 2 Suppose that we have shown the result for any k atomless positive measures. Consider the case \((\lambda _1, \ldots , \lambda _{k+1})\). Let \(\mu _{k+1} = \sum _{1 \le i \le k+1} \lambda _{i}\). Then \({\mathcal {T}}\) is also nowhere equivalent to \({\mathcal {G}}\) under \(\mu _{k+1}\). For \(i = 1, \ldots , k+1\), let \(\tilde{\rho }_i\) be the Radon–Nikodym derivative of \(\lambda _i\) with respect to \(\mu _{k+1}\). Let W be the set of all \({\mathcal {T}}\)-measurable functions from T to [0, 1], which is convex and weak\(^*\) compact in \(L_{\infty }((T, {\mathcal {T}},\mu _{k+1}),{\mathbb {R}})\) [by Theorem V.1 in Castaing and Valadier (1977)].

Note that \(\lambda _i\) is absolutely continuous with respect to \(\mu _{k+1}\) with a Radon–Nikodym derivative. For a function \(w \in W\), it is in \(L_{\infty }((T, {\mathcal {T}},\lambda _i),{\mathbb {R}})\), and hence also in \(L_{1}((T, {\mathcal {T}},\lambda _i),{\mathbb {R}})\). Define a mapping \(\Phi \) from W to \(\prod _{1 \le i \le k+1}L_{\infty }((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\) as follows:

$$\begin{aligned} \Phi (w) = \left( {\mathbb {E}}^{\lambda _1}(w|{\mathcal {G}}),\ldots , {\mathbb {E}}^{\lambda _{k+1}}(w|{\mathcal {G}}) \right) . \end{aligned}$$

Then \(\Phi \) is an affine function. We shall show that \(\Phi \) is continuous, where both W and \(\prod _{1 \le i \le k+1}L_{\infty }((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\) are endowed with weak\(^*\) topology.

Suppose that \(\{w_{\alpha \in D}\}\) is a net which weak\(^*\) converges to \(\tilde{w}_{0}\) in W, where D is a directed set. For \(1 \le i \le k+1\) and \(q \in L_{1}((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\),

$$\begin{aligned} \int _T {\mathbb {E}}^{\lambda _i}(w_{\alpha }|{\mathcal {G}})(t) q (t) \lambda _i({{\mathrm{d}}}t)&= \int _T {\mathbb {E}}^{\lambda _i}(w_{\alpha } q|{\mathcal {G}})(t) \lambda _i({{\mathrm{d}}}t) \\&= \int _T w_{\alpha }(t) q(t) \lambda _i({{\mathrm{d}}}t) \\&= \int _T w_{\alpha }(t) q(t) \tilde{\rho }_i(t) \mu _{k+1}({{\mathrm{d}}}t) \\&\rightarrow \int _T \tilde{w}_{0}(t) q(t) \tilde{\rho }_i(t) \mu _{k+1}({{\mathrm{d}}}t) \\&= \int _T \tilde{w}_{0}(t) q(t) \lambda _i({{\mathrm{d}}}t) \\&= \int _T {\mathbb {E}}^{\lambda _i}(\tilde{w}_{0}|{\mathcal {G}})(t) q (t) \lambda _i({{\mathrm{d}}}t). \end{aligned}$$

The convergence holds since \(w_{\alpha }\) weak\(^*\) converges to \(\tilde{w}_{0}\) in \(L_{\infty }((T, {\mathcal {T}},\mu _{k+1}),{\mathbb {R}})\) and \(q\tilde{\rho }_i \in L_{1}((T, {\mathcal {G}},\mu _{k+1}),{\mathbb {R}})\) (\(q \in L_{1}((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\)). Thus, \({\mathbb {E}}^{\lambda _i}(w_{\alpha }|{\mathcal {G}})\) weak\(^*\) converges to \({\mathbb {E}}^{\lambda _i}(\tilde{w}_{0}|{\mathcal {G}})\) in \(L_{\infty }((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\) for each i, which implies that \(\Phi \) is continuous under the weak\(^*\) topologies for both W and \(\prod _{1 \le i \le k+1}L_{\infty }((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\). As a result, \(\Phi (W)\) is convex and weak\(^*\) compact in \(\prod _{1 \le i \le k+1}L_{\infty }((T, {\mathcal {G}},\lambda _i),{\mathbb {R}})\).

Now fix a \({\mathcal {T}}\)-measurable function f from T to [0, 1]. Then \(f \in W\). Let \(W_f = \Phi ^{-1}({\mathbb {E}}^{\lambda _1}(f|{\mathcal {G}}),\ldots , {\mathbb {E}}^{\lambda _{k+1}}(f|{\mathcal {G}}))\). Since \(\Phi \) is affine and weak\(^*\) continuous, \(W_f\) is convex and weak\(^*\) compact in \(L_{\infty }((T, {\mathcal {G}},\mu _{k+1}),{\mathbb {R}})\). By Krein–Milman’s theorem [see Royden and Fitzpatrick (2010, p. 296)], \(W_f\) has an extreme point \(w_0\). We next show by contradiction that \(w_0\) is an indicator function in the sense that \(w_0(t) = 0\) or 1 for \(\mu _{k+1}\)-almost all \(t \in T\). Suppose that there exists some positive number \(0< \epsilon < \frac{1}{2}\) and a subset \(\hat{E} \in {\mathcal {T}}\) such that \(\mu _{k+1}(\hat{E}) > 0\) and \(\epsilon \le w_0(t) \le 1 -\epsilon \) for \(t \in \hat{E}\).

Let \(\mu _{k+1}^{\hat{E}}\) be the probability measure on \((\hat{E}, {\mathcal {T}}^{\hat{E}})\) rescaled from the restriction of \(\mu _{k+1}\) on \((\hat{E}, {\mathcal {T}}^{\hat{E}})\). Since \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu _{k+1}\), \({\mathcal {T}}^{\hat{E}}\) is also nowhere equivalent to \({\mathcal {G}}^{\hat{E}}\) under \(\mu _{k+1}^{\hat{E}}\). By Lemma 1, there exists a subset \(E \subseteq \hat{E}\) such that the event E is independent of the \(\sigma \)-algebra \({\mathcal {G}}^{\hat{E}}\) in the probability space \((\hat{E}, {\mathcal {T}}^{\hat{E}}, \mu _{k+1}^{\hat{E}})\), and \(\mu _{k+1}^{\hat{E}}(E) = \mu _{k+1}^{\hat{E}}(\hat{E} \setminus E) = \frac{1}{2}\).

By the induction hypothesis, there exist some \({\mathcal {T}}\)-measurable subsets \(E_1 \subseteq E\) and \(E_2 \subseteq (\hat{E}\setminus E)\) such that for \(2 \le i \le k+1\),

$$\begin{aligned} \frac{1}{2}{\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_E | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{E_1} | {\mathcal {G}}) \end{aligned}$$

and

$$\begin{aligned} \frac{1}{2}{\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{\hat{E}\setminus E} | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{E_2} | {\mathcal {G}}). \end{aligned}$$

Define two \({\mathcal {G}}\)-measurable mappings \(\xi _1\) and \(\xi _2\) on T as follows:

$$\begin{aligned} \xi _1(t) = {\left\{ \begin{array}{ll} \epsilon , &{} \text{ if } {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}})(t) = 0; \\ \epsilon \cdot {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t), &{} \text{ if } {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}})(t) \ne 0; \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \xi _2(t) = {\left\{ \begin{array}{ll} \epsilon , &{} \text{ if } {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t) = 0; \\ \epsilon \cdot {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}})(t), &{} \text{ if } {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t) \ne 0. \end{array}\right. } \end{aligned}$$

Then we have

  1. 1.

    \(\xi _1\) and \(\xi _2\) are both \({\mathcal {G}}\)-measurable;

  2. 2.

    for each t, \(-\epsilon \le \xi _1(t), \xi _2(t) \le \epsilon \);

  3. 3.

    \(\xi _1(t) \cdot {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}})(t) = \xi _2(t) \cdot {\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t)\).

Let

$$\begin{aligned} \xi _0 = \xi _1 \left( \frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} \right) - \xi _2 \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} \right) . \end{aligned}$$

We claim that \(\xi _0\) is not a constant function with zero value. Toward this aim, we only need \(\xi _1\) to be not a zero constant function on E or \(\xi _2\) to be not a zero constant function on \(\hat{E}\setminus E\).

  1. 1.

    If \({\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t)\) is not a zero constant function on E, then \(\xi _1\) is not a zero constant function on E, which implies that \(\xi _0\) is not a constant function with zero value.

  2. 2.

    Suppose that \({\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t)\) is a zero constant function on E. For any \(D_1 \in {\mathcal {G}}\), let \(D = D_1 \cap E\). We have

    $$\begin{aligned} 0&= \int _{T} {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) {\mathbf {1}}_{D} {{\mathrm{d}}}\mu _{k+1} \\&= \int _{T} {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) {\mathbf {1}}_{D_1} {\mathbf {1}}_{E} {{\mathrm{d}}}\mu _{k+1} \\&= \int _{T} {\mathbb {E}}^{\mu _{k+1}} \left( {\mathbf {1}}_{D_1} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) {\mathbf {1}}_{E} {{\mathrm{d}}}\mu _{k+1} \\&= \int _{\hat{E}} {\mathbb {E}}^{\mu _{k+1}} \left( {\mathbf {1}}_{D_1} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) {\mathbf {1}}_{E} {{\mathrm{d}}}\mu _{k+1} \\&= \frac{1}{\mu _{k+1}(\hat{E})} \cdot \int _{\hat{E}} {\mathbb {E}}^{\mu _{k+1}} \left( {\mathbf {1}}_{D_1} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) {\mathbf {1}}_{E} \mu _{k+1}^{\hat{E}}({{\mathrm{d}}}t) \\&= \frac{1}{\mu _{k+1}(\hat{E})} \cdot \int _{\hat{E}} {\mathbb {E}}^{\mu _{k+1}} \left( {\mathbf {1}}_{D_1} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) \mu _{k+1}^{\hat{E}}({{\mathrm{d}}}t) \cdot \int _{\hat{E}} {\mathbf {1}}_{E} \mu _{k+1}^{\hat{E}}({{\mathrm{d}}}t) \\&= \frac{1}{2}\int _{\hat{E}} {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) {\mathbf {1}}_{D_1} {{\mathrm{d}}}\mu _{k+1}. \end{aligned}$$

    The first equality holds since \({\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) (t)\) is zero on E. The sixth equality is due to the fact that \({\mathbb {E}}^{\mu _{k+1}} \left( {\mathbf {1}}_{D_1} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) \) is \({\mathcal {G}}^{\hat{E}}\)-measurable on \(\hat{E}\) and E is independent of \({\mathcal {G}}^{\hat{E}}\) under the probability measure \(\mu _{k+1}^{\hat{E}}\). The rest is clear. As a result, we have

    $$\begin{aligned} \int _{\hat{E}} {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) {\mathbf {1}}_{D_1} {{\mathrm{d}}}\mu _{k+1} = 0 \end{aligned}$$

    for any \(D_1 \in {\mathcal {G}}\), and hence \({\mathbb {E}}^{\mu _{k+1}} (\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}})(t) = 0\) for \(\mu _{k+1}\)-almost all \(t \in \hat{E}\). By the definition of \(\xi _2\), it is \(\epsilon \) for \(\mu _{k+1}\)-almost all \(t \in \hat{E}\). Thus, \(\xi _0\) is not a zero constant function.

For \(2 \le i \le k+1\),

$$\begin{aligned} {\mathbb {E}}^{\lambda _i} (\xi _0 | {\mathcal {G}})&= {\mathbb {E}}^{\lambda _i} \left( \xi _1(\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1}) | {\mathcal {G}}\right) - {\mathbb {E}}^{\lambda _i} \left( \xi _2(\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) \\&= \xi _1 {\mathbb {E}}^{\lambda _i} \left( \frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}}\right) - \xi _2 {\mathbb {E}}^{\lambda _i} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) \\&= 0. \end{aligned}$$

The third equality holds since for \(2 \le i \le k+1\), \(\frac{1}{2}{\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_E | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{E_1} | {\mathcal {G}})\) and \(\frac{1}{2}{\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{\hat{E}\setminus E} | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{E_2} | {\mathcal {G}})\). Moreover, we have

$$\begin{aligned} {\mathbb {E}}^{\mu _{k+1}} (\xi _0 | {\mathcal {G}})&= {\mathbb {E}}^{\mu _{k+1}} \left( \xi _1(\frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1}) | {\mathcal {G}}\right) - {\mathbb {E}}^{\mu _{k+1}} \left( \xi _2(\frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2}) | {\mathcal {G}}\right) \\&= \xi _1 {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_E - {\mathbf {1}}_{E_1} | {\mathcal {G}}\right) - \xi _2 {\mathbb {E}}^{\mu _{k+1}} \left( \frac{1}{2}{\mathbf {1}}_{\hat{E}\setminus E} - {\mathbf {1}}_{E_2} | {\mathcal {G}}\right) \\&= 0. \end{aligned}$$

The third equality is due to the definitions of \(\xi _1\) and \(\xi _2\). Since \(\mu _{k+1} = \sum _{1 \le i \le k+1} \lambda _{i}\), we have \({\mathbb {E}}^{\lambda _1} (\xi _0 | {\mathcal {G}}) = 0\).

Let \(w_0^1 = w_0 + \xi _0\) and \(w_0^2 = w_0 - \xi _0\). Notice that \(\xi _0(t) = 0\) for \(t \notin \hat{E}\), and \(-\epsilon \le \xi _0(t) \le \epsilon \) for \(t \in \hat{E}\). Since \(\epsilon \le w_0(t) \le 1 -\epsilon \) for \(t \in \hat{E}\), we know that \(w_0^1, w_0^2 \in W\). It is then clear that \(\Phi (w_0^1) = \Phi (w_0^2) = \Phi (w_0)= \Phi (f)\). Hence, \(w_0^1, w_0^2 \in W_f\) and \(w_0 = \frac{1}{2}(w_0^1 + w_0^2)\). As a result, \(w_0\) is not an extreme point of \(W_f\), which is a contradiction. Therefore, for any extreme point \(w_0\) of \(W_f\), \(w_0(t) = 0\) or 1 for \(\mu _{k+1}\)-almost all \(t \in T\). Let \(D = \{t \in T :w_0(t) = 1\}\). Then, \(\Phi ( {\mathbf {1}}_{D} ) = \Phi (w_0)= \Phi (f)\). That is, \({\mathbb {E}}^{\lambda _i} (f | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i} ({\mathbf {1}}_{D} | {\mathcal {G}})\) for \(1 \le i \le k+1\).

This completes the proof. \(\square \)

Before completing the proof of Theorem 1, we first prove Theorems 2, 3 and 5. In the following, we show that the nowhere equivalence condition is both necessary and sufficient for the convexity of conditional expectation of correspondence under vector measures.

Proof of Theorem 2

First, we assume that \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \). Pick two \({\mathcal {T}}\)-measurable and integrable selections \(f_1\) and \(f_2\) of F and fix some \(\alpha \in [0,1]\). Without loss of generality, we assume that \(f_1\) and \(f_2\) are strictly positive.Footnote 33

Define a new vector probability measure

$$\begin{aligned} \tilde{\lambda } = (\lambda ^{1}_1, \lambda ^{2}_1, \ldots ,\lambda ^{1}_m, \lambda ^{2}_m) \end{aligned}$$

as follows: for any \(E \in {\mathcal {T}}\) and \(1 \le i \le m\),

  1. 1.

    \(\lambda ^{1}_i(E) = \frac{1}{\int _T f_1 {{\mathrm{d}}}\lambda _i} \int _E f_1 {{\mathrm{d}}}\lambda _i\),

  2. 2.

    \(\lambda ^{2}_i(E) = \frac{1}{\int _T f_2 {{\mathrm{d}}}\lambda _i} \int _E f_2 {{\mathrm{d}}}\lambda _i\).

Then \(\lambda ^{j}_i\) is absolutely continuous with respect to \(\mu \) for \(j = 1,2\) and \(1 \le i \le m\). By Proposition 1, the set \(\left\{ {\mathbb {E}}^{\tilde{\lambda }}({\mathbf {1}}_{E} | {\mathcal {G}}) :E \in {\mathcal {T}}\right\} \) is convex. As a result, there exists some set \(E \in {\mathcal {T}}\) such that \({\mathbb {E}}^{\lambda ^{j}_i}({\mathbf {1}}_{E} | {\mathcal {G}}) = \alpha \) for each \(j = 1,2\) and \(1 \le i \le m\). That is, for any bounded \({\mathcal {G}}\)-measurable function g, \(j =1,2\) and \(1 \le i \le m\), we have

$$\begin{aligned} \int _T {\mathbb {E}}^{\lambda ^{j}_i}({\mathbf {1}}_{E} | {\mathcal {G}}) g {{\mathrm{d}}}\lambda ^{j}_i = \int _T \alpha g {{\mathrm{d}}}\lambda ^{j}_i = \frac{\alpha }{\int _T f_j {{\mathrm{d}}}\lambda _i} \int _T f_j g {{\mathrm{d}}}\lambda _i. \end{aligned}$$

In addition,

$$\begin{aligned} \int _T {\mathbb {E}}^{\lambda ^{j}_i}({\mathbf {1}}_{E} | {\mathcal {G}}) g {{\mathrm{d}}}\lambda ^{j}_i = \int _T {\mathbf {1}}_{E} g {{\mathrm{d}}}\lambda ^{j}_i = \frac{1}{\int _T f_j {{\mathrm{d}}}\lambda _i} \int _T {\mathbf {1}}_{E} f_j g {{\mathrm{d}}}\lambda _i. \end{aligned}$$

That is,

$$\begin{aligned} \alpha \int _T f_j g {{\mathrm{d}}}\lambda _i = \int _T {\mathbf {1}}_{E} f_j g {{\mathrm{d}}}\lambda _i, \end{aligned}$$

which implies that \({\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{E} f_j | {\mathcal {G}}) = \alpha {\mathbb {E}}^{\lambda _i}(f_j | {\mathcal {G}})\).

Define a new function \(f_0\) such that \(f_0 = f_1\) if \(t \in E\) and \(f_0 = f_2\) if \(t \in T\setminus E\). Then for \(1 \le i \le m\),

$$\begin{aligned} {\mathbb {E}}^{\lambda _i}( f_0 | {\mathcal {G}}) = {\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{E} f_1 | {\mathcal {G}}) + {\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{T\setminus E} f_2 | {\mathcal {G}}) = \alpha {\mathbb {E}}^{\lambda _i}(f_1 | {\mathcal {G}}) + (1 - \alpha ) {\mathbb {E}}^{\lambda _i}(f_2 | {\mathcal {G}}). \end{aligned}$$

Since \(f_0\) is obviously a \({\mathcal {T}}\)-measurable selection of F, \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\) is convex.

Conversely, suppose that \({\mathcal {T}}\) has a \({\mathcal {G}}\)-atom D under \(\mu \). By Corollary 1, \({\mathcal {T}}\) also has a \({\mathcal {G}}\)-atom under \(\lambda _i\) for any i such that \(\lambda _i(D) > 0\). As a result, we can assume without loss of generality that \(\lambda _1(D) > 0\). In addition, we assume \(\lambda _1\) to be a probability measure for simplicity. Define a correspondence

$$\begin{aligned} F(t) = {\left\{ \begin{array}{ll} \{0,1\} &{} t\in D;\\ \{0\} &{} t\notin D. \end{array}\right. } \end{aligned}$$

It has been shown in Proposition 4 of He and Sun (2017) that \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}_F\) is not convex, and hence \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\) is not convex. This is a contradiction.Footnote 34 \(\square \)

Next, we prove Theorem 5.

Proof of Theorem 5

We first prove the sufficiency result. The direction that \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F \subseteq {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_{\text{ co } (F)}\) is obvious, we only need to consider the other direction.

Fix a \({\mathcal {T}}\)-measurable selection \(\phi \) of \(\text{ co } (F)\). By the standard argument,Footnote 35 there exist \({\mathcal {T}}\)-measurable functions \(\alpha _1,\ldots ,\alpha _{l+1}, \gamma _1,\ldots , \gamma _{l+1}\) such that for \(\mu \)-almost all \(t \in T\),

  1. 1.

    \(\phi (t) = \sum _{1 \le j \le l+1} \alpha _j(t) \gamma _j(t)\);

  2. 2.

    \(\sum _{1 \le j \le l+1} \alpha _j(t) = 1\), \(\alpha _j\) is \({\mathcal {T}}\)-measurable and \(0 \le \alpha _j(t) \le 1\) for \(1 \le j \le l+1\);

  3. 3.

    \(\gamma _j\) is a \({\mathcal {T}}\)-measurable selection of F for \(1 \le j \le l+1\).

Define a new correspondence \(\tilde{F}\) as \(\tilde{F}(t) = \{\gamma _1(t), \ldots , \gamma _{l+1}(t)\}\). Then \(\phi \) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co } (\tilde{F})\).

We shall show that for any correspondence \(G(t) = \{\beta _1(t), \ldots , \beta _{l+1}(t)\}\) with \(\beta _j :T \rightarrow {\mathbb {R}}\) being a \({\mathcal {T}}\)-measurable and integrable mapping under \(\mu \) for \(1 \le j \le l+1\), and for any \({\mathcal {T}}\)-measurable selection \(\chi \) of \(\text{ co } (G)\), there exists a \({\mathcal {T}}\)-measurable selection of G which has the same conditional expectation with \(\chi \) under the vector measure \(\lambda \). This claim will be proved by an induction argument.

Given any correspondence \(G_1(t) = \{\beta _1(t)\}\) with \(\beta _1 :T \rightarrow {\mathbb {R}}\) being \({\mathcal {T}}\)-measurable and integrable under \(\mu \), the claim is automatically satisfied. We assume that for some \(1 \le n \le l\), given any correspondence \(G_n(t) = \{\beta _1(t), \ldots , \beta _{n}(t)\}\) with \(\beta _j :T \rightarrow {\mathbb {R}}\) being \({\mathcal {T}}\)-measurable and integrable under \(\mu \) for \(1 \le j \le n\), the claim holds if \(\chi \) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co } (G_n)\). We need to show that the claim also holds if \(\chi \) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co } (G_{n+1})\), where \(G_{n+1}(t) = \{\beta _1(t), \ldots , \beta _{n+1}(t)\}\) with \(\beta _j :T \rightarrow {\mathbb {R}}\) being \({\mathcal {T}}\)-measurable and integrable under \(\mu \) for \(1 \le j \le n+1\).

Suppose that \(\chi \) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co } (G_{n+1})\). By a similar argument as above, there exist \({\mathcal {T}}\)-measurable functions \(\tilde{\alpha }_1,\ldots ,\tilde{\alpha }_{n+1}\) such that for \(\mu \)-almost all \(t \in T\),

  1. 1.

    \(\chi (t) = \sum _{1 \le j \le n+1} \tilde{\alpha }_j(t) \beta _j(t)\);

  2. 2.

    \(\sum _{1 \le j \le n+1} \tilde{\alpha }_j(t) = 1\) and \(0 \le \tilde{\alpha }_j(t) \le 1\) for \(1 \le j \le n+1\).

Define a function \(\tilde{\chi }\) as follows:

$$\begin{aligned} \tilde{\chi }(t) = {\left\{ \begin{array}{ll} \frac{1}{1 - \tilde{\alpha }_{n+1}(t)}\sum _{1 \le j \le n} \tilde{\alpha }_j(t) \beta _j(t), &{} \text{ if } 0 \le \tilde{\alpha }_{n+1}(t) < 1; \\ \beta _1(t), &{} \text{ otherwise }. \end{array}\right. } \end{aligned}$$

Since \(\sum _{1 \le j \le n} \tilde{\alpha }_j(t) = 1 - \tilde{\alpha }_{n+1}(t)\), \(\tilde{\chi }\) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co }(G_n) = \text{ co }\{\beta _1(t), \ldots , \beta _{n}(t)\}\). Notice that

$$\begin{aligned} \chi (t) = \sum _{1 \le j \le n+1} \tilde{\alpha }_j(t) \beta _j(t) = \tilde{\chi }(t) + \tilde{\alpha }_{n+1}(t)(\beta _{n+1}(t) - \tilde{\chi }(t)). \end{aligned}$$

Let \(\beta ^1_{n+1} = \max \{0, \beta _{n+1} - \tilde{\chi }\} + 1\) and \(\beta ^2_{n+1} = \max \{0, -\beta _{n+1} + \tilde{\chi }\} + 1\). Then \(\beta _{n+1} - \tilde{\chi } = \beta ^1_{n+1} - \beta ^2_{n+1}\). Define a new vector measure

$$\begin{aligned} \tilde{\lambda } = (\lambda ^{1}_1, \lambda ^{2}_1, \ldots ,\lambda ^{1}_m, \lambda ^{2}_m) \end{aligned}$$

as follows: for any \(E \in {\mathcal {T}}\) and \(1 \le i \le m\), let \(\lambda ^{1}_i(E) = \int _E \beta ^1_{n+1} {{\mathrm{d}}}\lambda _i\) and \(\lambda ^{2}_i(E) = \int _E \beta ^2_{n+1} {{\mathrm{d}}}\lambda _i\). By Proposition 1, there exists a set \(D_{n+1} \in {\mathcal {T}}\) such that for \(j = 1, 2\) and \(1 \le i \le m\), \({\mathbb {E}}^{\lambda ^j_i}(\tilde{\alpha }_{n+1}|{\mathcal {G}}) = {\mathbb {E}}^{\lambda ^j_i}({\mathbf {1}}_{D_{n+1}} | {\mathcal {G}})\). For any bounded \({\mathcal {G}}\)-measurable mapping g, \(j = 1, 2\) and \(1 \le i \le m\),

$$\begin{aligned} \int _{T} {\mathbb {E}}^{\lambda ^j_i}(\tilde{\alpha }_{n+1}|{\mathcal {G}}) g {{\mathrm{d}}}\lambda ^j_i= & {} \int _{T} \tilde{\alpha }_{n+1} g {{\mathrm{d}}}\lambda ^j_i = \int _{T} \tilde{\alpha }_{n+1} g \beta ^j_{n+1} {{\mathrm{d}}}\lambda _i,\\ \int _{T} {\mathbb {E}}^{\lambda ^j_i}({\mathbf {1}}_{D_{n+1}} | {\mathcal {G}}) g {{\mathrm{d}}}\lambda ^j_i= & {} \int _{T} {\mathbf {1}}_{D_{n+1}} g {{\mathrm{d}}}\lambda ^j_i = \int _{T} {\mathbf {1}}_{D_{n+1}} g \beta ^j_{n+1} {{\mathrm{d}}}\lambda _i. \end{aligned}$$

As a result,

$$\begin{aligned} \int _{T} \tilde{\alpha }_{n+1} [\beta _{n+1} - \tilde{\chi }] g {{\mathrm{d}}}\lambda _i = \int _{T} {\mathbf {1}}_{D_{n+1}} [\beta _{n+1} - \tilde{\chi }] g {{\mathrm{d}}}\lambda _i, \end{aligned}$$

which implies that

$$\begin{aligned} {\mathbb {E}}^{\lambda _i} \left( \tilde{\alpha }_{n+1}(\beta _{n+1} - \tilde{\chi })|{\mathcal {G}}\right) = {\mathbb {E}}^{\lambda _i} \left( {\mathbf {1}}_{D_{n+1}}(\beta _{n+1} - \tilde{\chi }) | {\mathcal {G}}\right) . \end{aligned}$$

For \(1 \le j \le n\), let \(\hat{\beta }_j(t) = \beta _j(t)\) if \(t \in T\setminus D_{n+1}\), and 0 if \(t \in D_{n+1}\). Define \(\hat{G}_{n}(t) = \{\hat{\beta }_1(t), \ldots , \hat{\beta }_n(t)\}\). Then the mapping \(\tilde{\chi } {\mathbf {1}}_{T \setminus D_{n+1}}\) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co }(\hat{G}_n)\). By the induction hypothesis, there exists a \({\mathcal {T}}\)-measurable selection \(\hat{\chi }\) of \(\hat{G}_{n}\) such that \(\hat{\chi }\) and \(\tilde{\chi } {\mathbf {1}}_{T \setminus D_{n+1}}\) have the same conditional expectation under \(\lambda \). Then there exist n disjoint sets \(\{D_j\}_{1 \le j \le n} \subseteq {\mathcal {T}}\) such that \(\cup _{1 \le j \le n} D_j = T\setminus D_{n+1}\) and \(\hat{\chi }(t) = \beta _j(t)\) if \(t \in D_j\), \(1 \le j \le n\). That is, for \(1 \le i \le m\),

$$\begin{aligned} {\mathbb {E}}^{\lambda _i}(\tilde{\chi } {\mathbf {1}}_{T \setminus D_{n+1}} | {\mathcal {G}}) = \sum _{1 \le j \le n}{\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_j}\beta _j | {\mathcal {G}}). \end{aligned}$$

Therefore, for any \(1 \le i \le m\),

$$\begin{aligned} {\mathbb {E}}^{\lambda _i}(\chi | {\mathcal {G}})&= {\mathbb {E}}^{\lambda _i}(\tilde{\chi } | {\mathcal {G}}) + {\mathbb {E}}^{\lambda _i}(\tilde{\alpha }_{n+1}(\beta _{n+1} - \tilde{\chi }) | {\mathcal {G}}) \\&= {\mathbb {E}}^{\lambda _i}(\tilde{\chi } | {\mathcal {G}}) + {\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_{n+1}}(\beta _{n+1} - \tilde{\chi }) | {\mathcal {G}}) \\&= {\mathbb {E}}^{\lambda _i}(\tilde{\chi } {\mathbf {1}}_{T \setminus D_{n+1}} | {\mathcal {G}}) + {\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_{n+1}}\beta _{n+1} | {\mathcal {G}}) \\&= \sum _{1 \le j \le n}{\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_j}\beta _j | {\mathcal {G}}) + {\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_{n+1}}\beta _{n+1} | {\mathcal {G}}) \\&= \sum _{1 \le j \le n+1}{\mathbb {E}}^{\lambda _i}({\mathbf {1}}_{D_j}\beta _j | {\mathcal {G}}). \end{aligned}$$

Let \(\overline{\chi } = \sum _{1 \le j \le n+1} {\mathbf {1}}_{D_j}\beta _j\). Then \(\overline{\chi }\) is a \({\mathcal {T}}\)-measurable selection of \(G_{n+1}\). This completes the induction argument.

Since \(\phi \) is a \({\mathcal {T}}\)-measurable selection of \(\text{ co } (\tilde{F})\), there exists a \({\mathcal {T}}\)-measurable selection \(\hat{\phi }\) of \(\tilde{F}\) which has the same conditional expectation with \(\phi \) under the vector measure \(\lambda \). It is obvious that \(\hat{\phi }\) is also a \({\mathcal {T}}\)-measurable selection of F. As a result, \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_{\text{ co } (F)} \subseteq {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\), and hence \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_{\text{ co } (F)} = {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\).

The necessity part is obvious. If \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_{\text{ co } (F)} = {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\), then \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_F\) is convex. By Theorem 2, \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\). \(\square \)

For a sequence of sets \(\{A_k\}_{k\in {\mathbb {N}}}\) in a topological space X, let \({\text{ Ls }}(A_k)\) be the set of all \(x \in X\) such that for any neighborhood \(O_x\) of \(x \in X\), there are infinitely many k with \(O_x\cap A_k \ne \emptyset \). The following lemma is needed in the proofs of Theorems 3 and 4.

Lemma 3

Let \(\{\phi _k\}_{k\in {\mathbb {N}}}\) be a sequence of \({\mathcal {T}}\)-measurable mappings from T to \({\mathbb {R}}^l\) that is p-integrably bounded for some fixed p with \(1\le p < \infty \). For \(1 \le i \le m\) and \(k\in {\mathbb {N}}\), let \(h_k^i = {\mathbb {E}}^{\lambda _i}(\phi _k|{\mathcal {G}})\) and \(h_k = (h_k^1,\ldots , h_k^m)\). Suppose that \(h_k\) weakly converges to some \(h_0 \in \prod _{1 \le i \le m}L_{p}((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) as \(k \rightarrow \infty \). If \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \), then there exists a \({\mathcal {T}}\)-measurable mapping \(\phi _0 \in L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\) such that

  1. 1.

    \(\phi _0(t) \in {\text{ Ls }}\big ( \phi _k(t) \big )\) for \(\mu \)-almost all \(t\in T\), and

  2. 2.

    \({\mathbb {E}}^{\lambda }(\phi _0|{\mathcal {G}}) = h_0\).

Proof

Since the sequence \(\{\phi _k\}_{k\in {\mathbb {N}}}\) is p-integrably bounded in \(L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\), it has a weakly convergent subsequence by the Riesz/Dunford-Pettis weak compactness theorem [see Royden and Fitzpatrick (2010, pp. 408/412)]. Without loss of generality, we assume that \(\phi _k\) weakly converges to some \(\phi \in L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\).

Given any \(g\in L_{q}((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) with \(\frac{1}{p}+\frac{1}{q} = 1\), \(g\rho _i \in L_{q}((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\) and

$$\begin{aligned} \int _T h^i_k g{{\mathrm{d}}}\lambda _i&= \int _T {\mathbb {E}}^{\lambda _i}(\phi _k |{\mathcal {G}}) g {{\mathrm{d}}}\lambda _i = \int _T {\mathbb {E}}^{\lambda _i}(\phi _k g|{\mathcal {G}}) {{\mathrm{d}}}\lambda _i \\&= \int _T \phi _k g {{\mathrm{d}}}\lambda _i = \int _T \phi _k g \rho _i {{\mathrm{d}}}\mu \\&\rightarrow \int _T \phi g \rho _i {{\mathrm{d}}}\mu = \int _T \phi g {{\mathrm{d}}}\lambda _i \\&= \int _T {\mathbb {E}}^{\lambda _i}(\phi g|{\mathcal {G}}) {{\mathrm{d}}}\lambda _i = \int _T {\mathbb {E}}^{\lambda _i}(\phi |{\mathcal {G}}) g {{\mathrm{d}}}\lambda _i. \end{aligned}$$

Thus, \(h_k\) weakly converges to \({\mathbb {E}}^{\lambda }(\phi |{\mathcal {G}})\) in \(\prod _{1 \le i \le m}L_{p}((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\), which implies that \(h_0 = {\mathbb {E}}^{\lambda }(\phi |{\mathcal {G}})\).

Notice that \(\{\phi _k,\phi _{k+1},\ldots \}\) also weakly converges to \(\phi \) in \(L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\) for each \(k\in {\mathbb {N}}\). By Theorem 29 of Royden and Fitzpatrick (2010, p. 293) there is a sequence of convex combination of \(\{\phi _k,\phi _{k+1},\ldots \}\) that converges to \(\phi \) in \(L_p\) norm. For each \(k\in {\mathbb {N}}\), assume that \(\varphi _k\) is a convex combination \(\{\phi _k,\phi _{k+1},\ldots \}\) such that \(\Vert \varphi _k - \phi \Vert _{p} \le \frac{1}{k}\). Thus, there is a subsequence of \(\{\varphi _k\}\), say itself, which converges to \(\phi \) \(\mu \)-almost everywhere.

Fix \(t\in T\) such that \(\varphi _k(t)\) converges to \(\phi (t)\). By Carathèodary’s convexity theorem [see Royden and Fitzpatrick (1974), p. 37)] \(\varphi _k (t) = \sum _{j=0}^l \alpha _{jk} \gamma _{jk} (t)\), where

  1. 1.

    for each \(k\in {\mathbb {N}}\), \(\alpha _{jk} \ge 0\) for any j and \( \sum _{j=0}^l \alpha _{jk} = 1\);

  2. 2.

    for each \(k\in {\mathbb {N}}\), \(\gamma _{0k} (t), \ldots , \gamma _{lk} (t) \in \left\{ \phi _k(t),\phi _{k+1}(t),\ldots \right\} \).

Without loss of generality, assume that for each \(0\le j \le l\), \(\alpha _{jk}\rightarrow \alpha _j\) and \(\gamma _{jk}(t) \rightarrow \gamma _j(t)\). Then \(\alpha _{1},\ldots , \alpha _l \ge 0\) and \( \sum _{j=0}^l \alpha _{j} = 1\). Moreover, \(\gamma _j(t) \in {\text{ Ls }}(\phi _k(t))\). Let \(G(t) = {\text{ Ls }}(\phi _k(t))\). Then \(\phi (t) \in \text{ co }(G(t))\).

Since \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \) and G is p-integrably bounded and closed valued, Theorem 5 implies that \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_G = {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda )}_{\text{ co } (G)}\). Thus, there exists a \({\mathcal {T}}\)-measurable selection \(\phi _0\) of G such that \({\mathbb {E}}^{\lambda _i}(\phi _0|{\mathcal {G}}) = {\mathbb {E}}^{\lambda _i}(\phi |{\mathcal {G}})\) for each i, which completes the proof. \(\square \)

Now we are ready to prove Theorem 3.

Proof of Theorem 3

Suppose that \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \). For the case \(1 \le p < \infty \), we first show that \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly sequentially compact in \(\prod _{1 \le i \le m}L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). Fix an arbitrary sequence of \({\mathcal {T}}\)-measurable selections \(\{\phi _k\}_{k\in {\mathbb {N}}}\) of F. Let \(h_k = {\mathbb {E}}^{\lambda }(\phi _k|{\mathcal {G}})\) for each \(k\in {\mathbb {N}}\). We need to show that there is a subsequence of \(\{h_k\}_{k\in {\mathbb {N}}}\) which weakly converges in \(\prod _{1 \le i \le m}L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) to some point in \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\). Since the sequence \(\{\phi _k\}_{k\in {\mathbb {N}}}\) is p-integrably bounded, it has a weakly convergent subsequence in \(L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\) due to the Riesz/Dunford-Pettis weak compactness theorem [see Royden and Fitzpatrick (2010, pp. 408/412)]. Without loss of generality, assume that \(\phi _k\) weakly converges to some \(\phi \in L_p((T,{\mathcal {T}},\mu ),{\mathbb {R}}^l)\). As shown in the proof of Lemma 3, \(h_k\) also weakly converges to \({\mathbb {E}}^{\lambda }(\phi |{\mathcal {G}})\) in \(\prod _{1 \le i \le m}L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). By Lemma 3, there exists a \({\mathcal {T}}\)-measurable selection \(\phi _0\) of \({\text{ Ls }}(\phi _k)\) such that \({\mathbb {E}}^{\lambda }(\phi _0|{\mathcal {G}}) = {\mathbb {E}}^{\lambda }(\phi |{\mathcal {G}})\). Since F is compact valued, \({\text{ Ls }}(\phi _k(t)) \subseteq F(t)\) for \(\mu \)-almost all \(t\in T\). Thus, \(\phi _0\) is a selection of F, which implies that \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly sequentially compact. By the Eberlein–Smulian theorem [see Bogachev (2007, Theorem 4.7.10)], \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is relatively weakly compact in \(\prod _{1 \le i \le m}L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\).

Suppose that there exists a sequence \(\{\tilde{h}_k\}_{k \ge 1} \subseteq {\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) such that \(\tilde{h}_k \rightarrow \tilde{h}_0\) in the \(L_p\)-norm. Then \(\tilde{h}_k\) also weakly converges to \(\tilde{h}_0\). Since \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly sequentially compact, \(\tilde{h}_0 \in {\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\). Thus, \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is norm closed. By Theorem 2, \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is convex, hence \({\mathcal {I}}_{F}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is also weakly closed by Mazur’s theorem [see Royden and Fitzpatrick (2010, p. 292)].

Next, we consider the case \(p = \infty \). Given that \(\prod _{1 \le i \le m} L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) is a subset (and the dual) of \(\prod _{1 \le i \le m}L_{1}((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\), it follows that the relative weak topology on \(\prod _{1 \le i \le m} L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) is weaker than the weak\(^*\) topology on \(\prod _{1 \le i \le m} L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). Suppose that F is essentially bounded by some constant \(C > 0\). By Alaoglu’s theorem [see Theorem 11.7.1 in Loeb (2016)], the closed ball with radius C (the C-ball) is weak\(^*\) compact in \(L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). Any weak\(^*\) closed subset of the C-ball in \(L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\) is hence weak\(^*\) compact, and hence weakly compact. As a result, the weak\(^*\) topology and the weak topology coincide with each other on the C-ball in \(L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). Since F is essentially bounded by C, it is also integrably bounded. By the argument above, \({\mathcal {I}}_F^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly compact in \(\prod _{1 \le i \le m}L_{1}((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\). Hence, \({\mathcal {I}}_F^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weak\(^*\) compact in \(\prod _{1 \le i \le m} L_{\infty }((T,{\mathcal {G}},\lambda _i),{\mathbb {R}}^l)\).

Conversely, suppose that \({\mathcal {T}}\) has a \({\mathcal {G}}\)-atom D under \(\mu \). By Corollary 1, \({\mathcal {T}}\) also has a \({\mathcal {G}}\)-atom D under \(\lambda _i\) for any i such that \(\lambda _i(D) > 0\). We only need to show that if the set \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}_F\) is weakly compact (resp. weak\(^*\) compact) in \(L_{p}((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) when \(1 \le p < \infty \) (resp. \(p = \infty \)) for any p-integrably bounded and closed valued correspondence F, then \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\lambda _1\).

Suppose that \({\mathcal {T}}\) has a \({\mathcal {G}}\)-atom D with \(\lambda _1(D) > 0\). Consider the correspondence F as defined in the proof of Theorem 2. Without loss of generality, we assume that \(\lambda _1^D\) is a probability measure. Pick an orthonormal subset \(\{\varphi _k\}_{k\in {\mathbb {N}}}\) of \(L_2((D,{\mathcal {T}}^D,\lambda _1^D),{\mathbb {R}})\) on the atomless probability space \((D,{\mathcal {T}}^D,\lambda _1^D)\) such that \(\varphi _k\) takes value in \(\{-1,1\}\) and \(\int _D \varphi _k{{\mathrm{d}}}\lambda _1^D =0\) for each \(k\in {\mathbb {N}}\).Footnote 36

Let

$$\begin{aligned} \phi _k(t) = {\left\{ \begin{array}{ll} \frac{\varphi _k(t)+1}{2} &{} t\in D;\\ 0 &{} t\notin D. \end{array}\right. } \end{aligned}$$

Then \(\phi _k\) is a \({\mathcal {T}}\)-measurable selection of F for each \(k\in {\mathbb {N}}\).

Pick a set \(E\in {\mathcal {T}}^D\). By Bessel’s inequality [see Theorem 8.8.1 in Loeb (2016)], \(\int _D {\mathbf {1}}_E \varphi _k{{\mathrm{d}}}\lambda _1^D \rightarrow 0\) as \(k \rightarrow \infty \). Thus, for any \(E_1\in {\mathcal {T}}\),

$$\begin{aligned} \int _T {\mathbf {1}}_{E_1} \phi _k {{\mathrm{d}}}\lambda _1 = \frac{1}{2}\int _D {\mathbf {1}}_{E_1\cap D} \varphi _k {{\mathrm{d}}}\lambda _1 +\frac{1}{2}\lambda _1(E_1\cap D) \rightarrow \frac{1}{2}\lambda _1(E_1\cap D). \end{aligned}$$
(3)

Given any nonnegative function \(\psi \in L_1((T,{\mathcal {T}},\lambda _1),{\mathbb {R}})\), \(\psi \) will be the increasing limit of a sequence of simple functions \(\{\psi _k\}_{k\in {\mathbb {N}}}\) (finite linear combinations of measurable indicator functions). Fix any \(\epsilon > 0\). By the dominated convergence theorem, there exists a positive integer \(K_0 > 0\) such that for any \(k\ge K_0\), \(\int _T |\psi - \psi _k| {{\mathrm{d}}}\lambda _1 < \epsilon \). Then we have

$$\begin{aligned}&\left| \int _T \psi \phi _k {{\mathrm{d}}}\lambda _1 - \frac{1}{2}\int _T \psi {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1 \right| \le \left| \int _T \psi \phi _k {{\mathrm{d}}}\lambda _1 - \int _T \psi _{K_0} \phi _k {{\mathrm{d}}}\lambda _1 \right| \\&\qquad + \left| \int _T \psi _{K_0} \phi _k {{\mathrm{d}}}\lambda _1 - \frac{1}{2}\int _T \psi _{K_0} {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1 \right| + \left| \frac{1}{2}\int _T \psi _{K_0} {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1 - \frac{1}{2}\int _T \psi {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1 \right| \\&\quad \le \int _T | \psi - \psi _{K_0}| {{\mathrm{d}}}\lambda _1 + \left| \int _T \psi _{K_0} \phi _k {{\mathrm{d}}}\lambda _1 - \frac{1}{2}\int _T \psi _{K_0} {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1 \right| + \int _T |\psi _{K_0} - \psi | {{\mathrm{d}}}\lambda _1. \end{aligned}$$

The first and the third terms are less than \(\epsilon \). By Eq. (3) and the fact that \(\psi _{K_0}\) is a simple function, the second term goes to 0 as \(k \rightarrow \infty \). Hence, \(\int _T \psi \phi _k {{\mathrm{d}}}\lambda _1 \rightarrow \frac{1}{2}\int _T \psi {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1\) as \(k \rightarrow \infty \). Given any \(\psi \in L_1((T,{\mathcal {T}},\lambda _1),{\mathbb {R}})\), we can obtain \(\int _T \psi \phi _k {{\mathrm{d}}}\lambda _1 \rightarrow \frac{1}{2}\int _T \psi {\mathbf {1}}_D {{\mathrm{d}}}\lambda _1\) as \(k \rightarrow \infty \) by writing \(\psi \) as the sum of its positive and negative parts. Therefore, \(\phi _k\) weak\(^*\) converges to \(\phi = \frac{1}{2}{\mathbf {1}}_D\) in \(L_\infty ((T,{\mathcal {T}},\lambda _1),{\mathbb {R}})\). Thus, \({\mathbb {E}}^{\lambda _1}(\phi _k|{\mathcal {G}}) \in {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}_F\) weak\(^*\) converges to \(\frac{1}{2} {\mathbb {E}}^{\lambda _1}({\mathbf {1}}_D|{\mathcal {G}})\) in \(L_\infty ((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) as shown in the proof of Lemma 3. It has been shown in He and Sun (2017) that \(\frac{1}{2} {\mathbb {E}}^{\lambda _1}({\mathbf {1}}_D|{\mathcal {G}}) \notin {\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}_F\), which implies that \({\mathcal {I}}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}_F\) is not weak\(^*\) compact in \(L_\infty ((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\). This is a contradiction.

For \(1\le p < \infty \), notice that F is also p-integrably bounded, and \(\phi _k\) weakly converges to \(\phi = \frac{1}{2}{\mathbf {1}}_D\) in \(L_{p}((T,{\mathcal {T}},\lambda _1),{\mathbb {R}})\). \(\square \)

Proof of Theorem 1

Theorem 1 follows immediately by applying Theorems 2 and 3 to the correspondence \(G = \{0,1\}\).

Finally, we are ready to prove Theorem 4.

Proof of Theorem 4

Suppose that \({\mathcal {T}}\) is nowhere equivalent to \({\mathcal {G}}\) under \(\mu \) and \(1\le p < \infty \). We need to prove that \(H(y) = {\mathcal {I}}_{F_y}^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly upper hemicontinuous.

If H is not weakly upper hemicontinuous, then there exist some point \(y_0 \in Y\) and a weakly open set V with \(H(y_0) \subseteq V\) such that for every open neighborhood \(B(y_0, 1/k)\), there exist \(y_k \in B(y_0, 1/k)\) and \(h_k \in H(y_k)\) with \(h_k \notin V\), where \(B(y_0, 1/k)\) is the open ball with the center \(y_0\) and radius 1 / k. Since \({\mathcal {I}}_G^{({\mathcal {T}},{\mathcal {G}},\lambda )}\) is weakly compact, \(\{h_k\}_{k \ge 1}\) has a weakly convergent subsequence, say itself, with a limit \(h_0 \in {\mathcal {I}}_G^{({\mathcal {T}},{\mathcal {G}},\lambda )}\). Since V is weakly open and \(h_k \notin V\) for any k, \(h_0 \notin V\).

Let \(\phi _k\) be a \({\mathcal {T}}\)-measurable selection of \(F_{y_k}\) such that \(h_k = {\mathbb {E}}^{\lambda }(\phi _k|{\mathcal {G}})\) for each \(k \ge 1\). Since \(h_k\) weakly converges to \(h_0\) and \(y_k\) converges to \(y_0\in Y\), by Lemma 3, there exists a \({\mathcal {T}}\)-measurable selection \(\phi _0\) of \({\text{ Ls }}(\phi _k)\) such that \(h_0 = {\mathbb {E}}^{\lambda }(\phi _0|{\mathcal {G}})\). Since \(F_t(\cdot )\) is upper hemicontinuous for \(\mu \)-almost all \(t\in T\), \(\phi _0(t) \in {\text{ Ls }}(\phi _k(t)) \subseteq {\text{ Ls }}( F_{y_k}(t)) \subseteq F_{y_0}(t)\) for \(\mu \)-almost all \(t\in T\). That is, \(\phi _0\) is a \({\mathcal {T}}\)-measurable selection of \(F_{y_0}\) and \(h_0 \in H(y_0) \subseteq V\). This is a contradiction. Thus, H is weakly upper hemicontinuous.

The case that \(p = \infty \) follows as in the third paragraph of the proof of Theorem 3.

Conversely, suppose that \({\mathcal {T}}\) has a \({\mathcal {G}}\)-atom D with \(\lambda _1(D) > 0\). Let G be the correspondence as in Theorem 2,

$$\begin{aligned} G(t) = {\left\{ \begin{array}{ll} \{0,1\} &{} t\in D;\\ \{0\} &{} t\notin D. \end{array}\right. } \end{aligned}$$

Let \(Y = \{\frac{1}{k}\}_{k \ge 1} \cup \{0\}\) endowed with the usual metric, \(F(t,0) = G(t)\) and \(F(t, \frac{1}{k}) = \{\phi _k(t)\}\) for all \(t\in T\) and \(k \ge 1\), where \(\phi _k\) is the same as in the converse part of the proof of Theorem 3. Then G is compact valued and bounded, and \(F(t,\cdot )\) is upper hemicontinuous for all \(t\in T\).

Consider the correspondence G. For \(1\le p < \infty \), since \({\mathcal {I}}_G^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}\) is p-integrably bounded, it is relatively weakly sequentially compact in \(L_p((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) due to the Riesz/Dunford-Pettis weak compactness theorem in Royden and Fitzpatrick (2010, pp. 408/412) and hence relatively weakly compact in \(L_p((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\). For \(p = \infty \), \({\mathcal {I}}_G^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}\) is relatively weak\(^*\) compact in \(L_{\infty }((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) due to Alaoglu’s theorem. Thus, \({\mathcal {I}}_{F_y}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}\) is a subset of a fixed weakly (resp. weak\(^*\)) compact set for all \(y \in Y\) when \(1\le p < \infty \) (resp. \(p=\infty \)).

For the sequence \(\{\frac{1}{k}\}\), \(\frac{1}{k} \rightarrow 0\) and \(\phi _k\) is a selection of \(F_{\frac{1}{k}}\). As shown in the proof above, \({\mathbb {E}}^{\lambda _1}(\phi _k|{\mathcal {G}})\) weakly (resp. weak\(^*\)) converges to \(\frac{1}{2} {\mathbb {E}}^{\lambda _1}({\mathbf {1}}_D|{\mathcal {G}})\) in \(L_p((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) for \(1 \le p < \infty \) (resp. \(p = \infty \)), but there is no \({\mathcal {T}}\)-measurable selection \(\phi _0\) of G such that \({\mathbb {E}}^{\lambda _1}(\phi _0|{\mathcal {G}}) = \frac{1}{2} {\mathbb {E}}^{\lambda _1}({\mathbf {1}}_D|{\mathcal {G}})\). Therefore, \(\frac{1}{2} {\mathbb {E}}^{\lambda _1}({\mathbf {1}}_D|{\mathcal {G}}) \notin {\mathcal {I}}_{F_0}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}\), which implies that \({\mathcal {I}}_{F_y}^{({\mathcal {T}},{\mathcal {G}},\lambda _1)}\) is neither weakly upper hemicontinuous in \(L_p((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\) for \(1 \le p < \infty \) nor weak\(^*\) upper hemicontinuous in \(L_\infty ((T,{\mathcal {G}},\lambda _1),{\mathbb {R}})\). As a result, H(y) is neither weakly upper hemicontinuous in \(\prod _{1\le i \le m}L_p((T,{\mathcal {G}},\lambda _i),{\mathbb {R}})\) for \(1 \le p < \infty \) nor weak\(^*\) upper hemicontinuous in \(\prod _{1\le i \le m}L_\infty ((T,{\mathcal {G}},\lambda _i),{\mathbb {R}})\). This completes the proof. \(\square \)

1.3 Proof of Theorem 9

It follows from Remark 4 that there is a countably generated sub-\(\sigma \)-algebra \({\mathcal {G}}'\) of \({\mathcal {G}}\) such that \(q_j(\cdot |s,x)\) is \({\mathcal {G}}'\)-measurable for all \(s\in S\), \(x\in X\), and \(1 \le j \le J\). Without loss of generality, we shall assume that \({\mathcal {G}}\) is countably generated in the rest of this proof.

For \(1 \le j \le J\), let \(\tau _j\) be the finite measure on \((S, {\mathcal {S}})\) with \(\rho _j\) being the Radon–Nikodym derivative with respect to \(\tau \), \(\tau _0 = \tau \), and \(\tilde{\tau } = (\tau _0, \tau _1, \ldots , \tau _J)\).

Let

$$\begin{aligned} V = \left\{ x =(x_1, \ldots , x_m) \in {\mathbb {R}}^m :|x_i| \le C, 1 \le i \le m \right\} , \end{aligned}$$

where C is the upper bound of the stage payoff function u. We slightly abuse the notation by also viewing V as a constant correspondence from S to \({\mathbb {R}}^{m}\). Then V is \({\mathcal {S}}\)-measurable, bounded and compact valued. Let \(\tilde{V} = {\mathcal {I}}^{({\mathcal {S}},{\mathcal {G}},\tilde{\tau })}_V\). By Theorems 2 and 3, and Footnote 23, \(\tilde{V} \subseteq \prod _{0 \le j \le J} L_{\infty }((S,{\mathcal {G}},\tau _j),{\mathbb {R}}^{m})\) is nonempty, convex, compact and metrizable under the weak\(^*\) topology.

Given any \(s\in S\) and \(\tilde{v} = (\tilde{v}_0, \tilde{v}_1, \ldots , \tilde{v}_J) \in \tilde{V}\), consider the game \(\Gamma (\tilde{v}, s)\). The action space for player i is \(A_i(s)\). The payoff of player i with the action profile \(x\in A(s)\) is given by

$$\begin{aligned} \Phi _i(s,x,\tilde{v}) = (1-\beta _i) u_i(s,x) + \beta _i \sum _{1\le j\le J} \int _{S} \tilde{v}_{ji}(s_1) q_j(s_1|s,x) \tau _j({{\mathrm{d}}}s_1), \end{aligned}$$

where \(\tilde{v}_{ji}\) is the i-th dimension of \(\tilde{v}_{j}\).

For any \(\tilde{v} \in \tilde{V}\), there exists some \({\mathcal {S}}\)-measurable selection v of V such that \(\tilde{v} = {\mathbb {E}}^{\tilde{\tau }} (v|{\mathcal {G}})\). For each j and (sx), \(q_j(\cdot ,s,x)\) is \({\mathcal {G}}\)-measurable. We have

$$\begin{aligned}&\sum _{1 \le j \le J} \int _{S} \tilde{v}_{ji}(s_1) q_j(s_1, s,x) \tau _j({{\mathrm{d}}}s_1) = \sum _{1 \le j \le J} \int _{S} {\mathbb {E}}^{\tau _j}(v_i q_j(\cdot , s,x)|{\mathcal {G}})(s_1) \tau _j({{\mathrm{d}}}s_1) \\&\quad = \sum _{1 \le j \le J} \int _{S} v_i(s_1) q_j(s_1, s,x) \tau _j({{\mathrm{d}}}s_1) = \sum _{1 \le j \le J} \int _{S} v_i(s_1) \rho _j(s_1) q_j(s_1, s,x) \tau ({{\mathrm{d}}}s_1) \\&\quad = \int _{S} v_i(s_1) Q({{\mathrm{d}}}s_1| s,x). \end{aligned}$$

As a result,

$$\begin{aligned} \Phi _i(s,x,\tilde{v})= & {} (1-\beta _i) u_i(s,x) + \beta _i \int _{S} v_i(s_1) Q({{\mathrm{d}}}s_1|s,x). \end{aligned}$$
(4)

Lemma 4

For each \(i \in I,\, s \in S\), \(\Phi _i(s,\cdot ,\cdot )\) is jointly continuous on \(A(s) \times \tilde{V}\), where \(\tilde{V}\) is endowed with the weak\(^*\) topology and \(A(s) \times \tilde{V}\) is endowed with the product topology.

Proof

Suppose that \(x^n \rightarrow x^0\) in A(s) and \(\tilde{v}^{n}\) weak\(^*\) converges to \(\tilde{v}^{0}\) in \(\tilde{V}\). For each \(n \ge 0\), there exists an \({\mathcal {S}}\)-measurable selection \(v^{n}\) of V such that \(\tilde{v}^{n} = {\mathbb {E}}^{\tilde{\tau }}(v^{n}|{\mathcal {G}})\).

First, we have

$$\begin{aligned}&\left| \int _{S} v^{n}_i(s_1) Q({{\mathrm{d}}}s_1| s,x^{n}) - \int _{S} v^{n}_i(s_1) Q({{\mathrm{d}}}s_1| s,x^{0}) \right| \\&\quad \le C\cdot \left\| Q(\cdot | s,x^{n}) - Q(\cdot | s,x^{0}) \right\| \rightarrow 0, \end{aligned}$$

where \(\Vert \cdot \Vert \) here is the total variation norm. By Eq. (4),

$$\begin{aligned} \left| \Phi _i(s,x^{n},\tilde{v}^{n}) - \Phi _i(s,x^{0},\tilde{v}^{n}) \right| \rightarrow 0. \end{aligned}$$

Furthermore, since \(\tilde{v}^{n}\) weak\(^*\) converges to \(\tilde{v}^{0}\), we have

$$\begin{aligned} \left| \Phi _i(s,x^{0},\tilde{v}^{n}) - \Phi _i(s,x^{0},\tilde{v}^{0}) \right| \rightarrow 0. \end{aligned}$$

As a result,

$$\begin{aligned} \left| \Phi _i(s,x^{n},\tilde{v}^{n}) - \Phi _i(s,x^{0},\tilde{v}^{0}) \right|&\le \left| \Phi _i(s,x^{n},\tilde{v}^{n}) - \Phi _i(s,x^{0},\tilde{v}^{n}) \right| \\&\quad + \left| \Phi _i(s,x^{0},\tilde{v}^{n}) - \Phi _i(s,x^{0},\tilde{v}^{0}) \right| \\&\quad \rightarrow 0. \end{aligned}$$

The proof is completed. \(\square \)

Fix any \(s\in S\) and \(\tilde{v} \in \tilde{V}\), and consider the game \(\Gamma (\tilde{v}, s)\). A mixed strategy of player i is an element in \({\mathcal {M}}(A_i(s))\), which is endowed with the topology of weak convergence of measures. A mixed-strategy profile is an element in \(\prod _{i\in I}{\mathcal {M}}(A_i(s))\) endowed with the product topology. The set of mixed-strategy Nash equilibria in the game \(\Gamma (\tilde{v}, s)\) is denoted by \(N(\tilde{v}, s)\), which is nonempty and compact under the corresponding product topology. Let \(P(\tilde{v}, s)\) (abbreviated as \(P_{(\tilde{v})}(s)\)) be the set of payoff vectors induced by the Nash equilibria in \(N(\tilde{v},s)\). The correspondence \(N(\tilde{v},\cdot )\) is \({\mathcal {S}}\)-measurable and compact valued.

Denote \(R(\tilde{v}) = {\mathcal {I}}^{({\mathcal {S}},{\mathcal {G}},\tilde{\tau })}_{P_{(\tilde{v})}}\). The correspondence R is nonempty, convex, weak\(^*\) compact valued and upper hemicontinuous by Theorems 2, 3 and 4. By Fan-Glicksberg’s fixed-point theorem, R has a fixed point \(\tilde{v}^{*} \in \tilde{V}\). Then there exists an \({\mathcal {S}}\)-measurable mapping \(v^{*} :S \rightarrow {\mathbb {R}}^m\) such that \(v^{*}\) is a selection of \(P_{(\tilde{v}^{*})}\), and \(\tilde{v}^{*} = {\mathbb {E}}^{\tilde{\tau }}(v^{*}|{\mathcal {G}})\). There exists an \({\mathcal {S}}\)-measurable mapping \(f^{*} :S \rightarrow \bigotimes _{i\in I}{\mathcal {M}}(X_i)\) such that \(f^{*}(s)\) is a mixed-strategy equilibrium of the game \(\Gamma (\tilde{v}^{*},s)\) and \(v^{*}(s)\) is the corresponding equilibrium payoff for each \(s\in S\). Since \(v^{*}\) is a measurable selection of \(P_{(\tilde{v}^{*})}\), Eq. (4) implies that Eqs. (1) and (2) hold for \(v^*\). Therefore, \(f^*\) is a stationary Markov perfect equilibrium.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, W., Sun, Y. Conditional expectation of correspondences and economic applications. Econ Theory 66, 265–299 (2018). https://doi.org/10.1007/s00199-017-1067-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00199-017-1067-7

Keywords

JEL Classification

Navigation