Multivariate Systemic Optimal Risk Transfer Equilibrium

A Systemic Optimal Risk Transfer Equilibrium (SORTE) was introduced in:"Systemic optimal risk transfer equilibrium", Mathematics and Financial Economics (2021), for the analysis of the equilibrium among financial institutions or in insurance-reinsurance markets. A SORTE conjugates the classical B\"{u}hlmann's notion of a risk exchange equilibrium with a capital allocation principle based on systemic expected utility optimization. In this paper we extend such a notion to the case when the value function to be optimized is multivariate in a general sense, and it is not simply given by the sum of univariate utility functions. This takes into account the fact that preferences of single agents might depend on the actions of other participants in the game. Technically, the extension of SORTE to the new setup requires developing a theory for multivariate utility functions and selecting at the same time a suitable framework for the duality theory. Conceptually, this more general framework allows us to introduce and study a Nash Equilibrium property of the optimizer. We prove existence, uniqueness, and the Nash Equilibrium property of the newly defined Multivariate Systemic Optimal Risk Transfer Equilibrium.


Introduction
A Systemic Optimal Risk Transfer Equilibrium, denoted with SORTE, was introduced and analyzed in Biagini et al. (2021) [6].The SORTE concept was inspired by Bühlmann's notion of a Risk Transfer Equilibrium in insurance-reinsurance markets.However, in Bühlmann's definition the vector assigning the budget constraints was given a priori.On the contrary, in the SORTE, such a vector is endogenously determined by solving a systemic utility maximization problem.As remarked in [6], "SORTE gives priority to the systemic aspects of the problem, in order to optimize the overall systemic performance, rather than to individual rationality".For the precise definition of a SORTE, its existence, uniqueness and Pareto optimality we refer to [6].In Section 1.1 we will only very briefly recall its motivation and heuristic definition in order to compare it with the results of the present paper.We will not address any integrability issues in this Introduction.
The capital allocation and risk sharing equilibrium that we consider in this new work, similarly to the one introduced in [6], can be applied to many contexts, such as: equilibrium among financial institutions, agents, or countries; insurance and reinsurance markets; capital allocation among business units of a single firm; wealth allocation among investors.
The key novelty in this work is that we consider preferences of agents which depend on other agents' choices.This is modeled using multivariate utility functions.Let (Ω, F , P) be a probability space and L 0 (Ω, F , P) be the vector space of (equivalence classes of) real valued F -measurable random variables.The sigma algebra F represents all possible measurable events at the terminal time.E Q [•] denotes the expectation under a probability Q.For the sake of simplicity, we are assuming zero interest rate.In a one period setup we consider N agents.The individual risk (or the random endowment or the future profit and loss) plus the initial wealth of each agent is represented by the random variable X j ∈ L 0 (Ω, F , P).Thus the risk vector X = [X 1 , ..., X N ] ∈ (L 0 (Ω, F , P)) N denotes the original configuration of the system.We assume that the system has at disposal a total amount of capital A ∈ R to be used in case of necessity.This amount could have been assigned by the Central Bank, or could have been the result of the previous trading in the system, or could have been collected ad hoc by the agents.The amount A could represent an insurance pot or a fund collected (as guarantee for future investments) in a community of homeowners.For further interpretation of A, see also the related discussion in Section 5.2 of Biagini et al. (2020) [8].In any case, we consider the quantity A as exogenously determined and our aim is to allocate this amount among the agents in order to optimize the overall systemic satisfaction.
In this paper we work with a multivariate utility function U : R N → R that is strictly concave and strictly increasing with respect to the partial componentwise order.However, some results (see Corollary 6.2) hold also without the strict concavity or the strict monotonicity.We develop a condition on the multivariate utility U , see Definition 3.4, that will play the same role as the Inada conditions in the one dimensional case.Details and precise assumptions are deferred to Section 1.2 and Section 3. Examples of utility functions U satisfying our assumptions are collected in Section 7.

Systemic optimal (deterministic) initial-time allocation
If we denote with a j ∈ R the cash received (if positive) or provided (if negative) by agent j at initial time, then the time T wealth at disposal of agent j will be X j + a j .The optimal allocation a X ∈R N could then be determined as the solution to the following aggregate criterion As the vector a X ∈R N is deterministic, it is known at the initial time and therefore the allocation is distributed (only) at such initial time and this is to the advantage of each agent.Indeed, if the agent j receives the fund a j X at initial time, the agent may use this amount to prevent financial ruin of future default.By the monotonicity of U , we may formalize the budget constraints set in the utility maximization problem (here and below) using equivalently the equality N j=1 a j = A or the inequality N j=1 a j ≤ A.

Systemic optimal (random) final-time allocation
We are now going to replace in Π det A the constant vectors a ∈R N with random vectors Y = [Y 1 , ..., Y N ] ∈ (L 0 (Ω, F , P)) N representing final-time random allocations.Set and note that each component Y j of the vector Y ∈ C R is a random variable (measurable with respect to the sigma algebra at the terminal time), but the sum of the components is P-a.s.equal to some constant in R. We may impose additional constraints on the vectors Y of random allocations by requiring that they belongs further to a prescribed set B of feasible allocations.It will be assumed that and that B is translation invariant: B + R N = B. We consider a family of probability vectors (23)) associated to B and to the convex conjugate V of U , and we take L := Q∈QB,V L 1 (Q) for L 1 (Q) :=L 1 (Ω, F , Q 1 ) × ...×L 1 (Ω, F , Q N ).With these notations, a different possibility to allocate the amount A among the agents is to consider the following criterion It is clear that Π det A (X) ≤ Π ran A (X), thus random allocations realizes, as obvious, a greater systemic expected utility, as the dependence among the components X j of the original risk can be taken into account by the terms X j + Y j .The condition N j=1 Y j = A is instrumental for the allocation of the amount A. The optimization problem in (2) can be seen as the maximization of systemic utility for the allocation of the amount A over feasible allocations Y ∈ B, in a regulatory approach.Indeed, only the utility of the whole system is taken into account in (2), while optimality for single agents is not required.The problem (2) is similar in spirit to classical risk sharing problems (see Barrieu and El Karoui (2005) [5]).Unlike in the classical risk sharing problems, we have a multivariate value function in place of the classical sum of univariate ones.We observe that the "budget" constraints in Π ran A are not expressed in the classical way using expectation under some (or many) probability measures, but are instead formalized as P − a.s.equality.Only in case N = 1 the problem becomes trivial, as Π ran A (X) = E [U (X + A)] .On a technical level our first main result of this paper in the detailed study of the problem Π ran A .We first show in Theorem 4.2 that Π ran A (X) can be rewritten with the budget contraint assigned by the family of probability vectors Q B,V , namely We prove (Theorem 4.2 and Corollary 4.7): (i) the existence of the optimizer Y ∈ L of the problem in (3); (ii) its dual formulation as a minimization problem over Q B,V ; (iii) the existence of the optimizer Q ∈ Q B,V of such dual formulation; (iv) that ( 2) and (3) have the same optimizer.Additionally, in (35) we prove that for such an optimizer Q we have We now present some more conceptual motivation for the analysis of this problem.As stated above, Π ran A (X) is greater than Π det A (X) and the random variables A (X) are terminal time allocations, as they are F -measurable.However, obviously one may split Y in two components for some a ∈ R N such that N j=1 a j = A, which then represents an initial capital allocation a = (a 1 , ..., a N ) of A and a terminal time risk exchange Y satisfying We pose two natural questions: 1. Is there an "optimal" way to select such initial capital a ∈ R N ?2. Could we discover a risk exchange equilibrium among the agents that is embedded in the problem Π ran A ?
From the formulation in (4), one could conjecture that the amount a j assigned as the expectation of the optimizer of Π ran A (X) under the probability Q j , namely a j := E Q j [ Y j ], could have a special relevance.We will show indeed that the optimal solution to the above problem Π ran A (X) coincides with a multivariate version of the Systemic Optimal Risk Transfer Equilibrium (SORTE) introduced in Biagini et al. (2020) [6].
In order to answer these questions more precisely we need to recall the notion of a risk exchange equilibrium, as proposed by Bühlmann (1980) [12] and (1984) [13].

Risk exchange equilibrium and Systemic Optimal Risk Transfer Equilibrium
We recall that in this paper we work with a multivariate utility function but, in order to illustrate the risk exchange equilibrium and the SORTE concepts, in this subsection we assume that the preferences of each agent j are given via expected utility, by a strictly concave, increasing utility function u j : R → R, j = 1, ..., N. In this case the corresponding multivariate utility function would be U (x) := N j=1 u j (x j ), x ∈ R N .The vector X = (X 1 , ..., X N ) ∈ (L 0 (Ω, F , P)) N denotes the original risk configuration of the system (the individual risk plus the initial wealth) and each agent is an expected utility maximizer.At terminal time a reinsurance mechanism is allowed to happen, in that each agent j agrees to receive (if positive) or to provide (if negative) the amount Y j (ω) at the final time in exchange of the amount E Q j Y j paid (if positive) or received (if negative) at the initial time, where is some pricing probability vector (the equilibrium price system).The reinsurance nature of this reallocation comes from the fact that the clearing condition is required to hold, which models a terminal-time risk transfer mechanism.Integrability or boundedness conditions on Y j will be added later when we rigorously formalize the setting.We use the ∼ in the notation Y for the sake of consistency with the previous work [6].
As defined in Bühlmann (1980) [12] and (1984) [13], a pair ( Y X , Q X ) is a risk exchange equilibrium w.r.to the vector X if: (α) for each j, Y j X maximizes: The optimal value in (α) is denoted by Remark 1.1.A key point of Bühlmann's risk exchange equilibrium, which carries over to SORTE and mSORTE, is that in (α) the single agent j is optimizing over all possible random variables Y j and not over only those that satisfies the clearing condition (β).Indeed for the single myopic agent the clearing condition is irrelevant.Observe that if in (α) we consider a generic probability vector Q, the solutions Y j X of the single N problems in (α) will typically not satisfy the clearing condition (β).It is only the selection of the equilibrium pricing vector Q X in (α) that permits to comply with the clearing condition (β).
Remark 1.2.Differently from Bühlmann's notion we will also impose that the exchange vector Y belongs to a prescribed set B of feasible allocations, as already mentioned.If there are no further constraints, i.e.
holds in Bühlmann's risk exchange equilibrium.But the presence of the constraints on Y , represented by B, forces to abandon the condition (γ) in Bühlmann's risk exchange equilibrium and to allow for a generic vector Q . A detailed discussion on several potentially different pricing measures in Bühlmann's risk exchange equilibrium with feasible allocation set B = C R , as well as in SORTE, can be found in the Introduction of [6].This prompt us to define a B-risk exchange equilibrium w.r.to the vector X as a pair ( Y X , Q X ) satisfying: (α) for each j, Y j X maximizes: After this review of the concept of risk exchange equilibrium, we now return to our problem of allocating the amount A ∈ R. Observe that if a∈R N is allocated at initial time among the agents and N j=1 a j = A then the initial risk configuration of each agent becomes (a j + X j ) and they may enter in a risk exchange equilibrium w.r. to such modified vector (a X + X).
According to [6] (8) Observe that SORTE explains how optima can be realized conjugating optimality for the system as a whole (optimization over a ∈ R N in (8)) and convenience for single agents (the inner supremum in (8)).Under fairly general assumptions, existence, uniqueness and Pareto Optimality of a SORTE were proven in [6].
In this paper we propose a multivariate version of this concept, which we label with mSORTE, and prove that the optimizer Y of Π ran A (X), the optimizer Q of the dual formulation of Π ran A (X), the selection a j := E Q j [ Y j ] in the splitting (5) determine the (unique) mSORTE.

Multivariate Systemic Optimal Risk Transfer Equilibrium
Essentially, in this paper we answer to the following question: what can we say about a concept of equilibrium similar to SORTE, with the same underlying exchange dynamics, when preferences of each agent depend on the actions of the other agents in the system?In our analysis, we consider a multivariate utility function U : R N → R for the system.Mildly speaking, U is a utility associated to the system as a whole.U determines the preferences of single agents in the system, who take into account the actions and choices of the others: if the random vector Z [−j] = [Z 1 , . . ., Z j−1 , Z j+1 , . . ., Z N ] models the positions of agents 1, . . ., j − 1, j + 1, . . ., N , we suppose that each agent j is an expected utility maximizer, in the sense that he/she seeks W → max E P U ([Z [−j] ; W ]) where [Z [−j] ; W ] := Z 1 , . . ., Z j−1 , W, Z j+1 , . . ., Z N .U can be thought as nontrivial aggregation of preferences of single agents: if each agent has preferences given by univariate utility functions u 1 , . . ., u N : R → R, then given an aggregation function Γ : R N → R which is concave and nondecreasing we can consider U (x) = Γ(u 1 (x 1 ), . . ., u N (x N )), in the spirit of Liebrich and Svindland (2019) [34] Definition 4, as a natural candidate for U .Alternatively, U is a counterpart to the multivariate loss function ℓ considered e.g. in [3] in the framework of Systemic Risk Measures.Just setting U (x) = −ℓ(−x), x ∈ R N , produces a natural candidate for our U starting from a loss function ℓ.The difference between loss functions and utility is not conceptual here, being instead just an effect of considering as positive amounts losses (as in the case of ℓ) or gains (as in the case of standard utility functions).Multivariate utility functions could be employed also to describe the case of a single firm having investments in N units, where the interconnections among the N desks are relevant.We will impose on our multivariate utility function U conditions which play the same role of Inada conditions in the univariate case.Examples of utility functions satisfying our assumptions are collected in Section 7. Notice that the setup and results in [6] can be recovered from the ones in this paper by setting U (x) = N j=1 u j (x j ), x ∈ R N , as described in Section 7.1.
In this paper we introduce and analyze the following concept.A mSORTE is a triple ( Y X , a X , Q X ) where: • the pricing vector Q X is selected in such a way that the optimal solution Y X belongs to the set of feasible allocations B and verifies the clearing condition (6).
We prove existence and uniqueness of a mSORTE.Quite remarkably, this generalization of a SORTE allows us to introduce and to study a Nash Equilibrium property for a mSORTE, as shown in Section 4.2 (see Theorem 4.5).In Section 4.3 we provide an example of a class of exponential multivariate utility functions with the explicit computations of the mSORTE.
From a technical perspective, our results can be considered as consequences of Theorem 4.2 and Theorem 4.3.The proof of Theorem 4.2, which is the most lengthy and complex, is split in several steps and collected in Section 6.We use a Komlós-type argument, in contrast with the gradient approach in [6].This allows us to obtain existence of optimizers for both the primal and the dual problems without requiring differentiability of U (•), which is a rather unusual result in the literature.We also remark that, differently from [6], we need to construct the dual system (M Φ , K Φ ), where M Φ is a multivariate Orlicz Heart having as topological dual space the Köthe dual K Φ .Here, we denote with Φ : (R + ) N → R the multivariate Orlicz function Φ(x) := U (0) − U (−x) associated to the multivariate utility function U .Details of this construction are provided in Section 2.1.
As already mentioned, this paper is a somehow natural prosecution of [6].Thus, as far as the conceptual aspects are concerned, we refer to the literature review in [6] for extended comments.
The paper is organized as follows.Section 2.1 is a short account on multivariate Orlicz spaces and on the relevant properties needed in the sequel of the paper.The multivariate utility functions used in this paper are introduced is Section 3, together with our setup and assumptions.The core of the paper is Section 4, where we formally present the key concepts and provide our main results.Section 5 collects some preliminary results on duality and utility maximization.Most of the proofs are deferred to Section 6.The Appendix collects some additional technical results and some of the proofs related to Section 2.1.

Preliminary notations and multivariate Orlicz spaces
Let (Ω, F , P) be a probability space and consider the following set of probability vectors on (Ω, F ) For a vector of probability measures Q we write be the vector spaces of (equivalence classes of) Q-a.s.finite, Q-integrable and Q-essentially bounded random variables respectively, and set and write Given a vector y ∈ R N and n ∈ {1, . . ., N } we will denote by y [−n] the vector in R N −1 obtained suppressing the n-th component of y for N ≥ 2 (and y [−n] = ∅ if N = 1) and we set [y [−n] ; z] := y 1 , . . ., y n−1 , z, y n+1 , . . ., y N ∈ R N , for z ∈ R.
We will write R + := [0, +∞) and R ++ := (0, +∞), x, y = N j=1 x j y j for the usual inner product of vectors x, y ∈ R N .For a vector x ∈ R N , (x) ± denote the vectors of positive, negative parts respectively of the components of x.Same applies to |x|.

Multivariate Orlicz spaces
Given a univariate Young function φ : R + → R we can associate its conjugate function φ * (y) := sup x∈R+ (xy − φ(x)) for y ∈ R + .As in [39], we can associate to both φ and φ * the Orlicz spaces and Hearts L φ , M φ , L φ * , M φ * .For univariate utility functions, the economic motivation and the mathematical convenience of using Orlicz spaces theory in utility maximization problems were shown in [9].We now introduce multivariate Orlicz functions and spaces induced by multivariate utility functions.The following definition is a slight modification of the one in Appendix B of [3].Definition 2.1.A function Φ : (R + ) N → R is said to be a multivariate Orlicz function if it is null in 0, convex, continuous, increasing in the usual partial order and satisfies: there exist For a given multivariate Orlicz function Φ we define, as in [3], the Orlicz space and the Orlicz Heart respectively: where |X| := X j N j=1 is the componentwise absolute value, and L 0 ((Ω, F , P); [−∞, +∞]) is the set of equivalence classes of [−∞, +∞]-valued F -measurable functions.We introduce the Luxemburg norm as the functional Lemma 2.2.Let Φ be a multivariate Orlicz function.Then: 1. the Luxemburg norm is finite on X if and only if X ∈ L Φ ; 2. the Luxemburg norm is in fact a norm on L Φ , which makes it a Banach space; 3. M Φ is a vector subspace of L Φ , closed under Luxemburg norm, and is a Banach space itself if endowed with the Luxemburg norm; 4. L Φ is continuously embedded in (L 1 (P)) N ; 5. convergence in Luxemburg norm implies convergence in probability; , and the same holds for the Orlicz Heart.In particular X ∈ L Φ implies X ± ∈ L Φ and the same holds for the Orlicz Heart; 7. the topology of • Φ on M Φ is order continuous (see [22] Definition 2.1.13for the definition) with respect to the componentwise P-a.s.order and M Φ is the closure of (L ∞ (P)) N in Luxemburg norm; 8. M Φ and L Φ are Banach lattices if endowed with the topology induced by • Φ and with the componentwise P-a.s.order.
Proof.Claims (1)-( 5) follow as in [3].( 6) is trivial from the definitions.As to (7), sequential order continuity is an application of Dominated Convergence Theorem, and order continuity follows from Theorem 1.1.3in [22].( 8) is evident from the previous items.
Now we need to work a bit on duality.
Definition 2.3.The Köthe dual K Φ of the space L Φ is defined as Proposition 2.4.K Φ can be identified with a subspace of the topological dual of L Φ and is a subset of (L 1 (P)) N .
By Proposition 2.4 K Φ is a normed space which can be naturally endowed with the dual norm of continuous linear functionals, which we will denote by This norm will play here the role of the Orlicz norm, and the relation between the two norms • Φ and • * Φ is well understood in the univariate case (see Theorem 2.2.9 in [22]).The following Proposition summarizes useful properties which show how the Köthe dual can play the role of the Orlicz space L Φ * for M Φ in univariate theory, and are the counterparts to Corollary 2.2.10 in [22].
Proposition 2.5.The following hold: Proof.See Section A.3.We now provide an example connecting the multivariate theory to the univariate classical one.
Remark 2.6.Even thought we will not make this assumption in the rest of the paper, suppose in this Remark that Φ(x) = N j=1 Φ j (x j ) for univariate Orlicz functions, that is each separately satisfying Definition 2.1 for N = 1.Then we could consider the multivariate spaces L Φ and M Φ as above or we could take As shown in Section A.3, the following identity between sets holds: Observe that in the setup of this Remark, from Proposition 2.5 Item 3, we have 3 Setup and assumptions Definition 3.1.We say that U : R N → R is a multivariate utility function if it is strictly concave and strictly increasing with respect to the partial componentwise order.When N = 1 we will use the term univariate utility function instead.For a multivariate utility function U we define the convex conjugate in the usual way by Observe that by definition U (x) ≤ x, y + V (y) for every x, y ∈ R N , and For a multivariate utility function U , we define the function Remark 3.3.The well known Inada conditions, for (one-dimensional) concave increasing utility functions u : R → R, have an evident economic significance and are very often assumed to hold true in order to solve utility maximization problems.

Inada(+∞): lim
As it is easy to check, they can be equivalently rewritten as: Consider now the condition weaker than Inada(−∞): One can again easily check that the two conditions ( 14) and ( 15) are equivalent to the following single statement: there exists an Orlicz function Φ : R + → R and a function f : Motivated by the above remark, we now introduce a condition that will replace (16) in the multivariate case and will play the same role as the Inada in the one dimensional case.
Definition 3.4.We say that a multivariate utility function U : R N → R is well controlled if there exist a multivariate Orlicz function Φ : R N + → R and a function f : R + → R such that Lemma 3.5.Suppose that the multivariate utility function U is well controlled.Then: (iii) for every ε > 0 small enough there exist a constant b ε such that Proof.Recall that by Definition 2.1 there exist and the desired inequality follows letting (20).Remark 3.6.In the proofs in multivariate setting, the inequalities ( 18) and ( 19) will play the same role that respectively (14) and (15) have in the unidimensional case.In Proposition 7.1 we use the aforementioned univariate Inada conditions to make sure that (17) holds when U has a particular form.
We observe that the inclusion L Φ ⊆ L Φ , opposite of (ii), is a simple integrability requirement, which can be rephrased as: if for X ∈ (L 0 ((Ω, F , P); [−∞, +∞])) N there exist λ > 0 such that This request is rather weak and there are many examples of choices of U that guarantee this condition is met (see Section 7).Without further mention, the following two standing assumptions hold true throughout the paper.
Standing Assumption I.The function U : R N → R is a multivariate utility function which is well controlled (Definition 3.4) and such that for Φ in (17) The vector X belongs to the Orlicz Heart M Φ .
Observe that the Standing Assumption II implies that all constant vectors belong to B, so that all (deterministic) vector in the form e i − e j (differences of elements in the canonical base of R N ) belong to B ∩ M Φ .We recall the following concept, introduced in [8] Definition 5.15, that was already used in [6].
As pointed out in [8], B = C R is closed under truncation.Closedness under truncation property holds true for a rather wide class of constraints.For a more detailed explanation and examples, see also [6] Example 3.17 and Example 4.20.We will also need the following additional notation.
1.For any A ∈ R we consider the set of feasible random allocations Identifying Radon-Nikodym derivatives and measures in the natural way, this can be rephrased as: Q is the set of normalized (i.e. with componentwise expectations equal to 1), non negative vectors in the polar of B 0 ∩ M Φ , in the dual system (M Φ , K Φ ).Observe that M Φ ⊆ L 1 (Q) for all Q ∈Q and that Q depends on the set B.
3. In the definition of mSORTE we will adopt the subset Q B,V ⊆ Q of vectors of probability measures having "finite entropy" and the set L of the random allocations satisfying the integrability requirements defined by Here 4 Multivariate Systemic Optimal Risk Transfer Equilibrium

Main concept
We now provide the formal definition of the concept already illustrated in the Introduction (see equation ( 9)).It is the natural generalization of SORTE as introduced in [6] Definition 3.7.

Main results
We provide sufficient conditions for existence, uniqueness and the Nash Equilibrium property of a mSORTE, see Theorems (26) Moreover 1. there exists an optimum Y ∈ L to the problem in RHS of (25).Such an optimum satisfies Y ∈ B A ∩ L and for any optimum ( λ, Q) of (26) the following holds: 2. any optimum ( λ, Q) of (26) satisfies λ > 0 and Q ∼ P; 3. there exists a unique optimum to RHS of (25).If U is additionally differentiable, there exists a unique optimum ( λ, Q) of (26).
Proof.The case A = 0 is covered in Theorem 6.1 and Corollary 6.8, the latter only proving the last equality in (26).In Section 6.3 we then explain how we can apply also to A = 0 the same arguments used for A = 0.
The following result is the counterpart to Theorem 4.2, once a vector Q ∈ Q B,V is fixed, and will be applied in Theorem 4.6.
Theorem 4.3.Under Assumption 3.8, for every Q ∈ Q B,V and A ∈ R the following holds: Proof.Consider first A = 0.By Equations ( 61) and ( 62) by Remark 4.4 below.The case A = 0 is then proved.The case A = 0, instead, follows from Section 6.3.
Remark 4.4.From the definition of V we obtain the Fenchel inequality Recall that M Φ ⊆ L 1 (Q) for all Q ∈Q.For all X ∈ M Φ , for all Q ∈Q and Y such that and the last expression is finite if Q ∈Q B,V .
On the existence of a mSORTE and Nash Equilibrium Theorem 4.5.
= 0 for every j = 1, . . ., N .Furthermore, the following Nash Equilibrium property holds for any mSORTE ( Y X , Q X , a X ): for every j = 1, . . ., N where Proof.Take Y as in Theorem 4.2 Item 1, Q an optimizer of ( 26), and set a j := E Q j [ Y j ], j = 1, . . ., N,.Then, from ( 25) and ( 26), sup = sup = sup where ( 32) is a simple reformulation of (31).By Item 1 of Theorem 4.2, the optimizer Y ∈ L satisfies the constraints of the problem in (31), and by ( 27) ( Y , a) yields an optimum for the problem in (32).By Lemma A.2, setting a ×R N be a mSORTE as in Definition 4.1.We prove the Nash Equilibrium property (30).For any Z ∈ L j , using Item 1 of Definition 4.1 On uniqueness of a mSORTE Theorem 4.6.Under Assumption 3.8, suppose additionally that U is differentiable.Then there exists a unique mSORTE We claim that Y is an optimizer of RHS of (25) and Q X is an optimizer of (26).Observe that Y ∈ B A ∩ L (using and we are assuming that the set B is closed under truncation, by Lemma A.1 we have that , we then obtain ≤ sup Thm.4.3 = sup where ( 34) is a consequence of Fenchel inequality (29), the expression in ( 35) is a reformulation of the one in the previous line, (36) follows from Lemma A.2 and (37) holds true because( Y X , Q X , a X ) is a mSORTE and therefore ( Y X , a X ) is an optimizer of the problem in (36).Notice that Theorem 4.3 guarantees that the inf in ( 34) is a min.We then deduce that all above inequalities are equalities and Y is an optimizer of RHS of (25) and Q X is an optimizer of (26).Now, take another mSORTE ( Z X , D X , b X ) with E D j X [ Z j X ] = 0 for every j = 0, . . ., N .Arguing exactly as above for Z := Z X + b X we get that Z is an optimizer of RHS of (25) and D X is an optimizer of (26).Theorem 4.2 Item 3 yields Z = Y and Q X = D X .Taking expectations componentwise we get that b X for every j = 1, . . ., N , which yields a X = b X .Finally, from the definitions of Z, Y we get Z X = Y X .
Corollary 4.7.Under Assumption 3.8 and if U is differentiable there exists a unique optimum Y for the problem Π ran A (X) in (2).Such an optimum is given by X ] = 0 for every j = 1, . . ., N and budget A.
Proof.Take Y = Y X + a X for ( Y X , Q X , a X ) as described in the statement (Theorems 4.2 and 4.6).By Lemma A.1, Π ran A (X) ≤ RHS of (25).Since Y ∈ L ∩ B A is an optimum for RHS of (25) (see the proof of Theorem 4.6), existence follows.Uniqueness is a consequence of strict concavity of U , by standard arguments.The arguments above also show automatically the link between the unique optimum for Π ran A (X) and the unique mSORTE for the budget A.

Explicit computation in an exponential framework
With a particular choice of the utility function U (see (38) below) we can explicitly compute primal ( Y ) and dual ( Q) optima in Theorem 4.2.Consequently, the (unique) mSORTE can be explicitly computed.Recall from the proof of Theorem 4.5 that the mSORTE is produced by setting a j As we are able to find explicit formulas, we prefer to anticipate this example even if the proof of Proposition 4.8 relies on two propositions in Section 7.
Proposition 4.8.Take α 1 , . . ., α N > 0 and consider for Select B = C R and define Then the dual optimum in LHS of ( 26) is given by and the primal optimum Y in LHS of ( 25) is given by Proof.
To begin with, one can easily check that the assumptions of Proposition 7.1 and Proposition 7.3 are satisfied, and so is then Standing Assumption I. Standing Assumption II is trivially satisfied, once we take X ∈ M Φ .Also, . The conjugate V , as well as its gradient, are computed explicitly in Lemma A.4.Our aim is to show that primal and dual optima, whose existence and uniqueness are stated in Theorem 4.2 Item 2 and 3, are given by Y , for λ X and Q given in ( 41) and ( 39) respectively.By direct computation one can then check that , where in view of Proposition 70 and Proposition 7.3 we have Recall now that U (−∇V (w)) = V (w) − N j=1 w j ∂V ∂w j (w) for every w ∈ (0, +∞) N (see [41] Chapter V).Then, given (81), we can write dP and λ X given in ( 39) and ( 41) respectively.In view of Theorem 4.2 primal and dual optimality then follow, once we check that Y ∈ B A ∩ M Φ .Given Q and λ X , Y can be directly computed as (40) and we can check that N j=1 Y j = A, which completes the proof.
Remark 4.9.The key point in the proof of Proposition 4.8 above was guessing the particular form of Q.Such a guess is a consequence of imposing N j=1 Y j = A for the candidate optimum Y .Indeed, imposing that yet to be found precisely at this initial stage), we get that for some η ∈ R Here, we also used the explicit formula (80) for the gradient of V .This computation implies that Q needs to satisfy (39).

Dependence on X of mSORTE
We study here the dependence of mSORTE on the initial data X.We recall that both existence and uniqueness are guaranteed (see Theorem 4.5 and Theorem 4.6).
Proposition 4.10.Under the same assumptions of Theorem 4.6 and for = 0 for every j = 1, . . ., N , the variables dQX dP and Proof.By Theorem 4.6 there exists a unique mSORTE.Recall the proof of Theorem 4.5, where we showed that the optimizers ( Y , Q) in Theorem 4.2, together with Notice that in this specific case Y := e i 1 A − e j 1 A ∈ B ∩ M Φ for all i, j.The same argument used in the proof of [6] Proposition 4.18 can be then applied with obvious minor modifications (i.e. using V (•) in place of We stress the fact that, similarly to [6] Proposition 4.18, all the components of any Q ∈ Q B,V are equal.
We now focus on X + Y : consider Z := E P X + Y | G − X (the conditional expectation is taken componentwise).Then it is easy to check that We now prove that Z ∈ L = Q∈QB,V L 1 (Q).This will imply that Observe first that for any given Q ≪ P, the measure Q G defined by dQG dP := E P To see this, recall that all the components of Q are equal, hence so are those of Q G .Moreover by conditional Jensen inequality.Now, for any j = 1, . . ., N and Q ∈ Q B,V As a consequence, since by (42) L ⊆ L 1 (Q G ) and Y ∈ L, we get X + Y ∈ L 1 (Q), and the fact that Z ∈ L follows.
Finally, observe that Jensen inequality.Hence Z is another optimum for the optimization problem in RHS of (25).By strict concavity of U then we get Y = Z.Since X + Z is G-(essentially) measurable, so is clearly X + Y .
It is interesting to notice that this dependence on the componentwise sum of X also holds in the case of SORTE ([6] Section 4.5) and of Bühlmann's equilibrium (see [13] page 16, which partly inspired the proof above, and [10]).
Remark 4.11.In the case of clusters of agents, the above result can be clearly generalized (see [6] Remark 4.19).

Systemic utility maximization and duality
In this Section we collect some remarks and properties of the polar cone of B 0 ∩ M Φ , which will play an important role in the following.
Remark 5.2.In the dual pair (M Φ , K Φ ) take the polar (B 0 ∩ M Φ ) 0 of B 0 ∩ M Φ .Since all (deterministic) vector in the form e i − e j belong to B 0 ∩ M Φ , we have that for all Z ∈ (B 0 ∩ M Φ ) 0 and for all i, j ∈ {1, . . ., N } and the definition of Q provided in (22).We then see: that is, (B 0 ∩ M Φ ) 0 is the cone generated by Q.
), so that the polars satisfy the opposite inclusion: In particular, take a > 0 for which (19) Proof.Let Y ∈ B ∩M Φ .Notice that the hypothesis R N + B = B implies that the vector Y 0 , defined by Theorem 5.5.Let C ⊆ M Φ be a convex cone with 0 ∈ C and e i −e j ∈ C for every i, j ∈ {1, . . ., N }.
Denote by C 0 the polar of the cone C in the dual pair (M Φ , K Φ ): Then the following holds: If any of the two expressions is strictly smaller than V (0) = sup R N U , then the condition λ ≥ 0 in (44) can be replaced with the condition λ > 0.
Proof.The proof can be obtained with minor and obvious modifications of the one in [6] Theorem A.3 by replacing We also provide an analogous result when working with the pair ((L ∞ (P)) N , (L 1 (P)) N ) in place of (M Φ , K Φ ), which will be used in Section 6.2.
Theorem 5.6.Replacing M Φ with (L ∞ (P)) N and K Φ with (L 1 (P)) N in the statement of Theorem 5.5, all the claims in it remain valid.
Proof.As in Theorem 5.5, the proof can be obtained with minor and obvious modifications of the one in [6] Theorem A.3, using Theorem 4 of [40] in place of Corollary on page 534 of [40].
6 Proof of Theorem 4.2 We now take care of the proof of Theorem 4.2 for the case A = 0.For the reader's convenience, we first provide a more detailed statement for all the results we prove, which also provides a road map for the proof.It is easy to reconstruct the content of Theorem 4.2 from the statement below.
Furthermore, there exists a random vector Y ∈ (L 1 (P)) N such that: 2. Y satisfies: and for any optimizer 3. Y is the unique optimum to the following extended maximization problem: Proof.
STEP 1: we show the equality chain in (45) and (46).We introduce for We start recognizing π 0 (X) as the LHS of (45) and observing that −∞ < π 0 (X) since B A ∩ M Φ = ∅ and by Fenchel inequality π 0 (X) < +∞ (combining Remark 5.3 to guarantee Q B,V = ∅, Proposition 5.4 and the inequality chain in Remark 4.4).Again by Proposition 5.4 and Remark 4.4, it is enough to show that sup This equality follows by Theorem 5.5, taking C := B 0 ∩ M Φ noticing that minima over Q can be substituted with minima over Q B,V since the expression in LHS of (45) is finite.Observe at this point that, if any of the three expressions in (45) and (46) were strictly smaller than V (0) = sup x∈R N U (x), direct substitution of λ = 0 in the expression would give a contradiction, no matter what the optimal probability measure is.STEP 2: we show the existence of a vector Y as described in Items 2 and 3 of the statement, made exception for the first equality in (48).More precisely, we first (Step 2a) identify a natural candidate Y using a maximizing sequence, and we show that it satisfies N j=1 Y j = 0. Then (Step 2b) we show that such candidate satisfies the integrability conditions and inequalities in (47).Finally (Step 2c) we show optimality stated in Item 3. The proof of the first equality in (48) is postponed to STEP 5.
Step 2a.First observe that X + Y ≥ − (|X| + |Y |) in the componentwise order, hence for Take now a maximizing sequence (Y n ) n in B 0 ∩ M Φ .W.l.o.g.we can assume that N j=1 Y j n = 0 ∀n P − a.s.
since if this were not the case (i.e. if the inequality were strict) we could add a ε > 0 small enough to each component without decreasing the utility of the system or violating the constraint

Now we apply Corollary A.7 with
We notice that by convexity the random vectors W H still belong to B 0 ∩ M Φ , and Y ∈ B 0 as B 0 is closed in probability (since so is B).Moreover, we have that so that the second equality in (48) holds.
Step 2b.We first work on integrability.We proceed as follows: we show that for any we have N j=1 Y j dQ j dP ∈ L 1 (P).Then we show that ( Y ) − ∈ L Φ , and conclude the integrability conditions in (48).Let us begin with showing N j=1 Y j dQ j dP ∈ L 1 (P).By definition of V (•), we have This implies We prove integrability also for the positive part, assuming now Z = dQ dP , Q ∈ Q B,V and taking λ > 0 such that E P [V (λZ)] < +∞.By (51) W H → H Y P-a.s.so that Now since Also by ( 52) sup Now use (18): sup We also have, Y 1 being the first element in the maximizing sequence, that inf ] by construction.Thus, continuing from (55), we get sup From ( 53), ( 54), (56) we conclude that To sum up, for Z ∈ Q B,V and λ s.t.
Next, we prove that ( Y ) − ∈ L Φ .To see this, we observe from (17) that for ε > 0 sufficiently small By Fatou Lemma we than conclude From this we infer (X + Y ) − ∈ L Φ .
We are now almost done showing integrability.Since by ( 21) L Φ = L Φ , we conclude that (X + Y ) − ∈ L Φ .By the very definition of K Φ and Q B,V ⊆ K Φ , we have N j=1 (X j + Y j ) − dQ j dP ∈ L 1 (P), which clearly implies (X j + Y j ) − dQ j dP ∈ L 1 (P) for every j = 1, . . ., N .At the same time we observe that and all the terms in RHS are in L 1 (P) (recall (57)).Thus, N j=1 (X j + Y j ) + dQ j dP ∈ L 1 (P) and (X j + Y j ) + dQ j dP ∈ L 1 (P) for every j = 1, . . ., N .We finally get that Y ∈ Q∈QB,V L 1 (Q) and our integrability conditions in (47) are now proved.To conclude Step 2b, we need to show that Step 2c.Observe now that by concavity of U and the fact that (Y n h ) h is again a maximizing sequence.From the expression in Equation ( 59) we get that for every ε > 0, definitely (in H) for H big enough (this sequence is bounded in (L 1 (P)) N by ( 51)) to get that for every ε > 0 Clearly then Y satisfies Now recall that by (19) for some a > 0, b ∈ R and since RHS is in L 1 (P) we conclude that E P U (X + Y ) < +∞.Hence: ≥ sup Eq.(45) = sup It is now enough to recall that by (47), which we proved in Step 2b, Y satisfies the constraints in RHS of (49): Optimality claimed in Item 3 then follows.STEP 3: we prove uniqueness of the optimum for the maximization problem in Item 3 and condition λ > 0 for every optimum ( λ, Q) of (46).
The uniqueness for the optimum follows from strict concavity of U (see Standing Assumption I): if two distinct optima existed, any strict convex combination of the two would still satisfy the constraint and would produce a value for E P [U (X + •)] strictly greater than the supremum.Recall now from STEP 1 that to prove the claimed λ > 0 it is enough to show that any of the three expressions in (45) and ( 46) is strictly smaller than sup x∈R N U 8x). Property λ > 0 is easily obtained if sup z∈R N U (z) = +∞, since we proved that E P U (X + Y ) < +∞ in Step 2c.Suppose that instead sup z∈R N U (z) < +∞ and notice that, setting which implies sup z∈R N U (z) = U (X + Y ) P-a.s..In particular, from the fact that X + Y is finite almost surely, it would follow that U almost surely attains its supremum on some compact subset of R N , which is clearly a contradiction given that U is strictly increasing.STEP 4: we study a related optimization problem when a Q ∈ Q B,V is fixed.We show that for any fixed π Q 0 (X) < +∞ follows from Remark 4.4.The equality between (61) and (62) follows from Theorem 5.5, and the fact that We stress the fact that (61) and (62) hold also dropping Assumption 3.8.We observe that if (61) is strictly smaller than V (0) then the minimum in (62) can be taken over (0, +∞) in place of [0, +∞).Let now X ∈ M Φ be fixed and π 0 (•), π Q 0 (•) be as in ( 50), (61) respectively.Then from STEP 1 together with (61) and (62) and whenever ( λ, Q) is an optimum for (46), then Q is an optimum for (63).STEP 5: we show that for any optimum ( λ, Q) ∈ R + × Q B,V of (46) we have Q ∼ P and the first equality in (48) holds.
We start observing that for any optimal Q as in the claim we have that by STEP 5 The last equality in particular follows observing that by trivial set inclusions and Fenchel inequality We now prove that Q ∼ P, using an argument inspired by [27] Remark 3.32: if this were not the case then P(A k ) > 0, where we would get a contradiction arguing in the same way as in STEP 3. We move to N j=1 E Q j Y j = 0: if this were not the case, by (47) we would have N j=1 E Q j Y j < 0 so that adding 0 < ε sufficiently small to each component of Y would give a vector still satisfying the constraints of RHS of (64), but having a corresponding expected utility strictly grater that the supremum (U is strictly increasing, and Y is an optimum as showed by ( 64)), which is a contradiction.STEP 6: we prove uniqueness of the optimum for (46) under the additional differentiability assumption.Take C := B 0 ∩ M Φ , and observe that (46) can be rewritten, by (43), as min which from strict convexity of V (•) ([41] Theorem 26.5) admits a unique optimum 0 ≤ Z = 0.
We then get that, since λ = E P Z , d Q dP = Z E P[ Z] (again by (43)), uniqueness for optima in (46) follows.
We now consider the possibility of weakening the requirements of strict monotonicity and concavity of U .To do so, we introduce the additional condition B = −B.We stress that, for B = C R both Assumption 3.8 and −B = B hold true.Corollary 6.2.Suppose that the function U : R N → R is (not necessarily strictly) concave and (not necessarily strictly) increasing with respect to the partial componentwise order.Suppose that there exist a multivariate Orlicz function Φ : R N + → R and a function f : R + → R such that (17) holds and L Φ = L Φ .Suppose that Assumption 3.8 holds and that −B = B. Then the following statements in Theorem 6.1 still hold: the equalities in (45)=(46), the fact that any optimum λ for (46) is strictly positive, all the claims in Item 2 and optimality (not uniqueness) in Item 3.Moreover, existence (not uniqueness) for a mSORTE holds.
Proof.Observe first that all the statements in Lemma 3.5 still hold, since we did not need strictness of concavity and monotonicity in their proof.Looking back at the proof of Theorem 6.1 we observe that we used strict convexity and strict monotonicity only from STEP 5 on and for the following reasons: on the one hand, to prove uniqueness for Y , λ, Q and that Q ∼ P, on the other hand to prove that N j=1 E Q j [ Y j ] = 0. We now show that under the hypotheses of the Corollary it is indeed possible to show the latter equality also when U is just increasing and concave, neither of the two being necessarily strict.More precisely, we show that for any optimum ( λ, Q) of ( 46) we have N j=1 E Q j [ Y j ] = 0 for the vector Y obtained in STEP 2.
To see this, it is indeed enough to observe that Y ∈ B 0 ⇔ −Y ∈ B 0 , thus the sequence Y m we used in (76) satisfies E P N j=1 Y j m dQ j dP = 0 for any Q ∈ Q B,V (recall the definition of Q B,V in ( 23)).Then, the inequality on (77) is an equality, and our claim follows.To conclude, observe that from this point on literally the same arguments of the proof of Theorem 6.1 yield the equalities in (45)=(46), the fact that any optimum λ for (46) is strictly positive, all the claims in Item 2 and optimality (not uniqueness) in Item 3. Also, Theorem 4.3 does not in practice require strict concavity or monotonicity in its proof, so it still holds if this assumption is dropped.Finally, a counterpart to the existence in Theorem 4.5 can be obtained with the same argument.Take indeed ( Y , Q) as in Theorem 4.2 and adopt the notation from Lemma A.2. Then one can check that S Q (A) = H Q (A), and that ( a To see this, it is enough to observe that the inequality S Q (A) ≥ H Q (A) can be obtained as in Lemma A.2, and that ) is optimal for S Q (A) as in the proof of Theorem 4.5.

Replacing Assumption 3.8
In this Section we present a counterpart to some of our findings, with Assumption 3.8 replaced by the following one.
Notice in particular that the condition lim z→+∞ Φj (z) z = +∞ guarantees that Φ * j (z) < +∞ for every z ≥ 0. We now present two preliminary propositions before stating the main result of the section.In Orlicz space theory the well known ∆ 2 condition on a Young function Φ : R → R guarantees that L Φ = M Φ .We say that Φ ∈ ∆ 2 if: There exists y 0 ≥ 0, K > 0 such that Φ(2y) ≤ KΦ(y) ∀ y s.t.|y| ≥ y 0 .
First, we show how Assumption 6.3 is linked to the ∆ 2 condition for the conjugate of Φ j : Proposition 6.4.Let Φ : R → R be a Young function differentiable on R \ {0} and let Φ * : R → R be its conjugate function.Then In particular, under Assumption 6.3 we have Φ * 1 , . . ., Φ * N ∈ ∆ 2 which implies Proof.The equivalence of the two conditions in (65) can be checked along the lines of Theorem 2.3.3 in [39], observing that the argument still works in our slightly more general setup (use Proposition 2.2 [39] in place of Theorem 2.2.(a) [39]).We now prove the final claim.By Standing Assumption I and Remark 2.6, To conclude, under Assumption 6.3 Φ * j ∈ ∆ 2 by the previous part of this proof, which in turns implies L Φ * j = M Φ * j , j = 1, . . ., N.
We also need a sequential w * -compactness result, see [21] Proposition 2.6.10,partly inspired by [20] Chap.II, proof of Theorem 24.A similar result is stated in [38], proof of Theorem 1, with a more technical (even though shorter) proof.For these reasons we omit the proof.Proposition 6.5.Assume that Φ, Φ * : R + → R are (univariate) conjugate Young functions, both everywhere finite valued.Then the balls in L Φ , endowed with Orlicz norm, are σ L Φ , M Φ * sequentially compact.
We are now ready for stating and proving the main result of this Section.
Theorem 6.6.Theorems 4.2, 4.3, 4.5 and 6.1 hold true by replacing Assumption 3.8 with Assumption 6.3 Proof.We do not need to start from scratch.In fact, most of the proof of Theorem 6.1 carries over with no modifications.The only point which needed closedness under truncation was proving that If we show this in an alternative way, all the rest can be done exactly in the same way.We observe that by (58) we have for each j = 1, . . ., N sup convexity of Φ j we have sup ).Now we apply Propositions 6.4 and proposition 6.5.Given the sequences (X j + W j H ) − , j = 1, . . ., N , a diagonalization argument yields a common subsequence such that ((X j + W j H ) − ) H converges in σ L Φj , M Φ * j on L Φj for every j.Call such limit Z j .Almost sure convergence (X j + W j H ) − → (X + Y ) − P − a.s.
implies Z = (X + Y ) − .Indeed, if this were not the case assume without loss of generality P(Z j > (X j + Y j ) − ) > 0 for some j.On a measurable subset D of the event {Z j > (X j + Y j ) − }, P(D) > 0, the convergence is uniform (by Egoroff Theorem, Theorem 10.38 in [2]).Consequently, by Dominated Convergence Theorem plus σ L Φj (F ), M Φ * j (F ) convergence and the fact that By Fatou Lemma and where we used Equation (66) and the fact that N j=1 W j H is a numeric sequence converging (P-a.s.) to N j=1 Y j to move from lim inf to the sum of limits.As a consequence N j=1 We get (X + Y ) ± ∈ L 1 (Q), hence Y ∈ L 1 (Q) and rearranging terms in (67 In particular, since Y ∈ B 0 , we conclude that As mentioned before, all the remaining parts of the proof are identical to the ones for Assumption 3.8.Hence Theorem 6.1 holds true.Now we get counterparts to Theorems 4.2 and 4.3 with the exact same arguments, recalling for the latter that (61) and (62) still hold dropping Assumption 3.8 (see the proof of Theorem 6.1).Again using the same arguments of Theorem 4.5 we then get its counterpart under the alternative Assumption 6.3.

Working on (L ∞ (P)) N
The following result is a counterpart to Theorem 6.1 Item 1 when working with the dual system ((L ∞ (P)) N , (L 1 (P)) N ) in place of (M Φ , K Φ ).Theorem 6.7.Under Assumption 3.8 the following holds: Proof.To check (68) we can apply the same argument used in Step 1 of the proof of Theorem 6.1, by replacing Theorem 5.5 with Theorem 5.6.What is left to prove then is that for To see this, observe that as consequence of Lemma A.5 we have N ⊆ K Φ .From this, by closedness under truncation we have for any Proof.By Theorem 6.1 Item 1 and Theorem 6.7, both LHS and RHS of (69) are equal to the minimax expression

General case: total wealth A ∈ R
In this section we extend previous results to cover the case in which the total wealth A might not be equal to 0. For A ∈ R and Q ∈ Q B,V recall the definitions of π A (X) in (50) and introduce, coherently with (61), It is possible to reduce the maximization problem expressed by π A (X) (and similarly π where last line holds since R N + B = B under Standing Assumption II.We recognize then that π A (X) is just π 0 (•), with different initial point (X + a) in place of X.
The same technique adopted above can be exploited to show that for any a ∈ R N with The argument above shows how to generalize Theorem 6.1, Theorem 6.7, Corollary 6.8 to cover the case A = 0, exploiting the same results with X + a in place of X.Thus the statements of Theorem 6.1, Theorem 6.7, Corollary 6.8 remain true replacing 0, B 0 with A, B A respectively, and Equation (62) (similarly for (44), ( 46), (68)) with hence by Dominated Convergence Theorem where the inequality for LHS comes from the fact that From the boundedness of we conclude that sup n N j=1 E P (Z j n ) − = +∞.Select a, b as in (19) .Then we have where the coefficient b ε is the one in (18).Then Γ ε ≥ 0 and by Fatou Lemma we have 2ε As a consequence Since the term multiplying ε is finite by hypothesis and the inequality holds for all ε > 0 we conclude that E P [U (Z)] ≥ B. where in (⋆) we exploited the minimax Theorem 2.4 in [36].Now observe that since w = 0, if the infimum over h were attained at h = 0 we would get V (w) = +∞.Hence, recalling that w ∈ dom(V ),

A.3 Results on multivariate Orlicz spaces
Proof of Proposition 2.4.We show that K Φ is a subspace of the topological dual of L Φ and is a subset of (L 1 (P)) N .For Z ∈ K Φ consider the well defined linear map φ : L Φ → L 1 (P), X → N j=1 X j Z j .Suppose X n → X in L Φ and φ(X n ) → W , then we can extract a subsequence (X n k ) converging almost surely to X, since convergence in Luxemburg norm implies convergence in probability (Lemma 2.2 Item 5).It is then clear that φ(X n k ) = N j=1 X j n k Z j → k N j=1 X j Z j = W P-a.s., thus the graph of φ is closed in L Φ × L 1 (P) (endowed with product topology).By Closed Graph Theorem ( [2] theorem 5.20) the map is then continuous, thus any vector in K Φ identifies a continuous linear functional on L Φ .Finally since [sign(Z j )] N j=1 ∈ L ∞ (P) ⊆ M Φ ⊆ L Φ , N j=1 Z j ∈ L 1 (P) yielding K Φ ⊆ L 1 (P).
Proof of Proposition 2.5 Item 1.We show that for any extended real valued vector Z ∈ L 0 (Ω, F , P); [−∞, +∞] N we have sup and that, moreover Argue as in Proposition 2.2.8 of [22]: take any X ∈ L Φ and Z ∈ (L 0 (P)) N and assume w.l.o.g. both are componentwise nonnegative (multiplying by signum functions will not affect Luxemburg norms by definition).Take sequences of simple functions (Y j n ) n , j = 1, . . ., n each converging to X j monotonically from below.Clearly Y n Φ ≤ X Φ for each n and by Monotone Convergence Theorem This implies that sup The converse inequality is evident, so that (82) follows.Now suppose Observe (by using X j sgn(Z j ) in place of X j in RHS below) that sup Since also for X Φ = 0 we have X = 0 and as a consequence X j Φj = 0, j = 1, . . ., N , we have X j Φj ≤ X Φ j = 1, . . ., N .
Moreover for X = 0 set λ := max j X j Φj .Then Hence for X = 0 X Φ ≤ N max j X j Φj and the same trivially holds for X = 0.In general then Now inequalities (12) follow from inequalities (84) and ( 85) and the claims are proved.

A.4 On Komlós Theorem
We now recall the original Komlós Theorem: Theorem A.6 (Komlós).Let (f n ) n ⊆ L 1 ((Ω, F , P, ); R) be a sequence with bounded L 1 norms.Then there exists a subsequence (f n k ) k and a g again in L 1 such that for any further subsequence the Césaro means satisfy: 1 N i≤N f n k i → g P − a.s. as N → +∞ .
s. and the convergence is dominated: argue as in Lemma A.1.Thus for any Y ∈ B 0 ∩M Φ we have by Dominated Convergence Theorem that N j=1 E Q j Y j ≤ 0. This completes the proof that N = Q B,V .Corollary 6.8.Under Assumption 3.8 we have sup

)
Furthermore, any optimum ( Y , a) for S Q (A) produces an optimum ( Y , a) for H Q (A) by setting Y = Y − a, and any optimum ( Y , a) for H Q (A) produces an optimum ( Y , a) for S Q (A) by setting

−
he −αjx j − x j w j + the last equality we set 2 log(h) = −x.Now, recalling that sup x∈R (−e −γx − xw) = w γ + w γ log w γ we get from the equation above that (79) holds.Expressions (80) and (81) can then be obtained by direct computation.