Equilibrium in risk-sharing games

The large majority of risk-sharing transactions involve few agents, each of whom can heavily influence the structure and the prices of securities. In this paper, we propose a game where agents’ strategic sets consist of all possible sharing securities and pricing kernels that are consistent with Arrow–Debreu sharing rules. First, it is shown that agents’ best response problems have unique solutions. The risk-sharing Nash equilibrium admits a finite-dimensional characterisation, and it is proved to exist for an arbitrary number of agents and to be unique in the two-agent game. In equilibrium, agents declare beliefs on future random outcomes different from their actual probability assessments, and the risk-sharing securities are endogenously bounded, implying (among other things) loss of efficiency. In addition, an analysis regarding extremely risk-tolerant agents indicates that they profit more from the Nash risk-sharing equilibrium than compared to the Arrow–Debreu one.


Introduction
The structure of securities that optimally allocate risky positions under heterogeneous beliefs of agents has been a subject of ongoing research. Starting from the seminal works of [10,5,13,12], the existence and characterisation of welfare risk sharing of random positions in a variety of models has been extensively studied-see, among others, [7,19,1,16]. On the other hand, discrepancies amongst agents regarding their assessments on the probability of future random outcomes reinforce the existence of mutually beneficial trading opportunities (see e.g. [28,29,8]). However, market imperfections, such as asymmetric information, transaction costs and oligopolies, spur agents to act strategically and prevent markets from reaching maximum efficiency. In the financial risk-sharing literature, the impact of asymmetric or private information has been addressed under both static and dynamic models (see, among others, [6,21,22,23,31]). The importance of frictions like transaction costs has been highlighted in [3]; see also [14].
The present work aims to contribute to the risk-sharing literature by focusing on how over-the-counter (OTC) transactions with a small number of agents motivate strategic behaviour. The vast majority of real-world sharing instances involve only a few participants, each of whom may influence the way heterogeneous risks and beliefs are going to be allocated. (The seminal papers [20] and [30] highlight such transactions.) As an example, two financial institutions with possibly different beliefs, and in possession of portfolios with random future payoffs, may negotiate and design innovative asset-backed securities that mutually share their defaultable assets. Broader discussion on risk-sharing innovative securities is given in the classical reference [4] and in [27]; a list of widely used such securities is provided in [17].
As has been extensively pointed out in the literature (see e.g. [29] and [26]), it is reasonable, and perhaps even necessary, to assume that agents have heterogeneous beliefs, which we identify with subjective probability measures on the considered state space. In fact, differences in subjective beliefs do not necessarily stem from asymmetric information; agents usually apply different tools or models for the analysis and interpretation of common sets of information.
Formally, a risk-sharing transaction consists of security payoffs and their prices, and since only few institutions (typically, two) are involved, it is natural to assume that no social planner for the transaction exists and that the equilibrium valuation and payoffs will result as the outcome of a symmetric game played among the participating institutions. Since institutions' portfolios are (at least approximately) known, the main ingredient of risk-sharing transactions leaving room for strategic behaviour is the beliefs that each institution reports for the sharing. We propose a novel way of modelling such strategic actions where the agents' strategic set consists of the beliefs that each one chooses to declare (as opposed to their actual one) aiming to maximise individual utility, and the induced game leads to an equilibrium sharing. Our main insights are summarised below.

Main contributions
The payoff and valuation of the risk-sharing securities are endogenously derived as an outcome of agents' strategic behaviour under constant absolute risk-aversion (CARA) preferences. To the best of our knowledge, this work is the first instance that models the way agents choose the beliefs on future uncertain events that they are going to declare to their counterparties and studies whether such strategic behaviour results in equilibrium. Our results demonstrate how the game leads to risk-sharing inefficiency and security mispricing, both of which are quantitatively characterised in analytic forms. More importantly, it is shown that equilibrium securities have endogenous limited liability, a feature that, while usually suboptimal, is in fact observed in practice.
Although the agents' set of strategic choices is infinite-dimensional, one of our main contributions is to show that a Nash equilibrium admits a finite-dimensional characterisation with the dimensionality being one less than the number of participating agents. Not only does our characterisation provide a concrete algorithm for calculating the equilibrium transaction, it also allows us to prove the existence of Nash equilibrium for an arbitrary number of players. In the important case of two participating agents, we even show that a Nash equilibrium is unique. It has to be pointed out that the aforementioned results are obtained under complete generality on the probability space and the involved random payoffs-no extra assumption except from CARA preferences is imposed. Whereas a certain qualitative analysis could be potentially carried out without the latter assumption on the entropic form of agent utilities, the advantage of CARA preferences utilised in the present paper is that they also allow a substantial quantitative analysis, as workable expressions are obtained for Nash equilibrium.
Our notion of Nash risk-sharing equilibrium highlights the importance of agents' risk-tolerance level. More precisely, one of the main findings of this work is that agents with sufficiently low risk-aversion will prefer the risk-sharing game rather than the outcome of an Arrow-Debreu equilibrium that would have resulted from absence of strategic behaviour. Interestingly, the result is valid irrespective of their actual risky position or their subjective beliefs. It follows that even risk-averse agents, as long as their risk-aversion is sufficiently low, will prefer risk-sharing markets that are thin (i.e., where participating agents are few and have the power to influence the transaction), resulting in aggregate loss of risk-sharing welfare.

Discussion
Our model is introduced in Sect. 2 and consists of a two-period financial economy with uncertainty, containing possibly infinitely many states of the world. Such infinite-dimensionality is essential in our framework since in general the risks that agents encounter do not have a priori bounds, and we do not wish to enforce any restrictive assumption on the shape of the probability distribution or the support of agents' positions. Let us also note that even if the analysis was carried out in the simpler setup of a finite state space, there would not be any significant simplification in the mathematical treatment.
In our economy, we consider a finite number of agents, each of whom has subjective beliefs (probability measure) about the events at the time of uncertainty resolution. We also allow agents to be endowed with a (cumulative, up to the point of uncertainty resolution) random endowment.
Agents seek to increase their expected utilities through trading securities that allocate the discrepancies of their beliefs and risky exposures in an optimal way. The possible disagreement on agents' beliefs is assumed on the whole probability space, and not only on the laws of the to-be-shared risky positions. Such potential disagreement is important: it alone can give rise to mutually beneficial trading opportunities, even if agents have no risky endowments to share, by actually designing securities with payoffs written on the events where probability assessments are different.
Each sharing rule consists of the security payoff that each agent is going to obtain and a valuation measure under which all imaginable securities are priced. The sharing rules that efficiently allocate any submitted discrepancy of beliefs and risky exposures are the ones stemming from an Arrow-Debreu equilibrium. (Under CARA preferences, the optimal sharing rules have been extensively studied; see e.g. [10,13,7].) In principle, participating agents would opt for the highest possible aggregate benefit from the risk-sharing transaction, as this would increase their chance for personal gain. However, in the absence of a social planner that could potentially impose a truth-telling mechanism, it is reasonable to assume that agents do not negotiate the rules that will allocate the submitted endowments and beliefs. In fact, we assume that agents adapt the specific sharing rules that are consistent with the ones resulting from an Arrow-Debreu equilibrium, treating reported beliefs as actual ones, since these sharing rules are the most natural and universally regarded as efficient.
Agreement on the structure of risk-sharing securities is also consistent with what is observed in many OTC transactions involving security design, where the contracts signed by institutions are standardised and adjusted according to required inputs (in this case, the agents' reported beliefs). Such pre-agreement on sharing rules reduces negotiation time and hence also the related transaction costs. Examples are assetbacked securities, whose payoffs are backed by issuers' random incomes, traded among banks and investors in a standardised form, as well as credit derivatives, where portfolios of defaultable assets are allocated among financial institutions and investors.
Combinations of strategic and competitive stages are widely used in the literature of financial innovation and risk-sharing under a variety of different guises. The majority of this literature distinguishes participants among designers (or issuers) of securities and investors who trade them. In [15], a security-design game is played among exchanges, each aiming to maximise internal transaction volume; while security design throughout exchanges is the outcome of non-competitive equilibrium, investors trade securities in a competitive manner. Similarly, in [9], a Nash equilibrium determines not only the designed securities among financial intermediaries, but also the bid-ask spread that price-taking investors have to face in the second (perfect competition) stage of market equilibrium. In [14], it is entrepreneurs who strategically design securities that investors with non-securitised hedging need competitively trade. In [24], the role of security-designers is played by arbitrageurs who issue innovated securities in segmented markets. A mixture of strategic and competitive stages has also been used in models with asymmetric information. For instance, in [11], a two-stage equilibrium game is used to model security design among agents with private information regarding their effort. In a first stage, agents strategically issue novel financial securities; in the second stage, equilibrium on the issued securities is formed competitively.
Our framework models oligopolistic OTC security design, where participants are not distinguished regarding their information or ability to influence market equilib-rium. Agents mutually agree to apply Arrow-Debreu sharing rules since these optimally allocate whatever is submitted for sharing, and also strategically choose the inputs of the sharing rules (their beliefs, in particular).
Given the agreed-upon rules, agents propose accordingly consistent securities and valuation measures, aiming to maximise their own expected utility. As explicitly explained in the text, proposing risk-sharing securities and a valuation kernel is in fact equivalent to agents reporting beliefs to be considered for sharing. Knowledge of the probability assessments of the counterparties may result in a readjustment of the probability measure an agent is going to report for the transaction. In effect, agents form a game by responding to other agents' submitted probability measures; the fixed point of this game (if it exists) is called a Nash risk-sharing equilibrium.
The first step of analysing Nash risk-sharing equilibria is to address the wellposedness of an agent's best response problem, which is the purpose of Sect. 3. Agents have motive to exploit other agents' reported beliefs and hedging needs and drive the sharing transaction to maximise their own utility. Each agent's strategic choice set consists of all possible probability measures (equivalent to a baseline measure), and the optimal one is called the best probability response. Although this is a highly nontrivial infinite-dimensional maximisation problem, we use a bare-hands approach to establish that it admits a unique solution. It is shown that the beliefs that an agent declares coincide with the actual ones only in the special case where the agent's position cannot be improved by any transaction with other agents. By resorting to examples, one may gain more intuition on how future risk appears under the lens of agents' reported beliefs. Consider for instance two financial institutions adapting distinct models for estimating the likelihood of the involved risks. The sharing contract designed by the institutions will result from individual estimation of the joint distribution of the to-be-shared risky portfolios. According to the best probability response procedure, each institution tends to use a less favourable assessment for its own portfolio than the one based on its actual beliefs, and understates the downside risk of its counterparty's portfolio. Example 3.8 contains an illustration of such a case.
An important consequence of applying the best probability response is that the corresponding security that the agent wishes to acquire has bounded liability. If only one agent applies the proposed strategic behaviour, then the received security payoff is bounded below (but not necessarily bounded above). In fact, the arguments and results of the best response problem receive extra attention and discussion in the paper since they demonstrate in particular the value of the proposed strategic behaviour in terms of utility increase. This situation applies to markets where one large institution trades with a number of small agents, each of whom has negligible market power.
A Nash-type game occurs when all agents apply the best probability response strategy. In Sect. 4, we characterise a Nash equilibrium as the solution of a certain finite-dimensional problem. Based on this characterisation, we establish the existence of a Nash risk-sharing equilibrium for an arbitrary (finite) number of agents. In the special case of two-agent games, the Nash equilibrium is shown to be unique. The finite-dimensional characterisation of Nash equilibria also provides an algorithm that can be used to approximate the Nash equilibrium transaction by standard numerical procedures, such as Monte Carlo simulation.
Having characterised Nash equilibria, we are able to further perform a joint qualitative and quantitative analysis. Not only do we verify the expected fact that in any nontrivial case, Nash risk-sharing securities are different from the Arrow-Debreu ones, but we also provide analytic formulas for their shapes. Since the securities that correspond to the best probability response are bounded from below, the application of such a strategy from all agents yields that the Nash risk-sharing market-clearing securities are also bounded from above. This comes in stark contrast to Arrow-Debreu equilibrium and implies in particular an important loss of efficiency. We measure the risk-sharing inefficiency that is caused by the game via the difference between the aggregate monetary utilities at Arrow-Debreu and Nash equilibria and provide an analytic expression for it. (Note that inefficient allocation of risk in symmetricinformation thin market models may also occur when securities are exogenously given; see e.g. [25]. When securities are endogenously designed, [14] highlights that imperfect competition among issuers results in risk-sharing inefficiency, even if securities are traded among perfectly competitive investors.) One may wonder whether the revealed agents' subjective beliefs in a Nash equilibrium are far from their actual subjective probability measures, which would be unappealing from a modelling viewpoint. Extreme departures from actual beliefs are endogenously excluded in our model, as the distance of the truth from reported beliefs in a Nash equilibrium admits a priori bounds. Even though agents are free to choose any probability measure that supposedly represents their beliefs in a risksharing transaction and they do indeed end up choosing probability measures different from their actual ones, this departure cannot be arbitrarily large if the market is to reach equilibrium.
Turning our attention to Nash-equilibrium valuation, we show that the pricing probability measure can be written as a convex combination of the individual agents' marginal indifference valuation measures. The weights of this convex combination depend on agents' relative risk-tolerance coefficients, and as it turns out, the Nashequilibrium valuation measure is closer to the marginal valuation measure of the more risk-averse agents. This fact highlights the importance of risk-tolerance coefficients in assessing the gain or loss of utility for individual agents in a Nash risk-sharing equilibrium; in fact, it implies that more risk-tolerant agents tend to get better cash compensation as a result of the Nash game than what they would get in an Arrow-Debreu equilibrium.
Inspired by the involvement of the risk-tolerance coefficients in the agents' utility gain or loss, in Sect. 5, we focus on induced Arrow-Debreu and Nash equilibria of two-agent games when one of the agents' preferences approaches risk-neutrality. We first establish that both equilibria converge to well-defined limits. Notably, it is shown that an extremely risk-tolerant agent drives the market to the same equilibrium regardless of whether the other agent acts strategically or plainly submits true subjective beliefs. In other words, extremely risk-tolerant agents tend to dominate the risk-sharing transaction. The study of limiting equilibria indicates that although there is a loss of aggregate utility when agents act strategically, there is always a utility gain in the Nash transaction compared to an Arrow-Debreu equilibrium for the extremely risktolerant agent, regardless of the risk-tolerance level and subjective beliefs of the other agent. Extremely risk-tolerant agents are willing to undertake more risk in exchange for better cash compensation; under the risk-sharing game, they respond to the riskaverse agent's hedging needs and beliefs by driving the market to a higher price for the security they short. This implies that agents with sufficiently high risk-tolerancealthough still not risk-neutral-will prefer thin markets. The case where both acting agents uniformly approach risk-neutrality is also treated, where it is shown that the limiting Nash equilibrium securities equal half of the limiting Arrow-Debreu equilibrium securities, hinting towards the fact that a Nash risk-sharing equilibrium results in loss of trading volume.
For convenience of reading, all the proofs of the paper are placed in the Appendix.
2 Optimal sharing of risk

Notation
The symbols N and R are used to denote the sets of all natural and real numbers, respectively. We have chosen to use the symbol R to denote (reported, or revealed) probabilities.
In all that follows, random variables are defined on a probability space ( , F, P). We stress that no finiteness restriction is enforced on the state space . We use P for the class of all probabilities that are equivalent to the baseline probability P. For Q ∈ P, we use E Q to denote the expectation under Q. The space L 0 consists of all (equivalence classes, modulo almost sure equality) finite-valued random variables with the topology of convergence in probability. This topology does not depend on the representative probability from P, and L 0 may be infinite-dimensional. For Q ∈ P, L 1 (Q) consists of all X ∈ L 0 with E Q [|X|] < ∞. We use L ∞ for the subset of L 0 consisting of essentially bounded random variables.
Whenever Q 1 ∈ P and Q 2 ∈ P, dQ 2 /dQ 1 denotes the (strictly positive) density of Q 2 with respect to Q 1 . The relative entropy of Q 2 ∈ P with respect to Q 1 ∈ P is defined as For X ∈ L 0 and Y ∈ L 0 , we write X ∼ Y when there exists c ∈ R such that Y = X + c. In particular, this notion of equivalence will ease notation on probability densities: for Q 1 ∈ P and Q 2 ∈ P, we write log (dQ 2 /dQ 1 ) ∼ to mean that exp( ) ∈ L 1 (Q 1 ) and dQ 2

Agents and preferences
We consider a market with a single future period, at which point all uncertainty is resolved. In this market, there are n + 1 economic agents, where n ∈ N; for concreteness, define the index set I = {0, . . . , n}. Agents derive utility only from the consumption of a numéraire in the future, and all considered security payoffs are expressed in units of this numéraire. In particular, future deterministic amounts have the same present value for the agents. The preference structure of agent i ∈ I over future random outcomes is numerically represented via the concave exponential utility functional where δ i ∈ (0, ∞) is the agent's risk-tolerance, and P i ∈ P represents the agent's subjective beliefs. For any X ∈ L 0 , agent i ∈ I is indifferent between the cash amount U i (X) and the corresponding risky position X; in other words, U i (X) is the certainty equivalent of X ∈ L 0 for agent i ∈ I . The functional −U i is an entropic risk measure in the terminology of convex risk measure literature; see, for example, [18,Chap. 4].
Define the aggregate risk-tolerance δ := i∈I δ i and the relative risk-tolerance λ i := δ i /δ for all i ∈ I . Note that i∈I λ i = 1. Finally, set δ −i := δ − δ i and λ −i := 1 − λ i for all i ∈ I .

Subjective probabilities and endowments
Preference structures that are numerically represented via (2.1) are rich enough to include the possibility of already existing portfolios of random positions for acting agents. To wit, suppose that P i ∈ P are the actual subjective beliefs of agent i ∈ I , who also carries a risky future payoff in units of the numéraire. Following standard terminology, we call this cumulative (up to the point of resolution of uncertainty) payoff random endowment and denote it by E i ∈ L 0 . In this setup, adding on top of E i a payoff X ∈ L 0 for agent i ∈ I results in a numerical utility equal to for all X ∈ L 0 . Hence, hereafter, the probability P i is understood to incorporate any possible random endowment of agent i ∈ I , and the utility is measured in relative terms as the difference from the baseline level U i (0).
Taking the above discussion into account, we stress that agents are completely characterised by their risk-tolerance level and (endowment-modified) subjective beliefs, that is, by the collection of pairs (δ i , P i ) i∈I . In other aspects, and unless otherwise noted, agents are considered symmetric (regarding information, bargaining power, cost of risk-sharing participation, etc.).

Geometric-mean probability
We introduce a method that produces a geometric mean of probabilities, which will play a central role in our discussion. Fix (R i ) i∈I ∈ P I . In view of Hölder's inequality, we have i∈I (dR i /dP) λ i ∈ L 1 (P). Therefore, we may define Q ∈ P via log (dQ/dP) ∼ i∈I λ i log (dR i /dP). Since i∈I λ i log (dR i /dQ) ∼ 0, we are allowed to formally write log dQ ∼ i∈I λ i log dR i . (2. 2) The fact that dR i /dQ ∈ L 1 (Q) implies log + (dR i /dQ) ∈ L 1 (Q) and Jensen's inequality give E Q [log (dR i /dQ)] ≤ 0 for all i ∈ I . Note that (2.2) implies the existence of c ∈ R such that i∈I λ i log (dR i /dQ) = c; therefore, we in fact have E Q [log (dR i /dQ)] ∈ (−∞, 0] for all i ∈ I . In particular, log (dR i /dQ) ∈ L 1 (Q) for all i ∈ I , and

Securities and valuation
Discrepancies amongst agents' preferences provide incentive to design securities, the trading of which could be mutually beneficial in terms of risk reduction. In principle, the ability to design and trade securities in any desirable way essentially leads to a complete market. In such a market, transactions amongst agents are characterised by a valuation measure (that assigns prices to all imaginable securities) and a collection of securities that will actually be traded. Since all future payoffs are measured under the same numéraire, (no-arbitrage) valuation corresponds to taking expectations with respect to probabilities in P. Given a valuation measure, agents agree on a collection (C i ) i∈I ∈ (L 0 ) I of zero-value securities satisfying the market clearing condition i∈I C i = 0. The security that agent i ∈ I takes a long position in as part of the transaction is C i . As mentioned in the introductory section, our model can find applications in OTC markets. For instance, the design of asset-backed securities involves only a small number of financial institutions; in this case, P i stands for the subjective beliefs of each institution i ∈ I and, in view of the discussion of Sect. 2.3, further incorporates any existing portfolios that back the security payoffs. In order to share their risky positions, the institutions agree on prices of future random payoffs and on the securities they are going to exchange. Other examples are the market of innovated credit derivatives or the market of asset swaps that involve exchanging a random payoff and a fixed payment.

Arrow-Debreu equilibrium
In the absence of any kind of strategic behaviour in designing securities, the agreedupon transaction amongst agents will actually form an Arrow-Debreu equilibrium. The valuation measure will determine both trading and indifference prices, and securities will be constructed in a way that maximises each agent's respective utility.
Under risk preferences modelled by (2.1), a unique Arrow-Debreu equilibrium can be explicitly obtained. In other guises, Theorem 2.2 that follows has appeared in many works; see, for instance, [10,13,12]. Its proof is based on standard arguments; however, for reasons of completeness, we provide a short argument in Sect. A.1.

Theorem 2.2
In the above setting, there exists a unique Arrow-Debreu equilibrium (Q * , (C * i ) i∈I ). In fact, the valuation measure Q * ∈ P is such that log dQ * ∼ i∈I λ i log dP i , (2.3) and the equilibrium market-clearing securities (C * i ) i∈I ∈ (L 0 ) I are given by where the fact that H(Q * | P i ) < ∞ for all i ∈ I follows from Sect. 2.4.
The securities that agents obtain at an Arrow-Debreu equilibrium described in (2.4) provide higher payoffs on events where their individual subjective probabilities are higher than the "geometric mean" probability Q * of (2.3). In other words, discrepancies in beliefs result in allocations where agents receive higher payoffs on their corresponding relatively more likely events.
Let us note an interesting decomposition for the securities traded at an Arrow-Debreu equilibrium. To wit, in view of the string of equalities agent i is indifferent between no trading and the first "random" part δ i log(dP i /dQ * ) of the security C * i . The second "cash" part δ i H(Q * | P i ) of C * i is always nonnegative and represents the monetary gain of agent i resulting from the Arrow-Debreu transaction. After this transaction, the position of agent i has certainty equivalent The aggregate monetary value resulting from the Arrow-Debreu transaction equals Remark 2.3 In the setting and notation of Sect. 2.3, let (E i ) i∈I be the collection of agents' random endowments. Furthermore, suppose that agents share common subjective beliefs; for concreteness, assume that P i = P for all i ∈ I . In this case, setting E := i∈I E i , the equilibrium valuation measure from (2.3) satisfies log (dQ * /dP) ∼ −E/δ, and the equilibrium securities from (2.4) are given by In particular, note the well-known fact that the payoff of each shared security is a linear combination of the agents' random endowments.
In particular, for C i ∈ L 1 (Q * ), an application of Jensen's inequality gives with equality if and only if C i ∼ C * i . The last inequality shows that C * i is indeed the optimally designed security for agent i ∈ I under the valuation measure Q * . Furthermore, for any collection (C i ) i∈I with i∈I C i = 0 and C i ∈ L 1 (Q * ) for all i ∈ I , it follows that i∈I U i (C i ) ≤ i∈I U i (C * i ) = u * . A standard argument using the monotone convergence theorem extends the previous inequality to with equality if and only if C i ∼ C * i for all i ∈ I . Therefore, (C * i ) i∈I is a maximiser of the functional i∈I U i (C i ) over all (C i ) i∈I ∈ (L 0 ) I with i∈I C i = 0. In fact, the collection of all such maximisers is (z i + C * i ) i∈I , where (z i ) i∈I ∈ R I is such that i∈I z i = 0. It can be shown that all Pareto-optimal securities are exactly of this form; see e.g. [19,Thm. 3.1] for a more general result. Because of this Pareto optimality, the collection (Q * , (C * i ) i∈I ) usually comes under the appellation of (welfare) optimal securities and valuation measure, respectively.
Of course, not every Pareto-optimal allocation (z i + C * i ) i∈I , where (z i ) i∈I is such that i∈I z i = 0, is economically reasonable. A minimal "fairness" requirement that has to be imposed is that the position of each agent after the transaction is at least as good as the initial state. Since the utility comes only at the terminal time, we obtain the requirement z i ≥ −u * i for all i ∈ I . Whereas there may be many choices satisfying the latter requirement in general, the choice z i = 0 of Theorem 2.2 has the cleanest economic interpretation in terms of a complete financial market equilibrium.

Remark 2.5
If we ignore potential transaction costs, the cases where an agent has no motive to enter a risk-sharing transaction are extremely rare. Indeed, agent i will not take part in the Arrow-Debreu transaction if and only if C i = 0, which happens when P i = Q * . In particular, agents will already be in an Arrow-Debreu equilibrium, and no transaction will take place if and only if they all share the same subjective beliefs.

Strategic behaviour in risk sharing
In the Arrow-Debreu setting, the resulting equilibrium is based on the assumption that agents do not apply any kind of strategic behaviour. However, in the majority of practical risk-sharing situations, the modelling assumption of absence of agents' strategic behaviour is unreasonable, resulting amongst other things in overestimation of market efficiency. When securities are negotiated among agents, their design and valuation depend not only on their existing risky portfolios, but also on the beliefs about the future outcomes they report for sharing. In general, agents have an incentive to report subjective beliefs that may differ from their true views about future uncertainty; in fact, these also depend on subjective beliefs reported by the other parties.
As discussed in Sect. 2.6, for a given set of agents' subjective beliefs, the optimal sharing rules are governed by the mechanism resulting in an Arrow-Debreu equilibrium, as these are the rules that efficiently allocate discrepancies of risks and beliefs among agents. It is then reasonable to assume that in the absence of a social planner, agents adapt this sharing mechanism for any collection (R i ) i∈I ∈ P I of subjective probabilities they choose to report; see also the related discussion in the introductory section. More precisely, in accordance to (2.3) and (2.4), the agreed-upon valuation measure Q ∈ P is such that log dQ ∼ i∈I λ i log dR i , and the collection of securities that agents will trade are Given the sharing rules consistent with an Arrow-Debreu equilibrium, agents respond to subjective beliefs that other agents have reported, with the goal to maximise their individual utility. In this way, a game is formed with the probability family P being the agents' set of strategic choices. The subject of the present Sect. 3 is to analyse the behaviour of individual agents, establish their best response problem, and show its well-posedness. The definition and analysis of the Nash risk-sharing equilibrium is taken up in Sect. 4.

Best response
We now describe how agents respond to the reported subjective probability assessments from their counterparties. For the purposes of Sect. 3.2, we fix an agent i ∈ I and a collection R −i := (R j ) j ∈I \{i} ∈ P I \{i} of reported probabilities of the remaining agents and seek the subjective probability that is going to be submitted by agent i ∈ I . According to the rules described in Sect. 3.1, a reported probability R i ∈ P from agent i ∈ I leads to entering a long position on the security with payoff By reporting subjective beliefs R i ∈ P, agent i ∈ I also indirectly affects the geometric-mean valuation probability Q (R −i ,R i ) , resulting in a highly nonlinear overall effect on the security C i . With the above understanding, and given R −i ∈ P I \{i} , the response function of agent i ∈ I is defined to be where H(Q (R −i ,R i ) | R i ) < ∞ follows from the discussion of Sect. 2.4. The problem of agent i is to report the subjective probability that maximises the certainty equivalent of the resulting position after the transaction, that is, to identify R r Any R r i ∈ P satisfying (3.1) is called a best probability response. In contrast to the majority of the related literature, the agent's strategic set of choices in our model may be of infinite dimension. This generalisation is important from a methodological viewpoint; for example, in the setting of Sect. 2.3, it allows for random endowments with infinite support, like ones with a Gaussian distribution or arbitrarily fat tails, a substantial feature in the modelling of risk.

Remark 3.1
The best response problem (3.1) imposes no constrains on the shape of the agent's reported subjective probability as long as it belongs to P. In principle, it is possible for agents to report subjective views that are far from their actual ones. Such severe departures may be deemed unrealistic and are undesirable from a modelling point of view. However, as will be argued in Sect. 4.3.2, extreme responses are endogenously excluded in our setup.
We show in the sequel (Theorem 3.7) that best responses in (3.1) exist and are unique. We start with a result that gives necessary and sufficient conditions for a best probability response.
The proof of Proposition 3.2 is given in Sect. A.2. The necessity of the stated conditions for a best response follows from the first-order optimality conditions. Establishing the sufficiency of the stated conditions is nontrivial due to the fact that it is far from clear (and in fact not known to us) whether the response function is concave.

Remark 3.3
In the context of Proposition 3.2, rewriting (3.2), we obtain that Using also the fact that C r In words, the best probability response and actual subjective probability of an agent agree if and only if the agent has no incentive to participate in the risksharing transaction, given the reported subjective beliefs of other agents. Hence, in any nontrivial cases, agents' strategic behaviour implies a departure from reporting their true beliefs.

Remark 3.4 A message from (3.4)
is that according to their best response process, agents report beliefs that understate (resp. overstate) the probability of their payoff being high (resp. low) relatively to their true beliefs. Such behaviour is clearly driven by a desired post-transaction utility increase. More importantly and in sharp contrast to the securities (C * i ) i∈I formed in an Arrow-Debreu equilibrium, the security that agent i ∈ I wishes to enter, after taking into account the aggregate reported beliefs of the rest and declaring subjective probability R r i , has limited liability as it is bounded from below by the constant −δ −i .

Remark 3.5
Additional insight regarding best probability responses may be obtained by resorting to the discussion of Sect. 2.3, where P i incorporates the random endowment E i ∈ L 0 of agent i ∈ I in the sense that log(dP i /d P i ) ∼ −E i /δ i , where P i denotes the subjective probability of agent i. It follows from (3.4) that It then becomes apparent that when agents share their risky endowment, they tend to put more weight on the probability of the downside of their risky exposure rather than the upside. For an illustrative situation, see Example 3.8 later on.

Remark 3.6
In the course of the proof of Proposition 3.2, the constant in the equivalence (3.2) is explicitly computed; see (A.3). This constant has a particularly nice economic interpretation in the case of two agents. To wit, let I = {0, 1} and suppose that R 1 ∈ P is given. Then from the vantage point of agent 0, (3.2) becomes where the constant ζ 0 ∈ R is such that where U 1 (·; R 1 ) denotes the utility functional of a "fictitious" agent with representative pair (δ 1 , R 1 ). In words, ζ 0 is the post-transaction difference, denominated in units of risk-tolerance, of the utility of agent 0 from the utility of agent 1 (who obtains the security −C r 0 ), provided that the latter utility is measured with respect to the reported, as opposed to subjective, beliefs of agent 1. In particular, when agent 1 does not behave strategically, in which case R 1 = P 1 , it holds that Proposition 3.2 sets a roadmap for proving the existence and uniqueness in the best response problem via a one-dimensional parameterisation. Indeed, in accordance to (3.2), to find a best response, we consider for each z i ∈ R the unique random variable C i (z i ) that satisfies the equation (3.4), to obtain the unique best response of agent i ∈ I given R −i . The technical details of the proof of Theorem 3.7 below are given in Sect. A.3.

The value of strategic behaviour
The increase in agents' utility caused by following the best probability response procedure can be regarded as a measure for the value of the strategic behaviour induced by problem (3.1). Consider, for example, the case where only a single agent (say) 0 ∈ I applies the best probability response strategy and the rest of the agents report their true beliefs, that is, R j = P j for j ∈ I \ {0}. As mentioned in the introductory section, this is a potential model of a transaction where only agent 0 possesses meaningful market power. Based on the results of Sect. 3.2, we may calculate the gains, relative to the Arrow-Debreu transaction, that agent 0 obtains by incorporating such strategic behaviour (which, among others, implies limited liability of the security the agent takes a long position in). The main insights are illustrated in the following two-agent example.  The solid black line is the pdf of endowments E 0 and E 1 under the agents' common subjective probability measure, whereas the other curves illustrate the pdf of E 0 (dashed blue) and E 1 (dotted red) under the best probability response of agent 0. In this example, σ 2 = 1 and ρ = −0.5

Fig. 2
The solid black line is the pdf of the initial position E 0 , the dashed blue line illustrates the pdf of the position E 0 + C * 0 , and the dotted red line is the pdf of the position E 0 + C r 0 , all under the common subjective probability measure. In this example, σ 2 = 1 and ρ = −0.5 measure. The agents are exposed to random endowments E 0 and E 1 that (under the common probability measure) have Gaussian laws with mean zero and common variance σ 2 > 0, whereas ρ ∈ [−1, 1] denotes the correlation coefficient of E 0 and E 1 . In this case, it is straightforward to check that C * 0 = (E 1 − E 0 )/2; therefore, after the Arrow-Debreu transaction, the position of agent 0 is On the other hand, if agent 1 reports true beliefs, then by (3.2) the security C r 0 corresponding to the best probability response of agent 0 should satisfy 2C r 0 + log(1 + C r 0 ) = ζ 0 + E 1 for an appropriate ζ 0 ∈ R that is coupled with C r 0 . For σ 2 = 1 and ρ = −0.5, straightforward Monte Carlo simulation allows the numerical approximation of the probability density function (pdf) of E 0 and E 1 under the best response probability R r 0 , illustrated in Fig. 1. As is apparent, the best probability response drives agent 0 to overstate the downside risk of E 0 and understate the downside risk of E 1 .
The effect of following such a strategic behaviour is depicted in Fig. 2, where we compare the probability density functions of the positions of agent 0 under (i) no trading, (ii) the Arrow-Debreu transaction, and (iii) the transaction following the application of best response strategic behaviour. Compared to the Arrow-Debreu position, the lower bound of the security C r 0 guarantees a heavier right tail of the agent's position after the best response transaction.

Nash risk-sharing equilibrium
We now consider the situation where every single agent follows the same strategic behaviour indicated by the best response problem of Sect. 3. As previously mentioned, sharing securities are designed following the sharing rules determined by Theorem 2.2 for any collection of reported subjective views. With the well-posedness of the best response problem established, we are now ready to examine whether the game among agents has an equilibrium point. In view of the analysis of Sect. 3, individual agents have a motive to declare subjective beliefs different from the actual ones. (In particular, in the setting of Sect. 2.3, agents tend to overstate the probability of their random endowments taking low values.) Each agent acts according to the best response mechanism as in (3.1), given what other agents have reported as subjective beliefs. In a sense, the best response mechanism indicates a negotiation scheme, the fixed point (if one exists) of which produces the Nash equilibrium valuation measure and risk-sharing securities.
Let us emphasise that the actual subjective beliefs of individual players are not necessarily assumed to be private knowledge; rather, it is assumed here that agents have agreed upon the rules that associate any reported subjective beliefs to securities and prices, even if the reported beliefs are not the actual ones. In fact, even if subjective beliefs constitute private knowledge initially, some information about them will necessarily be revealed in the negotiation process that leads to a Nash equilibrium.
There are two relevant points to consider here. Firstly, it is unreasonable for participants to attempt to invalidate the negotiation process based on the claim that other parties do not report their true beliefs, as the latter is after all a subjective matter. This particular point is reinforced from the a posteriori fact that reported subjective beliefs in a Nash equilibrium do not deviate far from the true ones, as was pointed out in Remark 3.1 and is further elaborated in Sect. 4.3.2. Secondly, it is exactly the limited number of participants, rather than private or asymmetric information, that gives rise to strategic behaviour: agents recognise their ability to influence the market since securities and valuation become output of collective reported beliefs. Even under the appreciation that other agents will not report true beliefs and the negotiation will not produce an Arrow-Debreu equilibrium, agents still want to reach a Nash equilibrium as they will improve their initial position. In fact, transactions with a limited number of participants typically equilibrate far from their competitive equivalents, as has been also highlighted in other models of thin financial markets with symmetric information structure like the ones in [14] and [25]; see also the related discussion in the introductory section.

Revealed subjective beliefs
Considering the model from a more pragmatic point of view, we may argue that agents do not actually report subjective beliefs, but rather agree on a valuation measure Q ∈ P and zero-price sharing securities (C i ) i∈I that clear the market. However, there is a one-to-one correspondence between reporting subjective beliefs and proposing a valuation measure and securities, as we describe below.
From the discussion of Sect. 3.1, a collection of subjective probabilities (R i ) i∈I gives rise to a valuation measure Q ∈ P such that log dQ ∼ i∈I λ i log dR i and a collection (C i ) i∈I of securities such that which is then a necessary condition that an arbitrary collection of market-clearing securities (C i ) i∈I must satisfy with respect to an arbitrary valuation probability Q ∈ P in order to be consistent with the aforementioned risk-sharing mechanism. The previous observations lead to a definition: for Q ∈ P, we define the class C Q of securities that clear the market and are consistent with the valuation measure Q via Note that all expectations of C i under Q in the definition of C Q above are well defined. Indeed, the fact that exp( Given a valuation measure Q ∈ P and securities (C i ) i∈I ∈ C Q , we may define a collection (R i ) i∈I ∈ P I via log(dR i /dQ) ∼ C i /δ i for i ∈ I and note that this is the unique collection in P I that results in the valuation probability Q ∈ P I and securities (C i ) i∈I ∈ C Q . In this way, the probabilities (R i ) i∈I ∈ P I can be considered as revealed by the valuation measure Q ∈ P and securities (C i ) i∈I ∈ C Q . Hence, agents proposing risk-sharing securities and a valuation measure is equivalent to them reporting probability beliefs in the transaction. This viewpoint justifies and underlies Definition 4.1 that follows: the objects of a Nash equilibrium are the valuation measure and designed securities, in consistency with the definition of an Arrow-Debreu equilibrium.

Nash equilibrium and its characterisation
Following classic literature, we give the formal definition of a Nash risk-sharing equilibrium.

Definition 4.1
The collection (Q , (C i ) i∈I ) ∈ P × (L 0 ) I is called a Nash equilibrium if (C i ) i∈I ∈ C Q and, with log(dR i /dQ ) ∼ C i /δ i for all i ∈ I denoting the corresponding revealed subjective beliefs, it holds that A use of Proposition 3.2 results in the characterisation Theorem 4.2, the proof of which is given in Sect. A.4. For this, we need to introduce the n-dimensional Euclidean space (4.1)

Theorem 4.2
The collection (Q , (C i ) i∈I ) ∈ P × (L 0 ) I is a Nash equilibrium if and only if the following three conditions hold: (N1) C i > −δ −i for all i ∈ I , and there exists z = (z i ) i∈I ∈ I such that

Remark 4.3
Suppose that the agents' preferences and risk exposures are such that no trade occurs in an Arrow-Debreu equilibrium, which happens when all P i are the same (and equal to, say, P); see Remark 2.5. In this case, Q * = P and C * i = 0 for all i ∈ I . It is then straightforward from Theorem 4.2 to see that a Nash equilibrium is also given by Q = P and C i = 0 (and z i = 0) for all i ∈ I . In fact, as argued in Sect. 4.3.4, this is the unique Nash equilibrium in this case. Conversely, suppose that a Nash equilibrium is given by Q = P and C i = 0 for all i ∈ I . Then (4.3) shows that Q * = Q = P and (4.2) implies that C * i ∼ −z i ∼ 0, which means that C * i = 0 for all i ∈ I . In words, the Nash risk-sharing equilibrium involves no risk transfer if and only if the agents are already in a Pareto-optimal situation.
In the important case of two acting agents, since C 0 = −C 1 , applying simple algebra in (4.2), we obtain that a Nash equilibrium risk-sharing security C 0 is such that −δ 1 < C 0 < δ 0 and satisfies In Theorem 4.7, the existence of a unique Nash equilibrium for the two-agent case will be shown. Furthermore, a one-dimensional root-finding algorithm presented in Sect. 4.4 allows us to calculate the Nash equilibrium and further calculate and compare the final position of each individual agent. Consider, for instance, Example 3.8 and its symmetric situation illustrated in Fig. 2, where the limited liability of the security C r 0 implies less variability and a flatter right tail for the agent's position. Under the Nash equilibrium, as argued in Sect. 4.3.1, the security C 0 is further bounded from above, which implies that the probability density function of the agent's final position is shifted to the left. This fact, in the setting of Example 3.8, is illustrated in Fig. 3.
Despite the above symmetric case, it is not necessarily true that all agents suffer a loss of utility at the Nash equilibrium risk sharing. As we shall see in Sect. 5, for agents with sufficiently large risk-tolerance, the negotiation game results in higher utility compared to the one gained through an Arrow-Debreu equilibrium.

Within equilibrium
According to Theorem 4.7, Nash equilibria in the sense of Definition 4.1 always exist. Throughout Sect. 4.3, we assume that (Q , (C i ) i∈I ) is a Nash equilibrium and provide a discussion on certain of its aspects, based on the characterisation in Theorem 4.2.

Endogenous bounds on traded securities
As pointed out in Remark 3.4, the security that each agent enters resulting from the best response procedure is bounded below. When all participating agents follow the same strategic behaviour, Nash equilibrium securities are bounded from above as well. Indeed, since the market clears, the security that agents take a long position in is shorted by the rest of the agents, who similarly intend to bound their liabilities. Mathematically, since C i > −δ −i for all i ∈ I and i∈I C i = 0, it also follows that C i = − j ∈I \{i} C j < j ∈I \{i} δ −j = (n − 1)δ + δ i for all i ∈ I . Therefore, a consequence of the agents' strategic behaviour is that Nash risk-sharing securities are endogenously bounded. This fact is in sharp contrast with the Arrow-Debreu equilibrium of (2.4), where the risk transfer may involve securities with unbounded payoffs. An immediate consequence of the bounds on the securities is that the potential gain from the Nash risk-sharing transaction is also endogenously bounded. Naturally, the resulting endogenous bounds are an indication of how the game among agents restricts the risk-sharing transaction, which in turn may be a source of a large loss of efficiency. The next example is an illustration of such an inefficiency in a simple symmetric setting. In Fig. 3, the loss of utility in the two-agent Example 3.8 is visualised.
Example 4.4 Let X ∈ L 0 have the standard (zero mean, unit standard deviation) Gaussian law under the baseline probability P. For β ∈ R, define P β ∈ P via log(dP β /dP) ∼ βX; under P β , X has a Gaussian law with mean β and unit standard deviation. Fix β > 0 and set P 0 := P β and P 1 := P −β . In this case, it is straightforward to compute that C * 0 = βX = −C * 1 . It also follows that u * 0 = β 2 /2 = u * 1 . If β is large, then the discrepancy between the agents' beliefs results in large monetary profits to both after the Arrow-Debreu transaction. On the other hand, as established in Theorem 4.7, in the case of two agents, there exists a unique Nash equilibrium. In fact, in this symmetric case, we have −1 < C 0 < 1, and it can be checked that (see also (4.4) later) The loss of efficiency caused by the game becomes greater with increasing values of β > 0. In fact, if β converges to infinity, then it can be shown that C 0 converges to sign(X) = I {X>0} − I {X<0} ; furthermore, both U 0 (C 0 ) and U 1 (C 1 ) converge to 1, which demonstrates the tremendous inefficiency of the Nash equilibrium transaction compared to the Arrow-Debreu one.
Note that the endogenous bounds −δ −i < C i < (n − 1)δ + δ i depend only on the risk-tolerance profile of the agents and not on their actual beliefs (or risk exposures). In addition, these bounds become stricter in games where quite risk-averse agents are playing, as they become increasingly hesitant towards undertaking risk.

If trading, you never reveal your true beliefs
As discussed in Remark 3.3, agents' best probability responses differ from their actual subjective beliefs in any situation where risk transfer is involved. This result becomes more pronounced when we consider the Nash risk-sharing equilibrium. To wit, if (R i ) i∈I are revealed subjective beliefs corresponding to a Nash equilibrium, then it is a consequence of Theorem 4.2 (see also (3.4)) that Note that R i = P i if and only if C i = 0 for any fixed i ∈ I ; therefore, whenever agents take part (by actually trading) in a Nash equilibrium, their reported subjective beliefs are never the same as their actual ones. Even though in any nontrivial trading situation, agents report different subjective beliefs from their actual ones, we argue below that (4.5) imposes endogenous constraints on the magnitude of the possible discrepancy; the discussion that follows expands on Remark 3.1. Start by writing (4.5) as where we have used Jensen's inequality and the fact that (Q , (C i ) i∈I ) is an Arrow-Debreu equilibrium for the fictitious agents' preference pairs (δ i , R i ) i∈I . It follows that κ i ≥ 0, which implies that dP i /dR i ≤ 1 + C i /δ −i for all i ∈ I . Defining weights (α i ) i∈I as α i := δ −i /nδ = λ −i /n for all i ∈ I (noting that 0 < α i < 1/n for all i ∈ I and that i∈I α i = 1), the market clearing condition i∈I C i = 0 gives i∈I α i (dP i /dR i ) ≤ 1. We can obtain a corresponding lower bound. Indeed, using the endogenous bounds C i ≤ (n − 1)δ + δ i , it follows that κ i ≤ − log α i for all i ∈ I , which gives dP i /dR i ≥ α i (1 + C i /δ −i ) = α i + C i /(nδ). Using again the market clearing condition i∈I C i = 0, it follows that i∈I (dP i /dR i ) ≥ 1. To recapitulate, which imposes considerable a priori restrictions on the likelihood ratios dP i /dR i for all i ∈ I . (For example, there are no events for which all agents will overstate or understate their likelihood compared to their actual subjective beliefs.) In particular, since 1/α i = n/λ −i , we obtain that The above upper bound on the likelihood of P i with respect to R i only depends on the number of remaining agents n and the relative risk-tolerance coefficient of the agents; it depends neither on the aggregate risk-tolerance level δ nor on the actual subjective beliefs of other agents. Furthermore, note also that (4.6) implies that The latter gives an a priori endogenous estimate on the distance of the truth from the reported beliefs in a Nash equilibrium.

Loss of efficiency
As already mentioned, agents' strategic behaviour results in risk-sharing inefficiency, which, since utilities (U i ) i∈I are numerically represented by certainty equivalents, can be measured through the difference of the aggregate monetary utility under the Arrow-Debreu transaction and the aggregate monetary utility under the Nash equilibrium risk-sharing transaction. Note that similar measures of inefficiency have been used in the risk-sharing literature; see, for example, [30] or [2]. Mathematically, the loss of efficiency equals u * − u = i∈I u * i − i∈I u i , where (u * i ) i∈I and u * are defined in (2.5) and (2.6), whereas u i := U i (C i ), ∀i ∈ I, and u := i∈I u i .

From (2.7), (4.2) and (4.3) it follows that
Recalling that E Q [C i ] = 0 for all i ∈ I and noting the equality which holds in view of (4.3), we obtain Adding up (4.7) over i ∈ I and using the fact that i∈I z i = 0, we obtain an analytic expression of the loss of efficiency caused by the game, namely, In other words, the Nash risk-sharing equilibrium always implies a strict loss of efficiency, except for the case where there is no trading within a Nash equilibrium (which is equivalent to the case where there is no trading within an Arrow-Debreu equilibrium as well).

A priori information on z
From (4.7) and (4.8), we obtain This gives an economic interpretation for the quantities z i = λ i (u * − u ) + (u i − u * i ), i ∈ I . Indeed, λ i (u * − u ) is the fraction corresponding to agent i ∈ I of the aggregate loss of utility caused by forming a Nash instead of an Arrow-Debreu equilibrium; on the other hand, u i − u * i is the difference between the utility that agent i ∈ I acquires in a Nash from the one in an Arrow-Debreu equilibrium.
Although the aggregate utility u in Nash equilibrium risk sharing can never be higher than the Arrow-Debreu aggregate utility u * , it may happen that some agents benefit from the game in the sense that their individual utility after the negotiation game is higher when compared to the utility gain of the Arrow-Debreu equilibrium. We address such cases in Sect. 5.
Equation (4.9) is useful in obtaining tight bounds on z = (z i ) i∈I . Since u i ≥ 0 for all i ∈ I and u ≤ u * , from (4.9) it follows that Combined with i∈I z i = 0, the previous a priori bounds imply that z has to live in a compact simplex on I . The bounds in (4.10) are indeed sharp: in the no-trade setting of Remark 4.3, it follows that u * i = 0 for all i ∈ I , which implies that z i ≥ 0 for all i ∈ I ; since z ∈ I , it follows that z i = 0 for all i ∈ I . This also shows that the trivial Nash equilibrium obtained in Remark 4.3 is unique.

Individual marginal indifference valuation
In view of (4.8) and the subsequent discussion, recalling Remark 2.4, it follows that the allocation in a Nash equilibrium fails to be Pareto-optimal (except in the trivial no-trade case). Another way to demonstrate the inefficiency of a Nash equilibrium is through the disagreement between the individual agent's marginal (utility) indifference valuation measures after the Nash risk-sharing transaction.
Recall that given a position ) is maximised at q = 0 for all X ∈ L ∞ ; in other words, if prices are given by expectations under Q i , agent i ∈ I has no incentive to take any position other than G i . Using the first-order conditions, it is straightforward to show that log (dQ i /dP i ) ∼ −G i /δ i .
In an Arrow-Debreu equilibrium, the collection (Q * i ) i∈I ∈ P I with the property log(dQ * i /dP i ) ∼ −C * i /δ i for i ∈ I , which are the individual marginal indifference valuation measures associated with positions (C * i ) i∈I after the Arrow-Debreu risksharing transaction, satisfies Q * i = Q * for all i ∈ I : all agents' marginal indifference valuation measures agree. Now denote the individual agent's marginal indifference valuation measures after the Nash risk-sharing transaction by (Q i ) i∈I , for which we have log(dQ i /dP i ) ∼ −C i /δ i for all i ∈ I . In view of log (dP i /dQ * ) ∼ C * i /δ i , (4.2) and (4.3), it follows that log(dQ i /dQ ) ∼ log(1 + C i /δ −i ). Since E Q [1 + C i /δ −i ] = 1, it follows that Pareto optimality would require all (Q i ) i∈I to agree, which is possible only if C i = 0 for all i ∈ I , that is, exactly when no trade occurs. All Nash securities (C i ) i∈I have zero value under Q . For each individual agent i ∈ I , we can measure the marginal indifference value of C i via In particular, note that E Q i [C i ] ≥ 0, with strict inequality if C i is non-zero, for all i ∈ I . This observation implies that (except in trivial situations of no trading) all agents would be better off if they would take a larger position in their individual securities; for all a ∈ R + , the collection (aC i ) i∈I of securities clears the market, and for some a > 1, this collection of securities would result in higher utility for each agent than using the securities (C i ) i∈I . Of course, what prevents agents from doing so is that they would find themselves in a (Nash) disequilibrium. The fact that agents will not agree on market-clearing collections (aC i ) i∈I that for some a > 1 would be individually (and therefore, also collectively) preferable also indicates that trading volume within a Nash equilibrium tends to be reduced. The individual marginal indifference valuation measures (Q i ) i ∈ I allow an interesting expression of the Nash valuation measure Q . To wit, recall from Sect. 4.3.2 the weights α i = δ −i /nδ for all i ∈ I ; then from (4.11) and the market clearing condition i∈I C i = 0, it follows that (4.13) In words, the Nash valuation measure Q is a convex combination of the individual agents' marginal indifference valuation measures assigning weight α i to agent i ∈ I . Note also that more risk-averse agents carry more weight; however, since max i∈I α i < 1/n, Q is almost equal to the equally weighted average of (Q i ) i∈I for large numbers of agents. Relation (4.13) highlights the importance of risk-tolerance levels regarding the gain or loss of utility for individual agents in a Nash equilibrium. Consider, for instance, the situation of two interacting agents, with one of them being considerably more risk-tolerant than the other. In this case, Q will be very close to the risk-averse agent's marginal utility-based valuation measure, which will agree with the quoted prices. On the other hand, the possible discrepancy of Q from the risk-tolerant agent's marginal utility-based valuation is beneficial to this agent, as it gives the opportunity to purchase a positive-value security for zero price. A limiting instructive scenario along these lines is treated in Sect. 5.
The marginal indifference valuation measures (Q i ) i∈I of (4.11) can be used to provide interesting formulas for the utility gain in a Nash equilibrium and the utility difference between the Nash and Arrow-Debreu transactions. Note first that (4.3) and (4.8) give which, combined with (4.9) and the fact that C * i = δ i log (dP i /dQ * ) + u * i , implies that (4.14) Using further (4.11) and taking expectation with respect to Q in (4.14), we obtain The last equality has to be compared with (2.5). As in an Arrow-Debreu equilibrium, agents in a Nash equilibrium benefit from the distance of the resulting valuation measure from their subjective views; however, unlike the Pareto-optimal efficiency of the Arrow-Debreu transaction, agents in the Nash transaction suffer a loss from the distance of the valuation measure from their respective marginal indifference valuation measures. From (4.14) and (4.11), it follows that C i = −δ i log(dQ i /dP i ) + U i (C i ); combining this with the fact that C * i = δ i log(dP i /dQ * ) + u * i , we obtain the equality Taking expectations with respect to Q * , it follows that The difference of individual agents' utilities in the two equilibria comes from two distinct sources. The first stems from the discrepancy (measured via the relative entropy) of the Arrow-Debreu valuation from the individual marginal indifference valuation of agent i ∈ I in a Nash equilibrium. When the agent's marginal indifference valuation measure in a Nash equilibrium is close to the Arrow-Debreu measure, his loss of utility caused by the Nash game is lower. In a sense, this is the part of aggregate loss of utility that is "paid" by agent i ∈ I (see also (4.16) below). The other term on the right-hand side of (4.15) regards the price under the Arrow-Debreu valuation measure Q * of the actual security that agent i ∈ I buys at a Nash equilibrium. Recalling that Nash equilibrium prices of the Nash securities (C i ) i∈I are zero, positivity of E Q * [C i ] implies that the security C i is undervalued in a Nash equilibrium transaction. Again, note that if Q i is close to Q * , then the valuation E Q * [C i ] tends to be positive since E Q i [C i ] is always nonnegative (see (4.12)). To recapitulate the previous discussion: agents whose marginal indifference valuation measure is close to the Arrow-Debreu one tend to benefit from the Nash game. As we shall see in Sect. 5, this happens, for example, when agent i ∈ I is sufficiently risk-tolerant. Due to the market clearing condition i∈I C i = 0, the aggregate loss takes into account only the aggregate discrepancy of individual marginal measures from the Arrow-Debreu optimal one: under-valuation of certain securities is balanced by overvaluation of others. Indeed, adding up (4.15) over all i ∈ I gives which measures Nash inefficiency as aggregate discrepancy from optimal valuation of the individual agents' marginal indifference valuation in a Nash equilibrium. Equation (4.16) is the counterpart of (2.6), where the inefficiency of complete absence of trading as compared to Arrow-Debreu risk sharing is considered.

Existence and uniqueness of Nash equilibrium via finite-dimensional root finding
Theorem 4.2 is used as a guide in order to search for an equilibrium, parameterising candidates for optimal securities using the n-dimensional space I introduced in (4.1). Proposition 4.5 that follows, and whose proof is the content of Sect. A.5, enables us to reduce the search of a Nash equilibrium, an inherently infinite-dimensional problem in our setting, to a finite-dimensional one. The latter problem gives the necessary tools for numerical approximations of Nash equilibria (see also Example 4.8).

Proposition 4.5
For all z ∈ I , there exists a unique (C i (z)) i∈I ∈ (L 0 ) I with C i (z) > −δ −i and (4.17) (Note that necessarily, i∈I C i (z) = 0 for all z ∈ I .) Furthermore, (4.18) In the notation of Proposition 4.5, for each z ∈ I , define the probability Q(z) via The uniform bounds −δ −i < C i (z) < (n − 1)δ + δ i follow as in Sect. 4.3.1 and imply exp(C i (z)/δ i ) ∈ L 1 (Q(z)) for all i ∈ I and z ∈ I . In particular, (C i (z)) i∈I ∈ C Q(z) for all z ∈ I . In view of Theorem 4.2, Nash equilibria amount to finding z ∈ I such that E Q(z) [C i (z)] = 0 for all i ∈ I . We can in fact define a function : I → R + that gives a "distance from equilibrium" by the formula Since C i (z) > −δ −i for all z ∈ I , is well defined. Furthermore, the inequality log x ≤ x − 1, valid for x ∈ (0, ∞), gives in view of the fact that i∈I C i (z) = 0 for all z ∈ I , which shows that is indeed The following result summarises the above discussion.
Proposition 4.6 With the previous notation, the following are true: -Assume that (Q , (C i ) i∈I ) is a Nash equilibrium, and let z ≡ (z i ) i∈I ∈ I be as in (4.2). Then (z ) = 0. -Assume that there exists z ∈ I with (z ) = 0. Then the pair (Q(z ), (C i (z )) i∈I ) defined as in (4.17) and (4.19) is a Nash equilibrium.
Proposition 4.6 provides a one-to-one correspondence between Nash equilibria and roots of . Recalling the discussion in Sect. 4.3.4, any root of belongs to the compact subset of I consisting of (z i ) i∈I ∈ I with z i ≥ −u * i for all i. This allows Its practical usefulness notwithstanding, Proposition 4.6 does not answer the question of actual existence of Nash equilibria nor, in case of existence, the uniqueness. These issues are settled in Theorem 4.7, whose proof is the subject of Sect. A.6.

Extreme risk-tolerance
As discussed in Sect. 4.3.5, risk-tolerance coefficients are crucial factors in the gain or loss caused by the game in each agent's utility. In this section, we investigate this issue more closely by studying and comparing the Arrow-Debreu and Nash risk-sharing equilibria when agents' risk preferences approach risk-neutrality in the sense that risk-tolerance approaches infinity. In order to focus on the economic interpretation of the results, we consider the simplified (but representative) case of two agents.
The analysis that follows examines two cases: firstly, when only one agent becomes extremely risk-tolerant, and secondly, when both agents' risk-tolerance coefficients uniformly approach infinity. Besides the interest of this analysis in its own right, it also allows us to substantiate the claim that highly risk-tolerant agents are the ones who in fact benefit from the risk-sharing game.

One extremely risk-tolerant agent
We start with the two-agent case I = {0, 1}, where the risk-aversion of only one agent approaches zero. We keep the risk-tolerance δ 1 and subjective probability P 1 of agent 1 fixed. On the other hand, for agent 0, we consider a sequence of risktolerance coefficients (δ m 0 ) m∈N with lim m→∞ δ m 0 = ∞ and a fixed subjective probability P 0 . In this setup, Theorems 2. Given that lim m→∞ λ m 1 = 0, L 0 -lim m→∞ (dQ m, * /dP 0 ) = 1 readily follows from the dominated convergence theorem-in fact, with | · | TV denoting the total-variation norm, Scheffé's lemma implies that lim m→∞ |Q m, * − P 0 | TV = 0. Since for all m ∈ N and (Q m, * ) m∈N converges to P 0 , we expect the limiting relationship L 0 -lim m→∞ C m, * 0 = δ 1 log(dP 0 /dP 1 ) − δ 1 H(P 0 | P 1 ). Clearly, for the previous limit to be valid, the following (technical) assumption is necessary.
In Sect. A.7, it is shown that the latter assumption is also sufficient for the validity of Proposition 5.2, giving the limiting valuation and security in an Arrow-Debreu equilibrium and the limiting gains of both agents. It is indeed expected that the utility gain of a nearly risk-neutral agent is almost zero. To see this, compare the limiting valuation measure P 0 with the limiting utility of agent 0, which is the linear expectation with respect to P 0 . On the other hand, the only case where there is no limiting utility gain for agent 1 is when the two agents' subjective beliefs coincide.
We now turn to a Nash risk-sharing equilibrium. From (4.4), we obtain Accepting that the sequence (z m, 0 ) m∈N converges in R and (C m, 0 ) m∈N converges in L 0 (these conjectures actually have to be proved as part of Theorem 5. where we set z ∞, 0 := lim m→∞ z m, 0 . This heuristic discussion gives a method to compute the limit. For z ∈ R, define the random variable C ∞ 0 (z) by the equation is strictly increasing and continuous and maps (−1, ∞) to (−∞, ∞), it follows that C ∞ 0 (z) is a welldefined (−δ 1 , ∞)-valued random variable for all z ∈ R. So we should have is given as the limit of (z m, 0 ) m∈N , we may actually identify a priori what its value will be. To make headway, note that from should be satisfied. The next result, whose proof is given in Sect. A.8, ensures that a unique such candidate z ∞, 0 ∈ R exists. Lemma 5. 3 In the notation of (5.2), there exists a unique z ∞, Before we state our main result on the limiting behaviour of a Nash equilibrium, we make a final observation. Note that in view of (4.5), for all m ∈ N, we have log(dR m, 1 /dP 1 ) ∼ − log(1 − C m, 0 /δ m 0 ). Since lim m→∞ δ m 0 = ∞ and, as it turns out, (C m, 0 ) m∈N is convergent, the revealed subjective probability R m, 1 of agent 1 when m is large is close to the actual P 1 . (There is an alternative way to obtain the same intuition. From (4.6), note that dR m, 1 /dP 1 ≥ λ m 0 for all m ∈ N. Since lim m→∞ λ m 0 = 1 and (dR m, 1 /dP 1 ) m∈N has constant expectation 1 under P 1 , (R m, 1 ) m∈N must converge to P 1 .) This suggests the same asymptotic behaviour as in the case discussed in Sect. 3.3, where only agent 0 acts strategically as indicated by the best probability response, whereas agent 1 reports true subjective beliefs P 1 . Indeed, the following result, whose proof is given in Sect. A.9, implies that the limiting security structure is the same, regardless of whether the risk-averse agent 1 enters the game or simply reports true subjective beliefs (in which case only the approximately risk-neutral agent behaves strategically).
Theorem 5.4 With the previous notation (in particular, of Lemma 5.3), we have The equality of the limits of (C m, 0 ) m∈N and (C m,r 0 ) m∈N implies that the strategic behaviour of a risk-neutral agent dominates the risk-sharing transaction. Intuitively, agents with high risk-tolerance are willing to undertake more risk at the sharing transaction in return for a higher cash compensation. Thus, at the limit, the risk-neutral agent satisfies the reported hedging needs of other agents but achieves better prices by applying the best response strategy. On the other hand, for the risk-averse agent, the risk reduction is more important than a higher price to be paid. As a result, at equilibrium, the risk-averse agent prefers to submit true beliefs even though this results in a higher price to be paid to the risk-neutral agent. The situation is totally different in an Arrow-Debreu equilibrium transaction, where agents act basically as price takers and the securities and prices are determined by the efficiency of the transaction.
We argued in Sect. 4.3 that in any risk-transfer situation, the Nash equilibrium incurs some loss of efficiency. Although the aggregate utility is reduced in a Nash equilibrium when compared with the Arrow-Debreu one, certain agents may obtain a higher utility gain in risk-sharing games. In particular, Proposition 5.5 (whose proof is given in Sect. A.10) demonstrates that an agent with sufficiently high risk-tolerance enjoys a higher utility at a Nash equilibrium transaction than the utility at the Arrow-Debreu equilibrium sharing.
The limiting loss for the risk-averse agent comes from two sides. The first is , which is the limiting gain of agent 0. The remaining quantity δ 1 H(Q ∞, * |Q ∞, ) is in fact the loss from the applied strategic behaviour as opposed to sharing in a Pareto-optimal way. Both terms are strictly positive as long as C ∞, 0 is not identically equal to zero.
The message of Proposition 5.5 is clear. The introduction of strategic behaviour allows agents with high risk-tolerance to achieve better prices that the more riskaverse agents are willing to pay in order to achieve risk reduction. In contrast to the Arrow-Debreu equilibrium where prices are given by the optimal sharing measure, agents with sufficiently high risk-tolerance are willing to accept more risk in the Nash game since their strategy drives the market to better cash compensation for them. In fact, a more risk-averse agent not only tends to undertake all the efficiency loss caused by the game, but also fuels the utility gain of the (sufficiently) risk-tolerant counterparty.
Recalling the discussion and notation of Sect. 4.3.5, we may offer some more detailed comments. From (4.11) and Proposition 5.5, it follows that the marginal valuation measure of agent 0 approaches the limiting optimal valuation measure Q ∞, * . In turn, this implies that for large enough m ∈ N, the security that agent 0 gets in a Nash equilibrium is undervalued-indeed, note that According to (4.15) and the discussion that follows, we readily get that the utility of agent 0 is increased. For the risk-averse agent, the situation is different. From (4.13), it follows that Q m, 1 is close to Q m, for large m ∈ N, which in turn is close to Q ∞, . Hence, for large enough m ∈ N, the security received by agent 1 in a Nash equilibrium is overvalued; on top of this, agent 1 also carries all the risk-sharing inefficiency of a Nash equilibrium.

Both agents being extremely risk-tolerant
We have seen before that the strategic behaviour of a highly risk-tolerant agent dominates the Nash game and drives the market to his preferred transaction, regardless of the actions of the other agent. Here, we examine what happens to the equilibria when both agents approach risk-neutrality at the same speed. More precisely, we fix λ 0 ∈ (0, 1) and λ 1 ∈ (0, 1) with λ 0 + λ 1 = 1 and consider a nondecreasing sequence (δ m ) m∈N with lim m→∞ δ m = ∞. Define δ m i := λ i δ m for all m ∈ N and i ∈ {0, 1}. In contrast to the setup of Sect. 5.1, here the subjective beliefs of the agents have to depend on m ∈ N. To obtain intuition on why and how the subjective probabilities must behave, note that according to Theorem 2.2, the security C m, * 0 is for all m ∈ N given as a multiple by δ m 0 of a random variable whose dependence on risk-tolerance comes only through λ 0 and λ 1 . Since the latter weights are fixed for each m ∈ N, to guarantee that the securities in an Arrow-Debreu equilibrium have a well-behaved limit, we make the following assumption.
Note that the condition E P [ξ i ] = 0 for i ∈ {0, 1} appearing in Assumption 5.6 is just a normalisation and does not constitute any loss of generality.
The proof of Theorem 5.7 is given in Sect. A.11. Interestingly, the risk-neutrality of both agents drives the Nash equilibrium to half of the Arrow-Debreu securities, which is evidence of the market inefficiency caused by the strategic behaviour of risk-neutral agents. The result of Theorem 5.7 is another manifestation of the claim (initially made in Sect. 4.3.5) that the trading volume in a Nash equilibrium tends to be lower than the Pareto-optimal allocations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

A.1 Proof of Theorem 2.2
Suppose that (Q * , (C * i ) i∈I ) is an Arrow-Debreu equilibrium. We show the necessity of (2.3) and (2.4). For all i ∈ I , note that = 0 and i ∈ I . Since the function R → U i (C * i + X) ∈ R has a maximum at = 0, the first-order conditions and the dominated convergence theorem, using the fact that exp(−C * i /δ i ) ∈ L 1 (P i ), imply that E P i [exp(−C * i /δ i )X] = 0. The latter equality holds for all X ∈ L ∞ with E Q * [X] = 0 and all i ∈ I ; therefore, C * i ∼ δ i log(dP i /dQ * ) for all i ∈ I . Since E Q * [C * i ] = 0, (2.4) follows. Furthermore, i∈I C * i = 0 gives i∈I δ i log(dP i /dQ * ) ∼ 0, from which (2.3) follows. Assume now that (Q * , (C * i ) i∈I ) is given by ( Together with E Q * [C * i ] = 0 for all i ∈ I , this implies i∈I C * i = 0. The fact that C * i is optimal for agent i ∈ I under the valuation measure Q * is argued in Remark 2.4. We have shown that (Q * , (C * i ) i∈I ) given by (2.3) and (2.4) is an Arrow-Debreu equilibrium. The necessity of (2.3) and (2.4) for an Arrow-Debreu equilibrium proved in the previous paragraph establishes its uniqueness.

A.2 Proof of Proposition 3.2
To ease the reading, in the course of the proof of Proposition 3.2, we denote Q (R −i ,R r i ) by Q r i .

A.2.1 First-order conditions
We prove here the necessity of the stated conditions for a best response. Fix i ∈ I and R r i ∈ P such that V i (R r i ; it is straightforward to check that the resulting contract for agent i ∈ I would be zero; therefore, In particular, we have exp(−C r i /δ i ) ∈ L 1 (P i ), a fact that will be useful in several places for applying the dominated convergence theorem in the sequel.
Fix X ∈ L ∞ . For any ∈ R, let R i ( ) ∈ P be defined via the recipe , where the constant in the equivalence was cancelled by the definition of C i ( ). The dominated convergence theorem and simple differentiation, using also that E Q r i [C r i ] = 0, imply that Since V i (R( ); R −i ) = U i (C i ( )) for all ∈ R, another application of the dominated convergence theorem gives Noting that j ∈I \{i} λ j log(dR j /dP i ) ∼ log(dQ r i /dP i ) − λ i log(dR r i /dP i ) implies The last equivalence relation allows us to write (A.1) as Up to now, X ∈ L ∞ was fixed but arbitrary. Varying X over L ∞ in (A.2) gives Necessarily, C r i > −δ −i should hold. Taking logarithms and rearranging (A.4) gives (3.2).

A.2.2 Optimality of candidates for best response
We now proceed to showing that the necessary conditions for a best response are also sufficient. (As mentioned in the discussion following Theorem 3.7, we have not been able to show whether V i (·; R −i ) is concave; therefore, the first-order conditions do not immediately imply optimality.) Fixing R ∈ P and assuming the stated conditions, we further show that V i (R; Similarly to the arguments in Sect. A.2.1, the contract that agent i ∈ I would obtain by the response R ∈ P would be where Q X ∈ P is such that log(dQ X /dQ r i ) ∼ X. It follows that . (A.5) Remark A.1 If E Q r i [exp(X + )X + ] = ∞ was true (equivalently, since exp(X)X is bounded below if E Q r i [exp(X)X] = ∞ was true), we would necessarily have that Therefore, we obtain Combining the previous, including (A.5) and Remark A.1, it suffices to show that , applying Jensen's inequality under the probability having density D r i with respect to where the last equality uses . Using the inequality exp(x) ≥ 1 + x for x ∈ R, we obtain Putting everything together, it follows that it suffices to show for This follows from Jensen's inequality applied to the convex function (0, which is exactly what was required.

A.3 Proof of Theorem 3.7
Define R −i := (1/λ −i ) j ∈I \{i} λ j log(dR j /dP i ) and note that Hölder's inequality gives exp(R −i ) ∈ L 1 (P i ). For z i ∈ R, implicitly define C i (z i ) ∈ L 0 as the (−δ −i , ∞)-valued random variable satisfying the equation Note that the existence and uniqueness of the solution C i (z i ) for each z i ∈ R follows from the fact that the function (−1, ∞) y → (δ/δ i )y + log(1 + y) is strictly increasing from −∞ to ∞. For z i ∈ R, define also the (0, ∞)-valued random variable D i (z i ) := 1 + C i (z i )/δ −i and note that it is the unique solution to the equation Observe that D i (and hence C i ) is increasing as a function of z i . It is also straightforward to check that In particular, both follows; then, in view of Hölder's inequality, we obtain that In view of Lemma A.2, for each z i ∈ R, we may define Q i (z i ) ∈ P via in other words, Furthermore, for every z i ∈ R, Lemma A.2 and in particular the fact that we have As mentioned in the discussion following Proposition 3.2, to establish Theorem 3.7, we need to show that there exists a unique It is straightforward to check that f i is continuous by the dominated convergence theorem and Lemma A.2.
Let P i be the probability measure in P such that log(dP i /dP i ) ∼ R −i . Then, thanks to the equivalence relation log(dQ i (z i )/dP i ) ∼ D i (z i ) + R −i , we have that for all z i ∈ R. In fact, since the covariance of exp(D i (z i )) and D i (z i ) is nonnegative under any probability, we have f i (z i ) ≥ E P i [D i (z i )] for all z i ∈ R. Using monotone convergence and (A.8), lim z i →∞ f i (z i ) = ∞ follows from lim z i →∞ D i (z i ) = ∞. Furthermore, lim z i →−∞ D i (z i ) = 0 and monotone convergence imply that we have We claim that f i is strictly increasing, which implies that ζ i is indeed unique. In preparation, note that differentiating (A.6) with respect to z i and rearranging give D i (z i ) = q i (D i (z i )), where (0, ∞) y → q i (y) := λ i y/(λ i + y). In particular, since q i is an increasing function, the covariance between D i (z i ) and D i (z i ) is nonnegative for all z i ∈ R under any probability. Straightforward computations using the definition of Q i (z i ) give that the derivative of f i satisfies ≥ 0 for all z i ∈ R, Theorem 3.7 has been proved.

A.4 Proof of Theorem 4.2
Suppose that (Q , (C i ) i∈I ) is a Nash equilibrium and let (R i ) i∈I ∈ P I be the associated revealed subjective beliefs. We first prove relationship (4.3). In view of Proposition 3.2, since C i /δ i ∼ log(dR i /dQ ) and j ∈I \{i} λ j log(dR j /dP i ) = log(dQ /dP i ) − λ i log(dR i /dP i ), that is, log(dR i /dP i ) ∼ − log(1 + C i /δ −i ) for all i ∈ I , which is (4.5). Since log(dP i /dQ * ) ∼ C * i /δ i for all i ∈ I in view of Eq. (2.4), it follows that for all i ∈ I . In turn, since j ∈I C * j = 0, the latter gives which in view of (4.5) is exactly (4.3).
To prove (4.2), we add λ i log(1+C i /δ −i ) ∼ −λ i log(dR i /dP i ) to (3.2) and obtain Combined with (4.3), this gives (4.2) for an appropriate tuple z := (z i ) i∈I ∈ R I . The market clearing conditions i∈I C i = 0 = i∈I C * i show that i∈I z i = 0, that is, z ∈ I . Finally, the fact that E Q [C i ] = 0 for all i ∈ I results directly from For the proof of the converse implication, assume that conditions (N1)-(N3) hold for (Q , (C i ) i∈I ) and fix i ∈ I . Define the associated revealed beliefs (R i ) i∈I ∈ P I via log(dR i /dQ ) ∼ C i /δ i . Since C * i /δ i ∼ log(dP i /dQ * ), a combination of (4.2) and (4.3) gives that log(1 + C i /δ −i ) ∼ − log(dR i /dP i ). Using again (4.2) and (4.3), we have It follows that the sufficient conditions for optimality of Proposition 3.2 are satisfied for each i ∈ I . To show that (Q , (C i ) i∈I ) is a Nash equilibrium, it is left to verify that (C i ) i∈I ∈ C Q . Indeed, summing (4.2) with respect to i implies that i∈I C i = 0 since z is assumed to belong in I . This fact, together with the requirement C i > −δ −i , implies the uniform boundedness of C i ; in particular, exp(C i /δ i ) ∈ L 1 (Q ) for all i ∈ I . Taking also (N3) into account, we conclude that (C i ) i∈I ∈ C Q , which completes the proof.

A.5 Proof of Proposition 4.5
Fix z ∈ I . Suppose for the moment that a solution to (4.17) exists, and set Plugging back into the definition of L(z) in (A.9), we obtain that should be satisfied. We now proceed backwards by showing that (A.11) has a unique solution. In what follows, fix z ∈ I and define the function w : × R → R via the recipe w(y) = y − i∈I λ i log θ i (z i + C * i + δ i y) for y ∈ R, where the dependence of w on ω ∈ is suppressed. The derivative of w with respect to the spatial coordinate is Since θ i (y) behaves sublinearly as y → ∞, lim y↑∞ w(y) = ∞ follows in a straightforward way. Furthermore, by the definition of θ i , for all x ∈ (−∞, 0) and i ∈ I , we have x < δ i log θ i (x). This implies w(y) < y − i∈I (1/δ)(z i + C * i + δ i y) = 0 on the event {y < − j ∈I ((z j + C * j )/δ j )}, showing at the same time that the equation w(L(z)) = 0 has a unique solution and Given the existence of a unique L(z) solving (A.11), C i (z) is specified for all i ∈ I via (A.10). Since . Thus combining (A.12) and the equality j ∈I (1 + C j (z)/δ −j ) −λ j = exp(−L(z)) implies the validity of (4.18), which concludes the proof.

A.6 Proof of Theorem 4.7
We first establish the general existence result and then tackle uniqueness in the twoagent case.

A.6.1 Proof of existence of a Nash equilibrium
We use the notation from Proposition 4.5 and the discussion following it. For all z ∈ I and i ∈ I , define u i (z) := U i (C i (z)). Furthermore, for each z ∈ I , define u(z) := i∈I u ( z) and Note that i∈I φ i (z) = 0 for all z ∈ I , so that φ := (φ i ) i∈I is I -valued. The obvious continuity of I z → L(z) from (A.11) and the domination relation given by (A.12) allow the application of the dominated convergence theorem to establish that φ : I → I is a continuous function.

Lemma A.3 z ∈ I corresponds to a Nash equilibrium if and only if it is a fixed point of φ.
Proof In view of the discussion in Sect. 4.3.4, if z ∈ I corresponds to a Nash equilibrium, then z is a fixed point of φ. Conversely, we show that any fixed point of φ corresponds to a Nash equilibrium. With L(z) as in (A.9) and recalling (4.17), start by observing that From the last equality, it follows that Adding up the previous equality over all agents, we obtain Since log(dQ(z)/dQ * ) = −L(z) + λ(z) for appropriate λ(z) ∈ R and the equality holds for all i ∈ I and z ∈ I , it follows that for all i ∈ I and z ∈ I , where we note that the quantity λ(z) cancels in this equation. Now suppose that z ∈ I is a fixed point of φ. From the last equality, it follows that the quantities E Q(z ) [1 + C i (z )/δ −i ] have the same value, which we call x(z ), for all i ∈ I . In other words, E Q(z ) [C i (z )] = δ −i (x(z ) − 1) for all i ∈ I . Since i∈I C i (z ) = 0, we obtain that x(z ) = 1, which implies that E Q(z ) [C i (z )] = 0 for all i ∈ I , in turn implying that z corresponds to a Nash equilibrium.
In view of Lemma A.3, the existence of a Nash equilibrium follows if we can show that φ has at least one fixed point. For any z ∈ I and i ∈ I , the strong bound C i (z) > −δ −i implies u i (z) ≥ −δ −i . Furthermore, u(z) ≤ u * for all z ∈ I by aggregate optimality of an Arrow-Debreu equilibrium. Therefore, it follows that Define the set K := {z ∈ I | z i ≥ −δ −i − u * i , ∀i ∈ I } and note that K is a compact and convex subset of I . Since φ is continuous and maps K to K, Brouwer's fixed point theorem implies that φ has at least one fixed point on K, which establishes the claim. (In fact, according to the discussion in Sect. 4.3.4, any fixed point must actually lie in the smaller set {z ∈ I | z i ≥ −u * i , ∀i ∈ I }.)

A.6.2 Proof of uniqueness in the two-agent case
Note that (z 0 , z 1 ) ∈ I if and only if z 0 = −z 1 . In the course of the proof, we identify R and I via R z ↔ (z, −z) ∈ I , that is, by considering only the "zero" coordinate. Correspondingly, for z ∈ R, we write C i (z) instead of C i ((z, −z)) for i ∈ {0, 1}; similarly, for z ∈ R, we write L(z) instead of L(z, −z) in (A.9).
In view of Proposition 4.6 and the equality C 1 (z) = −C 0 (z) for all z ∈ R, we need to prove the existence of a unique z ∈ R such that E Q(z ) [C 0 (z )] = 0; since the existence was already established, we only focus on the uniqueness here. Define the continuous function R z → f 0 (z) = E Q(z) [C 0 (z)]; then, it suffices to show that f 0 is strictly increasing.
Recall that C 1 (z) = −C 0 (z) for all z ∈ R and rewrite (4.17) as Differentiating with respect to z ∈ R, we obtain after some algebra that , ∀z ∈ R.
In other words, upon defining q( for all x ∈ (−δ 1 , δ 0 ), the covariance between C 0 (z) and −L (z) is nonnegative under any probability for all z ∈ R. Continuing, if we take into account that log dQ(z)/dP i ∼ log dQ(z)/dQ * + log(dQ * /dP i ) ∼ −L(z) − C * i /δ i for all i ∈ I , it is straightforward to compute that for all z ∈ R, we have Since C 0 (z) is a (0, ∞)-valued random variable and Cov Q(z) (C 0 (z), L (z)) ≤ 0 for all z ∈ R, the claim is proved.

A.9 Proof of Theorem 5.4
To ease the reading, throughout the proof, for all m ∈ N, define the (0,1/λ m 1 )-valued random variable D m, 0 := 1 + C m, 0 /δ 1 and the (0, ∞)-valued random variable D m,r 0 := 1 + C m,r 0 /δ 1 . We use the obvious notation Q m, ∈ P for m ∈ N. As in (3.5), let Q m,r 0 ∈ P be defined via By the equivalence C m, * Coupling the last equivalence with C m, 1 /δ 1 = −C m, 0 /δ 1 = 1 − D m, 0 , after some algebra, we obtain log(dQ m, /dP 1 ) ∼ D m, . Therefore, for all m ∈ N, where the inequality Cov where the last inequality follows since Cov P 0 (exp(−δ 1 D m, We then obtain that (log D m, 0 ) m∈N is bounded in L 0 , from which it follows that the family The second and third terms of the right-hand side of (A.18) converge (to −H(P 0 | P 1 ) and zero, respectively), and the sequence (z m, /λ m 0 ) m∈N is bounded in R; therefore, the existence of a ∈ R such that (A.17) holds readily follows. Further, log(dQ m,r 0 /dP 0 ) ∼ −λ m 0 log D m,r 0 + λ m 1 log(dP 1 /dP 0 ) due to (3.5). The last two facts give log(dQ m,r 0 /dP 1 ) ∼ D m,r 0 . Therefore, for m ∈ N, where the last inequality follows from Cov P 1 (exp(D m,r 0 ), D m,r 0 ) ≥ 0 for all m ∈ N. It follows that E P 1 [D m,r 0 ] ≤ 1 for all m ∈ N, which implies that (D m,r 0 ) m∈N is L 0 -bounded. Hence, the family {(w m (D m,x 0 )) + : m ∈ N} is also bounded in L 0 . Since (C m, * 0 ) converges in L 0 , (A.16) implies that (z m,r 0 ) m∈N is bounded from above (in R). By way of contradiction, suppose that (z m,r 0 ) m∈N is not bounded from below. Passing to a subsequence if necessary, we may assume that (z m,r 0 ) m∈N is a sequence of negative numbers with lim m→∞ z m,r 0 = −∞. Hence, again by (A.16), we get lim m→∞ D m,r 0 = 0. Since x + log x ≤ 1/λ m 0 + (1/λ m 0 δ 1 )w m (x) for all x ≥ 1 and z m,r 0 ≤ 0 for all m ∈ N, we get that for all m, D m,r 0 + log D m,r 0 ≤ 1/λ m 0 + C m, * 0 /λ m 0 δ 1 on {D m,r 0 > 1}. Given (A.18), we conclude the existence of κ ∈ R such that for all m ∈ N, D m,r 0 + log D m,r 0 ≤ κ + log(dP 0 /dP 1 ) on {D m,r 0 > 1}. It follows that we can use the dominated convergence theorem on the right-hand side of the equal- Lemma 5.3 then yields z ∞ 0 = z ∞, 0 , which also implies C ∞ 0 = C ∞, 0 (z ∞, 0 ). We continue to deal with the best response case. Since , ∀k ∈ N.
By the domination relationship in (A.19), we may apply the dominated convergence theorem in the numerator to obtain lim k→∞ E P 0 (D m k ,r 0 ) −λ m k 0 (dP 1 /dP 0 ) λ m k Similarly, the domination relationship in (A.19) allows us to apply the dominated convergence theorem in the denominator to obtain lim k→∞ E P 0 (D m k ,r 0 ) λ m k 1 (dP 1 /dP 0 ) λ m k 1 = 1.
We may now conclude the proof of Theorem 5.4. By way of contradiction, if L 0 -lim m→∞ C m, Note that ψ m is strictly increasing with ψ m (0) = 0 for all m ∈ N; furthermore, (ψ m ) m∈N converges uniformly on compact subsets of R to ψ ∞ : R → R defined by ψ ∞ (y) = 2y.
Lemma A. 7 The sequence (z m, 0 ) m∈N is bounded in R.
Proof We show that (z m, 0 ) m∈N is bounded above. A symmetric argument applied to agent 1 shows that (z m, 1 ) m∈N is bounded above; since z m, We obtain then that E P [exp(ξ 1 /δ m 1 )C m, 0 ] ≤ 0 for all m ∈ N. Suppose that (z m, 0 ) m∈N fails to be bounded above. By passing to a subsequence if necessary, we may assume that (z m, 0 ) m∈N is a sequence of nonnegative numbers with lim m→∞ z m, ] ≤ 0 for all k ∈ N, which was established in the proof of Lemma A.7. Furthermore, since (z m k , ) k∈N and (C m k , * 0 ) k∈N are convergent and in particular uniformly bounded from below (for the latter sequence of random variables, this follows from the fact that ξ i ∈ L ∞ for i ∈ {0, 1}) and ξ 1 ∈ L ∞ , we infer the existence of c ∈ (0, ∞) such that the uniform lower domination exp(ξ 1 /δ m k 1 )C m k , 0 ≥ −c is valid for all k ∈ N. An application of Fatou's lemma gives E P [ C ∞ 0 ] ≤ 0. The symmetric argument from the side of agent 1 gives E P [ C ∞ 0 ] ≥ 0, which implies that E P [ C ∞ 0 ] = 0. Since 2 C ∞ 0 = z ∞ + C ∞, * and E P [C ∞, * ] = 0, it follows that z ∞ = 0. We conclude that C ∞ 0 = C ∞, * 0 /2.
The proof of Theorem 5.7 can now be completed exactly as in the case of Theorem 5.4. If L 0 -lim m→∞ C m, 0 = C ∞, * 0 /2 were not true, then there would exist some ∈ (0, 1) and a subsequence (C m k , 0 ) k∈N of (C m, 0 ) m∈N such that E[1 ∧ |C m k , 0 − C ∞, * 0 /2|] > for all k ∈ N. Since the sequence (z m k , ) k∈N is bounded due to Lemma A.7, there exists a further subsequence of (z m k , ) k∈N that is convergent. Then Lemma A.8 implies that there exists a further subsequence of (C m k , 0 ) k∈N that L 0 -converges to C ∞, *