Uncertainty, Learning and International Environmental Agreements: The Role of Risk Aversion

This paper analyses the formation of international environmental agreements (IEAs) under uncertainty, learning and risk aversion. It bridges two strands of the IEA literature: (i) the role of learning when countries are risk neutral; (ii) the role of risk aversion under no learning. Combining learning and risk aversion seems appropriate as the uncertainties surrounding many international environmental problems are large, often highly correlated (e.g. climate change), but are gradually reduced over time through learning. The paper analyses three scenarios of learning. A key finding is that risk aversion can change the ranking of these three scenarios of learning in terms of welfare and membership. In particular, the negative conclusion about the role of learning in a strategic context under risk neutrality is qualified. When countries are significantly risk averse, then it pays them to wait until uncertainties have been largely resolved before joining an IEA. This may suggest why it has been so difficult to reach an effective climate change agreement. JEL-Classification: C72, D62, D80, Q54


Introduction
Environmental issues such as climate change pose three key challenges for economic analysis: (i) there are considerable uncertainties about the likely future costs of environmental damages and abatement; (ii) our understanding of these uncertainties changes over time as a result of learning more about climate science, possible technological responses and behavioral responses by households, firms and governments; (iii) the problem is global, but since there is no single global agency to tackle climate change, policies need to be negotiated through international environmental agreements (IEAs). 1,2 Recently, these three issues have begun to be integrated in one framework. Two stands of literature can be distinguished.
The first strand of literature studies uncertainty and IEA formation with the focus on the role of learning, but under the assumption of risk neutrality. Ulph and Ulph (1996) and Ulph and Maddison (1997) compare the fully cooperative and the non-cooperative scenarios when countries face uncertainty about damage costs. They show that the value of learning about damage costs may be negative when countries act non-cooperatively and damage costs are correlated across countries. Na and Shin (1998), Ulph (2004), Kolstad (2007), Kolstad andUlph (2008, 2011) have considered how the prospect of future resolution of uncertainty affects the incentives for countries to join an IEA. Kolstad and Ulph consider a model where countries face common uncertainty about the level of environmental damage costs. 3 Three scenarios of learning are considered: with full learning, uncertainty about damage costs is resolved before countries decide whether to join an IEA; with partial learning, uncertainty is resolved after countries decide whether to join an IEA, but before they choose their emissions levels; with no learning, uncertainty is neither resolved before stage 1 nor stage 2. They showed that the prospect of learning, either full or partial, generally reduces the expected total 1 On the first two issues, see for instance Arrow and Fisher (1974), Epstein (1980), Kolstad (1996a,b), , Gollier, Julien and Treich (2000) as well as Narain, Fisher and Hanemann (2007). 2 On the third issue, see for instance the classic papers by Carraro and Siniscalco (1993) and Barrett (1994). The most influential papers have been collected in a volume by Finus and Caparros (2015) who also provide a survey. 3 By common uncertainty we mean that each country faces the same ex-ante distribution of possible damage costs, and when uncertainty is fully resolved they face the same ex-post level of damage costs, i.e. the risks they face are fully correlated across countries. Kolstad and Ulph (2011) extend this model to consider the case where the risks each country faces are uncorrelated. Uncorrelated uncertainty is also considered in a slightly different model in Finus and Pintassilgo (2013) and empirically investigated in a climate model with twelve world regions in Dellink and Finus (2012).
2 payoff in stable IEAs. In particular, Kolstad andUlph (2008, 2011) showed that partial learning would yield the highest total payoff for only a small proportion of parameter values.
For a significant majority of parameter values, the highest expected total payoff arose under no learning. Hence, it is better to form an IEA before waiting for better information: removing the "veil of uncertainty" seems to be detrimental to the success of international environmental cooperation.
All these models have assumed that countries are risk neutral. However, in the climate context, risks are highly correlated and hence possibilities for risk sharing are limited so that the assumption of risk aversion may be quite relevant. Therefore, we extend the two-stage coalition formation setting by Kolstad and Ulph (2008) by departing from the assumption of risk neutrality. In this paper, we allow for countries to be risk averse, and show that if countries have a relatively high degree of risk aversion, then for a majority of parameter values full learning yields higher expected total utility than no learning. This may help to explain why it has taken such a long time between the start of the process of tackling climate change (the Kyoto Protocol) to reach a more substantial agreement in Pariscountries are risk averse and so needed to wait until they had more information about the risk of climate change before committing to significant action to tackle climate change.
The second strand of literature studies uncertainty and IEA formation with the focus on the role of risk aversion, though under the assumption of no learning. Endres and Ohl (2003) show in a simple two-player prisoners' dilemma, using the mean-standard deviation approach to capture risk aversion, that risk aversion can increase the prospects of cooperation once it reaches a certain threshold. The reason is that the benefits of mutual cooperation increase relative to the payoffs of unilateral cooperation and no cooperation because cooperation reduces the variance of payoffs. The more risk averse players are, the more attractive cooperation becomes compared to free-riding. In their model, there is a first threshold above which the prisoners' dilemma turns into a chicken game and a second threshold above which the game turns into an assurance game. Compared to their paper, we allow for an arbitrary number of players, model cooperation as a two-stage coalition formation game and consider explicitly the role of learning.
Bramoullé and Treich (2009)  Boucher and Bramoullé (2010) consider the effects of risk aversion on coalition formation, but only with no learning. They analyze the formation of an international environmental treaty using a similar coalition game and payoff function as adopted in this paper. Using an expected utility approach, their analysis focuses on the effect of uncertainty and risk aversion on signatories' efforts, the participation level in an agreement and total expected utility. They show that if additional abatement reduces the variance of countries' payoffs, then, under risk aversion, an increase in uncertainty tends to increase abatement levels and may decrease equilibrium IEA membership while the reverse is true if additional abatement increases the variance of countries' payoffs. 4 In this paper, our model of no learning satisfies the first condition, but we extend the analysis of Boucher and Bramoulle (2010) by considering also partial learning and full learning.
Thus, taken together, in our paper, we generalize the analysis of Kolstad and Ulph (2008) by allowing for risk aversion, and the analysis of Boucher and Bramoulle (2010) and Endres and Ohl (2003) by considering the role of learning. The key findings are that as countries become more risk averse it is no longer the case that for most parameter values the scenario of "No Learning" yields the highest expected aggregate utility, but increasingly it is the scenario "Full Learning". Moreover, the set of parameter values for which the scenario "Partial Learning" yields the highest expected aggregate utility, which is a small subset of such values when countries are risk neutral, becomes even smaller as countries become more risk averse.
Thus, we qualify the negative conclusion about the role of learning in a strategic context if players are sufficiently risk averse.
In our model, emissions last for just one period, which may seem restrictive in the context of dynamic environmental problems such as climate change. However, it has been shown in the literature on IEA formation under uncertainty that one period models produce similar results to multi-period models. For instance, Kolstad and Ulph (2008), using a one-period model, and 4 Hong and Karp (2013) show that it does not matter whether one analyses the provision of a public good or the amelioration of a public bad. What matters is whether players' actions increase or decrease the volatility of payoffs. In our model, as in Endres and Ohl (2003)  Taken together, the tension we seek to capture in our modeling is between No Learning where countries decide to join an IEA and base their decisions for ever on expected damage costs ignoring any later information, Full Information where countries delay making any decision to join an agreement until (almost) all uncertainty about damage costs has been resolved, or Partial Learning where countries start the process of joining an IEA knowing that as they get better information they will be able to use that to adjust their emissions policies. This would seem to be particularly relevant to the kind of situation to which Weitzman (2009) has drawn attentiona small probability of catastrophic climate change.
The paper proceeds as follows. In section 2, we set out the theoretical model and present our theoretical results in section 3. Section 4 presents some simulation results while Section 5 summarizes our main conclusions and implications for future research.

No Uncertainty
To establish the basic framework, we set out the model with no uncertainty. There are N In many papers with a dynamic payoff structure but fixed membership, results are qualitatively similar to the one period emission game (e.g. Rubio and Casino 2005). The extension to flexible membership would be more interesting but is technically very challenging. See Rubio and Ulph (2007).
with 0  a positive constant. In this simple model with a linear payoff function, following the literature, the (continuous) strategy space can be normalized to   01  i x, . 6 To make this model interesting, we make the following assumption: The individual benefit exceeds the individual unit damage cost from pollution, i.e. 1   (so countries pollute in the Nash equilibrium) but does not exceed the global unit damage cost, i.e. 1  N  (so countries abate in the social optimum); the second condition is a sufficient , which we will need when we consider expected utility. 7 In order to study coalition formation, we employ the widely used two-stage model of IEA formation (Carraro andSiniscalco, 1993 andBarrett, 1994) which is solved backwards. In stage 2, the emission game, for any arbitrary number of IEA members n , 1  nN , the members of the IEA (which we denote by the symbol c for coalition countries) and the remaining countries (which we denote by the symbol f for fringe countries) set their emission levels as the outcome of a Nash game between the coalition and the fringe countries. 8 That is, the coalition members together maximize the aggregate payoff to their coalition, whereas fringe countries maximize their own individual payoff. Given 1  , then coalition members will also pollute, 1 Knowing the payoffs to coalition and fringe countries for any arbitrary number of IEA members, we then determine the stable (Nash) equilibrium in stage 1, the membership game. 6 Either benefits from emission are lower than damage costs in which case equilibrium emissions are zero, or the reverse is true in which case equilibrium emissions would obtain their maximum. Thus, the upper bound is set to 1 here. 7 The lowest possible payoff is derived if a country abates and all other countries free-ride, which is given by and hence, given that 1 A sequential Stackelberg game in the second stage, as an alternative assumption (e.g. Barrett 1994), would make no difference here as players have dominant strategies. This also applies to Boucher and Bramoullé (2010 . Then the additional benefit from pollution of 1 falls short of the additional damage n  , as by assumption 1  n  in the initial situation with n members. It is easily checked that such an equilibrium is also externally stable. The total payoff in a stable coalition is given by: Thus, this simple model provides a relationship between the unit damage cost  and the equilibrium number of coalition members. The equilibrium is a knife-edge equilibrium with * n ( )  countries forming the coalition, which de facto dissolves once a member leaves the coalition as no country would abate anymore. The equilibrium coalition size weakly decreases in the cost-benefit ratio from emissions the larger is  the smaller is the number of countries in a stable IEA.

Uncertainty, Risk Aversion and Learning
Now assume that the unit damage cost of global emissions is uncertain and equal for all countries, both ex-ante and ex-post. We denote the value by s  in the state of the world s and hence (1) becomes: 7 Following Kolstad and Ulph (2008) and Boucher and Bramoulle (2010) and make the following assumptions: Assumption 2(i) is just the analogue to Assumption 1(i). Assumption 2(ii) means that uncertainty matters, in the sense that it implies significant differences in the size of the stable IEAs that would arise if we knew for certain which state of the world prevailed.
To allow for risk aversion, we assume that each country has an identical utility function over payoffs:  While ex-ante countries face uncertainty about the true value of unit damage costs, we want to allow for the possibility that countries may learn information during the course of the game which changes the risk they face. We shall follow Kolstad and Ulph (2008)  Full Learning (m=FL) countries learn the true value of unit damage costs before they have to take their decisions on membership (stage 1) and emissions (stage 2). With Partial Learning (m=PL) countries learn the true value of damage costs at the end of stage 1, that is after they have made their membership decisions but before they make their emission decisions (stage 2). Thus, in this simple analysis, learning takes the form of revealing perfect information. 11 We will compare the outcomes of the three scenarios of learning in terms of the expected size of IEAs and expected aggregate utility from an ex-ante perspective, i.e. before stage 1. 8

Analytical Results
In this section, we set out the equilibrium of the IEA model for each of the three models of learning with risk aversion, generalizing the results of Kolstad and Ulph (2008) who assumed risk neutrality. The proofs are provided in the Appendix.

Full Learning
We start with the benchmark scenario of Full Learning (FL). Players know the realization of the damage parameter  at the outset of the coalition formation game, i.e. before stage 1.
Thus, the results follow directly from what we know from the game with certainty in section 2.1 above.
Note that with Full Learning, while the degree of risk aversion does not affect the expected size of the IEA it will affect expected utility. Importantly, the size and utility are computed from an ex-ante perspective to make a comparison with the other models of learning meaningful.

No Learning
In this section, we address the scenario of No Learning in which players take their membership (stage 1) and emission (stage 2) decisions under uncertainty. 12 We begin by solving for optimal emissions of countries for any number of IEA members n. 9 straightforward to see that fringe countries will always pollute. To solve for the optimal emissions for a coalition member for any n, which we denote by xc(n), we need to introduce some notation. We define: ,, ( ( ( ), ))) cc E u x n n is the expected utility to an IEA country when there are n IEA members who set emissions xc and all fringe countries set emissions equal to 1. Then: and n as the smallest value of n such that: We summarise the results on emissions in the following Lemma: So the expected equilibrium coalition size is (weakly) smaller under risk aversion than risk neutrality. With uncertainty, countries are unsure about the state of the world. With risk, and hence concave utility, countries shy away from the commitment to be a member in an IEAs members always have lower expected utility than fringe countries. This is in line with the findings in Boucher and Bramoulle (2010).

Partial Learning
In the scenario of Partial Learning, countries have to make their decision on whether to join an IEA without knowing the true damage cost of emissions, but can make their subsequent emission decisions based on that knowledge. We have argued above that this is the one out of the three scenarios of learning we present which most closely represents the situation the world faces.
It follows that the emission decisions of countries do not depend on risk aversion and so are the same as in Kolstad and Ulph (2008  , as in Kolstad and Ulph (2008). However, more generally, we have not been able to determine analytically how p varies with the degree of risk aversion. In section 4.2 we report our findings on this from our simulation results.

Proposition 3: Partial Learning
Since the second equilibrium Pareto-dominates the first equilibrium if it exists, expected membership is either As the degree of risk aversion affects p it has an effect on the likelihood of a second coalition with higher membership l n being stable. This effect is further explored in section 4.2 where we show that the likelihood of the larger equilibrium decreases with risk aversion.

Comparison Across the Three Scenarios of Learning
In this sub-section, we investigate what we can say about expected IEA membership, payoffs and expected utility across the four possible equilibria, FL, NL, PL1 and PL2.
is possible. In terms of payoffs across the four equilibria, it is straightforward to see from Proposition 1, 2 and 3 that: ; For NL, in the low damage cost state of the world, the highest payoff to coalition members is when xc=1, which is less than or equal to the payoff to coalition members in PL2 since 1 ll n   ; in the high damage cost state of the world, the highest payoff to coalition members is when xc = 0, which is less than the payoff to members in PL2 since ()  Although (9a) and (9b) allow us to rank many of the payoffs across the four possible equilibria for both members and fringe countries in the high and low damage cost states of the world, this is not sufficient to allow us to rank expected aggregate utility at an analytical and general level. The next section reports the simulations we have carried out to compare expected IEA membership and expected welfare across the different models of learning.

Results from Simulations
There are three sets of issues we wish to explore using numerical simulations. (i) What is the expected size of the IEA in the case of No Learning, () NL En , in relation to the theoretical limits 1 n  and n and, more importantly, to the key parameters of our model, To address these questions, assume first that each country has a CRRA utility function 13 where ρ≥0 measures the degree of relative risk aversion and with A a constant which we will set equal to 1. 14 Moreover, we set 0   N -1 in payoff function (1), the smallest value required to ensure non-negative payoffs. We use the risk neutral case (ρ = 0) as a benchmark  Meyer and Meyer (2006) note that the CRRA utility function is widely used in empirical studies of risk aversion, and that empirical estimates of ρ vary between 0 and 100. They note that such estimates depend on the variable that enters the utility function, and for the three most commonly used variableswealth, income and profitsthe appropriate empirical estimate increases as one moves from wealth to profits. In our one-period model the relevant variable is income, though there is no distinction between wealth and income. Hence, we have chosen a range of values for ρ at the lower end of the range noted by Meyer and Meyer. See details below. Also note that qualitative conclusions would not change for higher degrees of risk aversion. 14 The constant A is a multiplicative factor which has no effect on the simulation results presented in this section. 15 ensure that n, and nh satisfy Assumption 2 and are evenly distributed between 2 and N. 15 Thus, our simulations basically capture a larger parameter range.

Results for Size of Stable IEA with No Learning
Recall that from Kolstad and Ulph (2008) the expected size of the stable IEA with No Learning when countries are risk neutral is ()  NL E n n . In Table 1, we present the results when ρ > 0. 16 In terms of the key parameters of the model, nh and n , from Lemma 1 we also know that ()  NL h n E n n and rows 9-12 in Table 4 show, for each value of ρ, the percentage of subcases that can arise. When ρ=0.01, for more than 99.7% of all parameter values, ()  NL E n n , which is very close to the result with risk neutrality in Kolstad and Ulph (2008

Second stable IEA with Partial Learning
We showed in section 3.3 that with Partial Learning there always exists a stable IEA with nh members who abate in the high damage cost state and pollute in low damage cost state, but there is a critical value of p, which we defined as p , such that for 1 pp  there exists a second stable IEA where members abate in both states of the world. We also said that we had been unable to prove analytically how p varies with the degree of risk aversion. In rows 2 and 3 of Table 2, we show, for each value of risk aversion between 0 and 20, the average value of p and the percentage of such cases for which 1 pp  occurs, respectively. From row 2 we see that as the degree of risk aversion increases, the average value of p rises from 0.9437 to 0.9685, while from row 3 the percentage of simulations for which 1 pp  falls from 5.66% to 3.14%. So increasing risk aversion reduces the likelihood that with Partial Learning there exists a second stable IEA with higher membership.

Comparison Across the Three Scenarios of Learning
In the remaining part of Learning is the preferred scenario is higher than No Learning. This is also true for the expected utility of an individual country from an ex-ante perspective, which, for symmetric players, is simply the aggregate utility divided by the total number of countries. Thus, could governments choose the scenario of learning endogenously in stage zero, they preferred full learning for high levels of risk aversion. So countries would be better off leaving the decision to form an IEA and set their emissions until they have Full Information about the risks of climate change. The relatively small percentage of parameter values for which the preferred model of forming an IEA is Partial Learning, i.e. deciding whether to join an IEA before getting better information about the risks of climate change, but allowing emissions to be set after better information is available, also declines as countries become more risk averse. Thus, the pessimistic conclusion about the role of learning in a strategic context derived in previous papers is qualified, in particular if the degree of risk aversion is sufficiently large.

Summary and Conclusions
This paper bridges two strands of literature on the formation of IEAs under uncertainty by addressing the combined roles of learning and risk aversion. This approach allowed us to explore the impact of learning for any given level of risk aversion as well as the impact of changing risk aversion under various scenarios of learning.
We generalized the model of Kolstad and Ulph (2008) who showed that with risk neutrality the possibility of learning more information about environmental damage costs generally had rather pessimistic implications for the success of the formation of IEAs. Except for a relatively small set of parameter values for which partial learning would select a high IEA membership, learning resulted in lower or equal expected membership for partial learning and lower expected aggregate and individual payoff for partial and full learning, compared to no learning. Moreover, this parameter range required that the probability of low damage cost is very high a rather uninteresting parameter constellation in the context of climate change.
Hence, in a strategic context, learning reduces expected aggregate and individual payoffs for a wide range of parameter values. Across the different models of learning, they showed that for a large set of parameter values the scenario of learning which yielded highest expected aggregate and individual payoff was No Learning, which would suggest that countries are better off forming an IEA rather than waiting for better information.
In this paper, we have allowed countries to be risk averse using an expected utility approach which maps payoffs into utility. We first derived the theoretical results for each of our three scenarios of learning with risk aversion, confirming the main findings of Boucher and Bramoulle (2010)  Learning are small. However, even with special functional forms for the underlying utility functions there was limited scope for deriving analytical comparisons across our three scenarios of learning, primarily because welfare effects could differ for signatory and nonsignatory countries. Our simulation results showed that contrary to the finding with risk neutrality, when countries become significantly risk averse, the set of parameter values for which countries are better off with No Learning compared to Full Learning shrinks significantly and those cases for which this is reversed increases accordingly. This may explain why it has taken so long for a proper climate agreement to be reachedcountries are risk averse and waited till they had much better information about the risks of climate change.
In terms of future research, it would be desirable to use a model with asymmetric countries, though it is unlikely to be possible to derive analytical results; so it may be more useful to introduce different models of learning into Integrated Assessment Models of climate change.
It would also be interesting to endogenise the process of learning by allowing countries to invest in research in order to obtain better information.