1 Introduction

The tension between self-interest and collective interest is a central challenge in all social relationships and understanding how societies can solve the challenge and achieve cooperation among members is important to economics and other disciplines. In many cooperation problems, societies themselves determine the rules that govern the interactions of their members. For instance, government representatives negotiate and establish rules on how to protect global security or prevent climate change. Users of common pool resources develop rules to ensure sustainable harvest of the resource. Work teams establish rules on how to reward positive contributions to the common goal and how to punish free riders. In all these cases, actors who are confronted with the cooperation problem are also the ones who establish the rules, restricting their own behavior, which is known as endogenous choice of institutions.

Although there are numerous institutions, formal and informal, regulating behavior in societies, they are not all equally successful. For example, anti-smoking legislation had very different effects in Norway and Greece (Nyborg et al. 2016). Some institutions thrive while others become extinct after a short period of time. For example, the Montreal Protocol on the protection of the ozone layer is widely seen as success and the ozone layer is expected to be largely restored by the year 2050. By contrast, the Kyoto Protocol on climate change has been criticized heavily and, when the time came, negotiators decided not to extend the Kyoto Protocol but instead to design a new agreement. It is very difficult, if not impossible, to identify the reasons why some institutions are implemented successfully and solve the cooperation problem while others are not implemented or they are implemented but fail to change behavior. There are many potential reasons which may also influence each other. For instance, success may depend on the society trying to implement the institution, the type of the institution and the process of implementation, the type of the cooperation problem, or a combination of all.

Laboratory experiments have the advantage that they allow for observing behavior under controlled conditions, free of confounding factors. After many years of investigating cooperation under exogenously given institutional settings (for reviews, see Ledyard 1995; Zelmer 2003; Chaudhuri 2011), an important experimental literature has begun to study the endogenous choice of institutions in cooperation games. This literature started around the year 2000 (an early exception is the experiment by Ostrom et al. 1992) and has been growing ever since. The studies reveal institutional choices by individuals and groups facing different cooperation problems and different types of available institutions, and how they fare after having chosen the institution. Specifically, we can observe how many players choose the institution, how these players perform relative to the players who decide against the institution, how the players who choose the institution differ from those who decide against it, and how behavior changes over time as players learn from past experience. With careful design, we can observe how the type of institution, the process of implementation, and the type of the cooperation problem affect institutional choice and ultimately cooperation.

Experiments are also able to reveal the difference between endogenously chosen institutions and exogenously imposed institutions. This difference is highly relevant for policy whenever a regulator has the power to enforce regulations but may want leave the decision to the constituency, to be taken for example in a referendum, if this promises a better outcome in the end. The comparison between exogenously imposed and endogenously chosen institutions also reveals important insights into the different effects driving cooperative behavior. When an institution is exogenously imposed (or not), the difference in behavior between the players who act under the institution and those who do not is driven by only one effect, namely the effect of the institution. In contrast, when an institution is endogenously chosen, the difference in behavior between the players who choose the institution and those who decide against it, can be the result of four different effects: an institution effect, a selection effect, an information effect, and a democracy effect. The first effect, the institution effect, captures the change in cooperative behavior due to the introduction of the institution. This effect is the same as when the institution is introduced exogenously. The selection effect arises because of self-selection into the institution. There may be observable or unobservable factors that affect both subjects’ choice of the institution and their cooperation. The information effect is caused by the information that comes with the institutional choice. The process of institutional choice reveals information to the subjects about their partners’ preferences. This information may be used to draw conclusions about the partners’ future behavior and this, in turn, may change the subjects’ own behavior. The fourth effect is a genuine democracy effect and is solely caused by the process of choosing the institution.Footnote 1 Collectively establishing an institution could, for example, strengthen feelings of group identity (e.g. Akerlof and Kranton 2000; Chen and Li 2009). Because of the effects of selection, information, and democracy we would expect different outcomes with endogenous choice of institutions than with exogenously imposed institutions.

The review focuses on five main lessons, each presented in a separate section. In Sect. 2, we provide a framework that helps to categorize the experimental studies along two dimensions: the scale of the cooperation problem (local versus global) and the scale of the institution (exclusive versus inclusive). This framework is useful to understand the main differences between the studies’ basic designs and their results. Section 3 describes how many individuals and groups choose the institution and how they perform relative to those individuals and groups that decide against the institution. In those cases where players cannot only choose whether to implement the institution or not but also the type of the institution, we describe which type of institution is predominantly selected. Section 4 describes which personal characteristics and attitudes influence the institutional choice. In Sect. 5, we describe the differences in behavior when the institution is endogenously implemented by the players themselves and when it is exogenously imposed and where these differences come from. Finally, in Sect. 6, we summarize our main findings and discuss promising avenues for future research in this field. The Supplementary Material contains tables with all 39 studies included in this review and detailed information about their experimental designs, sorted according to the framework presented in Sect. 2.Footnote 2

2 Framework

The baseline cooperation game in all studies covered by this review is a prisoners’ dilemma, a public goods game, or a common pool resource game which is played by two or more players who have a dominant strategy to behave non-cooperatively assuming narrow selfish preferences.Footnote 3 We consider institutions that change the rules of the cooperation game in a significant way by either changing the payoffs or the available strategies. We can distinguish between two main types of institutions: formal institutions that, once implemented, are exogenously enforced and informal institutions that, if implemented, still need to be enforced by the players themselves. Formal institutions may modify the payoffs to playing certain strategies, for example, put a fine on free riding, or they may restrict the available strategies, for example, eliminate the possibility to free ride altogether. If a formal institution involves an institutional cost, it is typically borne by all players. Informal institutions offer the players an option to punish or reward other players which they may or may not use. The cost of the institution in this case is only borne by the players who use it.

Table 1 illustrates the two dimensions which we use to categorize the studies: the scale of the cooperation problem and the scale of the institution. The scale of the cooperation problem can be local, meaning that the population is divided into several subgroups and the benefits of cooperation are reaped only by the members of the same subgroup, or the scale can be global, meaning that the population may or may not be divided into subgroups but the benefits of cooperation are always reaped by the entire population. The institution can be exclusive, meaning that the institution only applies to those individuals who have voted in favor of the institution and excludes those who have voted against it, or the institution can be inclusive, meaning that the group as a whole makes the decision via voting and the institution is binding for everyone (or no one). Combining the two dimensions, the scale of the cooperation problem and the scale of the institution, results in four different situations which are depicted in Table 1.

Table 1 Framework

Panel (a) in Table 1 illustrates a situation where the cooperation problem is local and the institution is exclusive. All studies that fall into this category are summarized in Table S.1 in the Supplementary Material. Choosing an institution in this setting not only determines the institution under which one will act but also the interaction partners. The studies in this category are therefore closely related to the literature on partner selection and social networks.Footnote 4 Furthermore, there is no uncertainty about individuals’ institutional choice so that players have unambiguous information about their co-players’ institutional choices. Examples for this setting are voluntary work teams, clubs, neighborhoods, or more generally groups in which participation is voluntary and cooperation within the group does not have spill-over effects on people outside the group. Though there are no direct spill-over effects, players may still learn about the performance of other (neighboring) groups and change their own behavior. A well-cited study in this category is Gürerk et al. (2006) where individuals repeatedly choose between a standard public goods game and a public goods game with an informal sanctioning institution which allows players to punish or reward other players at a certain cost. After making the choice, players play the chosen game with all those players who have chosen the same game.

Panel (b) in Table 1 illustrates local cooperation and inclusive institutional choice. The corresponding studies are summarized in Table S.2 in the Supplementary Material. In this setting, the population collectively chooses the institution by voting and the population-wide institution then governs all local interactions among members of subgroups. Examples for this setting are institutions, such as liability rules or rules of good conduct, that are chosen at the societal level and govern all interactions between subsets of players within the society. In this case, players typically have only aggregate information about the choice of the institution but not about individual choices. A recent experiment in this category is Dal Bó et al. (2018) which studies pairwise interactions within populations of six subjects. Players first play a prisoners’ dilemma and then vote on the implementation of a fine that reduces the payoff to playing every strategy, but with the payoff to defection falling by more than the payoff to cooperation. In this modified game, mutual cooperation has a lower payoff than in the original prisoners’ dilemma but cooperation becomes the dominant strategy for both players. The fine is implemented for the entire population if, depending on treatment, a majority or a randomly selected player votes in favor of it.

Panel (c) represents a setting in which the institution is exclusive and the cooperation problem is global. Table S.3 in the Supplementary Material gives an overview of all studies falling into this category. A good example for this setting is an international treaty where some nation states participate and commit to provide a global public good, like the protection of endangered species or mitigation of climate change, while other states do not commit but still benefit from the provision of the public good. In this setting, there is unambiguous information about players’ choices and, by design, non-members are better off than members of the institution. Kosfeld et al. (2009) is a well-cited study in this category. Fixed groups of four individuals play a repeated public goods game with three stages. Players first announce whether or not they are willing to enter a coalition. After being informed about the number of willing participants, the potential members decide to implement the coalition or not, using a unanimity rule. Finally, players choose their contributions to the public good with members being bound to contribute their full endowment while non-members are free in their choice.

Panel (d) illustrates a situation where the cooperation problem is global and the institution is inclusive. Most of the studies fall into this category and they are presented in Table S.4 in the Supplementary Material. A society that chooses a tax scheme to fund its public infrastructure or a group of common pool resource users who establish rules for harvesting the resource can serve as examples. A well-cited study in this category is Sutter et al. (2010). In this experiment, fixed groups of four individuals choose to play a standard public goods game, a public goods game with an informal punishment option, or a public goods game with an informal reward option. Voting is voluntary, costly for the voters, and it is repeated until unanimity is reached. The voting outcome is binding for the entire group and for all ten rounds of play.

3 Choice of the institution and cooperation

In this section we describe how many players choose the institution, how many players decide against the institution, and how these two different groups perform after having made the institutional decision. We sort the studies according to the categories presented in the previous section. Within each category, we present similar studies consecutively and try to sort studies according to the theoretical strength of the institutions, starting with institutions which change the equilibrium and ending with institutions which do not change the equilibrium of the original cooperation game.

3.1 Local cooperation and exclusive institution

Grimm and Mengel (2011) let players sort themselves individually into a prisoners’ dilemma or a modified version of the game where the payoffs to defection are reduced and cooperation is the dominant strategy. The games are played for 100 rounds and each player out of eight is allowed to choose between the two games every fourth round. By design, half the players start to play the modified game and about 70% of them choose to cooperate. Almost all subjects select the modified game in the second half of the experiment and the cooperation rate is close to 100%. The original prisoners’ dilemma is played only in the first half of the experiment and the average cooperation rate is 16%. Average payoffs are significantly higher in the modified game.

In a similar experiment by Grimm and Mengel (2009), players sort themselves individually into a prisoners’ dilemma or a modified game with lower payoffs to defection. The modification does not change the unique defection equilibrium but it nevertheless leads to different behavior. By design, half the players start to play the modified game and more than 60% of them choose to cooperate. About two-thirds of players play the modified game in the second half of the experiment and around 60% of them choose to cooperate on average. The remaining players play the original prisoners’ dilemma and only 10% of them choose to cooperate on average. Average payoffs are also higher in the modified game. The proportion of players choosing the modified game increases slightly when information about the average payoffs in both games is provided.

Cobo-Reyes et al. (2019) first assign subjects randomly to one of two groups with one group playing a standard public goods game and the other group playing a modified version with a fixed cost where free riding is automatically punished and cooperation is the dominant strategy. After this initial assignment, individuals are allowed to switch between the two groups in every round. The parameters are chosen so that efficiency requires the players to stay in their initial group and contribute their full endowment to the public good. Cooperation is significantly higher with punishment institution than without (approximately 90 vs 40%) throughout the 30 rounds of play, and also payoffs are significantly higher with the institution despite the fixed cost. Players are informed about the performance in both groups and there is increasing migration to the punishment institution. At the end about 75% of players reside in the group with punishment institution. In another treatment, the two groups first decide whether they want to implement the punishment institution using majority voting, and then the individuals decide whether they want to move to the other group. There is much less migration in this treatment. As before, cooperation is higher with punishment institution than without (approximately 90 vs 50%) and also payoffs are significantly higher. The share of groups implementing the punishment institution increases moderately over time from 42 to 58%.

The experiments by Gürerk et al. (2006, 2014), Gürerk (2013), and Gürdal et al. (2019) have a similar design. They let participants sort themselves individually into a standard public goods game or a game with an informal institution that allows them to punish or reward other players at some cost. The institution in Gürerk et al. (2006) allows for both punishing and rewarding other players while the institutions in Gürerk et al. (2014), Gürerk (2013), and Gürdal et al. (2019) only allow for one of the two measures. The punishment institutions are initially unpopular in all experiments but over time more and more participants migrate to the game with punishment institution and use it to improve cooperation. Players receive detailed information about the performance in both games after each round of play which can help to explain the fast and almost complete migration to the game with punishment institution. In Gürerk et al. (2006), at the beginning of 30 rounds, only 37% of players choose the game with institution. These players contribute on average about 64% of the endowment to the public good and many of them are willing to punish low contributions. The remaining 63% of players choose the standard game at the beginning and contribute only 37% of the endowment to the public good. By the end of the game, the share of players in the game with institution has increased to 93% and the average contribution rate of those players has increased to 97%. As punishment is used primarily in the beginning of the game and much less in later rounds, payoffs are on average higher with the institution than without. Similar behavior is observed by Gürerk et al. (2014) when participants choose between the standard game and a game with informal punishment option. In a second treatment, participants choose between the standard game and a game with an informal reward option where players can pay to reward other players. Here, a relatively stable share of about 80% of players choose the game with reward option throughout all rounds and the average contribution rate for these players is 52% with a slight downward trend over time. The remaining players in the standard game have an average contribution rate of around 24%, also showing a slight downward trend over time. The reward option is thus less effective in increasing cooperation than the punishment option but it is still popular and leads to higher average payoffs that the standard game. Gürerk (2013) examines how providing ex-ante information about the choice and performance of previous players affects subjects’ behavior. He uses the same setup as the punishment treatment in Gürerk et al. (2014) and uses their results for the ex-ante information. The results show that providing ex-ante information affects individuals’ institutional choice and cooperation mostly at the beginning of the game. When information is provided, 54% of players join the punishment institution in the first round and they contribute 80% of the endowment on average while the players in the standard game contribute 20% on average. Without information, only 31% of players join the punishment institution in the first round and they contribute 65% on average. Over time more and more players join the punishment institution in both conditions and make high contributions. In both conditions, payoffs in the punishment game are initially lower than in the standard game but they are higher towards the end and on average. Gürdal et al. (2019) compare the popularity and the effects of an informal punishment institution among German students (taken from Gürerk et al. 2014) and Turkish students. They find that the average cooperation level is slightly lower among Turkish students but the support of the punishment institution and its effect on cooperation are very similar. The punishment institution increases cooperation significantly in both samples and over time there is almost complete migration from the standard game to the game with punishment institution.

The experiment by Nicklisch et al. (2016) allows individuals to join one out of three games: (1) a standard public goods game without punishment, (2) a game with an informal punishment option where players can punish each other, and (3) a game with a central punishment scheme where a randomly selected subject from outside the group can punish group members. Individuals receive information about the outcomes in all three games and they choose between the games every four rounds over a period of 32 rounds. In three different treatments, the players’ contributions are displayed to the others always correctly (10 out of 10 times), almost always correctly (9 out of 10), or only half the time correctly (5 out of 10). Both punishment institutions are relatively unpopular at the beginning irrespective of the noise level. Over time the support increases in all three conditions. At the end of the game, more than 80% of players reside in one of the two punishment institutions when the noise level is zero or low, and more than 60% play with punishment when the noise level is high. Average contributions and payoffs are significantly higher with punishment than without, but the differences become smaller and partly insignificant when the noise level increases.

Fehr and Williams (2017) let individuals sort themselves repeatedly into one of four games: (1) a standard public goods game without punishment, (2) a game with an informal uncoordinated punishment option where players can punish each other after the contribution stage, (3) a game with an informal coordinated punishment option where players are first asked what they think should be contributed to the public good and subsequently informed about the average expectations, then they choose their contributions and afterwards may punish each other, and (4) a game with central punishment where players are first asked about and informed about each other’s expectations, then choose their contributions, and afterwards one elected player may punish other players. Players choose between the different games every round and before they choose, they are informed about the average earnings in each game. When individuals choose the first time, about one-third of them choose the standard game without punishment option and their average contribution rate is about 20%. The other two-thirds choose either coordinated peer punishment or central punishment and the average contribution rate for those players is about 90%. After a few rounds of playing, these two institutions exist almost exclusively with about half the subjects joining the institution that delegates punishment to a representative and the other half joining the institution that allows all players to punish. The standard game becomes extinct very quickly and the uncoordinated punishment option is almost never chosen. This shows that providing information about the players’ expectations before contributions and punishments are chosen serves as a very useful coordination device. The two punishment institutions that include this stage lead to very high contributions rates between 90 and 100% from the beginning. They also lead to higher payoffs than the standard game from the beginning because punishments are rarely needed.

Figure 1 provides an overview of these findings.Footnote 5 Each bar represents an experimental treatment with the lower part showing the share of individuals inside the institution and the upper part showing the share of individuals outside the institution. The colors indicate the cooperation rates inside and outside the institution. The upper panel shows institutional choices when subjects choose the first time and cooperation rates in the first round following the choice of the institution. The lower panel shows institutional choices when subjects choose the last time and cooperation rates in the last round after the final choice of the institution. The figure confirms that subjects who join the institution behave much more cooperatively than those who do not join the institution. Taking the experimental treatment as unit of observation and using two-sided Wilcoxon signed-rank tests, we find that average cooperation rates are significantly higher inside the institution than outside the institution both in the first round (67 vs 31%, P = 0.0002) and in the last round (82 vs 10%, P = 0.0016). The figure also shows that subjects choose very differently in later rounds when they have gained experience than when they choose the first time. Significantly more subjects choose the institution in the last round than in the first round (84 vs 45%, P = 0.0017). Cooperation inside the institution increases from the first to the last round (67 vs 82%, P = 0.0231) while cooperation outside the institution decreases (31 vs 10%, P = 0.0025).Footnote 6 The studies shown on the right in Fig. 1 indicate that not all institutions are successful in stimulating high cooperation and allow us to speculate about possible reasons. The last bar in the figure shows an institution which does not fully eliminate the free-riding incentives: the prisoners’ dilemma with lower defection payoffs in Grimm and Mengel (2009). Under this institution, free-riders earn less than without the institution, but ceteris paribus they still earn more than if they had cooperated. These remaining free-riding incentives limit cooperation inside the institution and also the popularity of the institution. The punishment institution in Nicklisch et al. (2016) shown in the second-to-last bar suffers from noisy information about who is a cooperator and who is a free-rider. The reward option in Gürerk et al. (2014) (third-to-last bar) is very popular but it does not achieve a very high cooperation level, arguably because free-riders do not face an explicit risk of getting punished under this institution.

Fig. 1
figure 1

Overview of results for local cooperation and exclusive institution. Note: Bars represent experimental treatments with the lower (upper) part of each bar showing the share of individuals inside (outside) the institution. The upper (lower) panel shows institutional choice when individuals choose the first (last) time. Colors indicate achieved cooperation rates

3.2 Local cooperation and inclusive institution

Dal Bó et al. (2018) let groups of six players choose between a prisoners’ dilemma and a modified game where payoffs are reduced in a way that cooperation becomes the dominant strategy. The reduction in payoffs can be interpreted as institutional cost. The costs imply that payoffs are potentially higher in the prisoners’ dilemma but only if players cooperate. In equilibrium, payoffs are higher in the modified game than in the prisoners’ dilemma because mutual cooperation in the modified game has higher payoffs than mutual defection in the prisoners’ dilemma. The decision which game to play is made by majority voting or by one randomly selected player. Once the game is chosen, it is played over five rounds in alternating pairwise interactions within the group so that every player interacts with each of the other players only once. The treatments vary in the decision rule and the game played before the players vote. Across all treatments, only 46% of players vote for the modified game despite the welfare increase in equilibrium. The proportion of groups that play the modified game ranges from 25 to 55%, depending on treatment. Cooperation rates in the modified game are very high throughout all rounds (on average 94–98%) compared to relatively low cooperation rates in the prisoners’ dilemma (on average 15–36%). Payoffs are also significantly higher in the modified game. In one treatment (Majority Repeated), groups do not choose once-for-all but in every round. Here, the proportion of groups that play the modified game increases from 55 to 90%. The average cooperation rate is 98% in the modified game and 17% in the prisoners’ dilemma. This results shows that an institutional cost can make a strategically advantageous game look unattractive initially but players can learn to overcome the bias over time.

In a similar experiment by Dal Bó et al. (2010), groups of four players choose between a prisoners’ dilemma and a modified game in which a fine is imposed on unilateral defection, an off-equilibrium change that makes mutual cooperation another equilibrium of the game while leaving the payoffs to mutual cooperation unchanged. Once a game is chosen by simple majority voting, it is played for ten rounds in alternating pairwise interactions within the group. Only 53% of the players vote for the coordination game which leads to significantly higher cooperation rates. In the first round after choosing, 72% of the players in the coordination game cooperate compared to only 18% in the prisoners’ dilemma. Since there is no treatment with repeated choice we do not know how players’ behavior would change with experience.

Figure 2 summarizes the findings. As before, each bar represents one experimental treatment. The lower part is the share of groups inside the institution, the upper part is the share of groups outside the institution, and the colors indicate the cooperation rates inside and outside the institution. The upper panel shows behavior when choosing the first time and the lower panel shows behavior when choosing the last time. Using a two-sided Wilcoxon signed-rank test, we find that first round cooperation rates are significantly higher inside the institution than outside the institution (91 vs 30%, P = 0.0625). Unfortunately, there is only one treatment where subjects choose between games repeatedly (Dal Bó et al. 2018, Majority Repeated). This treatment shows that, with experience, many more subjects choose the institution than at the beginning. Due to the lack of observations, we are unable to provide statistical tests for the last round and for the changes over time.

Fig. 2
figure 2

Overview of results for local cooperation and inclusive institution. Note: Bars represent experimental treatments with the lower (upper) part of each bar showing the share of groups inside (outside) the institution. The upper (lower) panel shows institutional choice when groups choose the first (last) time. Colors indicate achieved cooperation rates

3.3 Global cooperation and exclusive institution

Before we start describing the results of the studies in this category, it is important to note that many of these studies fix the cooperation decisions inside and outside the institution by design and only analyze the choice of the institution. For instance, players may decide whether they want to join the institution or not, knowing that the members of the institution will be bound to cooperate while the non-members will be bound to free-ride. Therefore, the difference in cooperation between members and non-members occurs merely by design and the more interesting question is how many players decide to join the institution.

Another important feature in many of these studies is the so-called minimum participation threshold. The minimum participation threshold gives the minimum number of members that are needed for the institution to enter into force. If fewer players than this minimum number are willing to enter, the institution will not enter into force and the players will play the cooperation game without institution. The minimum participation threshold typically is the smallest number of members for which the institution is profitable compared to the situation without institution. However, experimenters sometimes set the minimum participation threshold higher or let the players choose the threshold in order to test how the minimum threshold affects institutional choice. In this setting, theory predicts that an individual joins the institution if she is pivotal for the formation of the institution and she does not join otherwise. Thus, the minimum participation threshold should be met just and the institution should always form. The reason for this is that, even though members by design earn less than non-members, they still earn more with the institution in place than without the institution. To coordinate the process on who joins and who does not, the experiments typically include a coordination mechanism like, for example, real-time information about participation decisions (McEvoy et al. 2011), sequential participation decisions (McEvoy et al. 2015), or an additional stage in which members, after being informed about the number of members, decide whether or not to implement the institution (Kosfeld et al. 2009).

McEvoy et al. (2011) consider groups of ten players who play a repeated public goods game with reshuffling of groups between rounds. In every round, players can choose whether they want to become a member of an institution or not, knowing that the members will be punished for not contributing fully to the public good (with some probability) while non-members will be bound to contribute zero and not be punished. The minimum participation threshold is either six or ten, depending on treatment. When six players are required for the institution to form, 73% of players join the institution when choosing the first time and 83% of them comply. In the last round, 65% of players join the institution and 62% of them comply. When all players are required for the institution to form, 98% of players join the institution when choosing the first time and 87% of them comply. In the last round, 93% of players join the institution and 63% of them comply. It is not clear why a significant share of the members of the institution chooses not to comply, especially towards the end of the game. One potential reason is that the punishment, while deterrent, is enforced only with 80% probability and thus some players may prefer to gamble.

Gerber et al. (2013) also study the effects of an exogenously given minimum participation threshold on institutional choice in a repeated public goods game. Depending on treatment, the formation of the institution requires either all four players or only three out of four players to join the institution. Members of the institution are forced to contribute the full amount to the public good while non-members are free in their choice. When all players are required to join the institution, the institution is implemented in 41% of cases in the first round. Members are bound to contribute the full amount to the public good while the groups without institution contribute 26% on average. Behavior in the last round is similar with participation in the institution being somewhat higher (59%). When only three players are required to join, the institution forms in 63% of cases in the first round. Most of these institutions have three members who are forced to contribute fully. The subjects who are outside the institution, either because the institution has not formed or because these subjects have not joined the institution, contribute on average 24%. In this treatment, participation in the institution even decreases over time to 33% in the last round.

Kosfeld et al. (2009) study institution formation in a repeated public goods game with either high or low marginal per capita return to the public good. Playing in groups of four players, players first decide whether they are willing to join the institution or not. After being informed about the number of potential members, these potential members decide whether they want to implement the institution or not with unanimity rule. If the institution is implemented, members are forced to contribute the full amount to the public good while non-members are free in their choice. With low marginal per capita return to the public good, the institution is implemented 43% of the time on average and, in 83% of them, participation is full. With high marginal return, the institution is implemented 61% of the time on average and, in 69% of them, participation is full. Institutions with less than full participation are often rejected by the players, even when they are profitable. Importantly, the tendency to reject profitable institutions with less than full participation does not decrease but rather increases over time. The increase in the number of implemented institutions over time is exclusively driven by an increase in institutions with full participation.

In a similar study, McEvoy et al. (2015) investigate institutional choice in a repeated six-player public goods game in which the players themselves determine the minimum participation threshold via voting in a pre-stage. Members of the institution are forced to contribute the full amount to the public good while non-members are forced to contribute zero. The majority of players vote in favor of the full and efficient minimum participation threshold so that this minimum threshold is implemented 77% of the time across all rounds. In those cases, the institution forms 91% of the time. Institutions with less than full participation are rarely proposed and adopted.

Dannenberg et al. (2014) employ a repeated non-linear public goods game with interior solutions in which each of ten players can decide whether to join an institution or not. They compare an institution in which the members are forced to maximize their joint payoff and an institution in which the members can determine their contributions endogenously by a smallest-common-denominator rule. When the institution forces members to maximize their joint payoff, 40% of players are willing to join the institution initially and this percentage decreases to 29% by the final round. When members can determine their contributions endogenously, 64% of players are willing to join the institution at the beginning which decreases to 48% by the final round. In those cases, the members choose a lower than optimal contribution level which limits the payoff advantage of the non-members. In both treatments, non-members contribute significantly less on average than the members of the institution. Similar results have been found by Dannenberg (2012).

Taken together, even though theory predicts that the institution is adopted whenever it is profitable to do so, this does not always happen in the experiments. The institution is much more likely to be implemented if it applies to all players and not only a subset of players. If they have the possibility, players attempt to implement an institution that covers all players either by voting for the full minimum participation threshold or by rejecting smaller institutions.

Figure 3 presents an overview of these findings in the same manner as before. Each bar represents one experimental treatment, the lower part is the share of individuals inside the institution, the upper part is the share of individuals outside the institution, and the colors indicate the cooperation rates inside and outside the institution. The upper panel shows behavior when choosing the first time and the lower panel shows behavior when choosing the last time. The figure shows that behavior in the studies in this category is quite different from the behavior shown in Figs. 1 and 2. Cooperation rates are higher inside the institution than outside the institution both in the first round (85 vs. 18%, P = 0.0017, two-sided Wilcoxon signed-rank test) and in the last round (75 vs. 8%, P = 0.0017). Cooperation rates inside and outside the institution decrease slightly over time. However, most of this occurs by design and is thus not very interesting. More interesting is that, across all studies and treatments shown in Fig. 3, participation in the institution does not increase over time: 51% of players opt for the institution when choosing the first time and 49% opt for it when choosing the last time (P = 0.6659). The exception is the study by Kosfeld et al. (2009), and to a smaller degree Gerber et al. (2013), but even here participation in the final round is only about 50%.

Fig. 3
figure 3

Overview of results for global cooperation and exclusive institution. Note: Bars represent experimental treatments with the lower (upper) part of each bar showing the share of individuals inside (outside) the institution. The upper (lower) panel shows institutional choice when individuals choose the first (last) time. Colors indicate achieved cooperation rates

3.4 Global cooperation and inclusive institution

In this category, there are a number of public goods experiments that allow players to propose a group-wide contribution level and then vote on the proposals to determine a binding level for the entire group. In this case, players have a weakly dominant strategy to propose and implement the efficient amount as a binding level for the group. In the repeated public goods experiment by Kroll et al. (2007), where majority voting is used to select from the proposed contribution levels, almost all groups implement the efficient level towards the end of the game. Similar results are provided by Bernard et al. (2013) who study a repeated common pool resource game in which players can implement a binding group-wide extraction level. In the repeated public goods experiment by Dannenberg et al. (2014), all players make proposals for a minimum contribution level, and then the smallest proposal becomes the binding minimum contribution level for everyone. This mechanism is very sensitive to the player who makes the smallest proposal and so groups fare very differently. Sixty percent of groups have an increasing minimum level over time that approaches the social optimum at the end. The other 40%, however, implement a low minimum level throughout the game which also leads to very low contributions.

In other public goods or common pool resource experiments, players do not propose a group-wide contribution or extraction level but instead a level for each player. In the common pool resource games by Walker et al. (2000) and Margreiter et al. (2005), players repeatedly propose individual extraction rates for every member of their group and then vote on the proposed allocation rules, using either majority or unanimity voting. If an allocation rule is adopted by the group, the allocations are automatically imposed on all members. If no rule is adopted, participants play the standard version of the game. In this setting, players have a weakly dominant strategy to propose an allocation rule that assigns strictly positive extraction levels to sufficiently many players, i.e. the majority under majority voting and all players under unanimity. Walker et al. (2000) find that a proposal is adopted in about 50% of times across all rounds. It is not clear why proposals are not adopted more frequently, but arguably the very high number of possible proposals makes coordination and agreement among group members difficult. Of the adopted proposals, 58% require symmetric extraction levels across all players and 89% out of those are socially optimal. Thus, symmetric and efficient proposals are the most focal ones that have the best chances of being implemented. The number of these proposals and their implementation increases over time. With this, the common pool resource is more efficiently used when a proposal is adopted than when no proposal is adopted. Margreiter et al. (2005) investigate the effect of asymmetric costs of appropriating the common pool resource. Therefore, high-cost players appropriate less than low-cost players, both in the Nash equilibrium and the social optimum. They show that in homogeneous groups a proposal is adopted in 61% of all cases. In heterogeneous groups, a proposal is only adopted in 32% of all cases. The percentage of socially optimal proposals among all adopted proposals is similarly high in both homogeneous groups (87%) and heterogeneous groups (93%).

The experiments described so far that study the adoption of group-wide or individual allocation rules are not included in Fig. 4 below because groups do not only differ in whether they adopt an allocation rule or not, but also in which allocation rule they adopt. For this reason, a simple comparison between groups with institution and groups without institution is not possible.

Fig. 4
figure 4

Overview of results for global cooperation and inclusive institution. Note: Bars represent experimental treatments with the lower (upper) part of each bar showing the share of groups inside (outside) the institution. The upper (lower) panel shows institutional choice when groups choose the first (last) time. Colors indicate achieved cooperation rates

In the experiments of Sutter and Weck-Hannemann (2003, 2004), groups decide repeatedly whether to implement a binding minimum contribution level in a non-linear public goods game with interior solutions. Unlike in the studies above, the binding minimum contribution level is pre-specified by design and players can only decide whether they want to implement it or not. In Sutter and Weck-Hannemann (2003), players decide whether or not to implement asymmetric minimum contribution levels that, if implemented, imply relatively high obligations for some players and relatively low obligations for others. Either way, the obligations are below the Nash equilibrium and do not affect players’ free-riding incentives. Across all rounds, about 80% of groups implement the institution, but there is no significant difference in average cooperation rates between groups that implement the institution (8%) and groups that do not implement the institution (− 3%).Footnote 7 The proportion of groups implementing the institution and the cooperation rates are relatively stable over time. Sutter and Weck-Hannemann (2004) consider symmetric minimum contribution levels which are either slightly below or slightly above the Nash equilibrium. In both cases, the institution is relatively popular: it is implemented by 78% of groups when the level is below the Nash equilibrium and by 68% of groups when it is above the Nash equilibrium. In both cases, the groups that implement the minimum contribute more to the public good than the other groups. The proportion of groups that implement the institution increases slightly over time while the cooperation rates are relatively stable both inside and outside the institution.

Kocher et al. (2016) also study the adoption of pre-specified minimum contribution levels, using a linear one-shot public goods game. One treatment allows for adoption of a low minimum level (10% of endowment) and the other treatment allows for adoption of a high minimum level (35%). Both minimum levels are above the Nash equilibrium but below the social optimum, each compared to a standard voluntary contribution mechanism. Players first vote in favor or against the adoption of the minimum contribution level and then make a contribution decision for both with and without minimum level (strategy method). In the end, the vote of one randomly selected player is implemented to determine whether the minimum contribution level is implemented or not. In this setting, players have a dominant strategy to vote in favor of the minimum contribution level and, if it is implemented, to contribute exactly at that level. The results show that 67% of players vote in favor of the low minimum level and a large majority of 88% of players vote in favor of the high minimum level. In case of the low minimum, there is no difference in contributions when the minimum is implemented and when it is not implemented (33 vs 34%). In case of the high minimum, contributions are higher when the minimum is implemented than when it is not implemented (51 vs 31%).Footnote 8 Similar results are provided by Martinsson and Persson (2019) who study the adoption of a pre-specified 25-percent minimum contribution level in a linear public goods game.

Andreoni and Gee (2012) let groups first play a standard public goods game and then provide them with the opportunity to implement a formal punishment institution that punishes the lowest contributor and makes full contributions the unique Nash equilibrium. In groups of four, players receive an additional endowment in a pre-stage and decide how much they want to invest in the punishment institution. If the aggregate investment reaches a certain threshold, the institution is implemented for the remaining rounds; otherwise investments are refunded and players play the standard public goods game. Across all rounds, 85% of groups implement the punishment institution. Both contributions and payoffs are significantly higher in the groups that implement the punishment institution than in the other groups, even when the institutional cost is taken into account.

In the experiment by Barrett and Dannenberg (2017), groups consisting of five players choose to play either a prisoners’ dilemma or a modified game where both full defection and full cooperation are Nash equilibria. In the first treatment, there is no institutional cost, meaning that payoffs to both full defection and full cooperation are the same as in the original prisoners’ dilemma. In the second treatment, playing the modified game comes at a fixed cost, meaning that full defection has the same payoffs as in the original prisoners’ dilemma but the payoffs to full cooperation are lower. The experiment shows that, in both treatments, cooperation and payoffs are significantly higher in the modified game but participation is very different. Without institutional cost, about half the groups choose the modified game in the first round. This proportion jumps to 100% when groups choose a second time and stays there for the remaining rounds. With institutional cost, all groups start by choosing the prisoners’ dilemma and only half the groups switch to the modified game over the course of the game.

Feld and Tyran (2002) let groups choose between a standard public goods game and a public goods game in which contributing less than the full amount is automatically punished. The punishment is non-deterrent and zero contribution remains the dominant strategy. Groups consist of three players and they use majority voting to determine the game they will play for one round after choosing. Using the strategy method, players are asked to make contribution decisions contingent on all possible distributions of votes. The authors find that half the players vote for the game with non-deterrent punishment and the other half vote against it. The average contribution rate with non-deterrent punishment is 71% compared to 24% in the standard game. Average payoffs are slightly higher in the game with non-deterrent punishment than in the standard game.

In a follow-up study, Tyran and Feld (2006) compare the adoption of a deterrent and a non-deterrent punishment institution, each relative to a standard public goods game without punishment. As before, punishment is automatically enforced if players contribute less than the full amount. With deterrent punishment, cooperation becomes the dominant strategy while with non-deterrent punishment free-riding remains the dominant strategy. They find, that 75% of players vote for the game with punishment when punishment is deterrent and 50% when punishment is non-deterrent. Using the strategy method, they find that with deterrent punishment the average contribution rate is 96 vs 15% in the standard game. With non-deterrent punishment, players contribute on average 64% of their endowment compared to 22% in the standard game. Average payoffs are also higher in the game with deterrent or non-deterrent punishment compared to the standard game. Vollan et al. (2017) use a similar setup and let samples of students and workers in China choose between a public goods game with or without a formal non-deterrent punishment scheme and make contribution decisions for all possible outcomes (strategy method). They find that 42% of players vote for the game with punishment. With punishment, the average contribution rate is 59% compared to 38% in the standard game. Average payoffs, however, do not differ between the game with punishment and the standard game. For the same type of institution, but using German students and the direct-response method, Gallier (2017) finds that 73% of the players vote for the public goods game with non-deterrent punishment and 27% vote for the standard game. Average contributions are significantly higher in the groups with punishment than in the groups without punishment (80 vs 39%), and also average payoffs are significantly higher.

Dannenberg et al. (2019) let groups repeatedly choose between a standard public goods game and a modified version in which players can vote to exclude another player. The treatments differ in whether the exclusion option involves a fixed cost or not. In both conditions, cooperation is higher with exclusion institution than without. The differences in cooperation increase over time, except of the last round when exclusion is no longer possible and both games have a similarly low cooperation level. Payoffs are higher with the institution, too, but only when it does not involve a cost. In this case, the share of groups that implement the institution rises from 30 to 96%. With institutional cost, the share rises from 4 to 52%.

Sutter et al. (2010) let groups choose between three different games, namely (1) a standard public goods game, (2) a public goods game with informal punishment option, and (3) a public goods game with informal reward option. In one treatment, the leverage of the punishment and reward option is low meaning that reducing or increasing another player’s payoff is relatively expensive, and in a second treatment, the leverage is high meaning that it is relatively cheap to decrease or increase another player’s payoff. Groups vote until unanimity is reached and the selected game is played for ten rounds. With low leverage, 62% of groups choose the standard game, 25% choose the game with reward option, and 13% choose the game with punishment option. Average contribution rates are the other way around: 27% of the endowment in the standard game, 43% in the reward game, and 81% in the punishment game. The same order holds for average payoffs. Thus, the game with the highest cooperation and efficiency is the least popular one. With high leverage, 85% of groups choose the game with reward option and 15% choose the standard game. Both average contributions and payoffs are higher in the reward game than in the standard game. Note that by design the maximally possible payoffs are higher in the reward game because players can increase the pie by rewarding others. No group chooses the punishment game arguably because of the very attractive alternative.

In the experiment by Ostrom et al. (1992), groups choose once whether they want to add an informal punishment option to a common pool resource game. They find that 56% of groups decide to add the punishment option to the game. When groups adopt the punishment option, the average efficiency rate reaches 89%. This rate decreases only slightly to 84% when the costs of fees and fines are subtracted. In contrast, when groups do not adopt the punishment option, the average efficiency rate is 28%.Footnote 9

Ertan et al. (2009) allow groups not only to choose between punishment option and no punishment option but also which contribution behavior, if any, can be punished (below average, average, above average). Groups use majority voting and have the opportunity to change the institution over the course of the game. The two treatments differ in the experience players have before they vote the first time and the number of voting rounds. At the beginning, many groups prohibit any punishment (50 and 65%) but over time they change the institutional setting to allow for punishment of low contributions. At the end, a large majority of 85 and 90% of groups implement a punishment scheme that allows for punishing below-average contributions. From the beginning, contributions and payoffs are higher with punishment option than without, and the difference becomes larger over time. Similar results for heterogeneous groups are provided by Noussair and Tan (2011).

The groups in the experiment by Markussen et al. (2014) choose repeatedly between a standard public goods game, a public goods game with formal punishment, and a public goods game with an informal punishment option. To avoid strategic voting, players only choose between two institutions at a time. When choosing between the standard game and an informal punishment option, only between 14 and 25% of groups choose the punishment option at the beginning but the proportion increases to 50 to 67% when groups choose a second time. Average contribution rates and payoffs are higher with punishment option than without and this difference becomes more pronounced over time. The popularity of the formal punishment scheme depends on the cost of its implementation and whether or not it is deterrent. Compared to the standard game, the cheap and deterrent punishment scheme is popular from the start, about 71% of groups choose it the first time and about 86% the second time, and it leads to higher contributions and payoffs than the standard game. When the punishment scheme is cheap and non-deterrent, about 58% of groups choose it the first time and about 43% the second time, and it also leads to higher contributions and payoffs than the standard game. The punishment scheme is less popular when it is expensive. When the punishment scheme is expensive and deterrent, 17% of groups choose it at the beginning and the proportion rises to 33% when groups choose a second time. Contributions are higher on average than in the standard game but payoffs are about the same on average because of the institutional cost. Very few groups choose the formal punishment scheme when it is expensive and non-deterrent. Similar results about the popularity of formal and informal punishment institutions are obtained in a follow-up study by Kamei et al. (2015).Footnote 10

Figure 4 provides an overview of these findings in the same form as before. The lower and upper part of the bars show the groups inside and outside the institution and the colors indicate the cooperation rates. Only about half the studies include repeated choice and thus allow for a comparison of behavior at the beginning and at the end of the game. The comparison shows that, similar to the results shown in Figs. 1 and 2, average cooperation rates are significantly higher inside the institution than outside institution both at the beginning (70 vs 41%, P < 0.0001, two-sided Wilcoxon signed-rank test) and at the end (64 vs 16%, P = 0.0004) of the game. Average cooperation inside the institution decreases slightly from the first to the last round (70 vs 64%, P = 0.4076), but cooperation outside the institution decreases significantly (41 vs 16%, P = 0.0010). Participation in the institution increases significantly over time (48 vs 66%, P = 0.0005). The increase in participation over time is particularly large for the institutions that do not involve a fixed cost and eliminate free-riding incentives (the first eight bars on the left in the lower panel). Participation in the study by Markussen et al. (2014) is still only about 50%, but players in this experiment choose between the same two institutions only twice. We could speculate that participation would have increased by more if players had chosen more often. All other institutions in the lower panel (the nine bars to the right) involve a fixed cost or do not eliminate free-riding incentives. In these cases, increases in participation and cooperation in the institution over time tend to be small.

3.5 Summary

The studies provide strong and robust evidence that the behavior of individuals and groups who adopt the institution behave more cooperatively than those who do not adopt the institution. The difference exists in the first round already and often increases over time because cooperation outside the institution typically decreases over time while cooperation inside the institution is stable or even increases.

On institutional choice, a natural presumption is that subjects are more likely to adopt the institution when theory predicts that it is profitable to do so. However, this presumption is not confirmed by the experimental data. Even when the institution makes cooperation the dominant strategy, the number of subjects who vote in favor of the institution is surprisingly low when they vote for the first time. Across all studies and treatments, the proportion of individuals or groups that adopt the institution often lies in the range between one-third and two-thirds when they vote for the first time. In many studies, the proportion is close to 50% so that one could think that subjects ‘flip a coin’ at the beginning. The proportion, however, increases considerably when subjects are allowed to vote repeatedly; often to more than 80%. A higher number of voting rounds, experience with both games before the voting, and more detailed information about other groups tend to speed up the learning process. Learning is thus an important factor that can explain a big part of the different experimental results.

We can also identify two factors that hamper the choice of the institution, even with repeated voting. The first important factor is inequality. When the cooperation problem is global but the institution covers only a subset of players, subjects appear to be reluctant to join the institution even if it is profitable compared to the situation without institution. Institutions also tend to be less effective and less popular if they do not completely eliminate free-riding incentives, i.e. they allow players to benefit from others’ cooperation efforts without facing a risk of getting punished. An example for this is a formal non-deterrent punishment scheme. Free-riders earn less under this institution than without the institution but, when they play with a cooperator, they still earn more than the cooperator. The second important factor is the cost of the institution. A significant share of subjects opposes the institution when it involves a fixed cost, even if this cost is often offset by a higher cooperation level in those groups that choose the institution.

4 Differences between supporters and opponents of the institution

In this section we describe if there are certain individual characteristics and attitudes that influence participants’ decision to vote for or against enacting the institution. A common finding is that subjects with a strong cooperative inclination vote in favor of the institution (Dal Bó et al. 2010; Ertan et al. 2009; Grimm and Mengel 2009, 2011; Sutter and Weck-Hannemann 2003, 2004; Kocher et al. 2016; Vollan et al. 2017; Gallier 2017; Fehr and Williams 2017). Furthermore, subjects who have experienced very low levels of cooperation in the past are more likely to vote in favor of the institution (Bohnet and Kübler 2005; Barrett and Dannenberg 2017; Dal Bó et al. 2010; Dannenberg et al. 2019). Subjects who have made bad experience with the institution, for instance, by receiving punishment, sometimes oppose the institution later (Ostrom et al. 1992). In other cases, most notably when information about the performance in the different games is provided, formerly punished players change their behavior and come to support the institution later (Gürerk et al. 2006).

Several studies show that subjects’ beliefs about the behavior in the available games are very important for their institutional choice. Subjects are more likely to vote in favor of the institution if they have optimistic beliefs about the behavior under the institution and/or if they have pessimistic beliefs about behavior in the original cooperation game (Barrett and Dannenberg 2017; Dal Bó et al. 2018; Grimm and Mengel 2009; Kosfeld et al. 2009). Martinsson and Persson (2019) find that subjects who contribute more than what they expect from others are more likely to vote in favor of the institution.

Some studies test if measures of people’s cognitive and strategic abilities such as SAT scores, IQ tests, final high school grades, or decisions in a beauty contest game are correlated with the institutional choice. Dal Bó et al. (2010), for example, find that subjects with better SAT scores and better performance in a beauty contest game are more likely to vote for a coordination game rather than a prisoners’ dilemma. Kamei et al. (2015) find that subjects with a high IQ test score are more likely to vote for the institution that gives higher payoffs. However, Dal Bó et al. (2018) and Barrett and Dannenberg (2017) find no effects of similar measures on the voting decisions. Sutter et al. (2010) test if subjects’ social orientation can predict their voting decisions but they find no significant effect. Fehr and Williams (2017), on the other hand, find that prosocial subjects are more likely to join punishment institutions, especially at the beginning of the game when it is not yet common knowledge that these institutions lead to higher payoffs.

Gallier (2017) measures in how far participants believe that they have control over events that affect their personal lives and relates this internal locus of control to their institutional choice. He finds that participants who believe that they have control over events are more likely to vote for a formal non-deterrent punishment scheme. A plausible explanation is that these people are more likely to believe that they can change the group outcome by changing the institution. Martinsson and Persson (2019) find that women are more likely to vote in favor of a group-wide minimum contribution level than men. We are not sure what to make of these latter results, but they show that the influence of gender and other personal characteristics on institutional choice surely is an important area for future research. For example, reviewing gender differences in experimental games, Croson and Gneezy (2009) find that women are more risk averse than men and their social preferences are more sensitive to the context; a finding that may also be relevant for institutional choice.

5 Effects of endogenous and exogenous institutional choice

5.1 Difference between endogenous and exogenous institutions

A number of studies provide a comparison between a treatment in which players can implement the institution or not and treatments in which the institution is exogenously imposed or not. Based on these studies, three different comparisons can be made: (1) a comparison between players who have endogenously implemented the institution and players who act under the same but exogenously imposed institution, (2) a comparison between players who have decided against the institution and players who were exogenously put into the game without the institution, and (3) an aggregated comparison between players who decide endogenously in favor or against the institution and players who are forced to play under the institution. The latter comparison shows whether letting players choose has an overall positive effect on cooperation, given that some players implement the available institution and other players do not. In this section, we will look at all three comparisons in the above order. Afterwards we will describe the different effects that contribute to different behavior under endogenously chosen institutions and exogenously imposed institutions.

On the first comparison, a relatively robust finding is that an institution that is endogenously chosen by the players leads to a higher level of cooperation than an institution that is exogenously imposed upon players.Footnote 11 The difference tends to be small for institutions that make full cooperation the unique equilibrium of the game, simply because cooperation in these cases is high irrespective of how the institution is chosen. For example, in the two-player game by Dal Bó et al. (2018) where mutual cooperation is the unique equilibrium, 94 to 98% of players cooperate when the game is chosen by the players and 92 to 93% cooperate when the game is exogenously imposed. Similarly, Tyran and Feld (2006) find a contribution rate of 96% when a formal deterrent punishment institution is chosen by the players and 93% when it is exogenously imposed. Andreoni and Gee (2012) report an average contribution of 95% when a formal deterrent punishment institution is chosen by the players and 91% when the same institution is exogenously imposed. The minimum contribution levels studied by Kocher et al. (2016) and Martinsson and Persson (2019) are also relatively strong institutions that change the equilibrium from zero contributions to the minimum level. Similar to the studies above, they find only very small differences between the endogenous case and the exogenous case. Kocher et al. (2016) actually find a lower cooperation rate in the endogenous case which may also be explained by their subject pool (Chinese students) and the use of the strategy method.

Larger differences arise when the institution makes cooperation one equilibrium among others or when the institution does not change the equilibrium at all. These weaker institutions are not always effective and they tend to be more effective for players and groups that have decided to implement them. Dal Bó et al. (2010) let subjects choose between a prisoners’ dilemma and a coordination game with multiple equilibria. They find a cooperation rate of 72% when the coordination game is chosen by the players and 50% when the coordination game is exogenously imposed. In the public goods game with an informal punishment option by Gürerk et al. (2014), the contribution rate is 91% when the players have joined the game voluntarily and it is only 54% when the game has been imposed upon the players. In Feld and Tyran (2002), the contribution rate is 71% when a formal non-deterrent punishment institution is chosen by the players and 38% when it is exogenously imposed. Similar results are provided by Bohnet and Kübler (2005), Tyran and Feld (2006), Grimm and Mengel (2009), Sutter et al. (2010), Kamai et al. (2015), Markussen et al. (2014), and Fehr and Williams (2017).Footnote 12

However, there are also some studies that consider relatively weak institutions and still do not find a clear difference between the endogenous and the exogenous case. Vollan et al. (2017) let samples of students and workers in China play a public goods game with or without a formal non-deterrent punishment scheme. They find no difference between an endogenously implemented punishment scheme (59% contribution rate) and an exogenously imposed punishment scheme (60%). The authors explain this finding with the importance and long history of authoritarian norms in China. Gallier (2017) and Dannenberg et al. (2019) also report only small and insignificant differences in cooperation between endogenously and exogenously implemented institutions, even though their experiments were conducted with German students. A possible explanation for this result in Gallier (2017) may be that the subjects play ten rounds of a standard public goods game before they vote on the implementation of a formal non-deterrent punishment institution. During these rounds, players already accumulate information about the cooperativeness of the group which perhaps limits the value of the signal that is associated with voting in favor of the institution. Dannenberg et al. (2019) study the effect of an exclusion institution which can be used to exclude players from the group. While this institution does not change the zero contributions equilibrium, one could speculate that the institution is psychologically perceived as a strong institution that is similarly effective on groups that have chosen it and groups that are assigned to it.

In Sutter and Weck-Hannemann (2003) players can choose whether or not to implement asymmetric minimum contribution levels in a non-linear public goods game. The pre-specified minimum contribution levels are below the Nash equilibrium and do not affect the free-riding incentives in the game. There is only a small difference in average cooperation rates between endogenously (8%) and exogenously (12%) implemented minimum contribution levels. One special feature of this experiment is that the institution assigns asymmetric minimum contribution levels to the players. The players with relatively high minimum levels make lower contributions when the obligations were determined endogenously by the group (0%) than when they were determined exogenously (24%). For the players with relatively low minimum levels, cooperation is slightly higher in the endogenous case than the exogenous case (10 vs 5%). In a follow-up study, Sutter and Weck-Hannemann (2004) investigate symmetric minimum contribution levels. They find only a small difference in cooperation rates between endogenously (8%) and exogenously (6%) implemented minimum contribution levels that are below the Nash equilibrium.

The second comparison concerns the players who decide against the institution and the players who are exogenously assigned into the same situation. In these cases, cooperation often is lower when the decision is made endogenously by the players than when it is made exogenously, though the differences are not large. Feld and Tyran (2002) find an average contribution rate of 24% when players reject a formal non-deterrent punishment scheme and a contribution rate of 30% when players do not have a choice to implement the punishment institution. In the follow-up study by Tyran and Feld (2006), the average contribution rate is 15% when players reject a formal deterrent punishment scheme (22% in case of a non-deterrent punishment scheme) and the contribution rate is 30% when the players have no choice. Vollan et al. (2017) who consider a formal non-deterrent punishment scheme report an average contribution rate of 38% when groups fail to implement the institution and a contribution rate of 47% when the game is exogenously assigned. Similar results are reported by Sutter and Weck-Hannemann (2003, 2004), Kosfeld et al. (2009), Sutter et al. (2010), Kocher et al. (2016), and Gallier (2017). Other studies provide comparisons where the differences between the endogenous case and the exogenous case are positive but small (Grimm and Mengel 2009; Dal Bó et al. 2010; Sutter et al. 2010; Dal Bó et al. 2018; Dannenberg et al. 2019).

The third comparison concerns the question whether it is beneficial in the aggregate to let players choose the institution endogenously compared to an exogenously imposed institution. This question is highly relevant for policy whenever a regulator has the power to enforce regulations but may want to leave the decision to the constituency, to be taken for example in a referendum, if this promises a better outcome in the end. If all players chose the institution, then letting them choose would be better than forcing the institution upon them. However, given that some players decide against enacting the institution and then often fare poorly, the answer is a priori not obvious. Tyran and Feld (2006) find that cooperation rates are higher when players can decide whether or not to impose a formal non-deterrent punishment scheme (47%) than when the institution is exogenously imposed (33%). Similar results for a non-deterrent contributions rule are provided by Feld and Tyran (2002). All other studies find either a small positive effect or a negative effect. The experiments by Sutter et al. (2010), Kocher et al. (2016), Vollan et al. (2017), and Martinsson and Persson (2019) find a small positive effect.Footnote 13 The studies by Sutter and Weck-Hannemann (2003, 2004), Tyran and Feld (2006), Grimm and Mengel (2009), Sutter et al. (2010), Andreoni and Gee (2012), Gallier (2017), and Dannenberg et al. (2019) find a negative effect.

Table 2 provides an overview of the three comparisons. Note that the table distinguishes not only between studies but also experimental treatments when there are several that are relevant for the comparisons. Note also that we refer to the size of the effects. The effect size often corresponds with statistical significance but not always. In some cases, we do not know if the difference is significant or not. In summary, the comparisons show that cooperation tends to be higher when the institution is endogenously chosen by the players than when it is exogenously imposed, cooperation tends to be lower when the institution is rejected by the players than when it is exogenously left out, and letting players choose is not necessarily better than forcing the institution upon them.

Table 2 Differences between endogenous and exogenous institutions

5.2 Separating between different effects of institutional choice

The results so far imply that the difference between players who play under the institution and the players who do not play under the institution is larger when players have made the institutional decision themselves than when the decision was made exogenously. The reason for this is that, when players are assigned exogenously, the difference in behavior is driven by only one effect, namely the effect of the institution. When players are allowed to choose, three additional effects are at play: a selection effect, an information effect, and a democracy effect. In principle, we would expect that all three of the additional effects reinforce the institution effect on cooperation, i.e. they affect cooperation positively for the players who decide in favor of the institution and they affect cooperation negatively for the players who decide against the institution. We would also expect that the relative magnitude of these effects depend on the theoretical strength of the institution. The institution effect should be relatively large for strong institutions that make cooperation the unique equilibrium of the game, while the effects of selection, information, and democracy should become relatively more important for weaker institutions.

In the literature, different approaches have been proposed to isolate and quantify the different effects. Dal Bó et al. (2010) quantify the different effects of choosing an institution endogenously with a randomization technique which implements the group’s voting outcome only with some probability. Using a formal institution that makes mutual cooperation another equilibrium of the game, they report that the institution effect accounts for 66% of the difference in cooperation between players who decide to implement the institution and those who decide against the institution. The selection effect explains 8% of the difference while the information effect is negligible. Consequently, 26% of the difference is explained by the democracy effect. Using the same randomization technique for a non-deterrent formal punishment scheme, Gallier (2017) finds that more than 80% of the difference in contribution rates between an endogenously chosen and rejected institution can be attributed to the institution effect. The information effect as well as the effect of self-selection are negligible and only 13% of the difference can be attributed to the democracy effect. The experiment by Vollan et al. (2017) relies on the strategy method, in which players make contingent cooperation decisions for all possible distributions of votes, in order to avoid the selection effect and control for the information effect. They find that the information effect is negligible in magnitude. The results suggest that the difference in cooperation between the groups that implement the non-deterrent formal punishment institution and the groups that decide against it is mainly explained by the institution effect (about 60%) and the democracy effect (about 40%). Notably, the democracy effect is not driven by high contributions of the groups that implement the punishment institution but rather by low contributions of the groups that fail to implement the punishment institution. Kocher et al. (2016) also use the strategy method to avoid the selection effect. Since players vote on whether or not to implement the binding minimum contribution level and then the decision of one randomly selected player is implemented without revealing the others’ voting decisions, the information effect also is virtually removed. Consequently, the difference in contributions when the minimum contribution level is endogenously chosen or rejected (51 vs 31% in the high level treatment) can be explained by the institution effect (65%) and the democracy effect (35%).

Taken together, the findings suggest that the institution effect is the most important factor while the other three effects tend to be less important. We cannot confirm that the institution effect is stronger for theoretically strong institutions. However, the number of studies is still too small and the employed methods are too diverse to allow for a conclusive assessment. Clearly, more research is needed in this area for a better understanding of the effects in different contexts and cultures.

6 Discussion and conclusions

Do people make wise decisions when it comes to the choice of institutions to solve a cooperation problem? This question is difficult to answer because we only know how subjects behave after having chosen a certain institution and we do not know how they would have behaved if they had chosen differently. After all, if voters are pessimistic about the institution succeeding, failure of the institution will be a self-fulfilling prophecy. Nevertheless, we can compare the performance of individuals and groups that make different institutional choices and investigate if the institution that pays off more handsomely is spreading over time. An important and robust finding is that the individuals and groups who adopt the institution behave significantly more cooperatively than those who do not adopt the institution. This is true irrespective of whether the institution changes the theoretical properties of the game or not. However, the institutions are not always implemented. When subjects choose between the games for the first time, their voting decisions appear naïve and almost look like random decisions. When subjects are allowed to vote repeatedly, they learn and the proportion of individuals or groups that adopt the institution increases considerably. Learning is easier, of course, the more information is available. For instance, providing information about the performance of groups playing in different games accelerates learning. Similarly, forcing groups to play all available games before they choose also helps to accelerate learning and improve choices. However, there are two factors which hamper the institutional choice, even when repeated voting is possible. First, a significant share of subjects is reluctant to support the institution when it involves a fixed cost so that the overall first best outcome is no longer feasible. Second, subjects are reluctant to adopt an institution that does not eliminate free-riding incentives and allows for inequality among players.

Our review shows, furthermore, that there are systematic differences between supporters and opponents of the institution. Subjects with a strong cooperative inclination often support the implementation of the institution more than less cooperative subjects. Optimistic beliefs about cooperation under the institution and pessimistic beliefs about cooperation in the original cooperation game also makes voting for the institution more likely. There is some evidence that subjects’ cognitive and strategic abilities as well as their internal locus of control increase the likelihood of supporting the institution. Gender may play a role, too. But the number of studies is still too low to make conclusive assessments and more research is needed to confirm these relationships.

Because of the importance and wide-spread interest in democratically chosen institutions, we have devoted one section to the question if and why individuals behave differently when they choose the institution themselves than when the institution is assigned exogenously. There is evidence that cooperation is higher when the institution is endogenously chosen than when it is exogenously imposed. On the other hand, cooperation often is lower when the institution is endogenously rejected than when it is exogenously left out. For this reason, letting people choose is not necessarily better than enforcing the institution from outside. Letting people choose therefore is only recommended when there is a high chance that most of them will actually choose to implement the institution which may be difficult to know in advance. Of course, for certain problems, like global security or climate change, enforcement of the institution from outside is not possible and players will have to choose the institution endogenously.

Much of the experimental literature on the endogenous choice of institutions has developed only recently. These experiments improve our understanding of the effect of self-selected institutions on behavior and people’s ability to choose institutions. The standard economic model based on perfectly rational, knowledgeable, and selfish actors often is silent, ambiguous, or wrong about how people choose. Models that include learning (e.g. Andreoni 1988; Burton-Chellew et al. 2015; Camerer and Ho 1999) or social preferences (e.g. Rabin 1993; Fehr and Schmidt 1999) seem to be more suitable to predict or explain behavior. While some of the studies use social preference models to explain the experimental results (Kosfeld et al. 2009; Sutter et al. 2010; Markussen et al. 2014; Cobo-Reyes et al. 2019; Dannenberg et al. 2019) we have not seen the utilization of learning models yet.

Can we say what works and what does not work? We can be confident that, in a setting where individuals choose repeatedly and where the institution eliminates the free-riding incentives for all players and is not too expensive, the cooperation problem will be solved. If one of the three factors is not met, we can expect difficulties. The most difficult problem arises when the institutional choice is hard to reverse, institutions are costly and unable to eliminate all free-riding incentives. In this case, a regulator with enforcement power may be in a better position to decide about and enforce the institution. Unfortunately, some of our most pressing problems, like global climate change, are of this kind and a regulator with enforcement power does not exist.

What is missing in the literature on endogenous institutions? Only one of the studies presented here allows players to communicate before they choose the institution (Ostrom et al. 1992). There are only very few studies using asymmetric players or asymmetric institutions. Many dimensions of heterogeneity are conceivable and relevant, such as unequal endowments, benefits or costs of cooperation, and benefits or costs of the institution. More research is needed to better understand the effects of asymmetric information and incomplete monitoring or enforcement on the choice and effectiveness of institutions. Elinor Ostrom (1990) has argued that letting people choose their own institutions is better for cooperation than enforcing institutions from outside because the outside regulator may not have the incentive or ability to establish and enforce effective institutions. This possibility is absent in most of the surveyed studies (exceptions are Nicklisch et al. 2016 and Fehr and Williams 2017) and deserves more attention. Many pressing cooperation problems involve trade-offs, so comparing second-best institutions with one another may offer valuable insights. For example, is it better to implement an institution that comes at a high fixed cost or an institution that is less costly but governs only a subset of players? In the surveyed experiments, players are allowed to choose between different rules before they play the game whereby the available rules and the voting mechanism are given. In the field, these things are also often endogenous which may be the next step to study (Rockenbach and Wolff 2016). Finally, to test the robustness of the reported results, it may be useful to conduct more experiments with non-student samples with diverse cultural backgrounds (Gürdal et al. 2019), with larger groups, or with teams rather than individual decision makers (Charness and Sutter 2012).