1 Introduction

Intergroup cooperation is a key factor for the functioning of organizations and societies both (Dovidio and Banfield 2015). Pooling resources and knowledge allows for risk sharing and for attaining synergy effects no group can obtain on its own; most importantly, increasing the group size may lead to greater contributions from collective action (Oliver and Marwell 1988; Rodrigues et al. 2023). However, establishing and maintaining intergroup cooperation is challenging. As Masuda (2012) demonstrates, individuals often exhibit ingroup favoritism (i.e., cooperating only or preferably within their own group) rather than full cooperation (i.e., cooperating within their own and with other groups), even when the latter is—and the former is not—Pareto efficient. Ingroup favoritism results from the difficulty of building trust between groups (as opposed to building trust between individuals), a difficulty that often leads to intergroup competition or even conflict rather than cooperation (Insko et al. 2001). As a case in point, experiments show that individuals tend to be more generous in allocating resources to ingroup members compared to outgroup members (Chen and Li 2009; Hewstone 1990). Furthermore, individuals assume that ingroup members share similar attitudes while expecting outgroup members to hold different views (Robbins and Krueger 2005; Mullen et al. 1992).

In this paper, we discuss the allocation of resources to jointly created goods as well as the distribution of these goods’ proceeds. We frame our experiment as modelling the interplay between economic migrants on the one hand (i.e., people leaving their respective homes to improve their economic outlook, IOM 2011), and citizens of the country the migrants reside in on the other hand.

The situation we study is characterized by interdependent payoffs and heterogeneous levels of power to influence the distribution rules, establishing a hierarchy. Hierarchy is a basic characteristic of social life (Van Kleef and Lange 2020; Hilbe et al. 2016). Experiments for example show that cooperation is fragile in situations with asymmetric power relations (Cox et al. 2011; Ostrom 2009). Most recently, Cox et al. (2023) demonstrated the diametric effects of asymmetric power on cooperation in a laboratory experiment. Varying the power type (asymmetric versus symmetric) of decision-makers in a repeated linear public goods game, they find that the asymmetric setting produces more free riding and lower contributions to the public good. The literature proposes several reasons for such cooperation fragility: the displacement of trust by power asymmetry (Farrell 2009; Johnston 2005), the temptation for dominant individuals to cheat subordinates (Phillips 2018), or the perception by individuals that equality of outcomes is unfair when inputs are unequal (Eek et al. 2001; Joireman et al. 1994). In contrast to these examples, hierarchy may also facilitate resource allocation and conflict resolution (Keltner et al. 2008; Magee and Galinsky 2008), thus promoting cooperation (Majeski 2004; Van Kleef et al. 2020), particularly in situations where punishment can be imposed (Bone et al. 2016). In our experiment, we introduce asymmetric power relations by giving one group (precisely, the citizens of the country that is the destination of migration) the power to implement redistributive policies which award this group a greater share of the jointly created good’s proceeds.

Redistributive policies allocate resources and opportunities across individuals or groups according to defined distribution rules (Kuhlmann and Blum 2021). Common instruments of redistribution are taxation (e.g., personal income taxes), social expenditures (e.g., unemployment or child benefits), and social services (e.g., access to health and education). While redistribution through taxes and social transfers is immediate, social services work in the longer term (Prasad 2008). Our research uses the framing of social expenditures and of social and welfare policies, which are, as Hills (2004) argues, primarily used for redistribution. As an example for such a use of welfare benefits, Austria, Germany, the Netherlands, and the UK in 2013 petitioned the EU Commission for the right to restrict welfare access for migrants, and now possess legal tools which they argue are intended to prevent welfare fraud (European Commission 2015).

In the paper at hand, we report on an experiment studying the dynamics of intergroup cooperation in the production and distribution of a jointly created good. We do so using a setting of repeated interactions with participants who choose from a variable set of redistribution rules in a situation of power asymmetry.Footnote 1 In particular, we employ students of a Western European University to model a society with two groups, framed as economic migrants and destination country citizens. Students in the role of economic migrants face the choice of contributing to a jointly created good, while students in the role of citizens have the option to implement discriminatory redistributive policies (i.e., citizens can change the degree to which migrants are entitled to benefit from payments from the jointly created good).Footnote 2,Footnote 3

Our research question is:

Research question: How do groups with asymmetric power interact under redistributive policies?

We find a central role for reciprocity in situations characterized by power asymmetry: The members of the rule-setting group (i.e., citizens) condition their voting behavior on the contribution behavior of the members of the disadvantaged group (i.e., migrants), rewarding cooperation with an egalitarian redistribution policy. However, the rule-setting group tends to ‘punish’ non-contributions by the disadvantaged group by implementing a discriminatory redistribution policy. In the same vein, the disadvantaged group rewards an egalitarian redistribution policy (set by the advantaged group) by contributing to the jointly created good, which results in a high payoff for members of both groups. However, the disadvantaged group tends not to contribute when the advantaged group chooses a disadvantageous (for the disadvantaged group) redistribution policy. Thus, unequal distribution policies crowd out cooperation—a poor basis for mutually beneficial cooperation in a society.

Our research contributes to the literature on intergroup cooperation in the provision of jointly created goods under power asymmetry, i.e., in asymmetric social dilemmata.Footnote 4 Most research in this field focuses on intergroup conflict (McDonald et al. 2012; Rusch 2014), leaving the question of intergroup cooperation in social dilemmata largely unexplored (Dovidio and Banfield 2015; Robinson and Barker 2017). Additionally, little research has examined factors that affect cooperation in the presence of asymmetry (Van Lange Paul et al. 1992; Eek et al. 2001), even though many social dilemmata in real life are asymmetric (Kugler et al. 2010; Nikiforakis et al. 2010; Cox et al. 2023) such as indexed kindergarten fees, unequal healthcare benefits financed by tax contributions or imbalanced participation in team projects. Yet ingroup favoritism is typically studied in symmetric settings, whereas we innovate by studying asymmetric decision spaces. We give one group the power to restrict the other group’s payoff from the jointly created good by implementing a sharing rule. We thus offer insights into the dynamics of asymmetric intergroup cooperation and draw conclusions about ways of fostering greater cooperation.

In the remainder of this paper, Sect. 2 presents the related literature, Sect. 3 lays out our method, Sect. 4 contains our hypotheses, Sect. 5 presents the results, and Sect. 6 concludes.

2 Related literature

Our study adds to the literature on cooperation in an asymmetric, in- and outgroup, social dilemma setting (see, e.g., Akerlof and Kranton 2000). Specifically, we study a setting where members of one group can vote unilaterally to determine the distribution of benefits from cooperation under different regimes of potential inequality between their own ingroup and an outgroup. This unilateral power to choose for both groups constitutes a potential source of conflict (Kuhlmann and Blum 2021).Footnote 5

Considering the implications for our experimental design (outlined in Sect. 3), we consider the following findings from the literature to be the most relevant: First, the value assigned to payoffs for members of one’s own group is greater than the value assigned to payoffs for members of other groups (see, e.g., Mullen et al. 1992; Hertel and Kerr 2001; Chen and Li 2009, ). This phenomenon of ingroup favoritism has been identified as a major driver of redistribution, particularly for reducing or increasing inequality. Accordingly, Fischbacher et al. (2023) show that 85% of participants’ decisions are made in favor of the ingroup. Second, unequal payoffs from jointly created (public) goods reduce the willingness to cooperate (see, e.g., Marwell and Ames 1979; Brookshire et al. 1989; Fisher et al. 1995; Palfrey and Rosenthal 1991; Rapoport and Suleiman 1993, ) and cooperation in public good games generally decreases over time (Ledyard 1995, 2020). Third, punishment can promote cooperation (Ostrom et al. 1994; Fehr and Gächter 2000), with humans exhibiting a preference for centralized over peer punishment (Traulsen et al. 2012). Considering cooperation through the lens of labor supply, and punishment through the lens of redistribution rules and motives, Sausgruber et al. (2021) argue that externally imposed redistribution is more harmful to labor supply than democratically chosen redistribution. Fourth, variations in allocation rules for public goods provision across members—such as our redistribution policy—reveal members’ preferences for redistributing the benefits of the public good (Grimalda et al. 2018). Traditional explanations for individuals’ redistribution preferences involve selfish incentives and fairness views based on merit and equality (Tyran and Sausgruber 2006; Esarey et al. 2012; Durante et al. 2014). Recently, a new dimension—ingroup preferences—has been recognized (Fischbacher et al. 2023). Klor and Shayo (2010), for example, show that payoffs of other ingroup members affect redistribution preferences, in that group members vote for policies that benefit the average member of their group.

More technically, public goods provision (i.e., cooperation) is particularly difficult in asymmetric dilemma situations (Koessler et al. 2023). Incentive-compatible mechanisms, however, can foster efficient public goods provision (Healy 2006; Kozlovskaya and Nicolò 2019). Falkinger (1996), for example, proposes a tax mechanism to overcome the free-rider problem in the provision of public goods. Specifically, he suggests taxing the deviation of the average contribution by an appropriate factor. A more complex provision scheme is the pivot mechanism, which consists of two rules: a decision rule that produces the public good only if the sum of the agents’ valuations exceeds its cost, and a transfer rule that adjusts the agents’ incentives to the social welfare (Attiyeh et al. 2000; Kozlovskaya and Nicolò 2019). Lastly, addressing the topic of providing jointly created goods efficiently in (asymmetric) social dilemma situations requires understanding reciprocity. Reciprocity is a form of conditional behavior, i.e., of individuals monitoring past interactions and conditioning their behavior on their partners’ earlier behavior (see Bruni et al. 2008, for a special issue on reciprocity). In intergroup cooperation, individuals expect others, regardless of these others’ group affiliations, to directly reciprocate their own initial cooperative efforts within the same interaction (Rabbie et al. 1989; Balliet et al. 2014; Romano et al. 2017). Thus, when a person has the opportunity to cooperate with another, they may choose to cooperate in the hope that the other person will respond in kind. Such expectations of the direct benefits of reciprocity may encourage people to enter into a cooperative relationship with outgroup members, but only if this cooperation is reciprocated directly and by the same individual (Balliet et al. 2014).

Two studies that are relatively closely related to ours are Huber et al. (2022) and Böhm et al. (2018). Using university students in all roles, Huber et al. (2022) perform an experiment which models two societies that differ in their levels of welfare, and allow migration from the low to the high-welfare society if members of the richer society cast a vote in favor of allowing migration. The result relevant for our study is that citizens vote in favor of migration because they expect it to increase their individual economic outcomes. While Huber et al. (2022) studied allowing or restricting migration, we investigate a different mechanism related to migration: redistribution rules. Böhm et al. (2018), on the contrary, focus on the possible exclusion of (students in the role of) refugees from the benefits of goods jointly created by (students in the role of) citizens. Most importantly for our own research questions, and in agreement with survey results, they find that the willingness to allow refugees to profit from the citizens’ jointly created public good depends on the refugees’ level of participation in the creation of this good.Footnote 6 We extend these experimental studies by investigating the impact of redistributive policies framed as decisions about migrant labor supply and vice versa.

3 Method

Our experiment consists of 22 sessions. Each session has twelve students participating simultaneously, but separated into two matching groups of six participants each.Footnote 7 At the beginning of the experimental sessions, participants arrive and wait outside the laboratory. At the designated starting time, participants are welcomed by the experimenter, draw cards bearing their computer numbers, are led into the lab and sit down at the workstations corresponding to the numbers on their cards. They there find a copy of the laboratory rules, which the experimenter reads out aloud, asking the participants to read along. The remainder of the sessions are then structured into two parts, as presented in Fig. 1.

Fig. 1
figure 1

Experimental procedures in the study. Note: Ovals indicate the start and the end point of the procedure, diamonds indicate participant choices, and rectangles indicate progression without participant choice. Note that citizens and migrants make their decisions simultaneously. Each session comprises 20 periods

Part A is the group formation task and manipulation check, designed to assign participants to the roles of migrants and citizens, which they subsequently assume in Part B. Part B is the main experimental task, in which the groups play multiple rounds in a stage game reminiscent of a linear public goods game: Migrants choose whether to contribute their endowment of points to a jointly created good; all citizens contribute their endowments automatically. While the migrants make their contribution decisions, the citizens decide by majority vote whether the proceeds of the jointly created good will be distributed equally or unequally (favoring citizens) over all citizens and migrants. The payoff to citizens and to contributing migrants consists solely of the points from the jointly created good, while non-contributing migrants also retain their respective endowments. After the decisions have been made, citizens and migrants observe each other’s contribution and voting behavior. (See online appendix Section E.A for the instructions for Parts A and B.)

3.1 Part A: Group formation

The experimenter begins Part A after having privately answered any remaining questions. The instructions for this part are shown on the participants’ screens.

We match participants randomly into groups of six and inform them that they interact only with members of their own matching group throughout the entire experiment (partner matching). Each matching group consists of two fixed subgroups of three participants each, formed using the over- and underestimator minimal group task of Tajfel (1970), as modified by Böhm et al. (2013): Participants see, for a duration of 0.5s, a random number of between 5 and 30 ‘X’ symbols on their respective screens. They then estimate the number of ‘X’ symbols they just saw. After repeating this task five times, we sum up the individual deviations from the true numbers of symbols per participant and split the cohort into two equally-sized groups at the median score (breaking ties randomly). Böhm et al. (2013) show that this procedure yields groups that behave similarly to each other in experiments, yet in which members of a group feel ‘closer’ to other members of their own group than to members of the respective other group. Balliet et al. (2014) further show that studies using groups generated with the minimal group task yield results that are comparable to those of studies using natural groups.

After finishing Part A, each participant gets informed about whether they are an over- or underestimator. The experimenter then distributes the instructions for Part B and again reads them out aloud, asking the participants to read along.

3.2 Part B: Main experimental task

After answering any remaining questions individually, the experimenter starts Part B of the experiment, which begins with comprehension questions. After all participants have correctly answered these questions, one participant rolls a die to assign the over- and underestimators to the roles of migrants and citizens. We privately inform participants about their respective role assignments, i.e., subgroup memberships.Footnote 8 The experimenter then starts the main experimental task, consisting of 20 periods of the stage game, conducted in four experimental treatments that differ from each other in terms of the distribution scheme (for details see Sect. 3.3). After the 20 periods are over, one of them is randomly selected for payoff, again by one of the participants publicly throwing a die (see Sect. 3.4 for details regarding the payoff scheme).Footnote 9

The participants then answer a computerized questionnaire eliciting socio-demographic characteristics and asking how they view the impact of migrants on the labor market and on the social security and welfare systems. The questionnaire furthermore includes the question for our manipulation check of the group manipulation task (the questionnaire is reproduced in online appendix section E.B). We ask participants to wait until all participants have finished filling in the questionnaire before they are paid in private and leave.

3.3 Treatment variations and distribution policies

Independent of treatment, each participant receives an endowment of 20 points at the beginning of each period. Citizens automatically contribute their 20 points to the jointly created good. Migrants can decide whether to contribute or not. Their contribution decision is a binary, all-or-nothing decision. Therefore, the total contributions C increase in increments of 20 points for each migrant who chooses to contribute.Footnote 10 For experimental convenience (and in line with the largest part of the related literature) C are essentially monetary contributions, although conceptually, C could be thought of as representing contributions of labor and thus of citizens’ and migrants’ time and effort. C is multiplied by a growth factor of 2.5 to yield the total pot \(P = 2.5C\).

Table 1 gives an overview of the distribution policies used in our experiment. Figure 2 furthermore illustrates how the payoffs from the jointly produced good are calculated under the different policies. We denote our distribution policies \(\theta \in \{{\textbf {Equal distribution}},{\textbf {Moderate inequality}},{\textbf {Extreme inequality}},{\textbf {Inefficient distribution}}\}\). In treatment \(\textsc {Equal}\), the only policy available is \({\textbf {Equal distribution}}\), in which citizens and migrants share the benefits from the jointly created good equally, in that 50% of the benefits accrue to the subgroup of citizens and 50% accrue to migrants. If we denote the share of the total pot P that goes to citizens \(s_{cit}^\theta\) and the share that goes to migrants by \(s_{mig}^\theta\), the shares under policy \(\theta ={\textbf {Equal distribution}}\) thus imply \(s^\theta _{cit} =.5\) and \(s^\theta _{mig} =.5\). In treatment \(\textsc {Equal}\), we ask citizens to individually indicate which share (in steps of 10 percentage points) they would prefer the group of citizens to receive, with the remainder (i.e., 100 percent minus the share they choose) thus going to the group of migrants. This citizen choice in treatment \(\textsc {Equal}\), however, is entirely hypothetical and does not affect actual payoffs. Instead, all participants share the goods’ proceeds equally. Treatment \(\textsc {Equal}\) thus allows us to observe migrants’ contribution decisions in the absence of the threat of discrimination. Those playing the role of citizens were clearly informed about their hypothetical decisions, which were solely intended to give them something to do while migrants made their decisions.

Fig. 2
figure 2

Calculation of payoffs from the jointly produced good under different distribution policies. Note: Solid edges indicate progression without participant choice, dashed edges indicate migrant choices. The figure presents the distribution policies citizens can choose from across all treatments. Citizens in the 10 independent observations (i.e., matching groups) of treatment \(\textsc {Equal}\) have no choice; there policy Equal distribution is implemented automatically, implying equal shares for citizens and migrants, i.e., \(s^\theta _{cit} =.5\) and \(s^\theta _{mig} =.5\). Citizens in the 11 observations of treatment \(\textsc {Moderate}\) can choose between Equal distribution and Moderate inequality, citizens in the 12 observations of treatment \(\textsc {Extreme}\) choose between Equal distribution and Extreme inequality, and citizens in the 10 observations of treatment \(\textsc {Inefficient}\) choose between Equal distribution and Inefficient distribution, with the shares listed in the headings of each block. Note that in \(\textsc {Equal}\), votes are cast for hypothetical policies and are thus not payoff-relevant

Table 1 Distribution policies

In treatments \(\textsc {Moderate}\), \(\textsc {Extreme}\) and \(\textsc {Inefficient}\), citizens can vote either for applying the \({\textbf {Equal distribution}}\) policy or for applying an alternative policy where the payoffs from the jointly created good are distributed asymmetrically between citizens and migrants. We characterize the level of inequality as the percentage point difference between the payoff shares of migrants and citizens, respectively. Inequality thus equals 0 in policy Equal distribution. In treatment \(\textsc {Moderate}\), the alternative policy \(\theta ={\textbf {Moderate inequality}}\) implies inequality of.2, as it allocates 60% of the payoffs from the good to citizens and 40% to migrants. In treatment \(\textsc {Extreme}\), the alternative policy \(\theta ={\textbf {Extreme inequality}}\) maximizes inequality at.6, with shares of 80% and 20% for citizens and migrants, respectively. In treatment \(\textsc {Inefficient}\), the alternative policy \(\theta ={\textbf {Inefficient distribution}}\) entails shares 50% and 40% for the subgroups of citizens and migrants, respectively, and thus inequality of.1.

3.4 Parametrization and sampling

We denote the number of citizens by \(n_{cit}=3\) and the number of migrants by \(n_{mig}=3\). The number of migrants who contribute is denoted by \(0\le n^+_{mig}\le n_{mig}\). C is a function of the number of citizens, of the number of migrants who contribute, and of each participant’s endowment of 20 points, such that \(C=20\cdot \left( n_{cit}+n^+_{mig}\right)\). If all migrants contribute, then each participant’s individual payoff within each of the two subgroups of citizens and migrants (\(\pi _{cit}\), \(\pi _{mig}\)) is equally large. Specifically, it amounts to \(\pi _{cit}=P\cdot \nicefrac {s_{cit}^\theta }{n_{cit}}\) for all citizens and to \(\pi _{mig}=P\cdot \nicefrac {s_{mig}^\theta }{n_{mig}}\) for all migrants.Footnote 11,Footnote 12 Following each period, we inform citizens and migrants about each other participant’s contribution and payoff.

The first five periods of each experimental session serve as initialization periods and use the Equal distribution policy.Footnote 13 Nevertheless, all participants are already aware of the exact alternative policy that will become available for implementation from period 6 onwards-if any. We chose \({\textbf {Equal distribution}}\) as the simplest of the available rules, since these initialization periods are intended to familiarize participants with the experimental task and to establish choice patterns which participants can condition their later choices on. They also allow for experiencing the causal effect of migrants’ initial behavior on citizens’ policy decisions. Starting with period 6, citizens at the beginning of each period participate in a majority vote on whether or not they wish to implement the \({\textbf {Equal distribution}}\) policy or, alternatively, a discriminatory policy where migrants receive lower shares from the good’s proceeds than do citizens. In addition to unequal shares for migrants and citizens, treatment \(\textsc {Inefficient}\) models a situation characterized by efficiency losses. If citizens implement the Inefficient distribution policy, the distribution shares are \(s_{cit}^{\theta }=.5\) and \(s_{mig}^{\theta }=.4\), implying inefficiency of.1 times (or 10 percent of) the total benefit. This inefficiency can be interpreted as reflecting transaction costs for the implementation of the discriminatory policy. Citizens who vote for the discriminatory policy in this treatment thus exhibit a preference for disadvantaging migrants even in the absence of a concurrent monetary benefit to themselves.

We ran our experiment between November and December 2019 in the Max-Jung-Laboratory for Experimental Economic Research at the University of Graz, Austria, studying 258 students’ decisions in 43 groups of 6 students each. The average age was 24 years (\(SD=4.80\), range: 18 to 63 years), 58 percent were female, and 69 percent were Austrians. We recruited our participants using ORSEE (Greiner 2015). Sessions lasted 75 min on average. We paid participants in euros immediately after the session ended, using an exchange rate from experimental points (P) to euros of 37 P = 1 euro. Participants earned an average of 15.91 euros (\(SD=3.82\), range: 4.90 to 27.10).

We programmed the experiment in z-Tree (Fischbacher 2007). The experiment was not pre-registered, as pre-registration was not yet considered the norm at the time the experiment was planned and conducted. The research was approved by the Ethics Committee of the University of Graz.

3.5 Framing

Instead of the more common approach of providing experimental participants with an abstract decision-making situation, we chose to use the context-framed (i.e., meaningful) terms of ‘economic migrants’ and ‘destination country citizens’ for the two roles in our experiment. In doing so, we follow Alekseev et al. (2017), who find that such context helps participants better understand a setting and make more consistent decisions.

(Economic) migrants are often perceived as threatening local jobs and burdening the social security and welfare (SSW) systems (Ogunyemi 2012; Kershen 2005). (A policy motivated by such perceptions are Austria’s indexed child benefits, under which workers with EU citizenship who work in Austria, yet whose children live outside of Austria, receive lower child benefits PwC 2021.) In fact, economic migrants’ contributions to their destination country are strongly determined by two factors: economic migrants’ willingness to take up work and integrate into society, and the willingness of destination country citizens and policy-makers to in turn allow and enable economic migrants to pursue formal work and integrate into society and its SSW systems Spies and Schmidt-Catran 2016; UNDESA 2017; Phillimore et al. 2018; World Bank Group 2018. It is precisely this two-way and often reciprocal willingness to allow for, and to engage in, integration into the labor market and the SSW systems that we use as the context for our study of intergroup cooperation. More precisely, we frame our experiment as investigating how economic migrants react to (1) the possibility and subsequent implementation of redistributive policies which lead to a distribution of SSW benefits that treats destination country citizens and economic migrants unequally, as opposed to (2) the possibility but subsequent non-implementation of such policies, and (3) the situation where such policies are neither being discussed nor implemented. We also show how experimental participants in the role of destination country citizens react to the observed labor market choices of the participants in the role of economic migrants.

By framing our experiment similar to Mitterbacher et al. (2024), we attempt to map the real-life issue of economic migrants entering a destination country’s labor market and SSW systems onto our experimental design.Footnote 14 Independent of treatment, migrants may choose whether or not they wish to enter the formal labor market. Translated to our framing, they can choose whether or not they wish to contribute to the jointly created good (the SSW systems). The citizens in the experiment are considered to be employed and thus get no choice about whether or not they contribute to the SSW systems, but rather contribute automatically.Footnote 15 Conceptually, the employer directly transfers part of the employees’ wages to the SSW systems (both for migrants and citizens).Footnote 16 Regardless of their employment statuses, all migrants and citizens receive benefits from the SSW systems. While the total benefits paid out by the SSW systems depend on migrants’ contribution decisions (as citizens contribute automatically), the share of this total that an individual participant receives also depends on the experimental treatment and on citizens’ policy decisions.Footnote 17

4 Hypotheses

Previous studies have typically discussed two polar outcomes regarding the contributions to public goods: the Nash equilibrium, i.e., zero contribution, and the social optimum, i.e., full contribution (Hichri 2005). Rational participants with no (binding) means of coordination will not contribute to the pool, therefore failing to increase their wealth beyond their initial endowments. The Nash equilibrium thus entails defection and yields total welfare that is low relative to the social optimum, in which everyone contributes fully. The social optimum is defined as the maximum possible welfare that migrants and citizens can collectively attain. In our study, the social optimum that all treatments have in common is characterized by contributing migrants. In case of treatment \(\textsc {Inefficient}\), it further requires non-discriminating citizens (i.e., citizens voting for the Equal distribution policy). The inverse behavior, i.e., non-contributing migrants and discriminating citizens, yields the Nash Equilibrium.Footnote 18

We introduce six hypotheses: the first three concern migrants’ and the other three concern citizens’ behavior. To facilitate comprehension of how our hypotheses map into the experimental design and, ultimately, the results, we formulate our hypotheses and our results by referring to reciprocity between ‘migrants’ and ‘citizens’ instead of using the more abstract terms of intergroup cooperation between in- and outgroup members. Where we discuss the conceptual implications of our design and results (e.g., when deriving our hypotheses, or when presenting the conclusions of our study), however, we will use wording that aligns with the language of the literature on other-regarding preferences and cooperation in (asymmetric) social dilemmata.

The asymmetric payoff structure, i.e., the key feature of our experiment, forms the basis for Hypothesis 1. As the literature reviewed in Sect. 2 shows (e.g., Marwell and Ames 1979; Rapoport and Suleiman 1993; Fisher et al. 1995), inequality generally tends to weaken cooperation. We thus expect migrants’ average contribution rates to decline as potential inequality (i.e., the level of inequality that citizens can implement) increases. Consequently, we expect the average contribution rates to be highest in \(\textsc {Equal}\) and lowest in \(\textsc {Extreme}\).Footnote 19 Our first hypothesis thus is:

Hypothesis 1

A migrant’s probability of contributing decreases in the level of potential inequality available in the discriminatory policy.

Testing approach We test Hypothesis 1 by conducting a non-parametric Kruskal–Wallis test and a Tukey post-hoc test (both in Sect. 5.1) and by estimating the multilevel probit regression Models (1) through (3) reported in Table 2.

Table 2 Multilevel probit regression for a migrant contributing

Hypotheses 2 and 3 focus on migrants’ contribution decisions and address the endogenous factors most relevant to our setting: pro-social preferences and past experience. We expect the participants in our experiment to tend towards two types of behavior: on the one hand, contribution behavior that is dominated by self-interest, and on the other hand, contribution behavior that is dominated by pro-sociality. Participants with pro-social preferences are willing to increase social welfare by contributing in a public good setting (Epper et al. 2020; Chambers 2012; Fleurbaey and Zuber 2013). Such cooperation, however, is susceptible to being exploited by free-riding (Ertan et al. 2009). Cooperation is rarely unconditional. Even if migrants initially contribute, they typically cease to do so as time passes and as they observe other migrants failing to contribute while reaping higher payoffs (Fischbacher et al. 2001; Keser and Van Winden 2000). We thus predict the following behavioral pattern to emerge after some initial fluctuations: if contributing migrants are faced with non-contributing migrants, the former will become less likely to cooperate in the future; if contributing migrants are faced with other contributing migrants, they will become more likely to cooperate in the future. Taking these factors into consideration, we state our second hypothesis as follows:

Hypothesis 2

A migrant’s probability of contributing increases in the number of other migrants who contributed in the preceding period.

Testing approach We test Hypothesis 2 by conducting the multilevel probit regression Model (2), corroborated by Model (3), both reported in Table 2.

Migrants faced with discrimination by citizens can react by contributing (e.g., in an attempt to favorably affect citizens’ voting behavior in subsequent rounds) or by not contributing (e.g., out of anger, or a wish to ‘punish’ citizens in turn). Which of the two they choose is an open question and may be driven by traits of the individual decision-maker or situational factors. Nevertheless, Houser et al. (2008) show that the threat of punishment can lead to decreased cooperation and Houser et al. (2012) show that individuals who feel that they are treated unfairly in an interaction with another person are more likely to cheat going forward. Based on these findings, we predict that contributing migrants, faced with citizens who vote for discrimination, will become less likely to cooperate in the future. We thus state our third hypothesis as follows:

Hypothesis 3

A migrant’s probability of contributing decreases in the number of citizens who voted for discrimination in the preceding period.

Testing approach We test Hypothesis 3 by conducting the multilevel probit regression Model (3) reported in Table 2, corroborated by the Models OA(12) and OA(13) reported in Table OA.4.

Our remaining three hypotheses cover citizens’ discrimination behavior. Hypothesis 4 relates to the treatment effect in levels of discrimination. In this first hypothesis addressing citizens’ behavior, we assume that citizens decide without regard to the effect of their choices on how migrants may respond in subsequent periods. Under this assumption, discrimination is likely to be driven by profit maximization (Koplin 1963).Footnote 20 Yet profit maximization is available as a motive for discrimination only in \(\textsc {Moderate}\) and \(\textsc {Extreme}\), since citizens in \(\textsc {Inefficient}\) reap no monetary benefits from discriminating. If citizens choose without conditioning their behavior on its potential effect on future migrant decisions, fully profit-maximizing citizens would always vote for discrimination in \(\textsc {Moderate}\) and \(\textsc {Extreme}\), while they would be indifferent between equal distribution and discrimination in \(\textsc {Inefficient}\). Citizens whose behavior is to some degree driven by anti-social preferences about the payoff to others would face greater incentives for discrimination in \(\textsc {Extreme}\) than in \(\textsc {Moderate}\), since both the intensity of, and the payoffs from, discrimination—given a fixed level of migrant contributions—are greater in the former than in the latter. Accordingly, we predict average voting rates for discrimination will be highest in \(\textsc {Extreme}\) and lowest in \(\textsc {Inefficient}\). In light of these arguments, our fourth hypothesis is:

Hypothesis 4

A citizen’s probability of voting for discrimination increases in the level of potential inequality available in the discriminatory policy.

Testing approach We test Hypothesis 4 by conducting a non-parametric Kruskal–Wallis test supplemented by pairwise U-tests and a Tukey post-hoc test (all three tests in Sect. 5.2) and by estimating the multilevel probit regression Models (4) through (6) reported in Table 3.

Table 3 Multilevel probit regression for a citizen voting for discrimination against migrants

Our final two hypotheses relate to potential drivers of discrimination rooted in other-regarding preferences. For the fifth hypothesis, we assume that citizens condition their behavior on the past decisions of their peers, since the literature documents conditional interactions both in general (Kourtellos and Petrou 2020) and in public goods games in particular (Li et al. 2013; Battu and Srinivasan 2020). Conditional behavior can arise from, e.g., social norms (Bicchieri 2010) or peer pressure (Mittone and Ploner 2011).

Social norms are implicit behavioral rules based on shared expectations of how individual group members should typically behave in a given situation. In general, descriptive social norms, i.e., norms that refer to actually predominant behavior in groups, induce positive feedback loops between individual and group behavior: the more others someone observes following a social norm, the more strongly this someone is motivated to follow it themselves (Burke and Young 2011). In our study, voting for the discriminatory policy may be—or may become—the descriptive social norm.

Similarly, peer pressure can also affect individual behavior in decision-making (Mani et al. 2013). When peers observe choices (such as in our experiment, where the end-of-period payoff screen reveals how many citizens voted for policy implementation), individuals can be encouraged to emulate the decisions of others (Mittone and Ploner 2011; Cheng et al. 2020). In our case, citizens may vote for the discriminatory policy solely out of peer pressure. Mittone and Ploner (2011) find that the presence of group identity promotes peer pressure. This effect is also likely to be active in our experiment, since our group formation task was designed to induce precisely such feelings of group identity among our participants. We specify our fifth hypothesis as follows:

Hypothesis 5

A citizen’s probability of voting for discrimination increases in the number of other citizens who voted for discrimination in the preceding period.

Testing approach We test Hypothesis 5 in the multilevel probit regression Models (5) and (6) reported in Table 3, corroborated by Model OA(15) reported in Table OA.5.

Besides the assumed conditional behavior between citizens, we expect a conditional cross-relationship between citizens and migrants. Citizens anticipate the potential effects of their own actions on migrants’ decisions. Through such a channel, discrimination can, for example, function as a form of punishment. Houser and Xiao (2010) show that many people punish after having received disadvantageous outcomes. We take this as evidence for reciprocal concerns and use it to formulate Hypothesis 6. We predict the following behavioral pattern to emerge after some initial fluctuations: citizens will reciprocate migrants’ behavior by discriminating when the latter do not contribute, and by refraining from discriminating when the latter do. In a sense, citizens may thus use discrimination as an educational device to ‘teach’ migrants to behave cooperatively. Note that in \(\textsc {Inefficient}\), this may in fact be the main driver of votes for discrimination. The reason is that citizens in \(\textsc {Inefficient}\) have no direct (i.e., unconditional) profit incentive for discriminating against migrants, because citizens reap no personal, monetary benefits from such discrimination. Our sixth hypothesis thus is:

Hypothesis 6

A citizen’s probability of voting for discrimination decreases in the number of migrants who contributed in the preceding period.

Testing approach We test Hypothesis 6 in the multilevel probit regression Model (6) reported in Table 3, corroborated by Model OA(16) reported in Table OA.5.

5 Results

We first report our findings regarding the contribution behavior of (students in the role of) migrants and then those regarding the discrimination behavior of (students in the role of) citizens.Footnote 21 With one exception that we will mention later, all of our results are robust to controlling for participant characteristics.

5.1 Migrant behavior

We begin our analysis of migrant behavior by studying average contribution rates and their development over time. Figure 3 plots the average share of contributing migrants (which, due to the binary nature of contributions, equals average contribution rates) by period, both averaged over the four treatments and separately for each. We observe typical endgame effects (Stoecker 1983; Selten and Stoecker 1986). Contribution rates drop in period 19 in all treatments and again in period 20 in treatments \(\textsc {Equal}\) and \(\textsc {Moderate}\). We therefore exclude periods 19 and 20 from the remaining analyses.

Fig. 3
figure 3

Time path of the average share of migrants contributing. Note: Solid lines show the average share of migrants contributing to the jointly created good under the \({\textbf {Equal distribution}}\) policy (\(\textsc {Equal}\)) and averaged for all policies/treatments (AVERAGE); dashed lines show shares under the discriminatory policies (\(\textsc {Moderate}\), \(\textsc {Extreme}\), and \(\textsc {Inefficient}\)). Average contribution rates in a given matching group and period are 0, 1/3, 2/3 or 1, depending on the number of migrants in the matching group who contribute

The all-treatment average contribution rate follows a downward trend, just as in most repeated linear public goods games (see, e.g., Ledyard 1995, 2020). Average contributions decline from 59.3 percent in period 1 to 48.5 percent in period 18. In the individual treatments, we observe a similar trend in \(\textsc {Moderate}\), \(\textsc {Extreme}\) and (to a lesser extent) \(\textsc {Inefficient}\).

Focusing on the five initialization periods, a Kruskal–Wallis test using average contribution rates at the group level as the independent observations does not detect a statistically significant treatment difference (\(\chi ^{2}(3)=4.6133, p=.2024\)).

Turning to a first look at the treatment differences in contribution rates, we plot average contribution rates in Fig. 4. We observe the highest average contribution rates in \(\textsc {Equal}\) (60.2 percent), closely followed by \(\textsc {Inefficient}\) (59.1 percent). We find considerably lower rates in \(\textsc {Moderate}\) (43.1 percent), with \(\textsc {Extreme}\) in between (50.5 percent). These differences turn out not to be statistically significant when we use matching groups’ average contribution rates from periods 1–18 as the independent observations in a Kruskal–Wallis test (\(\chi ^{2}(3)=3.7493, p=.2898\)). In general, migrants’ lower average contribution rates in \(\textsc {Moderate}\) and \(\textsc {Extreme}\) compared to \(\textsc {Equal}\) and \(\textsc {Inefficient}\) would be consistent with discriminatory policies having a negative impact on average contribution rates. However, observing differences in average contribution rates in the different treatments would not yet allow us to pin down whether the differences arise from the mere availability of such policies or from their actual implementation. Yet, this interdependence between contribution decisions, policy treatment variations, and policy voting decisions of citizens is what we are interested in. To account for this interplay of factors, we continue our analysis at the participant\(\times\)period level using multilevel probit regressions. We present the results in Table 2.

Fig. 4
figure 4

Share of contributing migrants in the different treatments. Note: The figure shows the average contribution rates to the jointly created good over periods 1 through 18, with 95% confidence intervals and separately for each treatment. See online appendix Figure OA.1 for a version of the figure including all periods (1 through 20)

Our model includes three hierarchical levels to account for the nested structure of our data: the level of the participant (level 1), the level of the period (level 2) and the level of the matching group (level 3). In Model (1) we regress a migrant’s binary contribution decision (1 = contribute) in the current period on four treatment dummy variables to be able to test for treatment differences in the probability of contributing. To account for treatment-specific time trends, we include interactions between the treatment dummies and the period number, P. We observe good model fit, as indicated by AIC decreasing from 2325.07 in the NULL-model (see online appendix section C for details) to 2267.23 in Model (1). A log-likelihood ratio test comparing Model (1) to the NULL-model also yields a significant, favorable result (\(\chi ^{2}(7)=71.8402, p<.0001\)). Since our models do not include a constant, we refrain from interpreting differences between the coefficients of the treatment dummies themselves.Footnote 22 We instead analyze these coefficients using a Tukey post-hoc test and find that the coefficients of \(\textsc {Extreme}\) (1.11, \(p=.0410\)) and \(\textsc {Inefficient}\) (1.16, \(p=.0459\)) are significantly greater than that of \(\textsc {Moderate}\) (\(-0.24\)), indicating a significantly higher probability to contribute in the former two treatments compared to \(\textsc {Moderate}\). The higher probability of contributing in \(\textsc {Extreme}\) and \(\textsc {Inefficient}\) is further documented by the fact that we find coefficients significantly greater than zero for \(\textsc {Extreme}\) (1.11, \(p=.0023\)) and \(\textsc {Inefficient}\) (1.16, \(p=.0039\)) only. All other pairwise comparisons yield no significant results (all \(p >.05\)). The significant, negative coefficients of the interaction terms of P with both \(\textsc {Extreme}\) (\(-0.10\), \(p<.0001\)) and \(\textsc {Inefficient}\) (\(-0.06\), \(p<.0001\)) confirm the time trend observed for these two treatments in Fig. 3. The interaction between P and the \(\textsc {Equal}\) treatment dummy yields no significant evidence of a time trend in this treatment (0.02, \(p=.0836\)).

Model (2) extends Model (1) by including the contribution decision in the first period as a proxy of a participant’s ‘type’, interpreted as being either generally cooperative (ContribOwn\(_{t=1}\) = 1) or not (ContribOwn\(_{t=1}\) = 0).Footnote 23 We also include the participant’s once-lagged contribution decision and two once-lagged dummy variables for whether one or two of the other migrants contributed. The positive and significant coefficients for the contribution decision in the first period (1.07, \(p<.0001\)) and also for the lagged contribution decision (0.45, \(p<~.0001\)) indicate behavioral stability of the contribution decision. We also find significantly positive coefficients for the other migrants’ lagged contribution decisions (0.26, \(p=.0038\) for one other migrant, and 0.92, \(p<~.0001\) for two other migrants contributing). A decrease in AIC to 2122.05 from 2267.23 in Model (1) and a significant log-likelihood ratio test (\(\chi ^{2}(4)=153.1846\), \(p<.0001\)) document increased model fit.

Model (3) adds three dummy variables that reflect whether one, two, or three citizens (with zero serving as the base category) voted in favor of the discriminatory policy in the preceding period. This allows us to study how migrants react to an increasing number of citizens who vote for the discriminatory policy. Note that when we include this predictor, we need to exclude observations from treatment \(\textsc {Equal}\), as citizens did not have the option of implementing discriminatory policies in this treatment. The coefficients for all three dummy variables are negative and significantly different from zero, with the coefficient for only one citizen voting for the discriminatory policy in the previous period being the smallest of the three in absolute terms (\(-0.37\), \(p =.0488\)). Coefficients for two and three citizens voting for the discriminatory policy are larger in absolute terms (\(-1.15\) and \(-1.13\), respectively, with both \(p <.0001\)). Thus, when even one citizen votes for the discriminatory policy, as compared to none, migrants become less likely to contribute in the subsequent period. The effect is stronger, however, if the majority of two or three migrants vote in favor of the discriminatory policy.

A decrease in AIC to 1059.33 from 1115.26 in Model OA(11) (online appendix Table OA.4) and a significant log-likelihood ratio test (\(\chi ^{2}(3)=61.9243, p<.0001\)) document increased model fit.

Hypothesis 1 expresses our expectation that migrants’ willingness to contribute decreases in the level of potential inequality available through the discriminatory policy. However, given the varying coefficients of the treatment dummies in Models (1) through (3) and our results from the earlier Kruskal–Wallis test, we conclude that there is no clear evidence of a significant treatment effect in overall contribution rates. We thus formulate our first result as follows:

Result Hypothesis 1

We do not identify a consistent effect of the level of potential inequality available in the discriminatory policy on migrants’ contribution probability.

The probability to contribute in Model (2) increases when one and even more so when both other migrants contributed in the preceding period. After controlling for the number of votes in favor of a discriminatory policy in the preceding period (Model (3)), only the effect of both other migrants contributing remains significant. This show that migrants condition their contribution decision on the past contribution decisions of their fellow migrants, with both other migrants contributing having a stronger, more robust effect. We therefore formulate our next result in favor of Hypothesis 2:

Result Hypothesis 2

Migrants’ probability of contributing increases in the number of other migrants who contributed in the preceding period.

In our third hypothesis regarding migrant behavior, we study the effect of citizens’ votes for the discriminatory policy. Our results regarding this question are relatively clear. Model (3) documents that greater numbers of citizens voting in favor of the policy reduce migrant contributions in the subsequent period, with the effect being more pronounced when a majority of citizens votes in favor of the policy (and the discriminatory policy thus gets implemented).

Result Hypothesis 3

Migrants’ probability of contributing decreases significantly in the numbers of citizens who voted for discrimination in the preceding period.

5.2 Citizen behavior

We continue our analysis by studying citizens’ average rates of voting for discriminatory policies and these rates’ development over time, depicted both in average form and by individual treatment, in Fig. 5. Note that, since policy implementation requires a majority vote (i.e., two of the three citizens voting in favor), implementation rates differ slightly from vote rates (but they exhibit the same trends, as documented in online appendix Figure OA.3). We again observe an endgame effect in the form of an increasing share of citizens voting for the implementation of the discriminatory policy starting with period 19. Both for this reason and in the interest of consistency with the results regarding the migrants, we exclude the last two periods from our further analyses of citizen behavior.

Fig. 5
figure 5

Time path of the average share of citizens who vote for discrimination against migrants. Note: The solid line shows the average share of citizens voting for the implementation of the discriminatory policy averaged over all treatments (AVERAGE); dashed lines show shares under the discriminatory policies (\(\textsc {Moderate}\), \(\textsc {Extreme}\), and \(\textsc {Inefficient}\)). Average voting rates are 0, 1/3, 2/3 or 1, depending on the number of citizens in a matching group who vote for the discriminatory policy

Unlike in migrants’ contribution behavior, we do not observe a clear time trend in citizens’ voting behavior. The overall share of citizens voting for the implementation of the discriminatory policy starts at 51.9 percent in period 6 and ends at 45.9 percent in period 18. However, we see clearly lower voting rates throughout the experiment in treatment \(\textsc {Inefficient}\), starting at 33.3 percent in period 6 and ending at 13.3 percent in period 18. We confirm this difference in overall average voting rates by the findings documented in Fig. 6. In \(\textsc {Inefficient}\), citizens on average vote for the implementation of the discriminatory policy 24.1 percent of the time. In contrast, citizens on average voted more often to implement the discriminatory policy in both treatment \(\textsc {Moderate}\) (50.8 percent) and treatment \(\textsc {Extreme}\) (56.4 percent). A Kruskal–Wallis test of group average voting rates over periods 6 through 18 finds a significant difference (\(\chi ^{2}(2)=11.024, p=.0040\)). Pairwise comparisons with U-tests show that average voting rates for the discriminatory policy in \(\textsc {Inefficient}\) are, after Holm correction for multiple testing, significantly lower than in both \(\textsc {Moderate}\) (\(p =.0164\)) and \(\textsc {Extreme}\) (\(p =.0122\)), while the latter two do not differ significantly (\(p =.2939\)). Note that both \(\textsc {Moderate}\) and \(\textsc {Extreme}\) provide incentives for participants with selfish (i.e., payoff maximizing) preferences to vote for discriminatory policy implementation that are absent in \(\textsc {Inefficient}\). Our results thus show that, when such incentives for selfish votes are absent, efficiency concerns and pro-social preferences attenuate the tendency to vote for the discriminatory policy.Footnote 24 Conversely, if efficiency is not an issue, or the selfish incentives outweigh other-regarding considerations, i.e., in treatments \(\textsc {Moderate}\) and \(\textsc {Extreme}\), the level of potential inequality that the policy offers does not affect voting rates. This suggests a type of binary behavior: When they can personally profit from its implementation, citizens with predominantly selfish preferences vote for the policy, irrespective of the precise level of potential inequality available (hence no significant difference between \(\textsc {Moderate}\) and \(\textsc {Extreme}\)). When citizens cannot personally profit from the policy’s implementation, they vote against its implementation. They thus do not exhibit behavior driven by such motivations as spite.

Fig. 6
figure 6

Share of citizens’ votes for discrimination in the different treatments. Note: The figure shows the average voting rates in favor of a discriminatory policy over periods 6 through 18, with 95% confidence intervals and separately for each treatment. See online appendix Figure OA.2 for a version of the figure including all periods (1 through 20)

We analyze predictors of citizens’ voting behavior using multilevel probit regressions and present the results in Table 3. We again account for the nested data structure at the participant (level 1), period (level 2) and matching group (level 3) levels. Model (4) includes dummy variables for the three treatments with payoff-relevant citizen decisions, and interactions of these dummies with P. Using Tukey post-hoc tests, we find no significant treatment differences (all pairwise comparisons \(p>.05\)) for the treatment dummies when controlling for treatment-specific time trends. Overall, AIC decreases from 1179.54 in the NULL-Model (see online appendix section C for details) to 1169.04 in Model (4) and a likelihood ratio test yields a significant result (\(\chi ^{2}(5)=20.4935, p=.0010\)).

Model (5) adds the citizen’s lagged voting decision and dummy variables for the number of citizens voting for the discriminatory policy in the matching group in the preceding period (again with zero as the base category). We also add the citizen’s voting decision in period 6 as a proxy for the participant’s preference prior to observing the decisions of their fellow citizens. The latter receives a significantly positive coefficient (1.01, \(p<.0001\)): Citizens who voted for implementation of the discriminatory policy in period 6 are more likely to do so in subsequent periods. A citizen’s vote in the preceding period similarly has a positive, significant effect (0.66, \(p<.0001\)). Finally, we observe positive regression coefficients for the dummy variables for one (0.59, \(p =.0009\)), two (0.37, \(p =.0764\)) and three (0.59, \(p =.0300\)) citizens voting in favor of the discriminatory policy in the preceding period, although they are significant only for one and three citizens. This indicates that any number of citizens voting for the discriminatory policy in the preceding period, compared to the base category of no citizen voting for it, increases the likelihood of voting for the policy in the current period. For the model overall, AIC decreases by 74.69 compared to Model (4) and a likelihood-ratio test is significant (\(\chi ^{2}(5)=84.6899\), \(p<.0001\)).

To analyze the dependence of citizens’ voting behavior on migrants’ contribution behavior, we finally include dummy variables for the number of contributing migrants, lagged by one period and with none as the base category, in Model (6). All three dummies show negative coefficients, yet they are significant only for two or three migrants contributing (\(-0.17\), \(p=.2100\) for one, \(-0.60\), \(p =.0002\) for two, and \(-1.20\), \(p<.0001\) for three migrants contributing, respectively.). As the negative coefficients increase, the probability of citizens voting for the discriminatory policy in the current period decreases with increasing number of migrants contributing in the preceding period. AIC of Model (6) decreases by 31.95 compared to Model (5) and a likelihood-ratio test is significant (\(\chi ^{2}(3)=37.9496, p<.0001\)), together showing increased model fit.

Turning to Hypothesis 4, which states that the probability to vote for the discriminatory policy increases with the level of potential inequality, we find the treatment effects to be more stable for citizens’ voting decisions than they were for migrants’ contribution decisions. A Kruskal–Wallis-test and pairwise U-tests suggest that citizens are significantly less likely to vote for the discriminatory policy in treatment \(\textsc {Inefficient}\). Nevertheless, once we control for development over time and past decisions in the regressions reported in Table 3, we no longer detect significant treatment differences. One reason for this finding may be that, as shown in Model (6), a citizen’s vote for the (non-)implementation of the different policies is highly dependent on the behavior of migrants.Footnote 25 We thus conservatively conclude:

Result Hypothesis 4

Citizens’ probability of voting for discrimination is not significantly affected by the policy’s level of potential inequality.

Hypothesis 5 postulates that citizens condition their voting behavior on the voting behavior of the other citizens. Model (5) showed that this the case, with the caveat that while all dummy variables for the number of citizens voting in favor of the discriminatory policy are positive, only those for 1 and 3 citizens were significant. When we include the dummy variables for the number of migrants contributing in the preceding periods in Model (6), the coefficients’ sizes decrease, leading to all becoming insignificant. Our finding thus is:

Result Hypothesis 5

We do not identify a consistent effect of the number of other citizens who voted for discrimination in the preceding period on citizens’ voting behavior.

Finally, regarding Hypothesis 6, we find significantly negative coefficients for the lagged number of contributing migrants, as shown in Model (6). We thus find support for Hypothesis 6:

Result Hypothesis 6

Citizens’ probability of voting for discrimination decreases in the number of migrants who contributed in the preceding period.

5.3 Limitations

In this section, we specifically want to address two objections that are frequently discussed in work like ours. The first is that we employ student participants and do not explicitly recruit a sample more representative of the groups we model in our experiment, i.e., economic migrants. While some of our participants may have migrated themselves, or may have family members who have done so, they are not representative of the group we model. Yet, we are of course not the first to use student participants to study economic behavior in general and social dilemma situations in particular, or even to study migrant and citizen behavior, as outlined in Sects. 2 and 3. One example of the latter is Rustagi et al. (2010), who find that students in a lab experiment and participants in a field experiment in the Bale region in Ethiopia exhibit similar behavior. Several other studies have investigated the comparability of students and other participants and come to favorable conclusions (see, e.g., Liyanarachchi 2007; Druckman and Kam 2011; Exadaktylos et al. 2013). Finally, we study the effect of participants’ nationality on our findings, using nationality to proxy for past migration experience. This of course is not a perfect proxy, but we are nevertheless heartened by the effect of nationality being small and not significant throughout. We report the detailed results in section B of the online appendix, where we also include other control variables, like political interest.

The second limitation that we wish to discuss concerns our use of a lab setting. Lab experiments are often confronted with the concern of limited external validity. While external validity of course is not an issue exclusively in experimental research (Postman 1955; Gigerenzer 1984), it is a concern also there. In our specific case, we would expect migrants to also react to stimuli related to social norms, past experiences, personal relationships, etc., which we cannot re-create in the lab. Such factors could impact on how the mechanisms identified in the lab transfer to the richer field situation. Nevertheless, there is prior research documenting that mechanisms identified in the lab often translate well. Camerer (2015) is one example of a study that investigates whether economic experiments designed to generalize from the lab to the field are externally valid. He concludes that generalizability between the two situations is ‘generally rather good’.

Overall, we believe that our results document relevant patterns. Patterns which should ideally be replicated using evidence from real migrants and citizens in the field.

6 Conclusion

Our paper studies intergroup cooperation in societies where groups are asymmetrically endowed with the power to set distribution rules. We operationalize this setting by modelling asymmetric intergroup cooperation in the lab using a fictitious migration environment populated by students.

Experimental participants in the roles of destination country citizens and economic migrants interact in a lab environment, allowing us to study their mutual willingness to cooperate to solve a social dilemma. Members of the first group (citizens) can vote to implement a redistribution policy for the proceeds from the jointly created good that disadvantages the members of the second group (migrants). The members of the second group, in turn, can decide whether or not they want to contribute to creating the good together with the members of the first group. Our results document a strong reciprocal relationship between the members of the two groups. When their counterparts decide to forego redistribution and opt for sharing the proceeds from the jointly created good equitably, the participants in the group that faces the threat of disadvantageous redistribution increase their contributions. Conversely, when the threatened group contributes to jointly creating the good, the group with decision-making powers over the distribution rule reciprocates by opting for equal distribution of the good’s proceeds. Moreover, we observe the highest contribution rates (.60) under circumstances where redistribution is not possible, indicating that the mere possibility of one group implementing unequal distribution rules decreases cooperation in this setting.

Even though this summary of our results paints a picture of the self-reinforcing power of mutually beneficial cooperation, the reverse interpretation is, alas, equally valid. Discrimination against the group lacking decision-making power reduces its members’ willingness to contribute. Similarly, a lack of contributions on the part of this latter group induces discriminatory voting in the group possessing the decision-making power. This unfortunate relationship can easily lead to a downward spiral and thus constitutes a fragile basis for mutually beneficial co-existence in a society.

A closer look at our data also reveals that participants’ decisions are in part driven by their own past decisions as well as by the past decisions of other participants in the same group. We interpret the former—autocorrelation of a participant’s own decisions—as evidence of potential inherent ‘behavioral types’ among our participants. Similarly, we interpret the latter—cross-correlation between a participant’s decisions with his or her ingroup members’ lagged decisions—as suggestive of the endogenous development of something akin to a social norm.

Apart from these conditional behavioral patterns, our results reveal a general trend for contributions to decline over time, as can be found in many other experiments studying jointly created goods.

Our results suggest that group norms strongly influence cooperative behavior. Societies that promote cooperation as a valued norm—such as our equal distribution policy—tend to enjoy higher levels of intergroup cooperation. Furthermore, pro-social behavior is a key factor in promoting positive intergroup relations in asymmetric environments. More cooperative behavior by members of an outgroup renders ingroup members more willing to cooperate in turn. Our experiment thus clearly supports reciprocity in intergroup cooperation settings. Overall, however, we observe a significant trend towards reduced cooperative behavior, demonstrating the fragility of intergroup cooperation.

In light of our findings, we recommend providing equal opportunities and fair policies to all groups within societies, since we find fairness to promote trust and cooperation.

For future research, our experimental game provides a ready framework for testing various other policies in a setting of possible intergroup cooperation (or competition). As Crespo Cuaresma et al. (2021) note, scientific evaluation of public policy promotes better societal outcomes by helping policy-makers select the most effective policy instruments. Whereas our study thus is an example of how such policy studies can be conducted in an abstract or in a framed setting, we recommend replicating such research with members of the actual groups of interest whenever external validity is of prime concern for a particular research question. To take our framing as an example, we would be excited to see follow-up work inviting individuals with an actual background of migration to the lab to gauge whether their decisions and outcomes differ from those of our student participants. Interacting with real economic migrants instead of students in the role of economic migrants may offer additional insights, just as the behavior of our student participants in the role of migrants may differ from that of actual migrants. Similarly, the behavior of participants in the role of citizens towards student participants in the role of migrants may differ from their behavior towards actual migrants.

In summary, we hope that our results and our experimental approach will offer future researchers a pathway to develop measures that cultivate cooperation and foster a more harmonious future for diverse societies.