The instability of globalization: applying evolutionary game theory to global trade cooperation

A new wave of protectionism is threatening the open and cooperative international order. This paper applies evolutionary game theory to analyze the stability of international trade cooperation. Global trade liberalization is modeled as an iterated prisoner’s dilemma between all possible pairs of WTO member states. Empirical data are used to model the sizes and competitiveness of the respective markets, which then determine the resulting gains and costs of trade cooperation. Because of the large number of WTO member states and repeated rounds of their interactions, we use computer simulations to calculate the strategies that lead to the maximum ‘fitness’ of the respective member states and consequently diffuse through the population of countries. The results of our simulations show that international trade cooperation is not a stable equilibrium and that extreme levels of trade liberalization can be exploited successfully by protectionist trade policies.


Introduction
The world's largest economy-the United States of America-has turned towards protectionist trade policies since Donald Trump took office as president (Ikenberry 2017;Irwin 2017;Norrlof 2018). By 2017, the Trump administration withdrew the United States from the Transpacific Partnership (TPP), put Transatlantic Trade and Investment Partnership (TTIP) negotiations on ice, and started to renegotiate the North American Free 1 3 Trade Agreement (NAFTA), which subsequently became the United States-Mexico-Canada Agreement (USMCA). In 2018, the United States imposed tariffs of between 20 and 50% on solar panels and washing machines, while tariffs on aluminum and steel imports were raised to 10% and 25%, respectively. In addition, President Trump repeatedly has announced an imminent increase of import tariffs on cars to 25%. Besides those general measures, China and the United States have engaged in a trade war, in which they repeatedly have raised import tariffs against one another. In January 2020, the two parties signed a preliminary agreement to moderate their dispute (the so-called Phase One Trade Agreement), 1 but currently (mid-February 2020), the different tariffs and counter-tariffs remain to be in place.
Much of public choice literature concentrates on the domestic politics behind external trade policies and treats countries' policymaking largely as independent of international politics. Based on the well-known argument that trade liberalization generally is welfare increasing, many scholars propose that protectionist measures result from rent-seeking by concentrated and well-organized interests (Damania et al. 2004;Lake and Linask 2015;Pecorino 1997;Aidt 1997). Principally concentrating on the trade policies of the United States, the literature further explores how factors like the rules of electoral competition (Baldwin and Magee 2000;DeVault 2013;Wagner and Plouffe 2019), political ideologies (Hoffman 2009;Nollen and Iglarsh 1990) or economic sensitivities (Arce et al. 2008;Nollen and Iglarsh 1990) determine the chances of protectionist interests' success. In contrast, the diffusion literature has shown convincingly that countries do not choose their trade policies independently, but that they observe and influence each other. As a result, policies like trade liberalization can diffuse throughout the international system. Globalization and an open trading order have at least partly been a consequence of such a successful diffusion of liberal trade policies during the 1990s and 2000s (Meseguer 2009;Pitlik 2007;Simmons and Elkins 2004).
Nowadays, the important question for the global trading order is whether protectionist trade policies can gain momentum similar to that of trade liberalization during the 1990s and 2000s. If that is the case, we could witness a new wave of protectionism that might shatter globalization at its core. In the following, we develop an evolutionary game theory model, demonstrating that global trade liberalization is not a stable equilibrium, and that it may indeed be followed by a wave of defection and protectionism. Our model of global trade cooperation necessarily abstracts from the domestic politics behind trade measures; consequently, it cannot explain why a country chooses a specific trade policy at a certain point in time. However, our model shall explain how such trade policies can survive and influence the global trading order within an international system wherein countries influence each other at least to some degree. Thus, we do not intend to offer an alternative to existing public choice models of domestic policymaking, but we aim to complement the debate by focusing on the global dynamics of international trade liberalization and protectionism.
To analyze the stability of global trade cooperation against defectionist and protectionist trade policies, the article proceeds in five steps. First, we discuss the challenges of applying evolutionary game theory models from biology to issues of international politics. Second, we build an evolutionary game theory model of global trade cooperation that takes the economic asymmetries between countries into account and understands the evolutionary process as a diffusion of successful trade policies within the population of countries. Third, we discuss the strengths and weaknesses of different strategies in our model and present the results of computer simulations of it. Fourth, we discuss the limitations of our model and sketch issues for further research. Finally, the conclusion summarizes our findings. Within an additional online appendix, we present some robustness checks, which demonstrate that changes in the assumptions of our model do not change the results of the simulations in an unpredictable way.

Theory: applying evolutionary game theory to global trade cooperation
International trade liberalization usually is modelled as a prisoner's dilemma, wherein all countries have a common interest in the economic gains from free trade, but face incentives to protect sensitive domestic industries at the same time (see, for example, Krugman 1992;Melese et al. 1989;Thorbecke 1997). If trade liberalization were a one-shot game, mutual defection would be the only Nash equilibrium. However, trade liberalization is an iterated game wherein countries have the possibility of reacting to each other's previous moves. As the work of Axelrod (1984Axelrod ( , 1997 has shown, iterations of the prisoner's dilemma allow countries to play tit-for-tat and thus to cooperate by opening their markets gradually and reciprocally (Axelrod and Keohane 1985;Keohane 1986;Rhodes 1989). International regimes, like the global trade regime established by the General Agreement on Tariffs and Trade (GATT) and the World Trade Organization (WTO) stabilize such cooperation by reducing the transaction costs of countries' interactions (Keohane 1984;Stein 1982). According to that logic, trade wars cannot be won because they lead to endless rounds of retaliation and generate losses for every country-player involved (Conybeare 1985;Ossa 2014). Protectionist trade policies can exploit the good-will of tit-for-tat opponents only in one round of the game, as defectors will be punished with retaliatory measures in the following rounds. From that point of view, aggressive trade policies seem to be 'irrational' provocations that will be short-lived and will not have long-lasting impacts on the cooperative global order. Whereas conventional and iterated games have been used widely for explaining economic and political issues, evolutionary game theory mainly has been brought to bear in the field of biology. 2 Biologists like Maynard Smith (1982) developed evolutionary game theory models to explain why cooperative behavior among genes, cells and animals (Axelrod and Hamilton 1981) emerges in an environment of reproductive competition, wherein purely egoistic behavior should lead to an evolutionary advantage (Dawkins 1976). In contrast to Axelrod (1984Axelrod ( , 1997, the evolutionary biologist Martin Nowak and his associates argue that tit-for-tat-induced cooperation is not a stable equilibrium of an iterated prisoner's dilemma involving a large number of players (Imhof et al. 2005;Nowak 2006;Nowak and Sigmund 2004). Instead, they present a model wherein cooperation and defection follow each other in consecutive waves. Within their evolutionary game theory model, small groups of tit-for-tat players (conventional tit-for-tat and generous tit-for-tat) can 1 3 establish cooperation within a population dominated by defectors. Once cooperation is widespread within the population, a neutral shift towards unconditional cooperation can occur because tit-for-tat strategies do not punish unconditional cooperation. However, once the whole population cooperates, unconditional defection again becomes an attractive strategy because it exploits the cooperative behavior of others. The result is an endless cycle in which the level of cooperation in a given population first increases and then declines. Such an evolutionary game theory model allows for dynamic perceptions of cooperation and defection; it demonstrates that cooperation may be less stable than envisaged by conventional game theory.
Even though evolutionary game theory mainly has been deployed in the field of biology, its application to questions of international political economy is promising (Friedman 1998). For purposes of analyzing global trade cooperation, evolutionary game theory has at least three advantages over conventional game theory. First, evolutionary game theory is based on games involving large numbers of players (Friedman and Sinervo 2016;Gintis 2009;Nowak and Sigmund 2004). Herein, we are interested in the consequences of defectionist trade policies for international cooperation within a population of 164 WTO member states. Second, evolutionary game theory models are less static and deterministic than conventional game theory models, which provides more room for analyzing possible changes in the international system. Finally, because evolutionary game theory has been developed to study the behavior of genes, cells and animals (Axelrod and Hamilton 1981), it does not rely on the assumption of rational and well-informed actors. It does not matter whether a country adopts a strategy for normative reasons or purely for its own self-interests. What matters is the extent to which the strategy contributes to the country's fitness and whether the strategy can survive the selectivity of an evolutionary process.
The evolutionary game theory models of biologists are based on two crucial assumptions, which become problematic when such models are applied in the area of international political economy. First, the biologists' evolutionary game theory models assume homogeneous populations and that no asymmetries exist in the resources and capabilities of the players within those populations. Most models of evolutionary game theory assign simplistic numerical payoffs to the games that are played by the members of a population. 3 As long as all players can earn the same payoffs, such models need not distinguish between players and their strategies. It does not matter which player faces which opponent, only which strategy is superior. However, the international system is not composed of homogeneous actors: countries obviously differ in their sizes and resources. The consequences of such asymmetries for international cooperation were first addressed by so-called hegemonic stability theory in the 1970s (Krasner 1976), but they are not reflected in most (evolutionary) game theory models. 4 In our model, we distinguish between the stable characteristics of players (like an economy's competitiveness and its market size) and the strategies they can play (like unconditional defection, tit-for-tat, generous tit-for-tat or unconditional cooperation). As a result, our model represents an asymmetric game wherein countries' payoffs result from their own characteristics and from those of other players.
Second, the evolutionary process in the biologists' models results from reproductive competition wherein the players of unsuccessful strategies die and successful players have better chances of producing offspring. But countries do not die or reproduce as a result of their trade policies, implying that the evolutionary process in international politics cannot work the same way as in biological settings (Gintis 2009). In contrast to biologists, we understand the evolutionary process as a diffusion of successful strategies among countries (see, for instance, Elkins and Simmons 2005;Gilardi 2010;Shipan and Volden 2008;Yukawa et al. 2014). A country adopting an inefficient strategy either gains relatively little market access for its exports abroad or allows imports that squeeze out its uncompetitive business enterprises. As a result, domestic opposition to the country's trade strategy rises, which in turn makes policy change more likely. Whenever a country chooses to adapt its trade strategy, it generally observes the strategies of other countries and follows the example of an economically successful one. Thus, the strategies of fitter countries tend to diffuse within the global population of countries.

Method: computer modelling of global trade cooperation
Models of evolutionary game theory necessarily consist of two different parts: a game that is played repeatedly between all possible pairs of players plus a model of an evolutionary process in which successful strategies are favored over unsuccessful ones (Friedman and Sinervo 2016;Gintis 2009). Because of the large number of players and repeated rounds of interactions, it is not possible to forecast the behavior of evolutionary game theory models a priori. Therefore, we need the help of computer simulations to study them. The following sections lay down our methodological decisions with respect to the game of global trade cooperation, the evolutionary competition between different trade policies, and the computer program used to analyze the behavior of our model.

The iterated prisoner's dilemma of trade liberalization
Open markets are beneficial for almost all countries under nearly all circumstances (Krugman 1987). International trade allows countries to exploit comparative cost advantages (which result from the different factor endowments of participating economies) and economies of scale (which result from access to larger markets). In our model (see Fig. 1), the gains from international trade are the product of the relative strength of the domestic export industry (e a ) and the market size of the other country-player (M b ). Neoclassical economics argues that opening domestic markets unilaterally is beneficial even if other countries close their markets because open economies can produce and consume more efficiently by importing goods from lower-cost producers. From that point of view, international trade liberalization should be a game of harmony and problems of cooperation between trading partners should not arise. Notwithstanding those claims, many countries have adopted restrictive trade policies time and again, and they usually drive hard bargains about mutual trade concessions.
Several reasons can be found for why economically rational governments may establish restrictive trade policies and why avoiding such measures can costs them. First, governments may impose tariffs to satisfy the demands of organized interests and so to gain financial resources for domestic political competition (Baldwin and Magee 2000;Krueger 1974;Magee et al. 1989). Second, even classical trade theory has acknowledged that large countries are able to improve their terms of trade if they establish optimal tariffs (Conybeare 1984;Hillman 1992;Johnson 1953). Third, tariffs can be parts of strategic trade policies that aim to support domestic industries in oligopolistic markets or protect infant industries at home (Brander 1986;Hillman 1992;Krueger and Tuncer 1982). Finally, countries may implement trade restrictions to protect cultural and institutional idiosyncrasies in their domestic markets, which otherwise may be endangered by international trade (Bala and Van Long 2005;Belloc and Bowles 2009). 5 In our model (see Table 1), the costs of opening domestic markets result from the extent of a country's protectionism against imports (i a ), the relative strength of the trading partner's export industry (e b ) and the size of the trading partner's economy (M b ). As long as the gains from trade liberalization exceed the costs of import competition ( e a M b > i a e b M b ), the two countries play a prisoner's dilemma game (see also Axelrod 1984;Conybeare 1984Conybeare , 1985Gawande and Hansen 1999;Krugman 1992;Melese et al. 1989;Milner and Yoffie 1989;Rhodes 1989;Thorbecke 1997). Cooperation is beneficial, but countries are at the same time tempted to defect and exploit cooperation by their counterparts. Countries try to maximize their access to external markets to generate export possibilities for their competitive industries, while simultaneously minimizing foreign access to their domestic market to protect uncompetitive industries.
According to the gravity model of trade, the amount of potential trade between two countries is proportional to their market sizes M and inversely proportional to the distance D between them (see, for example, Bergstrand 1985;Deardorff 1998). Thus, a given country gains less from access to the market of another country the farther away that country is in terms of geographical distance. In our model (see Table 1), we discount market access by the square root of the absolute distance ( √ D ab ) to account for the fact that transportation costs do not rise linearly. As a result, country A's gains from international trade are given by ( The prisoner's dilemma of trade liberalization is not a one-shot game: it is played repeatedly between all possible pairs of WTO member states. Countries do not decide once and for all to open or close their borders to trade. In fact, they can change their trade policies almost at any time. Moreover, even if the WTO member states belong to a multilateral institution, international trade flows constitute bilateral relationships between single exporters and importers. Thus, we can understand global trade liberalization within the WTO as an iterated prisoner's dilemma between all n(n−1) 2 pairs of member states. That setup allows countries to play conditional strategies like tit-for-tat against single opponents and to open their markets reciprocally (Axelrod and Keohane 1985;Keohane 1986;Rhodes 1989).
We also incorporate noise and surveillance costs into the iterated game of global trade liberalization to make our model more realistic. First, as a result of noise, countries act randomly with a probability of α = 0.1. Thus, even if their main strategy requires cooperation, they may defect from time to time-for example, to accommodate domestic opposition to certain trade measures. Such erratic behavior is of no consequence if counterparts adopt unconditional strategies-i.e., if they always cooperate or defect. However, a deviation from the main strategy matters greatly if countries play conditional strategies like tit-fortat. If both countries play tit-for-tat, unintentional defections initiate endless cycles of retaliation. Thus, tit-for-tat is not a good strategy in a noisy environment but generous tit-for-tat is more successful (Axelrod 1997;Nowak 2006;Nowak and Sigmund 2004). Generous titfor-tat is 'friendlier' than simple tit-for-tat because a player cooperates with a probability of β = 0.3, even when it otherwise would retaliate against a trading partner's previous defection. 6 The 'unmotivated' cooperation of generous tit-for-tat allows the countries to break out of cycles of retaliation and return to cooperation.
Second, conditional strategies like (generous) tit-for-tat add surveillance costs because such strategies require countries to establish bureaucracies that monitor the global market and engage in judicial disputes when needed (Bown and Hoekman 2005). We introduce a discount factor (γ = 0.05) to account for such surveillance costs. The discount factor applies to all gains produced by conditional strategies. Accordingly, countries receive only 95% of their payoffs when they play tit-for-tat or generous tit-for-tat. As a result, unconditional cooperation becomes more appealing in a very cooperative environment, while unconditional defection becomes more appealing in a very uncooperative environment because the two unconditional strategies avoid incurring the surveillance costs required in simple and generous tit-for-tat strategies.

The modified Moran process as a revision protocol of an evolutionary process
In evolutionary game theory, the adjustment process is modelled by a revision protocol, which determines when players change their strategies and what rules they apply in choosing strategies (Sandholm 2009). In the models of biologists, the evolution of finite populations usually is modeled as a Moran process Voelkl 2011). Accordingly, after every round of the game (i.e., after each player has played once against all other players), one of them is selected randomly to die. To replace that player within a population of a fixed size, another player needs to be chosen to reproduce itself. Here, the second player is selected with a probability equal to its relative fitness within the population. Thus, more successful players are more likely to reproduce their strategies than less successful players. Such selectivity constitutes reproductive competition within the evolutionary process.

∑
Step 2: = Step 3: A gets selected with a probability equivalent to its relaƟve fitness in the populaƟon B gets selected randomly B adopts the strategy of A ∑ Fig. 1 The modified Moran process as a revision protocol for an evolutionary process Because the population of WTO member states is finite and does not grow indefinitely, we apply the logic of the Moran process as well (see Fig. 1). However, we need to modify the selection process to distinguish between the stable characteristics of countries and the various strategies they play. Countries are not born and do not die as a result of strong or weak trade policies, but they may alter their strategies. Hence, strategies, not countries, compete within the evolutionary process. In our model, a country is selected after every round of the game to reproduce its strategy with a probability equal to its relative fitness ( p c a = f a ∑ n b=1 f b ). Another country is chosen randomly ( p d b = 1 n ) and mimics the strategy of the successful country. In that way, the trade strategy of a relatively fit country is reproduced; the number of countries within the population remains constant and all countries retain their stable characteristics.
We also allow for mutations within the evolutionary process so that new strategies can re-enter a population, even if they already have been eliminated by the Moran process in previous rounds of the game. If we set the mutation rate at δ = 0.1, on average one out of ten selection processes produces a random result, meaning that a country adopts a strategy arbitrarily from the set of available strategies. As a result, the Moran process does not have a natural end. In other words, even if all WTO member states are generous tit-for-tat players and cooperate with one another, mutation makes it possible for a new strategy like unconditional defection to be adopted by one country. Whether the new strategy can survive within the population depends on the adopting country's fitness.
A country's fitness is determined by the accumulated payoffs it receives in interactions with every other country in every round of the game. Thus, fitness is modeled as accrued access to international markets minus the costs of opening the domestic market. On the one hand, the more market access a country gains in the prisoner's dilemma of trade liberalization, the larger are its export industry profits from comparative cost advantages and economies of scale. On the other hand, the more a country opens its domestic market to imports from other countries, the more its protected industry suffers from import competition. A successful strategy needs to maximize market access in all rounds of the iterated prisoner's dilemma game while simultaneously minimizing the costs of trade liberalization.
To adapt evolutionary game theory to the structures of international politics, we also need to account for the effects of a country's characteristics on its own fitness. The reason for doing so is that countries always have access to their own resources and, moreover, resource endowments differ between countries. The domestic economy always can exploit comparative cost advantages and economies of scale on the domestic market. Large domestic markets contribute to countries' fitness and make them less dependent on foreign market access. To reflect the size of the domestic market in the fitness function of countries, we assume that they get access to their own markets in every round of the game ( ). We do not discount the domestic market with the factor e (the share of a country's export sector) because not only the export industry, but the whole domestic industry operates on the domestic market. We account for transportation costs on the domestic market with the square root of the countries' area ( D a = √ area a ), which resembles the average distance within that market.
Because of the asymmetric payoffs in our iterated prisoner's dilemma of trade liberalization and the inclusion of market size into countries' fitness, countries are differently successful within our model and only some of their fitness depends on the strategies they play. First, countries gain more fitness from trade cooperation within the iterated prisoner's dilemma the larger their export shares are in relation to their rates of protectionism. The 1 3 ratio e a i a can be interpreted as country A's competitiveness; it determines the relation between the benefits and costs of trade liberalization. Second, large countries gain more fitness from their domestic markets than small countries do, but that fitness is independent of their own strategies and the outcome of the prisoner's dilemma. Thus, large countries are relatively more successful within a defectionist population, whereas competitive countries profit more from a cooperative environment.

Computer simulations based on empirical data
Empirical data from the World Bank database provide the stable characteristics of countries within our iterated prisoner's dilemma of global trade cooperation. The World Bank's current USD estimates of GDP determine the countries' market sizes M. 7 The share of countries' export industries e is calculated by dividing their exports of goods and services by their GDPs. 8 The rate of protectionism i is measured by the countries' average tariffs weighted by current imports. 9 The reference year for all economic indicators is 2017, and we chose the most current data (mostly from 2016), when the relevant information was not available for 2017.
The population in our simulation consists of 130 countries. Although the WTO currently has 164 member states, the number of countries in our computer simulations is somewhat less for two reasons. First, we treat the European Union (EU) as a single player because its external trade policy is quite integrated. Even though all 28 EU member states 10 also are WTO member states, they cannot unilaterally determine their trade policies, as they have delegated far-reaching competencies in that policy area to the European Commission and especially to the Directorate General for Trade. Second, we lack the necessary economic data for the Fiji Islands, Hong Kong, Liechtenstein, Macao, Trinidad and Tobago, and Yemen. Thus, we had to exclude those six countries, although some of them are competitive and open economies.
To preserve the characteristics of an iterated prisoner's dilemma and prevent the domination of our computer simulations by a few outliers, we have restricted the range of possible values for the variables e and i. For instance, some free trading economies, such as Singapore, report exports exceeding their GDPs, which leads to an implausible value of e > 1. At the same time, as their rates of protectionism (i) already are extremely low, those economies hardly face any costs of trade liberalization. Furthermore, some uncompetitive countries like Afghanistan have export shares (e) that are extremely small, which implies that they gain little from trade liberalization within our model. To avoid the scenario in which unconditional cooperation and unconditional defection become de facto dominant strategies for highly competitive countries and uncompetitive countries, respectively, we restricted the values for e and i to 0.9 ≥ e ≥ 0.1 and i ≥ 0.01.
Owing to the large number of countries within our population (n = 130) and the even larger number of rounds within the evolutionary process (here: R = 50,000), we need to rely on computer simulations to analyze the behavior of our model of global trade cooperation.
Each of our computer simulations requires the calculation of 419.24 million prisoner's dilemmas. The programming language Python has become standard for the coding of computer simulations in the field of evolutionary game theory (Isaac 2008). The so-called Axelrod Python library (Knight 2015), which offers impressive possibilities for the analysis of evolutionary game theory models, already is available. However, the library cannot distinguish between players with stable characteristics and their variable strategies, meaning that payoffs in the iterated prisoner's dilemma are based on the conventional numerical values of zero, one, three and five. Consequently, we could not use the Axelrod library to analyze our model, in which the payoffs depend on stable country characteristics that are independent of their chosen strategies. We needed to code a new program package, which allows us to implement the payoffs of Table 1 for the iterated prisoner's dilemma of trade liberalization. 11

Analysis: computer simulations of global trade cooperation
To analyze the impact of protectionist trade policies on global trade cooperation, we first present the results of a round-robin tournament of conventional iterated games. The analysis shows that unilateral defection leads to absolute and relative losses for the respective economy if all other countries retaliate. Thus, in a direct comparison, tit-for-tat clearly 'wins' over unilateral defection. However, in the second step of our analysis, we demonstrate that tit-for-tat nevertheless is not evolutionarily stable and that unilateral defection temporarily can be successful within a dynamic evolutionary process.

A conventional Axelrod tournament of global trade cooperation
We follow the example of Axelrod (1984) and conduct two round-robin tournaments of ten rounds, without applying an evolutionary selection process or including noise, generosity Fig. 2 The gains of trade liberalization (no noise, generosity and surveillance costs are assumed in this round-robin tournament of ten rounds. Countries' fitness does not include their domestic markets, but only captures the gains from trade cooperation.) 1 3 or surveillance costs. The result of the simple simulation clearly confirms the strength of tit-for-tat within a multipolar setting. Figure 2 shows how much fitness the three largest economies (China, the EU and the United States) gain in this tournament. The left graph shows the result when the whole population of 130 countries plays tit-for-tat. While all three largest economies profit from trade liberalization and enjoy absolute gains, the economies of China and the EU are slightly smaller and more competitive than that of the United States, implying that China and the EU gain relatively more fitness from access to the US market than vice versa.
The United States can reduce the fitness gains for China and the EU considerably if it defects unconditionally instead of playing tit-for-tat. The right graph of Fig. 2 shows how much fitness China, the EU, and the United States gain if the United States defects and all 129 other countries play tit-for-tat. Even though only one country-the United Statesdefects, the losses in fitness for China and the EU are considerable. However, the United States pays a high price. The defector exploits the cooperativeness of all other countries only in the first round: thereafter the other countries retaliate and close their markets to US exports. As a result, the fitness of the United States stagnates after round one and it loses even more fitness than China and the EU. In fact, the relative decline of the United States in comparison to China and the EU is larger if the United States defects than if it plays tit-for-tat. That is because tit-for-tat allows all other countries to cooperate with each other while at the same time isolating and punishing the defector. Even a large economy like the United States cannot win with a strategy of unilateral defection if all other countries retaliate against it forcefully.
Avoiding losses by defecting unilaterally may be a rational strategy in a bipolar setting, but it does not pay in a multipolar setting (Snidal 1991). If only two major countries play the game, as during the Cold War, a country like the United States can avoid losing to its opponent by defecting, leaving the other player (in that case the USSR) without any gains from cooperation. However, when more than two significant players participate (as in the current global economy), the United States cannot prevent other countries (like China and the EU) from cooperating with each other. If the gains from cooperation among other countries are significant, the United States loses out in relative terms by not cooperating. Thus, a strategy that avoids losses within a bilateral relationship creates exactly such losses in a multipolar setting. Under such conditions, trade wars cannot be won; protectionist trade strategies therefore seem 'irrational' as they lead to both absolute and relative losses. However, things change when we simulate an evolutionary process based on more realistic assumptions about noise, generosity and surveillance costs. Figures 3 and 4 show the results of two simulations of our evolutionary game theory model of international trade cooperation. The level of noise is α = 0.1, the surveillance costs are γ = 0.05 and the mutation rate is δ = 0.1. Countries play one of the four strategies that are used in the model of Martin Nowak and his associates (Imhof et al. 2005;Nowak 2006;Nowak and Sigmund 2004), namely unconditional defection, tit-for-tat, generous tit-fortat (with a generosity level of β = 0.3) and unconditional cooperation. In both simulations, the members of the population of countries enter the game by playing randomly assigned strategies.

The ups-and-downs of trade liberalization within an evolutionary process
The two simulations in Figs. 3 and 4 show clearly that when one strategy becomes dominant, it opens the door for another strategy to take over. When unconditional defection (black) dominates the population, tit-for-tat and generous tit-for-tat (grey) take over and establish cooperation. When tit-for-tat and generous tit-for-tat are the main strategies, unconditional cooperation (white) can enter the population and save on surveillance costs because it is not exploited by tit-for-tat or generous tit-for-tat players. However, once unconditional cooperation dominates the population, the gains from exploiting cooperation increase and defection again pays. Thus, the cycle starts anew and no stable equilibrium emerges. The two graphs illustrate how the level of cooperation rises and falls in accordance with the dominant strategies within the population. The cooperation ratio increases from tit-for-tat to generous tit-for-tat and it is highest when unconditional cooperation is strong within the population. Of course, the cooperation ratio declines thereafter as unconditional defection starts to exploit generosity and unconditional cooperation. The oscillation between high and low levels of cooperation closely resembles Nowak's endless cycles of cooperation and defection (Nowak 2006;Nowak and Sigmund 2004)-even though our simulations are not based on a homogeneous, but rather on a heterogeneous population wherein the gains from domestic markets differ between countries and countries play asymmetric games against each other.
The two simulations of Figs. 3 and 4 show waves of cooperation and defection, but the concrete manifestations of those waves differ considerably. For example, the simulation of Fig. 3 starts with a noticeably more defectionist population than that of Fig. 4. The lengths of the waves likewise differ and the cooperative wave from rounds 20,000 to 30,000 in Fig. 3 is five times longer than the minor cooperative wave from rounds 3000 to 5000 within the same simulation. On average, the population of Fig. 3 achieves 62% cooperation, whereas the average cooperation level is only 58% within the simulation of Fig. 4. The differences between the simulations are explained by the influences of probabilities within our stochastic model. The Moran process selects one strategy with a probability that is proportional to the fitness of the respective country and a randomly chosen country adopts that strategy. Moreover, the parameters for noise, generosity and mutation also are probabilities. Thus, no simulation produces the same results as any other one, pointing to an important limitation of our simulations. Our model cannot-and it does not aim to-calculate or predict how global trade cooperation is going to develop within the next months or years. However, what the model illustrates is the potential for instability within international trade policies. Both simulations show long defectionist episodes at relatively late stages. For example, unconditional defection dominates from round 43,000 to 49,000 in Fig. 3 and from round 35,000 to 41,000 in Fig. 4.
Within our simulations, none of the four strategies were evolutionarily stable (see also Bendor and Swistak 1995;Boyd and Lorberbaum 1987). The only Nash equilibrium in the prisoner's dilemma of trade cooperation is unconditional defection. Countries playing simple tit-for-tat are exploited in the first round, and unconditional cooperators are exploited continually when they enter a population dominated by unconditional defection (see Table 2). Nevertheless, unconditional defection is not evolutionarily stable because (generous) tit-for-tat can enter the population successfully when such conditional strategies are Country A has access to its own market in all rounds and opens its own market in the first round Always defect Always cooperate Country A has access to its own market in all rounds and opens its own market in all rounds Tit-for-Tat Always defect Country A has access to its own market in all rounds and gets access to opponents' markets in the first round Tit-for-tat Tit-for-tat � Country A has access to its own market, gets access to opponents' markets and opens its own market in all rounds Tit-for-tat Always cooperate � Country A has access to its own market, gets access to opponents' markets and opens its own market in all rounds Always cooperate Always defect √ D ab � Country A has access to its own market and gets access to opponents' markets in all rounds

Always cooperate
Tit-for-tat � Country A has access to its own market, gets access to opponents' markets and opens its own market in all rounds Always cooperate Always cooperate � Country A has access to its own market, gets access to opponents' markets and opens its own market in all rounds R = number of rounds (i.e., games against each opponent) deployed by several countries. Those countries then profit from cooperation among each other and consequently gain comparative advantages over unconditional defectors (Axelrod and Hamilton 1981;Imhof et al. 2005;Nowak 2006;Nowak and Sigmund 2004). Finally, in contrast to Axelrod and Hamilton (1981), (generous) tit-for-tat also is not evolutionarily stable because the neutral drift towards unconditional cooperation is not punished by a titfor-tat playing population (see Table 2). In our simulations, the drift from tit-for-tat towards unconditional cooperation even is reinforced because unconditional strategies can avoid the surveillance costs of conditional strategies. Of course, unconditional cooperation is not an evolutionarily stable strategy either, but it can be exploited by unconditional defection, which subsequently gains the upper hand within the population. The level of cooperation in the global trade order has been quite high in recent years, and we may have reached the 'Minsky-moment' of globalization. The new wave of economic nationalism-including current US protectionism-can be seen as an attempt to exploit generosity and unconditional cooperation within the population of countries. For example, the more concessions the EU offers to reduce its trade surplus with the United States, the more successful President Trump's strategy becomes. If the trade talks between China and the United States produce a favorable outcome for the latter, President Trump's strategy bears even more fruit. The dilemma of appeasement is that it rewards unilateral protectionism-which increases the appeal of that policy for other countries. If other countries follow the US example, a diffusion of protectionist trade policies could begin. A historical example for that possibility is the wave of protectionism after the Smoot-Hawley tariff act of 1930, which ended a long period of open trade under British hegemony (James 2001). To avoid such a downswing today, the WTO member states would need to ensure that unilateral protectionism does not become successful. They cannot allow generosity and unconditional cooperation to be exploited and therefor need to retaliate forcefully against protectionism. Trade wars against large economies like the United States are expensive, and they lead to considerable welfare losses. However, global welfare will decline significantly more if countries find no answer to economic nationalism. Then, unilateral defection becomes a winning strategy and protectionist trade policies diffuse throughout the population of countries.

Limitations and issues for further research
The findings reported in the paper at hand are not based on empirical tests, but on computer simulations, which necessarily are based on assumptions. Nevertheless, we are confident that our simulations capture the important features of global trade cooperation. Our model rests on two theoretical fundaments. First, we model international trade as a prisoners' dilemma in which countries have common interests in trade liberalization, but still have an interest in protecting their own industries. Such an understanding of international trade cooperation is shared widely in the field of international political economy (Axelrod 1984;Conybeare 1984Conybeare , 1985Gawande and Hansen 1999;Krugman 1992;Melese et al. 1989;Milner and Yoffie 1989;Rhodes 1989;Thorbecke 1997). Second, we do not regard the trade policies of different countries as being chosen unilaterally, but instead assume that countries observe and influence each other. As a result, successful trade policies are more likely to diffuse through the international system (Elkins and Simmons 2005;Gilardi 2010;Meseguer 2009;Pitlik 2007;Shipan and Volden 2008;Simmons and Elkins 2004;Yukawa et al. 2014). Based on those two crucial assumptions, we have applied an evolutionary game theory model, which is inspired by the work of evolutionary biologists (Friedman and Sinervo 2016;Imhof et al. 2005;Nowak 2006;Nowak and Sigmund 2004). We argue that the strategic interactions within populations of players are general phenomena that are not restricted to the biological realm, but which also take place in different economic, political and social circumstances (Friedman 1998).
Nonetheless, our evolutionary game theory model of global trade cooperation involves simplifications, which rightly can be criticized and require further work. What is most important, we treat the countries in our population as unitary actors and do not open the black box of domestic politics. The influence of domestic politics on countries' trade policies is discussed extensively in the literature (e.g., Arce et al. 2008;Baldwin and Magee 2000;DeVault 2013;Hoffman 2009;Mansfield et al. 2000Mansfield et al. , 2002Nollen and Iglarsh 1990;Wagner and Plouffe 2019); most public choice models of protectionism concentrate on the domestic realm (e.g., Damania et al. 2004;Lake and Linask 2015;Pecorino 1997;Aidt 1997). For the moment, we chose to omit domestic politics and to concentrate on the global dynamics of trade cooperation because this level of analysis has been subject to far less theoretical development over the past two decades. Given the current challenges of the global trading order, we argue that the global politics of trade deserve more academic attention. The next step in theory building shall be to combine the existing domestic politics models of protectionism with our model of global trade cooperation to derive at a fullfledged multi-level model of trade liberalization and protectionism.
In addition to excluding domestic politics, our model does not consider the economic developments that likely would occur within national economies in reaction to trade liberalization or protectionism. If countries open their markets to profit from international trade liberalization, their formerly protected industries decline, their export industries flourish and their markets are likely to grow. Conversely, a protectionist environment leads to shrinking markets, a decline in export industries, and a rise in protectionist demands (Pecorino 1997). Nonetheless, we decided to fix the market size M, the export share e and the rate of protectionism i to the empirical values from 2017 in order to keep countries' characteristics and the heterogeneity of the game-playing population as close to today's reality as possible. If the values of those variables changed owing to countries' cooperation or defection strategies, our population of countries would deviate from empirical reality in the course of the 50,000 rounds in our simulations. An interesting challenge for future research will be to allow for changes in the country characteristics M, e and i, and to set the parameters of the model so that its dynamic behavior resembles the historical development of the global trading order as closely as possible.

Conclusion
Our evolutionary game theory model of global trade cooperation differs from the biologists' models in two important respects. First, the iterated prisoner's dilemma is played by a heterogeneous population of countries that differ in terms of their competitiveness and market sizes. Heterogeneity implies that not only the effects of the strategies themselves determines their success within the evolutionary process, but that the strategies played by competitive and large countries are more successful than the strategies played by uncompetitive and small countries. Second, countries do not die or reproduce as a result of their trade strategies. Instead, they can mimic the example of successful countries and change their strategies accordingly. As a result, we distinguish between countries (with stable economic characteristics) and strategies (which change during the evolutionary process). Despite those modifications, our model reconfirms the finding of Nowak and his associates that none of the strategies studied is evolutionarily stable. Waves of cooperation and defection emerge. Unconditional defection is not evolutionary stable because groups of (generous) tit-for-tit playing countries can enter a defectionist population successfully. Once (generous) tit-for-tat dominates the population, unconditional cooperation may succeed because it is not exploited by (generous) tit-for-tat playing countries. However, unconditional cooperation easily can be exploited by defectionist countries and the cycle starts anew.
The waves of trade liberalization and protectionism suggest that globalization is not the 'end of history' (Fukuyama 1992). Like hegemonic stability theory (Krasner 1976), our evolutionary game theory model of global trade cooperation explains the long waves of economic openness and protectionism, which have distinguished the history of the global trading order since the beginning of industrialization. However, the reasons underlying those waves of international cooperation and defection differ. Although our model takes the diverse market sizes and economic competitiveness of countries into account, it does not depend on the dominance of a single hegemonic player. It is the distribution of strategies within the population of countries that determines the success of a new strategy. Once several countries have adopted the same new strategy, the distribution of strategies within the population of countries shifts, which opens the door for yet another strategy. Large, competitive countries obviously have more leverage in changing the distribution of strategies than small, uncompetitive countries, but the same fluctuations also could be observed in a population of equally large and competitive players.
The success of economic nationalism like that of the Trump administration depends on the reactions of others. If other countries can sustain cooperation among each other while simultaneously punishing unilateral protectionism, the defecting country loses more in relative terms and reinforces its relative decline. However, if other countries try to stabilize cooperation by being generous and by appeasing defecting countries, economic nationalism may indeed become a successful strategy. Countries that are interested in an open and cooperative trading order need to balance generosity and retaliation carefully. Tit-for-tat is a strong strategy for punishing unilateral protectionism, but it faces difficulties when confronted with noise. In an uncertain and unpredictable world, countries sometimes may be forced to defect 'unintentionally' for domestic reasons. If all countries strictly play simple tit-for-tat, such 'unintentional' defection leads to endless rounds of retaliation and the global trading order collapses. To avoid that possibility, countries need to be generous rather than retaliating against every single defection. However, the problem of generosity and a high level of cooperation within the population is that it can be exploited by unilateral defection. Countries need to distinguish between unintentional defection, to which they should react with some generosity, and exploitative defection, which requires forceful retaliation. In a noisy and uncertain environment, that distinction is crucial, but difficult.

3
are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.