Maintenance of cooperation in a public goods game: A new decision-making criterion with incomplete information

Hardin’s “The Tragedy of the Commons” prophesies the inescapable collapse of many human enterprises. The emergence and abundance of cooperation in animal and human societies is a challenging puzzle to evolutionary theory. In this work, we introduce a new decision-making criterion into a voluntary public goods game with incomplete information and choose successful strategies according to previous payoffs for a certain strategy as well as the risk-averse benefit. We find that the interest rate of the common pool and the magnitude of memory have crucial effects on the average welfare of the population. The appropriate sense of individuals’ innovation also substantially influences the equilibrium strategies distribution in the long run.

Cooperative behaviors are a trademark of both insect and human societies [1]. These behaviors are explained by kinship in the former and unknown mechanisms in the latter. Hardin's "The Tragedy of the Commons" indicated the overuse of public resources leads to an inevitable collapse of cooperation among non-kin populations; for example, in the drive to reduce pollution and sustain global climate [2]. Moreover, the prisoner's dilemma (PD) has long been viewed as a typical example to show the disintegration of cooperation through pair-wise interactions. However, conspicuous examples of cooperation (although almost never of ultimate self-sacrifice) also occur where relatedness is low or absent [3].
For the purpose of studying qualitatively interactions between humans, a public goods game (PGG) has been used by economists, social scientists and evolutionary biologists as a paradigm to explain maintaining cooperation in a group of unrelated individuals [4][5][6][7]. In a typical PGG, an experi-menter endows four players with 20 money units (MUs) each. The four players then have the opportunity to invest part or all of their money into a common pool. The experimenter doubles the total capital in the common pool and divides it equally among all players regardless of their investment. If everyone cooperates and invests all the money, each player ends up with 40 MUs. However, every player faces the temptation to defect and to free-ride on the other player's contributions by withholding the money, because each MU returns only half to the investor. If all the players were perfectly rational, they would invest nothing. Such behavior attributed to Homo economicus varies considerably from experimental evidence [4]. Note that for pair-wise players with a fixed investment amount, the PGG reduces to the PD.
Recently, various mechanisms have been proposed to explain the persistence of cooperation in the absence of genetic relatedness. For example, repeated interactions and direct reciprocity [3,8], indirect reciprocity [8][9][10], punishment [11,12], reward [13], and the structured population have been suggested [1,14]. Voluntary participation may provide a way to escape from the economic stalemate and maintain substantial levels of cooperation in a large population without secondary mechanisms [5,[15][16][17][18][19]. This volunteering PGG considers three types of players: (1) cooperators C and (2) defectors D, both willing to join the PGG, the former contributes a fixed amount to the common pool, while the latter attempts to free ride on the others' investment and contributes nothing; (3) loners L, who are unwilling to participate in the PGG and obtain a fixed small payoff. In other words, loners are risk averse investors.
Traditionally, most researchers concerned with cooperation in social dilemmas focus on imitation or learning rules, with the central argument that an individual could compare their payoff with that of others, randomly choose a role model, and take over the strategy of the role model with a certain probability [7,[20][21][22][23][24][25][26][27][28][29][30]. Game theory has been an interesting subject in recent years [31][32][33][34]. We would position the procedure of decision making among the games of incomplete information: In a fictitious game, player i has the weight function updated by adding 1 to the weight of a strategy each time it is played [23]. Moreover, in the expected weighted attraction model, the strength of hypothetical reinforcement of strategies is considered players who make decisions not according to payoffs they would have yielded. However, without any information of their von Neumann neighborhoods, how does a person make decisions if they only know their own payoffs in each round in the rule of PGG? In other words, what is the decision-making criterion for a person with incomplete information, who only obtains their own information?
In this work, we introduce a new decision-making criterion to show the procedure of cooperation maintenance. Numerical simulation of this evolutionary mechanism illustrates that historical experience is a large factor when a person chooses an appropriate strategy. Moreover, compared with random selection, a proper length of retention is more conducive to the welfare of the whole population. The simulation also indicates that cooperation is on a substantial level, which means this introduction is an illusion of free ride behavior, irrespective of the initial conditions.

Model
Here, we consider the voluntary PGG in a spatial lattice with periodic boundary conditions. The players are arranged in the rigid regular two-dimensional square lattice with 10000 members and interact with their von Neumann neighborhoods only. The von Neumann neighborhoods are the nearest players to each lattice point.
Confined to a site x on the lattice with incomplete information, player x chooses a strategy according to their own historical records. The profit of player x then depends on their strategy as well as the choices of their neighbors. Considering a single PGG involving the player and their four nearest neighbors, the payoffs for different strategies are then determined by the five strategies. Namely, if n c , n d , n l (with n c +n d +n l =N=5) denote the numbers of C, D and L players, then the net payoff of cooperator P c , defectors P d , and loners P l is given by The cooperative investments are set to unity and r denotes the interest rate of the common pool. In particular, if only one player joins the PGG (i.e. n c +n d =1), the solitary player will be accounted as a loner. For a voluntary PGG deserving its name, we must have 1<1+K<r<N. In the rigorous sense of the spatial PGG, the payoffs to each player are accumulated by summing the player's performance in PGGs taking place on the player's site and neighboring sites. For convenience, we assume that the payoff for a player is the average value of the payoffs obtained over all the games they take part in. Note that this simplification does not change the system's dynamics.
Confined to a site x on a square lattice, player x uses a mixed strategy: s x (p l , p c , p d ), which means probability distribution over pure strategies. Obviously there is the normalization condition: resents the probability that player x chooses the i strategy.
That is to say, at each decision stage, all the players will come up with one of their feasible actions with their preassigned probabilities. From time to time, player x reassesses and updates their mixed strategy according to the payoffs which they obtained in the previous rounds, i.e. from their historical experience. They increase the probability of the last round strategy if the payoff satisfied them, and vice versa. But when do they feel satisfied? In our model, all the players will realize that they can obtain a small but fixed income in the long run when they refuse to join the PGG. They also remember their decisions and deficits along with profits for a certain strategy in each round from the beginning until now. Then it is safe to say that each player will remember their average payoff for a certain choice: . In our model, we suppose that each player updating their mixed strategy will compare their last round's payoff with their decision criterion: the better earning of K and i P . Assuming that player x choose the i strategy in the last round, the evolution of the mixed strategy is then given by x l The parameter ω denotes the magnitude of memory, i= c, d and j =d, c (i.e. the opposite i strategy), respectively. Hence, the players in our model can choose the strategy which has been successful during the game's history. In addition, as former researchers have done [24], we postulate that the players have the possibility to explore available strategies, and define µ as the probability that players choose a random strategy.

Numerical simulation
Simulations were carried out for a population of N= 100×100 individuals. In the quantitative analysis, numerical simulations were performed to investigate the frequencies of cooperators and defectors when this decision criterion was introduced into the PGG. Figure 1 shows that this mechanism can maintain the evolution of cooperation to avoid dilemma. Obviously, when r→2, most people view quitting the game as their best choice. When r→5, cooperation means the greatest benefit for all players. Moreover, if r = 5, each player knows that defection is a reasonable choice only if they alone make this decision. Every player faces the maximum temptation to defect and to free-ride on other players' contributions.
Thus from Figure 1, we can see that when r gets close to 5, the average frequency of defection is higher than 40%. Figure 2 shows that the payoff of defection approaches 2.2 when r = 5, which means that if the player is smart enough at this value, they will definitely take part in the game, regardless of the probability of the random choice. As Figure  1 shows us, most players participate in the game; the frequency of loners is close to zero. Additionally, when r is small, the decision-making criterion for a certain player is the fixed payoff K of the loner. The frequencies of C, D and L are the same as in former research [17]. In Figure 2, payoffs of cooperators C and defectors D increase as r moves from 2 towards 5, and this result conforms to the instinct of players in real life, when they know the interest rate of the common pool. Compared with previous results [17], the payoffs increasing monotonically with r are more reasonable.
In the remaining text we will discuss how the magnitude of memory ω and the mutation rate μ affect the welfare of all the players. When ω→∞, the behaviors of individuals are only influenced by the historical payoffs of a certain strategy, which means the players are myopic. The rule in this case is similar in spirit to Win-Stay-Lose-Shift rules. When ω→0, the present round payoffs cause only a minor effect on the evolution of mixed strategies, i.e. the players have longer memories. More precisely, the magnitude of memory represents the individual's sensitivity to their neighborhood. The lower the ω value, the more slowly an agent will react to their surroundings, and vice versa. Thus for a low ω value, even if individuals cannot make more money than loners, they will still keep their original action for a very long time. Consequently, the population may not obtain very efficient outcomes in the long run. Conversely, for a high ω value, the individuals are mainly affected by current payoffs, and the C clusters will collapse immediately once they are invaded by defectors D. That is to say, the strategy is unable to correct mistakes, and like tit-for-tat [3], an accidental defection would lead to a breakdown in cooperation, which decreases average outcomes of the population. We conclude that an appropriate magnitude of memory is helpful to promote outcomes of the collectives. In general, it may result in a more efficient outcome for the population when ω is approximately 10 ( Figure 3). In fact, the result suggests that both the behavior of marching in lockstep and focusing on short-term goals will go against the development of the population in a real-life situation. It is also in close accord with the conclusion of a previous study [17].
The value of µ characterizes the probability that an individual makes their decision randomly. For µ→1, the individual is completely irrational and their decision is random. Apparently, the only equilibrium is the equal abundance frequency in all strategies. For smaller µ, the mutation rate causes minor modifications in the equilibrium of the system when r > 3, whereas it affects the equilibrium significantly when the value of r is small (Figure 4). The single extreme value in Figure 4 is much more credible than Figure 4(a) from [17] where there are two extreme values.

Conclusion
In this work, we introduce a new yet simple evolution method to spatial voluntary public goods games with incomplete information, resulting in interesting dynamic properties.
In our model, players are considered to know only their payoffs in each round. They adjust their mixed strategies continually according to their decision-making criterion: the better earning of the risk-averse benefit K and the history of payoffs for a certain strategy. If they are satisfied with their payoffs, their probability of using the previous round strategy in their mixed strategies increases, and vice versa. Thus, this rule is no longer myopic. By introducing this criterion, we found that both the interest rate of the common pool and the magnitude of individuals' memory have an important effect on the evolution of cooperation. To maintain the best welfare of the whole population, people should choose an appropriate magnitude of memory. Meanwhile, our simulation results suggest that in a real-life situation, people make decisions from their own history of experience and from those of neighborhoods, which implies that the heterogeneity of individuals is due to the different environments in which they live. A person may make a different choice as their environment changes. With this evolution rule, the proportion of cooperation is on a substantial level, which reduces free ride behavior in enterprise decision-making.