# Sampling Dynamics of a Symmetric Ultimatum Game

## Abstract

We propose a dynamic three-strategy symmetric model of the Ultimatum Game with players using a sampling procedure. We allow an intermediate strategy, interpreted as a social norm, to evolve in time according to beliefs of players about an average offer. We show that a social norm converges to a self-consistent offer of about 15 % in the unique globally asymptotically stable equilibrium of our model.

## Keywords

Sampling dynamics Ultimatum game Social norm## 1 Introduction

Cooperation between unrelated individuals in animal and human societies is an intriguing issue in biology and social sciences, cf. [4, 5, 11, 17]. Usually, it is addressed within game-theoretic models such as the Prisoner’s Dilemma and the Snowdrift games. In economics, one of the fundamental questions is concerned with the bargaining problems. The essential features of this social dilemma are present in the Ultimatum Game. It is a nonsymmetric game where the goal is to divide a fixed prize, of unit worth, between two players. The first player, called the proposer, makes an offer—the share of the prize. The second player, called the responder, either accepts or rejects the offer. If the offer is accepted, the responder gets the offer and the proposer gets the rest of the prize, otherwise both players get nothing. It is easy to see that any offer may be supported by a Nash equilibrium. Let the proposer make an offer *α*∈[0,1] and the responder reject any offer strictly lower than *α* and accept all offers not smaller than *α*. Any such pair of strategies constitutes a Nash equilibrium. However, these equilibria are based on empty threats. Suppose that the proposer deviates from the equilibrium and makes an offer *α*′<*α*. Then the responder rejects the offer even though she would be better off accepting it.

A stronger notion of the subgame perfect equilibrium is used to select one from many Nash equilibria. If the prize is perfectly divisible, that is the strategy space of the proposer is continuous, there is a single subgame perfect equilibrium in which the proposer offers *α*=0 and the responder accepts any offer. If the prize has a grid with a size *g*>0, that is *g* is the smallest nonzero offer, and consequently the strategy space of the proposer is discrete, there are two subgame perfect equilibria. The first one is the same as in the continuous case. In the second equilibrium, the proposer offers *α*=*g* and the responder accepts any positive offer.

There is a vast body of literature on an experimental treatment of the Ultimatum Game. The precise analysis of this literature is beyond the scope of this note. However, the single most important finding is that the offers observed during experiments are different from predictions based on the concept of the subgame perfect equilibrium and vary between 20 % and 40 %. An excellent survey is in [8].

There were many ways of explaining this systematic violation of theoretical predictions including dependence of subjects’ preferences on payoffs of other players in various ways and all sorts of models of adaptation and learning, including our previous paper [10]. Here, we offer a different dynamic approach using a notion of the sampling equilibrium introduced in [12] and further developed in [16].

The rest of the note is organized as follows. In Sect. 2, the main model is derived. It is studied and discussed in Sect. 3. We conclude in Sect. 4.

## 2 Model

### 2.1 Ultimatum Game

We propose a symmetric version of the Ultimatum Game. Game roles, the proposer or the responder, are assigned to players at random with equal probabilities. Therefore, a pure strategy must define an action in each role and so we define a pure strategy to be a pair (*α*,*β*), where *α* is an offer made while a player is a proposer and *β* is a minimal accepted offer while a player is a responder. To make things simple, we restrict possible offers to the following three possibilities.

*α*=0, altruistic if

*α*=1, or is of an intermediate type if

*α*=

*δ*,

*δ*∈(0,1). Also, to keep the number of pure strategies conveniently small, we need to relate the acceptance levels

*β*with the offer levels

*α*. To do so, we assume that players are symmetric across their roles,

*α*=

*β*. That is if a player offers a share

*α*in the role of the proposer then he expects the same offer while being the responder. Obviously, a player accepts all higher offers and rejects all lower offers. In short, players of our symmetric Ultimatum Game have three pure strategies at their disposal: (0,0), (

*δ*,

*δ*), and (1,1). Payoffs are therefore given by two matrices

*P*and

*R*for the proposer and the responder, respectively, where

*P*

_{ ij }and

*R*

_{ ij }are payoffs of the proposer (the row player) and the responder (the column player) respectively if the proposer plays the

*i*th strategy and the responder the

*j*th one (payoffs in bold are rejections).

The game we consider is a very simplified version of the Ultimatum Game and resembles the cardinal Ultimatum Game introduced in [3]. The cardinal Ultimatum Game is an extensive form game where the first player (the proposer) has only two strategies. She can offer an equal share *α*=1/2 or offer a share close to the perfect subgame equilibrium of the game. The second player (the responder) can accept or reject any offer.

It is noted in [3] that the game is derived by “*abstracting the crucial features of the full ultimatum game* (“*crucial*” *relative to observed patterns of lab play*).” We make some changes to the cardinal Ultimatum Game. They are dictated by some modeling considerations on one hand and technical necessities on the other one. We start with a theoretical model and try to say something about the possible values of the parameter *δ*. In fact, we treat the intermediate strategy, given by the value of the parameter *δ* as a social norm, admittedly simplified, where *δ* plays the role of the actual average offer and players’ beliefs about an average offer which are equal at the equilibrium. Therefore, we do not want to use any data to formulate the prior of the model. This is why we add the altruistic strategy so that the possible strategies span the whole spectrum of possibilities even if this strategy is not (or extremely rarely) observed during experiments.

Also, we make the game symmetric. This is because we consider the evolution of the social norm within the population in the very long run. It seems to us that it would be very questionable to assume that some part of the population in the long run consists of proposers and the other part of responders. Such an assumption is perfectly valid in biological scenarios where players are animals and so they may be of different species. It can also makes sense in some economic scenarios as well, e.g., a population of universities and a population of prospective students, but we do not see any compelling reason why in the context of bargaining an asymmetric scenario should be of any interest to explain the evolution of the social norm.^{1} Consequently, we make the game symmetric. Due to technical reasons we need to narrow down the set of pure strategies and so we remove asymmetric strategies. Finally, the model of the game is more general than the cardinal Ultimatum Game but at the same time the set of strategies is restricted to the symmetric ones. However, our model still allows for the same game patterns as the original cardinal Ultimatum Game proposed in [3], hence it can reproduce typical patterns of play observed in laboratory experiments.

### 2.2 Sampling Dynamics of the Ultimatum Game

A mixed strategy is a probability distribution over the set of pure strategies. The set of probability distributions is denoted by *Δ*. A mixed strategy **x**∈*Δ* can be interpreted as a distribution of pure strategies in a large population of players that are randomly matched into pairs to play a symmetric normal form game, that is we assume a standard evolutionary type setting, cf. [6, 18].

Now we construct our dynamical model. We assume that players use a testing or sampling procedure introduced in [12]. We restrict ourselves to 1-sampling procedure—players use (test) once each pure strategy against randomly chosen opponents and adopt a strategy with the highest payoff. If the probability that the *i*th strategy is a winning one is equal to the current fraction of that strategy in the population, then we say that the population is at the sampling equilibrium.

*w*(

*i*,

**x**) the probability that the

*i*th strategy will provide the highest payoff in the population where the distribution of pure strategies is

**x**. Each pure strategy

*i*defines a random variable

*v*

_{ i }(

**x**) with payoffs

*a*

_{ ij }and probabilities

**x**(obviously, care must be taken if there are tied payoffs). These random variables are independent. We get

*w*(

*i*,

**x**) is the probability that the random variable associated with the

*i*-th pure strategy yields the highest payoff when all random variables

*v*

_{ i }are sampled once. In the case of a tie, the probability is split equally among best alternatives. The vector of winning probabilities is denoted by

*w*(

**x**). A mixed strategy \(\hat{\mathbf {x}}\) is a sampling equilibrium if \(\hat{\mathbf {x}}=w(\hat{\mathbf {x}})\). It is not difficult to see that

*w*(

**x**) is a polynomial in

**x**, hence for any game there exists a sampling equilibrium by the Brouwer’s fixed-point theorem.

^{2}in [16] and some further properties were studied in [14]. It is a system of ordinary differential equations of the form

*Δ*is forward invariant under the sampling dynamics. Also, a set of all critical points of a vector field

*w*(

**x**)−

**x**is a set of all sampling equilibria. This dynamics is well behaved since the vector field is a polynomial function in

**x**and so for any initial condition

**x**there exists a unique solution

*ξ*(

*t*,

**x**),

*t*≥0.

It is important to note the unique feature of the sampling equilibrium concept. In contrast to the notion of Nash equilibrium, a sampling equilibrium depends only on inequalities between payoffs and so the notion of sampling equilibrium is an “ordinal” concept. Any perturbation of payoffs does not change the sampling equilibria of the game as long as the inequalities between the payoffs are preserved, and consequently the order of the payoffs. In other words, the concept of sampling equilibrium assumes only one thing about the players that they prefer higher payoffs to lower payoffs but does not take into account the differences between the payoffs unlike the Nash equilibrium.

Now we would like to justify the use of the sampling dynamics (3). One reason is the learning procedure behind it. It is based not on imitation like most dynamics applied to Ultimatum Game (mostly in the form of the replicator dynamics^{3}), but on the procedure of comparing private outcomes in a sequence of games. From the descriptions of the experiments, it seems that subjects cannot observe choices made by other participants and so this excludes any model based on imitation. The second reason is that the model of the game is severely simplified and so the use of a learning procedure that discerns the finest differences in payoffs seems inappropriate. The model of sampling procedure, and consequently the notion of sampling equilibrium that is somewhat grainy, seems more suitable.

*v*

_{ i }(

**x**). These random variables depend on the value of

*δ*in general, but it is enough to write down only a single table. Table 1 presents these random variables. The construction details are discussed in Appendix A.

Random variables *v* _{ i }(**x**) in the symmetric Ultimatum Game

| 0 | | 1− | 1 |
---|---|---|---|---|

Pr[ | 1/2 | | 0 | (1− |

Pr[ | (1− | | (1− | |

Pr[ | 1− | 0 | 0 | |

*δ*there are three distinct cases. For

*δ*<1/2, we have

*δ*<1−

*δ*and the values in Table 1 are in the ascending order. For

*δ*>1/2, the two middle columns should be interchanged while for

*δ*=1/2, the two middle columns should be added. The value of a parameter

*δ*does change the winning probabilities

*w*(

*i*,

**x**) required for the sampling dynamics (3). To get winning probabilities, we have to consider all 24 possible realizations of the random vector (

*v*

_{1},

*v*

_{2},

*v*

_{3}) and calculate probabilities according to (2). This results in three different dynamics. Sampling dynamics for the symmetric Ultimatum Game for

*δ*<1/2 reads

*δ*=1/2 reads

*δ*>1/2 reads

## 3 Results and Discussion

In the previous section, we constructed sample dynamics (4)–(6). Here, we discuss its qualitative behavior. In particular, we look at stationary points, that is time independent solutions of (4)–(6). Such points provide distributions of pure strategies in the population at the equilibrium. We would like to interpret the intermediate pure strategy (*δ*,*δ*) as the social norm and hence the believed *δ* should be the mean offer in the population at the equilibrium. To make the concept of social norm sensible, we require the equilibrium to satisfy the following three conditions.

### Property 1

(Uniqueness & global asymptotic stability)

*For all values of* *δ*∈(0,1), \(\hat{\mathbf {x}}_{\delta}\) *is the unique equilibrium and for any initial condition* **x**, *solution of* (4)*–*(6) *converges to the equilibrium*, \(\xi(t, \mathbf {x})\rightarrow \hat{\mathbf {x}}_{\delta}\) *as* *t*→∞.

Property 1 is necessary to claim that an equilibrium supports the social norm. Suppose that there were two asymptotically stable equilibria and only one would satisfy remaining two conditions, we would run into troubles as we would have to provide a story explaining why the initial condition should belong to the basin of attraction of the correct equilibrium. This is not to say that in different populations there may not be different social norms.

### Property 2

(Mean-consistency)

*At the equilibrium*\(\hat{\mathbf {x}}_{\delta}\)

*we require that*

*where*\(\bar{\delta}(\delta)\)

*is an average offer at the equilibrium*.

Property 2 is a consistency condition. If we want to interpret the parameter *δ* as an average value of an offer in a population at the equilibrium, then the value predicted by the model needs to be consistent with the assumed value, i.e., Eq. (7) has to be satisfied. We use Eq. (7) to find the exact value of *δ* predicted by the model. Also, one may interpret parameter *δ* as players’ beliefs about the actual average offer in the population. Then the consistency condition is an exact analog^{4} of the equilibrium condition in [13].

There is a different way of interpreting condition (7). In fact, we have defined a class of models indexed with the parameter *δ*. If we assume a particular value of this parameter then we arrive at the corresponding equilibrium given that Property 1 is satisfied. At the equilibrium, we can calculate the mean offer \(\bar{\delta}\) which gives rise to another model. We are looking for a fixed point of this process, that is a fixed point in the space of models.

### Property 3

(Mode-consistency)

*At the equilibrium*\(\hat{\mathbf {x}}_{\delta}\),

*we require that*

Property 3 is required if we want to interpret the intermediate strategy as the social norm. If this condition is violated, then it amounts to a statement that we have an equilibrium at which the social norm is not used by the majority of the population. What kind of a norm would that be then?

There is an interplay between the last two properties. Property 2 can give a very sharp prediction but it is based on a statistics that is not robust. On the other hand, Property 3 leads usually to a set of values of the parameter *δ* but is based on a robust statistics. We want to have an equilibrium satisfying both properties but we are interested in the predicted set as well as in the particular predicted value of *δ* understanding that the predicted sharp value of *δ* can be quite far away from some of the experimental data.

It is obvious that the crucial property required for further discussion is the uniqueness and the global stability of equilibrium. We prove in Appendix B that our sampling dynamics (4)–(6) has Property 1 for any value of the parameter *δ*.

*δ*in the following noncontinuous way

*δ*does not lead to changes in inequalities defining winning probabilities. This results from the sampling procedure being “grainy” as we have noted before.

A quick look at the formulae (9) shows that the Property 3 is satisfied only for *δ*≤1/2. For *δ*>1/2, the egoistic strategy has the largest share of a population. Consequently, we conclude that our model predicts that the average offer should not be larger than 50 %. This result corresponds nicely with the experimental results.

*δ*

^{∗}satisfying the equation \(\bar{\delta}(\delta^{*}) = \delta^{*}\), namely

*δ*

^{∗}≈0.146. Figure 1(d) shows the dependence of an average on the value of the parameter

*δ*. This value fits nicely into a region

*δ*≤1/2 where the Property 3 is satisfied.

Concluding, all three Properties 1–3 are satisfied for *δ*≈0.146. Hence, the model’s predictions correspond nicely with the experimental results and are far better than the predictions of the fully rational game theory, i.e. a subgame perfect equilibrium. Figures 1(a)–(c) show the behavior of the sampling dynamics for all three distinct cases.

*δ*,

*δ*) interpreted as a social norm) by iterating a function

*δ*

_{0}. After a while, the state of the system converges to the corresponding equilibrium \(\hat{\mathbf {x}}_{\delta_{0}} \). Once near the equilibrium, the society eventually learns that the mean offer is different than the beliefs

*δ*

_{0}and adapts beliefs to \(\delta_{t_{1}} = \bar{\delta}(\delta_{0}) \). This leads to a new equilibrium \(\hat{\mathbf {x}}_{\delta_{t_{1}}} \) and the process is repeated. We can use a recurrence equation

*δ*

_{0}, after at most two steps we have \(\delta_{t_{n}} < 1/2\) and then we know that the population eventually converges to the equilibrium \(\hat{\mathbf {x}}_{\delta<1/2}\) that is the unique globally asymptotically stable equilibrium

^{5}and consequently the beliefs

*δ*

_{ n }converge to the actual average offer and the intermediate strategy (

*δ*,

*δ*) converges to the mean-consistent and mode-consistent social norm (

*δ*

^{∗},

*δ*

^{∗}). The actual timing

*t*

_{ n }of steps in the recurrence equation (10) is irrelevant as long as

*t*

_{ n }→∞ thus we can think of two different time scales where the time scale for a distribution of strategies within a population is fast and the time scale of evolution of beliefs (or a social norm) is slow.

Our model predicts the convergence of beliefs and a social norm to some value corresponding well with the observed behavior in a process where at each step the population adjusts beliefs to the current equilibrium \(\hat{\mathbf {x}}_{\delta_{t_{n}}}\) given a current social norm \((\delta_{t_{n}}, \delta_{t_{n}})\). Figure 1(d) shows this behavior for initial values *δ* _{0}=0 and *δ* _{0}=1. For *δ* _{0} close to 1, two steps are required to have \(\delta_{t_{n}} < 1/2 \).

## 4 Conclusions

We presented a model of a symmetric Ultimatum Game and analyzed the sampling dynamics describing the evolution of a population of players using the sampling procedure. The model is parameterized by a parameter *δ* interpreted as an average offer at the equilibrium. The intermediate strategy (*δ*,*δ*) is interpreted as the “social norm” strategy.

Sampling procedure leads to dynamics with the unique globally asymptotically stable equilibrium \(\hat{\mathbf {x}}_{\delta}\) that depends on *δ* in a noncontinuous way but is constant over two separate intervals. We showed that the natural property of the mode-consistency leads to the selection of *δ* smaller than 1/2. The property of the mean-consistency results in a particular choice of a value of the parameter *δ* ^{∗}≈0.146. Both results correspond nicely with the observed behavior in experiments.

We would like to stress that the reported experimental mean offers vary wildly and are sometimes quite far away from our mean-consistent value of *δ* ^{∗}. This may be for several reasons. Firstly, the model presented here is an extreme simplification and as such should not be construed as an exercise in fitting. Rather, we start with a model and want to derive some bounds on the predicted average offer in experiments based on certain “natural conditions” of mean-consistency and mode-consistency. Secondly, the mean statistics is not robust and may be severely distorted by outliers, and even more so in small samples. The more robust statistics, a mode, gives rise to predictions that are less precise but at the same time more consistent with the data. It is clear that, even taking into account the statistical properties of mean, the derived mean-consistent value of *δ* ^{∗} is certainly too low in comparison to the observed modal offer of around 40 %, cf. [9]. It was argued that the focal point, cf. [15], of the Ultimatum Game is the social norm of equal split, cf. [7], which is a better explanation of the observed offers than our model. However, as noted in [2], “*social comparisons activate the norm of equity: responders expect to be treated like others in like circumstances*.” In our model, each player compares the received offer only with what he offers while being the proposer. There are neither comparisons between players nor any knowledge about the mean offer and so the focal point of the equal split is never triggered. Our model takes into account only the *I want to be treated the way I treat others* norm but not the *I want to be treated like others are* norm. This is probably not enough to warrant the value of *δ* close to the even split social norm.

As mentioned before, our model is an extreme simplification. Although, it does capture the basic features of simple bargaining situations, more detailed models are needed. Further analysis of Ultimatum Game within the sampling dynamics framework can proceed by introducing more intermediate strategies with the limit case containing all intermediate strategies (*α*,*α*), *α*∈(0,1). This may lead to some infinitely-dimensional dynamical systems or partial differential equations describing the evolution of a density function over the set of strategies. Other possible extension is the inclusion of direct comparisons between players (perhaps through studying a game on a network) or the inclusion of the information about the mean offer into the players’ payoff function. Construction and analysis of such systems are left for a future work, however, it should be noted that such analysis may be extremely difficult.

## Footnotes

- 1.
On the other hand, such an asymmetric scenario may be of great interest if one wants to study an asymmetric experiment, something that we do not do herein.

- 2.
These dynamics can be derived rigorously along the lines of [1].

- 3.
Replicator dynamics and other similar dynamics can be derived from different behavioral procedures like replication or learning with aspiration levels, but all of these procedures are either unsuitable like replication with its biological connotations or use some additional information that seems not available to players.

- 4.
Obviously, in the original model proposed in [13], the model of beliefs is more complicated and formal. We use only one parameter for both beliefs about an average and actual average. At the equilibrium, they are equal. We could distinguish these two outside of the equilibrium, however, as we show later, the dynamics describing the evolution of beliefs is decoupled from the dynamics describing the evolution of the distribution of strategies thus such a distinction in the case of the sampling dynamics is irrelevant.

- 5.
- 6.
That is, the critical point is locally asymptotically stable and for any initial condition

**x**, the solution converges to the critical point.

## Notes

### Acknowledgements

J. Miękisz would like to thank Polish Ministry of Science and Higher Education for a financial support under the grant N201 023 31/2069.

### Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

## References

- 1.Benaim M, Weibull JW (2003) Deterministic approximation of stochastic evolution in games. Econometrica 71:873–903 MathSciNetzbMATHCrossRefGoogle Scholar
- 2.Bohet I, Zeckhauser R (2004) Social comparisons in ultimatum bargaining. Scand J Econ 106:495–510 CrossRefGoogle Scholar
- 3.Bolton GE, Zwick R (1995) Anonymity versus punishment in ultimatum bargaining. Games Econ Behav 10:95–121 MathSciNetzbMATHCrossRefGoogle Scholar
- 4.Hamilton WD (1963) The evolution of altruistic behaviour. Am Nat 97:354–356 CrossRefGoogle Scholar
- 5.Hammerstein P (ed) (2003) Genetic and cultural evolution of cooperation. MIT Press, Cambridge Google Scholar
- 6.Hofbauer J, Sigmund K (1998) Evolutionary games and population dynamics. Cambridge University Press, Cambridge zbMATHCrossRefGoogle Scholar
- 7.Janssen MCW (2006) On the strategic use of focal points in bargaining situations. J Econ Psychol 27:622–634 CrossRefGoogle Scholar
- 8.Kagel JH, Roth AE (eds) (1997) The handbook of experimental economics. Princeton University Press, Princeton Google Scholar
- 9.Levine D (1998) Modeling altruism and spitefulness in experiments. Rev Econ Dyn 1:593–622 CrossRefGoogle Scholar
- 10.Miękisz J, Ramsza M (2012) Replicator dynamics of symmetric ultimatum game. Dyn Games Appl 2:258–268 MathSciNetCrossRefGoogle Scholar
- 11.Nowak MA, Highfield R (2011) Super cooperators: why we need each other to succeed. Canongate Books, Edinburgh Google Scholar
- 12.Osborne M, Rubinstein A (1998) Games with procedurally rational players. Am Econ Rev 88:834–847 Google Scholar
- 13.Rabin M (1993) Incorporating fairness into game theory and economics. Am Econ Rev 83:1281–1302 Google Scholar
- 14.Ramsza M (2005) Stability of pure strategy sampling equilibria. Int J Game Theory 33:515–521 MathSciNetzbMATHCrossRefGoogle Scholar
- 15.Schelling TC (1980) The strategy of conflict. Harvard University Press, Cambridge Google Scholar
- 16.Sethi R (2000) Stability of equilibria in games with procedurally rational players. Games Econ Behav 32:85–104 MathSciNetzbMATHCrossRefGoogle Scholar
- 17.Sigmund K (2010) The calculus of selfishness. Princeton University Press, Princeton zbMATHGoogle Scholar
- 18.Weibull JW (1995) Evolutionary game theory. MIT Press, Cambridge zbMATHGoogle Scholar