1 Introduction

For centuries, philosophers, economists and social scientists have been intrigued with the issue of whether human notions of fairness are hard-wired in the human psyche, or whether behavior that appears to be fair is simply a useful tool in self-interested activity. This question resonates in the experimental economic literature, where many experimenters observe that subjects in bilateral or small-group settings tend to choose fair instead of self-interested plans of action, even when they are motivated by financial incentives and shielded by anonymous payment procedures.

In both laboratory and field experiments, the workhorse tool for investigating fairness is the ultimatum bargaining game in which one person makes a take-it-or-leave-it offer to another, who in turn can either accept the offer or reject and force both parties to earn nothing. The ultimatum bargaining model has served as a useful paradigm in the social sciences because of its simplicity and wide applicability to economic negotiations in which one party has the ability to make a final, non-negotiable proposal.Footnote 1 The model has also attracted attention because the minimum possible offer prediction (based on pure, calculated self-interest) is rarely observed in experiments (Binmore et al., 2002; Gale et al., 1995; Samuelson, 2005, 67; Sieberg et al., 2013).Footnote 2 Rather than offering the minimum possible amount to the other person, proposers tend to make ‘fair’ offers, often sharing roughly half of the pie. This tendency suggests that fairness has a bigger role in human behavior than pure self-interest would predict.

In Crooked Thinking or Straight Talk: Modernizing Epicurean Scientific Philosophy (2020), Kenneth Binmore explores philosophical perspectives on the roots of fairness. He addresses insights of prominent philosophers about the potential for rational people to adopt fair behavior (Kant) or to choose fair actions in a sort of minimax strategy—to ensure that the worst off have the best possible outcomes (Rawls). Although Binmore is intrigued by these ideas, he does not find them convincing. Instead, he concludes that the best explanation for why people engage in fair-minded behavior is that they do so when this behavior is in their own long-term best interest.

Despite robust findings of fairness in the general ultimatum game, seemingly slight modifications to the ultimatum game have been shown to produce demands that are more “gamesman-like” (Binmore et al., 1985). In particular, factors such as role reversal and the opportunity to learn from prior interactions can yield demands that come closer to standard predictions based on self-interested motivations. We draw from important insights and modifications in the experimental literature to explore Binmore’s concerns. We integrate these factors into our experiment by appending a conflict stage to a two-person ultimatum game. In our conflict bargaining game, a rejection of the take-it-or-leave-it offer results in a contest in which both parties bear a cost, but only one of them “wins” the whole potential earnings pie. The random conflict outcome is determined by a pre-announced probability of success for the proposer. This conflict option creates differential opportunities for the responders to punish offers they see as unfair. We also allow for learning with random re-matching of proposers and responders. Finally, to explore the role of asymmetric experience and learning, the initial pairings are followed by a second part in which all of the original proposers stay to play the same conflict bargaining game with a newly arrived group of responders.

In all cases, the win probabilities for each party to a conflict are common knowledge. For each of the conflict win probability treatments, the optimal initial demand of a self-interested proposer can be calculated by backward induction, under the assumption that responders also make accept-or-challenge decisions purely on the basis of their own self-interest. In particular, the proposer can demand the highest share of the pie that a self-interested responder would accept. Our results are consistent with Binmore’s argument and with those of some previous modifications to the ultimatum game experiment. With experience, proposers (and original responders, when roles are reversed) are increasingly likely to make theoretically predicted demands in this competitive context.

The main observed deviation from rational, self-interest-based theory, however, involves the incidence of conflict. In theory, the proposer should be able to use the win probability to calculate which initial demands will be accepted and which will be rejected. Thus, the optimal demand is at the “edge,” in the same manner that the predicted demand in a standard ultimatum game involves an offer that is just above the minimum that a rational, self-interested responder would accept. In this manner, there “should” be no conflict. In contrast, we observe persistent, non-negligible levels of rejection and conflict. Although conflict is costly, it plays a role in encouraging proposers to make more realistic and more profitable demands and in helping responders understand what they “should” accept. These results fit nicely with Binmore’s argument that behavioral strategies adopted are likely to be case-specific: Notions of fairness will be applied if they best advance prospects of meeting self-interest-based goals. If more selfish strategies can be more successful in that pursuit, they should be adopted (Binmore, 2020, p. 123). In our experiments, conflict teaches both proposers and responders to make and accept demands that are consistent with game-theoretic predictions, rather than with fairness concerns.

2 Literature Review

Standard game theoretic-predictions are based on an assumption that players are exclusively motivated by the future causal payoff consequences of their acts. In an ultimatum game, the implication is that proposers should offer an amount that is weakly better than the zero earnings amount that the responder can obtain from rejecting (some small \(\varepsilon >0)\), based on the belief that such an offer would be accepted in the final stage or subgame. In technical terms, such an outcome is required by Selten’s notion of subgame perfection as a restriction on the set of reasonable Nash equilibria. Experimental results, however, tend to yield offers that lie between 40–50% of the total pie. Proposers seem to have concerns both for their own payoffs and for the responder payoffs. In fact, the original ultimatum game was itself devised as a single-period task to examine some perplexing multi-round bargaining behavior that had been observed in early psychology experiments.Footnote 3

This stark contrast between minimal-offer predictions and observed pie-splitting tendencies has not gone unquestioned, however.Footnote 4 Instead, it has provoked several papers assessing behavioral explanations as well as attempts to alter the game in order to stimulate more strategic behavior. In the original ultimatum game experiment, Güth et al. (1982) suggest that some of the discrepancies between predictions and results arise from concerns over “inequity aversion.” In other words, if players value equitable distributions, then a responder may prefer zero earnings that are equal for both players to both a positive earnings amount that is far below that obtained by the proposer. Others (Ochs & Roth, 1989) argue that results are driven by the psychology associated with the presence of first-mover advantage, constrained by potential punishment from the second player.

The responder’s ability to punish the proposer for non-equal distributions is not the only factor affecting tendencies for fairness. Research on the dictator game, in which the proposer determines the shares for both players, with no chance for a response, has shown that proposers frequently offer more than the predicted amount of zero. This result is so regular that the exceptions are of interest. Hoffman et al. (1994), however, argue that this behavior may indicate a desire to appear fair to the experimenter. In fact, behavior that is closer to being purely selfish is typically observed when experiments are conducted using a double-blind treatment, in which the experimenter could not link actions to a given subject. Similarly, Lönnqvist et al. (2014) show that if dictators can claim that their decisions were based on the flip of a coin, they are also more likely to behave selfishly.Footnote 5

2.1 Ultimatum Game Modifications

Several authors have devised subtle alterations to the traditional ultimatum game model in an effort to generate more gamesman-like behavior. One avenue is to let the responder express some emotion in terms of a message that is not directly payoff relevant, and hence, which does not alter the structure of the payoff functions for the game. Xiao and Houser (2005) demonstrate that punishment need not be costly if subjects have the option to express disapproval for unfair offers. They compare rejection rates for offers of 20% or less in one-shot ultimatum games under two different treatments—one which mirrored the traditional ultimatum game with no opportunity for expression of emotions, and another in which responders were able to send a message to the proposers along with the decision. The authors found roughly the same number of unfair offers in each treatment but saw significantly fewer rejections in the treatment with ex-post messages. The opportunity to express negative emotions seems to have provided responders with a less costly form of punishment than outright rejection (Xiao & Houser, 2005, 7400). Similarly, Grimm and Mengel (2011) find that a forced delay (of roughly 10 min between seeing an offer and deciding whether to accept or reject it) can significantly increase acceptance rates for low offers (of 1 or 2 out of 10) when compared to subjects who make an immediate decision.Footnote 6

2.1.1 Multi-Stage Games

Another possible modification is to add a second stage in which the original responder can reject and make a take-it-or-leave-it counteroffer. Binmore et al. (1985) conducted two-stage ultimatum bargaining game experiments and found results that echoed previous findings that proposers tended to make fair demands for the parameterization they implemented. However, when responders from this experiment were recruited to play as proposers in a new treatment, the initial demands made were much closer to subgame-perfect-Nash-equilibrium predictions (as determined by backward induction). Thus, the authors argued that a combination of a multi-stage game and experience with both roles can lead subjects to play “like a game theorist.” (1985, 1179).Footnote 7 Neelin et al. (1988) offer a more cautious assessment, noting that subgame-perfect Nash-like behavior was not observed in games with three or more stages, or even with two stages if the parameters used cause the proposer to be disadvantaged. They conclude that the observed tendency for proposers to make optimal offers in some two-stage games was driven largely by the parameters chosen (1988, 824).

2.1.2 Expectations

Harrison and McCabe (1996) conducted ultimatum game experiments to test the role of common expectations on subject behavior. In their experiments, common expectations were created by having the players gain experience with both roles by giving them varying levels of information about the demands and responses of other players, and by injecting automata into some of the subject pools. Players were told that their role as a proposer or responder would be selected at random. They were asked to submit both a range of demands, in the event that they were to play the role of a proposer, and a range of acceptable offers, in the event that they were to play the role of a responder. The authors also paired some subjects with automata that were programmed to either make offers, as proposers (or to accept offers as responders) that were consistent with game-theoretic predictions. Subjects were given varying levels of access, in different treatments, to information about offers and thresholds for acceptance. Access to information allowed the authors to detect the role of common expectations on offers and acceptances. The addition of automata enhanced this effect by giving subjects experience with other “subjects” who played like a game theorist. The authors found that the effect of these common expectations was to increase the amount that proposers kept for themselves, and to decrease the minimum threshold for acceptance. This tendency was even more pronounced in the sessions with automata. The authors concluded that common expectations can play a stronger role than fairness concerns. In particular, if subjects are aware that proposer demands larger than the standard 50% level can be made and accepted, then behavior can move closer to what game theory would predict.

2.1.3 Competition

Fehr and Schmidt (1999) note that situations involving competition can work against tendencies towards reciprocal fairness. Such competition is particularly relevant in settings that resemble markets with multiple buyers and sellers. Güth et al. (1997) test the effect of injecting competition into an ultimatum bargaining game. In their game, the proposer can make an offer to five potential responders, and only one of the five would be selected. The proposer, can then choose the responder who is willing to accept the highest demand and take the lowest offer, and the four remaining responders earn nothing. As a result, the responders compete against each other. By creating this competition, the authors find results that are closer to subgame-perfect Nash predictions.

Competition alters the effect of fairness concerns on the overall outcome. Bolton and Ockenfels (2000) note that “[w]hereas the second mover can force a fair outcome in the UG, a trader in a competitive environment cannot force an equal split, and an effort to do so risks them falling behind traders less interested in fairness. In competitive markets, strategic considerations compel the fair minded to behave as if they are self-interested” (van Damme et al., 2014, 301) Similarly, Grosskopf (2003) examines a game in which players play a traditional ultimatum game and a modified version in which each proposer has three responders. Her results indicate that the presence of competition between the responders can motivate the proposer to make demands that are closer to theoretical predictions. Salmon and Wilson (2008) report similar competition effects in an experiment in which several responders first bid in an auction to purchase a unit from a seller, which is sold to the highest bidder. The seller can then post a take-it-or-leave-it price offer to sell a second unit to the second-highest bidder in the original auction, with rejection resulting in 0 earnings on that unit for both parties. Although the seller’s posted price in the final ultimatum stage was generally quite high, leaving little in the way of earnings for the targeted buyer/responder, the actual rejection rate was below 5 percent.Footnote 8

2.1.4 Uncertainty

Other variations of the ultimatum game include situations in which there are uncertainties regarding the size of the pie or “cake.” Güth et al. (1996) report an experiment in which the cake can either be large, 2/3 of the time, or small, 1/3 of the time. The probabilities are known, but only the proposer knows the true size of the cake. They find that proposers with large cakes tend to make offers that look like a fair distribution of a small cake, while proposers with small cakes make smaller offers. Thus, they note that seemingly ‘fair’ behavior entails the proposers ‘hiding behind the cake.’ Proposers will be selfish when they can conceal their selfishness. This result, of partial self-serving dishonesty, is reflected in Fischbacher and Föllmi-Heusi’s (2013) results regarding tendencies to lie in situations in which the lying could not be directly detected and in which there were no penalties. The authors had subjects roll a die and report the outcome. The number affected their earnings in a questionnaire that followed. Comparing the reported results to what could be expected in random throws of a die, the authors found that some subjects are obviously honest, some lie and some are apparently partially dishonest—they inflate the number, but not to the highest possible amount. Taken together, the results suggest that people are self-interested, but that even when they can get away with being fully selfish, they may not do so.

2.1.5 Rejection Outcomes

The impact of the mutual rejection earnings of zero has also been explored. Motivated by a previous literature using non-zero rejection outcomes, Hennig-Schmidt et al. (2018) ran an experiment to compare the effects of rejection outcomes that favored either proposers or responders. Their baseline for comparison was a setup in which both roles had the same rejection outcome. The authors report that the tendency for equal splits declined when the subjects had different rejection outcomes. Moreover, proposer offers decreased as their outside option payoffs increased. Similarly, responder demands tended to increase when their outside option payoffs increased. Overall, the rejection rate was increasing in the asymmetry between the proposer and responder outside option payoffs.

A similar modification of the ultimatum game model involves replacing the zero-earnings rejection outcome with a costly conflict that has a random outcome in which the proposer has a known probability of success. The winner of the conflict wins the entire “pie” or resource, and each player incurs a cost of conflict, irrespective of who wins. The model’s predictions diverge from those of the ultimatum game because initial proposer demands should take into account the conflict costs and win probability. Nevertheless, the conflict and ultimatum bargaining games are similar in that a strategic, selfish proposer should offer the responder just enough to induce acceptance. Despite the potential for conflict, it should not be observed in equilibrium, and proposer demands should be aggressive in the sense that there is no money “left on the table.” In other words, proposers should send an offer that makes the responder slightly better off by accepting than by rejecting.

This conflict bargaining model is motivated by situations in which corporations or nation-states negotiate “in the shadow of power.” In such cases, bargaining can be costly in terms of legal or military expenditures. It is typically the case that one party may have a greater likelihood of winning, regardless of whether that player is the first or second mover. The model is useful in analyzing labor disputes, court cases, and territorial negotiations between nation-states that can be backed with the threat of war (Fearon, 1995; Powell, 1999).

By introducing an element of competition into this variation of the ultimatum game, experimental tests of a two-stage bargaining model with conflict have found some weak support for the subgame perfect Nash predictions (Sieberg et al., 2013).Footnote 9 Observed patterns of proposer demands were qualitatively similar to theoretical predictions as a function of the proposer win probability, but the demand patterns were less responsive to proposer win probabilities (flatter) than predicted values, and conflict occurred more often than the theory suggests.Footnote 10 Conflict with a random outcome was also used by Pecorino and Van Boening (2010) in an experiment framed as tort litigation between a plaintiff, who makes a pre-trial settlement offer, and a defendant who may accept the proposed settlement or reject and incur a litigation cost with a random outcome. They report lower settlement offers and rejection rates than are observed in comparable ultimatum games with no trial phase appended.Footnote 11

2.2 Analysis of Rejection Rates

The presence of rejections in ultimatum games is important because in equilibrium optimal demands should be made and accepted with no conflict. Most of the literature, however, has tended to focus on proposer demands, and on the various alterations to the format of the ultimatum game that tend to produce more gamesman-like demands. Reactions to these demands, in terms of either acceptance or rejection, are often secondary in the analysis. Harrison and McCabe (1996) do examine reactions, because their design involves subjects deciding what demands they would make and what demands they would accept, before knowing which role they will play. With both considerations constantly in mind for each round, reactions also tend to match theoretical predictions. Other articles focus more closely on proposer demands. In their second game, in which original responders were invited to play the role of proposers, Binmore et al. (1985) did not have subjects play the role of responders. They had the proposers make demands and recorded them. As noted, their proposer demand results are intriguing, but the absence of responders means that potential acceptances or rejections are unknown. Others (Neelin et al., 1988) mention that some responders engage in rejections, but again their primary focus is on proposer demands. In contrast, Grosskopf (2003) highlights the role and importance of responder actions in both the traditional ultimatum game (one proposer and one responder) and in the version with one proposer and multiple responders. She notes that responders use rejections to punish unfair demands, particularly in the traditional form of the ultimatum game. Responders are less likely to reject offers when the game is modified to include multiple responders. The responders in her experiment seem to be aware of when they can use rejection as a punishment for inappropriate demands, and proposers tend to anticipate this behavior.

Grosskopf’s discussion of responder behavior is a major motivation for our analysis of the rejections. Instead of viewing conflict as irrational, perhaps it yields valuable experience in the game. We are interested in the role that punishment plays in the form of costly rejections and in how these rejections, combined with the differential probability of conflict success, provide educational signals: proposers learn what demands tend to yield the best results, and responders learn what proposer demands are relatively attractive and should be accepted. Because conflict can play such an important role in teaching both parties to the game what offers should be made and accepted, then at least in the first rounds of the experiment, we should expect to see rates of conflict that are higher than the predicted rate of zero.

This paper explores a combination of modifications to the ultimatum game, drawn from the literature, to explore Binmore’s ideas of fairness. We use a simple one-stage bargaining model, with a conflict stage that follows a rejected initial proposal. We also incorporate opportunities to learn, role reversal and previous experience with the game. Our results support Binmore’s argument that fairness, if it occurs, is case-specific. We find that subjects’ demands are very close to the theoretical (subgame-perfect Nash) predictions, especially after experience and role reversal. However, in contrast to predictions, conflict is often observed. We examine the effect of conflict as a form of costly punishment and find that conflict plays an important role in moving demands away from fair divisions and in the direction of their predicted values.

3 The Ultimatum Conflict Bargaining Game

In a traditional ultimatum game, there is a fixed pie of size V. The proposer begins by making a demand d, or equivalently, an offer of (V − d) to the responder. An accepted offer is implemented, and a rejected offer results in payoffs of 0 for both. The subgame perfect equilibrium, assuming purely selfish preferences, is for the proposer to offer the responder the smallest positive amount possible. The responder will recognize that this amount is greater than nothing, and accept. Thus, if players have $10 to divide and the smallest unit that may be offered is $1, then the proposer should offer the responder $1 and demand $9. A $1 offer would be accepted by a responder who is motivated solely by own earnings. Thus, the proposer would end up worse off by offering anything more than the minimum positive amount of $1 in this case.

The bargaining game with conflict has a similar format. The game starts with a status quo division of (q, 1 − q), where q is the initial allocation for the proposer.Footnote 12 The proposer begins by demanding an amount x and offering V − d to the other player. The responder can accept the offer of V − d or can reject. A rejection leads to conflict in which the proposer wins the entire pie with a known probability of success p. In the event of conflict, each party also incurs a cost. This cost could easily be modeled as being role-specific, or even outcome specific, but for simplicity, we will work with a symmetric conflict cost denoted by c.

Assuming that both the players are informed of the payoffs and the win probabilities, the game is solved via backward induction. Working backward from the final stage, consider the responder’s calculation: with probability p, the responder receives nothing and bears the cost of conflict, c, and with probability 1 − p, the responder wins the resources but also pays the cost of conflict. Thus the responder’s expected payoff of rejecting is (1 − p)V − c. If ε is the smallest payoff increment this will induce the responder to accept, the proposer will demand d and make an offer, V − d, that equals the responder’s expected payoff plus ε:Footnote 13

$$ V - d = \left( {1 -p} \right)V{-}c + \varepsilon \quad \left( {\text{proposer offer that incentivizes acceptance}} \right) $$
(1)

This equation can be solved for the equilibrium proposer demand d as an increasing linear function of the proposer win probability, p:

$$ d = c + pV - \varepsilon \quad \left( {\text{subgame perfect Nash proposer demand}} \right) $$
(2)

As noted previously, the responder should accept the offer when ε > 0, i.e., the rate of conflict should be zero in equilibrium.

It is useful to rewrite the equilibrium proposer demand on right side of (2) in a form that suggests an anchor-and-adjust procedure for the proposer demand:

$$ d = \left( {c + 0.5V - \varepsilon } \right) + \left( {p - 0.5} \right)V\alpha \quad \left( {{\text{anchor and adjust at rate}}\alpha {\text{from Nash}}} \right) $$
(3)

where the anchor is the common conflict cost plus half of the pie value, minus the epsilon needed to ensure responder acceptance, and α is a behavioral adjustment parameter to the proposer win probability that is 1 in a Nash equilibrium. Note that the anchor of c + 0.5 V − ε is a “fair” amount, based on half of the “pie,” adjusted for the conflict cost.

4 Experimental Design

The previous section described the general version of the ultimatum game with conflict. Our experiment involves a specific case, described here. Each negotiation interaction in the experiment involves an initial “pie” of size $10 which is given an original distribution of $3 to the proposer and $7 to the responder, which adds an element of realism that is held constant across all treatments.Footnote 14 The initial $3/$7 distribution was selected to create an asymmetry between the players in order to potentially provoke some conflict. If the initial share of the pie were equal, there might be more of a tendency towards equal divisions to preserve the status quo.

In each game, the proposer had the opportunity to make a proposed redistribution, an integer dollar amount, to the responder. If the responder agreed with the redistribution, each player received their agreed-on payoff amount. If the responder refused, then each player was penalized a conflict cost of $2, and the pie was divided based on the results of a random event, which was explained in the experiment as a comparison of the roll of a 10-sided die with the announced proposer win probability, p. The winner of the conflict would take the entire $10 pie, and the loser would receive nothing, although both would incur the $2 conflict cost. The players knew in advance the probability that the proposer would win the lottery. These probabilities were 0.2, 0.4, 0.6, and 0.8 and varied by treatment.

Subjects were constrained to make discrete integer dollar demands, rather than continuous amounts. In this case, the backward induction arguments still yield a pattern of initial demands that is an increasing function of proposer win probabilities, as determined by (2). With a probability of a proposer win of 0.6, for example, the responder would have an expected payoff of (1 − 0.6)(10) − 2 from rejecting an offer, which is 2. To make the responder prefer to accept the offer, the proposer should offer $3 and demand the amount $7, a proposal that should be accepted. Similarly, the predicted proposer demands in Eq. (2) with c = 2, V = 10, and ε = 1 will be 3, 5, 7, and 9 for p = 0.2, 0.4, 0.6, and 0.8.

For the parameters being used, the anchor point in (3) of c + V/2 − ε is 6 (= 2 + 10/2 − 1). Notice that this anchor point of 6 is based on a “fair” pie split (V/2). Another perspective on the anchor is that the initial endowment for the proposer is 3, so it is unlikely that a proposer would demand less, and a demand of the full pie of size 10 would presumably lead to conflict, so the remaining range of likely demands spans the range from 3 to 9, with a midpoint of 6. With these parameters, Eq. (3) specifies that the proposer demand deviates from an anchor of 6 in a manner that is linearly related to the proposer win probability p.

As noted in the introduction, the observed proposals for a two-stage bargaining game with conflict tended to be “too flat,” i.e. not as responsive to changes in proposer win probabilities as predicted by backward induction (Sieberg et al., 2013). To assess whether these results were affected by the perceived rationality of the opponent, we incorporated the element of experience to manipulate perceived rationality In other words, players who have a long experience with a type of negotiation could behave differently than those who are new to the situation. The procedures to be described in the next section involve matching experienced players with new, inexperienced players, to see if there was any effect on their demands.

5 Experimental Procedures

There were 72 subjects, who were recruited with email announcements from the undergraduate student population at the University of Virginia, with 18 subjects in each of the four proposer-power treatments. Participants received $6 for showing up and were informed that they would be paid their subsequent earnings privately in cash, immediately after the experiment. The experiment consisted of multiple stages with various degrees of experience and role reversal. Subjects did not know the number of stages.

After arriving and being seated in private booths, the instructions were read aloud. Half of the participants (six subjects in each experiment) were designated as “proposers,” and half (six) were “responders.” Each bargaining interaction involved an initial pie size of $10 (lab dollars), with an unbalanced initial allocation of $3 for the proposer and $7 for the responder. The experiment was conducted by selecting the Political Conflict option for the Veconlab Bargaining Game.Footnote 15 Each bargaining game interaction implemented the one-stage model described in the previous section. For each change between stages of the experiment (for example, between the initial one-shot game and the 6-round random matching game) the subjects were logged into new experiment “sessions” with new session code names, and instructions were again read aloud. To ensure that the subjects were assigned/maintained the designated roles (proposer or responder) the experimenters logged in each subject individually for each stage.

The proposers suggested an initial division of the $10, and the responders could either accept the suggested reallocation or reject it. If the responder rejected, then both players incurred a $2 conflict cost, and the $10 sum was awarded to the conflict winner as determined by determined by a simulated toss of a 10-sided die with a known proposer win probability. Conflict is risky because the winner receives the full $10 and the loser receives nothing. After costs were subtracted, conflict payments are $8 for the winner and − $2 for the loser.

Each session had 18 subjects and the same sequence of experience-role treatments, but each cohort used a different win probability of either 0.2, 0.4, 0.6, or 0.8, which stayed the same for all bargaining interactions for that cohort. One session was conducted for each proposer win probability.Footnote 16 Each session consisted of s stages in two Parts that are summarized in Table 1. Subjects received an initial role assignment in the first stage of Part 1, with 6 being proposers and 6 being responders in a one-shot game. The second stage kept the roles the same, and had the players play a series of six rounds in which proposers were randomly rematched with responders in each round, and with the proposer win probability remaining unchanged. In the third stage, the roles of proposers and responders were reversed, and they were randomly re-matched for six more rounds. After stage 3, the original responders were released (with payments).

Table 1 Treatment sequence for each cohort

In Part 2, the original proposers remained, and a new cohort of 6 inexperienced responders was brought into the lab for stages 4–6. All subjects were made aware of the asymmetry in experience. The three stages in part 2 matched those in part 1, beginning with a 1-shot game in Stage 4, followed by 6 rounds of random matching in stage 5 and six more rounds of random matching with reversed roles in the final stage 6. This setup enables us to compare the behavior of the original inexperienced proposers in the first (Part 1, stage one) 1-shot game with the same proposers (in Part 2, stage 4) after extensive experience, but with a new cohort of inexperienced responders.

Earnings from all parts were recorded and summed, and every lab dollar was converted into $0.50 in take-home pay. Total earnings (including the $6 show-up payment) were about $34 for the original proposers who stayed for about two hours, and about $20 for the original responders and for the substitute responders in Part 2, who only stayed for about an hour.

Combining the theoretical predictions with some of the results from modifications to the ultimatum game from the experimental literature, we make the following hypotheses:

H1: Proposer demands will be increasing in the proposer power (conflict win probability), as predicted in a subgame perfect Nash equilibrium.

H2. Deviations from Nash equilibrium proposer demands diminish with experience and role reversal.

H3. Conflict rates will be low, as predicted by the subgame perfect Nash equilibrium, and will diminish with experience and role reversal.

6 Experiment Results

6.1 Stage 1: Initial One-Shot Games

Proposers in the initial one-shot game tend to demand more as their conflict win probabilities increase, as indicated by the positive slope of average proposer demand line in Fig. 1. For comparison, the subgame perfect Nash equilibrium (SPNE) predictions (3, 5, 7, and 9) are connected by the gray dashed line that is linear in the proposer win probability. There is, however, a systematic pattern in the deviations from these predictions. Average demands at the low proposer win probability of 0.2 are higher than the predicted value of 3. Average demands for the win probability of 0.4 match are close to the predicted value of 5, which corresponds to the focal 50/50 split. Conversely, demands for higher proposer win probabilities of 0.6 and 0.8 tend to be lower than the corresponding Nash predictions of 7 and 9. This pattern of divergence from Nash predictions is not surprising: the first-stage results in Fig. 1 suggest a role for fairness, since a flat demand pattern tends to reduce proposed payoff inequalities.

Fig. 1
figure 1

Average proposer demands for the initial one-shot game

In total, there was a separate 12-person session for each proposer-win treatment, so there are 6 proposer demands for each round in a given treatment. These proposer demands for the initial one-shot games are listed in Table 2. Proposers have the least “power” in the 0.2 win probability treatment, shown in the top row. In this case, the observed demands are all above the subgame perfect Nash demand of 3, with one exception. Rejections are indicated in bold, and it is apparent that all demands above 3 were rejected. In fact, half of the demands in the 0.2 treatment involved an even split. Proposers with higher probabilities of success, 0.6 and 0.8, however, tended to make demands that were too low relative to the Nash predictions of 7 and 9 respectively, although a significant proportion of those demands were still rejected, as indicated by the rejection rate column on the right side of the table.

Result 1: Proposer demands for the initial one-shot game are significantly driven by differences in proposer win probabilities, although this pattern is diminished by a tendency to propose more equal divisions, especially when the proposer is disadvantaged.

Table 2 Proposer demands by treatment: stage 1 single-period game with rejected demands in bold

Support: We use a Jonckheere test of the null hypothesis of no effect against the directional alternative hypothesis that proposer demands are higher for successively higher proposer win probabilities. The test statistic is J = 161, which is the number of “binary wins” in which a demand in a given category is higher than a demand in a category with a lower prediction (in a row above the particular demand being considered). This is a directional test of the hypothesis that proposer demands are higher for higher proposer win probabilities, and the result is significant at the 0.005 level (critical values shown in Holt, 2019, Table 13.7).

6.2 Stage 2: Six Rounds with Random Matching

Some of the divergence from predictions in the one-shot game could be due to the lack of familiarity with the conflict consequences of power asymmetries involved. In this case, we expect results to come closer to the subgame perfect Nash equilibrium predictions in the random matching sequences that followed. In fact, proposer demands in the second stage with random matching are closer to predictions, as indicated by the dark dashed line in Fig. 2, which is closer to the gray dashed line predictions, especially for the low proposer power (p = 0.2) treatment on the left. However we do see that average demands remain lower than predicted levels for the high proposer-power treatments with win probabilities of 0.6 and 0.8.

Fig. 2
figure 2

Average proposer demands with random matching

Figure 3 shows the complete frequency distribution of demands with random matching for each treatment in stage 2. Notice that almost all of the demands are at predicted levels, which contrasts with the distribution in the prior one-shot game shown in Table 2, where only 9 of 36 demands are at predicted levels.

Fig. 3
figure 3

Stage two: proposer demand frequencies by treatment, with random matching

To understand what prompted proposers to change their demands, it is helpful to look at the responder behavior. In theory, proposers make “minimal” demands that the responder would prefer to accept, so there should be no conflict in equilibrium, as indicated by the gray dashed line along the base of Fig. 4. The solid dark line shows the conflict rates for the initial one-shot game, and the solid gray line shows that some conflict persists for all power treatments with experience provided by random matching.

Fig. 4
figure 4

Conflict rates and experience (stages 1, 2 and 3)

The high conflict rate for the first one-shot game with low 0.2 proposer win probability is unsurprising. These proposers demanded, on average, shares that were too high (4.67) relative to the SPNE demand of 3. Given the responder’s correspondingly high probability (0.8) of conflict success in this treatment, their expected value of conflict exceeds what they are being offered. The high conflict rate for p = 0.2 in Fig. 4 indicates that responders did not underestimate their own corresponding conflict success probabilities. These high conflict rates for p = 0.2 in the one-shot game (left side of the dark line in Fig. 4) may have resulted in lower demands for that treatment with the random matchings, and correspondingly lower rates of conflict. This reduction in conflict is consistent with the lower rates for the random matching treatment that are tracked by the dark dashed line in Fig. 4. These results echo the intuition provided by Grosskopf (2003), that rejections can be used as punishment for inappropriate demands. They also reflect Binmore’s assertion that behavior is motivated by self-interest, and that when fair demands conflict with bargaining power, self-interest will prevail (2020, 123).

Table 3 shows average demands for the six proposers with random matchings. Notice the clear effect of proposer power; there is no overlap between rows. For example, the proposer with the highest average demand in the p = 0.2 row is below the lowest average for the p = 0.4 row.

Table 3 Average demands for each individual proposer (6 rounds of random matching)

The average proposal across all proposers with random matching is shown in the fourth column of Table 4, next to the analogous averages for the one-shot game in the third column. This comparison indicates that proposer demands with random matching are moving away from the anchor point of 6 and toward the Nash predictions, which corresponds to modal proposals at predicted levels in Fig. 3. These observations motivate our second result.

Result 2: Experience provided by random matching moves proposer demands closer to subgame perfect Nash predictions, especially for low proposer power treatments.

Table 4 Average proposer demands with original proposers and responders

6.3 Stage 3: Role Reversal

As noted in the introduction, prior research has indicated the importance of role reversal in terms of helping subjects understand the strategic landscape, which is conducive for more gamesman-like behavior. In our setup, the initial responders became proposers and vice versa in the third stage. The right-side column of Table 4 shows the average proposals for the six rounds of random matching following role reversal. Again, there is some flatness, but it is largely eliminated with role reversal and random matching: Although demands are still slightly above the Nash equilibrium at p = 0.2 and slightly below them at p = 0.6 and p = 0.8, they are closer to predictions than previous demands. These average demands are shown as the dotted line in Fig. 2, which are much closer to the predicted values than was the case prior to role reversal.

Although the original responders appear to have learned what they can demand, experience does not seem to have taught the original proposers to avoid conflict in the face of an optimal demand. Relative to conflict in the original random matching rounds (gray line in Fig. 5), the conflict rates for the random matchings after role reversal (dark lie) no lower on average. It appears that having experienced conflict as a responder (and either winning or losing) provides subjects with an appreciation for what they can demand as of proposers. These observations are summarized:

Result 3: Proposer demands are quite close to predicted levels with role reversal and random matching, although incidence of conflict does remain surprisingly high.

Fig. 5
figure 5

One-shot proposer demands: inexperienced proposers in part 1 (dark line) versus experienced proposers in part 2 (gray line)

6.4 Stages 4–6 with Mixed Experience Effects in Part 2

To further assess the effect of experience and of role reversal on “gamesman” behavior (Binmore et al., 1985; Harrison & McCabe, 1996), we added three additional stages in which the original proposers were asked to stay, and six new responders were brought into the room, replacing the original responders. These subjects participated in a one-shot stage, followed by six rounds of random matching, and then by a role reversal with six more rounds of random matching. Thus, the key difference was that the original proposers had fresh experience with all roles in the game, whereas the responders in Part 2 had no prior experience, and this asymmetry was particularly apparent for the original proposers.

As expected, experience led proposers to make demands that were close to the SPNE predictions. Figure 5 shows that even in the initial one-shot game with new responders, the experienced proposers made demands (gray line) that are much closer to predictions on average than was the case in the stage 1 one-shot game with inexperienced proposers and responders.

Figure 6 shows how tightly these proposer demands in the one-shot game correspond to Nash predictions, even with inexperienced responders in stage 4.

Fig. 6
figure 6

One-shot demands from experienced proposers in stage 4

Table 5 permits a comparison of average proposer demands after the introduction of new responders, in a one-shot game (Stage 4), with random matching (Stage 5) and with random matching and role reversal (Stage 6 on the right side). In all three treatments, these demands are quite close to theoretical predictions.

Table 5 Average proposer demands with experienced proposers and inexperienced responders

The conflict rate for the Part 2 (stage 4) one-shot game with an experienced proposer when the win probability was 0.8 is unsurprising. This situation saw a proposer who had already learned (from experience in both roles) about what the best demand should be given the high probability of success. The average demand made in this round was 8.67, as shown in the third column of Table 5, which is very close to the SPNE prediction of 9. For a new responder, however, this kind of high demand is likely to be unsettling, provoking some recourse to conflict. Figure 7 shows conflict rates for the one-shot games and the random matching rounds in the Part 2 mixed-experience setting (stages 5 and 6). As indicated by the lower conflict rate for the experienced random matching rounds, the new responders, given the opportunity to learn their likelihood of success in the event of a conflict, reduced their tendencies to reject.

Fig. 7
figure 7

Conflict rates with experienced proposers (part 2)

As was the case in Part 1, the proposers and responders reversed roles in Part 2 after the initial set of random matching rounds. Although the initial responders in Part 2 were less experienced than the original proposers, they had clearly learned what kinds of demands were possible. The demand averages were close to the predicted values, but the conflict rates after this role reversal (dotted line) did not diminish, except for the high-power treatment.

Result 4: The extensive experience gained with playing both roles in Part 1 and Part 2 produced the most Nash-like behavior in terms of proposer demands, but this experience did little to reduce costly and inefficient conflict.

6.5 Anchor-and-Adjust Model Parameters and Experience Effects

The qualitative similarities in proposer power effects in Tables 4 and 5 are consistent with the responsiveness to proposer power implied by the anchor-and-adjust model specified above in (3). It follows from that equation that the parameters can be estimated with a linear regression of the form:

$$ d = A + \alpha x,\quad \left( {\text{anchor and adjust regression}} \right) $$
(4)

where the dependent variable d is the proposer demand, the constant A is the anchor, and the independent power variable x is the proposer power measure, 10(p − 0.5). Recall that the adjustment responsiveness parameter α is equal to 1 in a subgame perfect Nash equilibrium. For the pie size of 10, conflict cost of 2, and minimum demand increment of 1, the theoretical value of the anchor A is 6, calculated as equal division of the pie of 10, plus the conflict cost of 2, minus the minimum positive responder expected payoff incentive to accept, ε = 1.

The anchor-and-adjust regressions for each stage are shown in Table 6. The anchor estimates in the top row are all close to the value of 6 that is predicted ε = 1. The responsiveness estimates in the second row increase from a relatively low level of 0.46 in stage 1 to near-Nash levels that are close to 1 in the final stages. Finally, note that R2 values are lower in early stages when there is likely to be more “unexplained” variation due to experimentation and learning.

Table 6 Anchor-and-adjust proposer demand: d = A + (p − 0.5) regression results by stage

The bottom row of Table 6 shows the number of rounds of experience, denoted by e, for the subjects with proposer roles in that stage. The decline in the responsiveness parameter (from 0.95 to 0.90) in stage 6 is not surprising since the proposers in that stage started the game in stage 4 and have the same experience as the “reverse-role” proposers in stage 3. The intuitive nature of proposer experience on the increased responsiveness could be evaluated with stage-by-stage parametric tests, but there are 12 parameters being estimated and many tests that could be performed. Our focus, instead, is on the big picture, i.e. the overall effect of proposer experience. Therefore, we consider a combined regression for all stages, with responsiveness modeled as a function of the rounds of experience, denoted by e:

$$ \alpha = \overline{\alpha } + \beta /e,\quad \left( {\text{experience effects on adjustment responsiveness}} \right) $$
(5)

As experience increases, the second term in (5) diminishes, so the \(\overline{\alpha }\) estimate provides the limiting value of the responsiveness parameter, which can be compared with the full responsiveness of 1 that results in a Nash equilibrium. Using the reciprocal of e in (5) is a way to model the diminishing marginal effects of added experience. With this reciprocal specification, a negative estimate of β would indicate that responsiveness to proposer power increases with experience. A substitution of the experience process (5) into the linear regression specification in (4) results in an interaction term that is the product of the power variable x and the inverse experience measure (1/e). In this case, β is the estimated coefficient of this interaction term.

The anchor and adjust model that results from Eqs. (4) and (5) was estimated using data from all 6 stages combined (using 660 proposer demand observations) to obtain:

$$ d = \underset{(0.03)}{5.82} + \underset{(0.02)}{0.93x} - \underset{(0.06)}{0.54(x/n)} \quad R^{2}=0.88 $$
(6)

The low standard errors shown in parentheses indicate that all variables are highly significant. In particular, the negative and significant coefficient of the interaction term confirms the intuition from separate stage-by-stage regressions that experience tends to increase responsiveness to proposer power as stipulated in hypothesis H2. The coefficient of the proposer power variable x suggests that responsiveness would eventually converge to a level 0.93, which is only slightly below the level of 1 that corresponds to Nash behavior. Finally, the constant term, 5.82, corresponds closely to the theoretical “fair” anchor of 6, calculated as half of the pie (5), plus the conflict cost (2), minus the minimum increment of about 1 needed to induce responder acceptance. In conclusion, the regression analysis of the combined model provides strong support for our second hypothesis, that deviations from Nash proposals will diminish with experience and role reversal.

7 Discussion of Learning and Conflict

Even though proposer demand behavior converges to subgame perfect Nash predictions in the presence of experience and role reversal, the results in various settings exhibit a common tendency for conflicts to occur in about a third of the bargaining interactions, which contradicts hypothesis H3. This section considers several factors that might contribute to these unpredicted conflicts.

7.1 One-Stage Versus Two-Stage Bargaining

Binmore et al. (1985) highlighted the difference between one- and two-stage games as critical in producing more gamesman-like behavior. In particular, they argue that a one-stage game should produce less strategic demands, because the responder can be more likely to irrationally reject. In contrast, in a two-stage game, the accept/reject decision is postponed to the second stage, making the decision-making process more fine-tuned.

“Our suspicion is that the one-stage ultimatum game is a rather special case, from which it is dangerous to draw general conclusions. In the ultimatum game, the first player might be dissuaded from making an opening demand at, or close to, the "optimum" level, because his opponent would then incur a negligible cost in making an "irrational" rejection. In the two-stage game, these considerations are postponed to the second stage, and so their impact is attenuated.” (Binmore et al., 1985, 1180).

Neelin et al. (1988) find results that contradict Binmore et al.’s claims that adding extra stages provides more experience and more gamesman like behavior. They show that subjects are less likely to behave as theory predicts with more than two stages and with more extreme parameterizations that produce more sharply asymmetric predictions. Similarly, in a two-stage conflict bargaining game, Sieberg et al. (2013) report that proposer demands were flatter than predictions and that this flatness seemed to persist to some extent with experience. In contrast, the proposer demands in the one-stage game used in this paper approached the subgame perfect Nash predictions with experience and role reversal. This difference can be understood in the context of statements made by Binmore et al. (1988). They note that it is not surprising that their subjects’ demands in the two-stage game did not perfectly match predictions, because even they—the experimenters—had to work to deduce the predicted values. As such, subjects should be expected to use a rule of thumb. This intuition is also valid for the one- versus two-stage games with conflict. The subgame perfect Nash equilibrium for the one-stage game is much easier to compute, or learn through experience, than is the case for the two-stage game.

7.2 Overweighting/Underweighting

The addition of the random-conflict lottery to follow the rejection of an ultimatum can introduce a source of potential bias since anomalous results for risky choices are quite common. It is possible that probabilities might be overweighted if they are small. In particular, suppose that proposers seemed to overweight their low win probabilities and underweight their high win probabilities, at least initially, then this might affect what they think responders would accept. This could cause them to demand “too much” when the actual win probability is low (0.2) and too little when it is high. Probability weighting effects should diminish over time as subjects learn from actual experience about what demands are likely to be rejected.

7.3 Rejections and Punishment

Sieberg et al. (2013) note that they cannot rule out the role that utility from engaging in conflict itself plays in explaining its high incidence. A utility of conflict, which could differ across individuals, would result in high rejection rates for some subjects and not others.Footnote 17 Aside from two subjects (both original proposers when playing the role of responder—one who rejected all but one offer in each part of the experiment and one who rejected all but one offer altogether), we find no “chronic rejecters” among the 72 subjects who participated.

Although rejections should not be observed in a subgame perfect Nash equilibrium with selfish preferences, it is not necessarily the case that the observed rejections are a sign of irrational (in terms of economic reasoning) behavior. Instead, rejections can be used as punishment for demands deemed as unreasonable—even when the demands are technically “fair.” We provide two tables, similar to the analysis used in Binmore et al. (1985) to examine the average actions of the original proposers and to compare them to the average reactions those subjects had, while playing the role of responder, to various demand levels.

The top row of Table 7 shows a total count of those who “asked < NE” as a proposer in all treatments (9 in total out of 24 initial proposers). Those 9 subjects are categorized by column in terms of their subsequent behavior as a responder: “reject < NE,” “reject NE,” or “reject > NE”. This categorization is based on a determination of the most common response that the subject exhibited, which we refer to loosely as the “average” response. All but one of these subjects who tended to make below-Nash demands tended to reject demands that were at the NE (4 individuals) or above (4 individuals). These conflict choices seem to be consistent with the person’s own beliefs regarding the appropriate demand. Similarly, of the eight subjects who, on average, demanded the predicted level (the “asked NE” row), 5 rejected, on average, demands above Nash predictions, but tended to accept demands at or below that level. The only subject who accepted a demand above the Nash level as a responder was in the third row for those who made “asked > NE” as a proposer. That subject was in the p = 0.2 treatment, and many of the subjects in that treatment tended to demand too much, since demands in early rounds for that treatment tended to be 5 (a 50/50 split). Three remaining subjects demanded the Nash level but rejected demands for that level, when responders. Similarly, six of the seven subjects in the third row who made demands above the Nash predictions later rejected demands at or above the same levels. Altogether, nine of the subjects in Part 1 tended to reject demands that were less than or equal to what they themselves demanded on average.

Table 7 Actions taken in part 1, as responder, by original proposers

Table 8 provides a similar analysis for decisions made in Part 2 with experienced initial proposers and a new set of initially inexperienced responders. In this game, eight subjects tended to demand below the Nash prediction (six of these had done the same in Part 1). These subjects, on average, rejected demands at or above the Nash levels. Of the thirteen subjects who demanded the predicted level, ten rejected demands above that level but accepted demands at or below the Nash level. Interestingly, the three subjects who demanded above the Nash level also rejected demands above the Nash level. Altogether, in Part 2, only three of the subjects tended to reject demands that were lower than or equal to what they themselves demanded on average (recall that there were nine such subjects in Part 1).

Table 8 Actions taken in part 2, as responder, by original proposers

Conflict continued throughout the experiment, despite the equilibrium predictions of zero conflict, and despite the presence of subjects who were experienced with both roles. The data pattern, however, suggests that the experienced proposers tended, as responders, to reject offers that were higher than what they themselves deemed reasonable.

7.4 Directional Learning

Grosskopf (2003) highlights the role that rejections—as punishment for demands deemed unreasonable—can play in teaching subjects what demands they should make in what circumstances. Rather than viewing rejections as irrational, she notes, “Responders seem to reject when they know that their punishment will be effective.” (Grosskopf, 2003, 148). The idea is that proposers who experience unsatisfactory outcomes due to rejection will be more likely to adjust their proposals in an upward direction that might have produced a better result. Such directional adjustments are referred to as “Directional Learning.”

Andersen et al. (2011) also examine the role that rejections can play in punishing unfair offers in an ultimatum game. They use framing to elicit low offers in an experiment involving a range of stakes. They conclude that responders are more likely to reject offers that are generally deemed unfair. This tendency decreases, however, as the payoff stakes increase, even for unfair offers. Thus, rejections are less common when they are more costly (in high-stakes settings). They point out similarities between rejections in high-stakes ultimatum games and the use of costly punishment in other games (Andersen et al., 2011, 3436). In the other extreme, “cheap talk” expressions of feelings, which are a form of low-cost punishment, tend to substitute for actual rejections, which are more costly forms of punishment (Xiao & Houser, 2005).

Lönnqvist et al. (2015) further explore tactics that can be used to deter punishment in ultimatum games and in other circumstances in which there is uncertainty. They show that if proposers can blame unfair offers (2 out of 10) on the result of a coin flip that determines whether the offer is fair or unfair, then responders are less likely to punish, even in cases in which the outcome of the coin flip cannot be verified, and in which they suspect that the proposer lied about the outcome. Based on these results, the authors note the role that hypocrisy can play in avoiding punishment.

Chavez and Bicchieri (2013) approach the issue of punishment for violations of equal distributions by introducing the option for a third party to either punish or compensate either player in an ultimatum game. They find that third parties prefer to use compensation, and that they will reward proposers for equal payoff offers (and responders too, presumably to keep the payoffs equal) or will compensate responders for uneven distributions. They are less likely to use punishment unless the compensation option is unavailable.

We examine the role of rejections on subsequent behavior by the “punished” proposers, which is done with learning matrices that are similar to the transition matrices in Grosskopf (2003). There are two ways that a particular proposal can be considered successful ex post: either the proposal was accepted, or it was rejected by the responder, but the proposer won the conflict, which we will label as an "unsuccessful rejection.” Figure 8, based on Part 1 data prior to role reversal, shows that a proposer who has experienced an acceptance or an unsuccessful rejection, for a given demand is likely to make the same proposal in the next round or else demand slightly more. In contrast, successful rejections (that yield negative earnings for the proposer) have the effect of inducing proposers to decrease their demands in the next round. Both matrices suggest the presence of directional learning, highlighted by Grosskopf (2003). Interestingly, already in Part I, there is evidence that proposers who make the predicted demands stay with them.

Fig. 8
figure 8

Analysis of direction learning, part 1, prior to role reversal

In Part 2, the results prior to role reversal indicate that the proposers have learned from having experienced both roles in the game. The majority of proposals are in the predicted range, and in the rare case of successful rejections (there were only 4), two of them stayed with the predicted demand and two, having demanded too much, dropped to the predicted demand. It is arguably the case that rejections, and the costly conflict that resulted, played a strong role in teaching proposers what demands were optimal in terms of maximizing expected payoffs.

The directional learning results shown in Figs. 8 and 9 were constructed on matchings that occurred prior to role reversal, e.g. stages 2 and 5. Role reversal (e.g. stage 3) provides even sharper results.Footnote 18 Experience of playing as a responder allowed the original responders, when roles were reversed, to move more swiftly towards the optimal demands and to stay there.

Fig. 9
figure 9

Analysis of direction learning: part 2 prior to role reversal

8 Conclusion

Binmore argues that fairness, when it occurs, is a result of self-interested strategy rather than of some natural human instinct. Rather than valuing fairness in and of itself, Binmore argues that fair behavior should be expected to be case-specific. In particular, people engage in fair behavior when that strategy choice is expected to improve their payoffs. When superior payoffs can be obtained without being fair, Binmore expects people to refrain from fair behavior. Our results from an experiment using an ultimatum game with a conflict option support Binmore’s argument. Ultimatum game experiments frequently obtain results in which offers made appear to be fair. These results are so consistent, that, as noted, a series of modifications have been made in experiments to see if less fair, more gamesman like behavior can be provoked. Our experiment did just that. Despite similarities to a traditional ultimatum bargaining game, the conflict option awarded differential bargaining power. This power, combined with the frequently used choice to exercise it (engage in conflict) produced results that differed strongly from the mostly 50/50 splits generally seen in traditional ultimatum games. Indeed, the use of the conflict option taught subjects to make or accept offers that moved away from fair divisions toward the subgame perfect Nash equilibrium predictions. In line with Binmore’s argument, we find that subtle modifications to the ultimatum game allow people to abandon fair behavior in favor of self-serving action.