The one player guessing game: a diagnosis on the relationship between equilibrium play, beliefs, and best responses

Experiments involving games have two dimensions of difficulty for subjects in the laboratory. One is understanding the rules and structure of the game and the other is forming beliefs about the behavior of other players. Typically, these two dimensions cannot be disentangled as belief formation crucially depends on the understanding of the game. We present the one-player guessing game, a variation of the two-player guessing game (Grosskopf and Nagel 2008), which turns an otherwise strategic game into an individual decision-making task. The results show that a majority of subjects fail to understand the structure of the game. Moreover, subjects with a better understanding of the structure of the game form more accurate beliefs of other player’s choices, and also better-respond to these beliefs.


Introduction
Subjects in laboratory experiments consistently deviate from equilibrium behavior (Camerer 2003). Many models of bounded rationality try to explain these deviations through errors in belief formation (e.g., Nagel 1995;Ho et al. 1998; Weizsäcker guessing game we can disentangle the effects of beliefs from "game form recognition," and analyze to what extent understanding the structure of the game determines their belief formation and their best-responses. 2 Our experimental results show that a majority of subjects fails to fully solve the one-player guessing game, and that subjects with a better understanding of the structure of the one-player guessing game play values closer the Nash Equilibrium in the two-player guessing game. This implies that an important part of non-equilibrium play is likely due to the inability of subjects to fully understand the structure of the game. Additionally, we observe that subjects with a better understanding of the oneplayer guessing game form more accurate beliefs, are better at best-responding to their own beliefs, and tend to better adjust their beliefs according to the population they face. These results confirm the intuition that understanding the structure of the game is crucial for belief formation.

Experimental design
The experiment consists of four different parts: Subjects first play the one-player guessing game (1PG), followed by the two-player guessing game (2PG). After this, we elicit subjects' beliefs about other subjects' two-player guessing game choices. A subset of subjects then participated in an additional belief elicitaton task ("Whatif" belief elicitation). At the end of the experiment, all subjects are asked to answer a battery of cognitive ability tests. In the following we describe each part of the experiment in more detail.

The one-player guessing game (1PG)
The one-player guessing game, first introduced in Bosch- Rosa et al. (2018), allows to test whether subjects can solve the two-player guessing game introduced by Grosskopf and Nagel (2008) free of any strategic concerns. 3 In essence, subjects play the role of both players in a two-player guessing game, i.e. they play the two player guessing game "against themselves." Accordingly, each subject (i) picks two numbers x i ∈ [0, 100] and y i ∈ [0, 100] and is paid depending on the absolute distance of each chosen number to the "target value" which is two thirds of the average of both numbers. The further away each chosen number is from this target value, the lower is the payoff. Formally the experimental payoff for choosing number x i and y i is: (1) Subjects are paid for both choices, so their combined payoff is: The payoff function is maximized at (y i = 0, x i = 0) . This solution can be found through logical induction by starting with a random value x 0,i , and then calculating the "best response" which is y � 1,i = 1 2 x 0,i . Following this, a "best response to the best response" can be calculated ( x � 1,i = 1 2 y � 1,i ) and so on until reaching the fixed point ( x � ∞,i = 0 , y � ∞,i = 0). By turning the two-player guessing game into an algebraic problem with no strategic uncertainty, we can separate those subjects who can solve the mathematical problem associated with the guessing game from those who cannot. 4

The two-player guessing game (2PG)
The two-player guessing game that we use is an adaptation of the one presented in Grosskopf and Nagel (2008) and Nagel et al. (2016). Subjects are matched in pairs and asked to simultaneously pick a number z i ∈ [0, 100] . In Grosskopf and Nagel (2008) the winner is whoever picks the number closer to 2/3 of the average of both numbers, so unlike in games with N > 2 subjects, now z i = 0 is a (unique) weakly dominant strategy. In our version of the 2PG, the payments are based on the (absolute) distance of each individual pick to 2/3 of the average of both numbers. Formally, the payment for player i depends on the choices of player j and her own in the following way: 5 This small change in payoffs dramatically changes the game as now the equilibrium is reached through iterated deletion of strictly dominated strategies, and zero is no longer a weakly dominant strategy. Now the best response is to choose 1/2 of the number a player believes the other player chooses. 6 We opted for this modification of the original game for two reasons. First, it allows us to de facto ask subjects for a point estimate of their belief about the other (4) 4 A penalty which increases in the distance to the target guess is crucial here, as with the more commonly utilized tournament incentives one of the two selves would always win, and the game would become trivial. Additionally, this payoff structure allows us to study the best response of subjects to their beliefs in the 2PG as 0 is not anymore a unique weakly dominant strategy. See Sects. 2.2 and 3.4 for more details. 5 Note that we limited the minimum payoff to zero in order to avoid potential losses for the subject. 6 See Nagel et al. (2016) for a lengthier discussion on the implications that the different payoff structures have in the 2PG. subject's choice, and secondly, and more important, it makes the game comparable to the 1PG. Note that while certainly not standard, distance-based payoff structures are widely used in the literature. Güth et al. (2002) first utilized such a payoff structure, arguing that it more closely resembles the financial decision-making situations that beauty contests are often intended to emulate. Since then a number of experiments have used distance-based beauty contests. 7 Most relevant for our experiment is Nagel et al. (2016), who directly compare distance-based and tournament incentives in two player guessing games and find no significant differences across the choices of subjects.

Belief elicitation
After subjects had played the 1PG and the 2PG (with no feedback in both cases) we elicited their beliefs about the other players' decisions in the 2PG. Similar to Lahav (2015), subjects were asked to distribute a total of 19 "tokens" into 20 "bins".
Each token represented a subject in the session (each session consisted of 20 subjects), and each bin had a range of 4 integers that players could play in the 2PG (i.e. the first bin had the range [0,4], the second [5,9], and so on). See Fig. 11 in "Electronic supplementary material Appendix C" for a screen-shot of the experimental interface.
To incentivize subjects, we used a linear scoring rule that paid €0.10 for each token that overlapped with the choice of any other subject in the 2PG. For instance, if a subject put three tokens in the bin "5-9" and in her session only 2 subjects had actually played any value within this range, then she would receive a total of 20 cents for the tokens allocated in that bin. If, on the other hand, she placed 5 token in the bin "0-4" and 10 subjects had played a value in this range, then she would be paid 50 cents for the tokens allocated in that bin.
Formally, define b ij as the number of tokens that subject i deposited in bin j, and p −ij as the number of subjects other than player i that chose a value that falls within bin j in the 2PG. Then the payoff for belief formation for subject i is: The resulting distribution of beliefs provides an estimate of what subjects think about other subjects' choices, and allows us to analyze how subject best-respond to their own beliefs. 8 (5) 7 See for instance Hommes et al. (2004), Costa-Gomes and Crawford (2006), Mauersberger and Nagel (2018), or Duffy (2016) who reviews some of these experiments among other types of macroeconomic experiments. 8 There is some discussion about how to best incentivize subjects to state their true beliefs. In particular, there is mixed evidence on whether incentive compatibility matters or not (Schotter and Trevino 2014). Methods to elicit beliefs beyond first moments, such as ours, are typically difficult for subjects to understand. Using a non-linear scoring rule would introduce an additional level of complexity. Hence, while not incentive compatible for risk neutral subjects, we opted for this approach because we believe it provides the best compromise between tractability for subjects and incentivization efficacy.

"What if" belief elicitation
Since playing the 1PG could have an influence on the beliefs subjects form in the 2PG, we asked a subset of 40 subjects to additionally guess the choices of players in a 2PG who had not previously taken part in the 1PG. 9 The incentives for this elicitation task are the same as the ones described above, and the data came from a random pick of 19 subjects from a sample of 80 subjects who we had invited two weeks earlier to participate in a 2PG without previously taking part in the 1PG. Gill and Prowse (2016) show that subjects who score higher in a Raven Test (Raven 1960) choose numbers closer to equilibrium, earn more, and converge quicker to equilibrium in a three-player guessing game. 10 Since we are interested in studying the ability of subjects to solve the guessing game, we also tested the cognitive ability of our subjects. In particular, all subjects answered a Raven Test and played "Raceto-60," a variant of the Race game (see e.g. Gneezy et al. 2010;Levitt et al. 2011). 11 The Raven Test is a multiple choice test in which subjects must pick an element that best completes a missing element in a matrix of geometrical shapes (see an example in Fig. 12 of "Electronic supplementary material Appendix C"). The score of this test has been found to correlate with measures of strategic sophistication and the ability of subjects to solve novel problems (Carpenter et al. 1990). It is increasingly used in economic research due to its simplicity and the lack of required technical skills.

Cognitive ability
Since logical induction is a central element of the guessing games, we test this ability with the "Race-to-60" game. In this game, each participant and a computerized player sequentially choose numbers between 1 and 10, which are added up. Whoever is first to push the sum to or above 60 wins the game. The game is solvable by backward induction, and the first mover can always win by picking numbers such that the common pool adds up to the sequence : [5; 16; 27; 38; 49; 60]. Subjects always move first and therefore, independent of the computer's backward induction ability, can always win the game. 12

Results
A total of 80 subjects participated in this experiment. All subjects were recruited through ORSEE (Greiner 2015) and were mostly undergraduate students with a variety of backgrounds, ranging from anthropology to electrical engineering or architecture. Sessions lasted one and a half hours and were run at the Experimental Economics Laboratory of the Technische Universität Berlin. Subjects who had previously participated in guessing game experiments were not invited. The experiment was programmed and conducted using z-Tree (Fischbacher 2007). For detailed results on the cognitive ability tests, see "Electronic supplementary material Appendix A"

The one player guessing game
In Fig. 1 we present the results of the 1PG in a scatter plot. Recall that in this case subjects have to pick two numbers, (x i , y i ) ; the first number is depicted on the horizontal axis, the second on the vertical axis. The diagonal dashed line marks the points where a subject picked the same number for x i and y i . The solid circle indicates subjects who fully solved the game (0,0).
As can be seen, only a minority ( ≈ 31% ) of subjects is able to fully solve the 1PG, i.e., pick zero for both numbers. In the remainder of this paper we will use this ability to fully solve the game as our primary measure of understanding of the structure of the guessing game.
Result 1 Only 31% of our subjects fully understand the one-player guessing game. Fig. 1 is that subjects who play numbers closer together also play numbers closer to the origin. This is relevant, as in the 1PG there are two ways in which a subject (who has not fully solved the game) can improve her payoffs: by picking numbers closer to zero, and/or by picking numbers that are closer to each other. A Spearman test confirms the correlation between higher average of both choices and the distance between them (Spearman = 0.83 , p value < 0.001 ). As subjects with high payoffs played both numbers that were close to each other, and to zero, one could interpret the payoffs of the 1PG as a measure of (partial) understanding of the structure of the guessing game. Therefore, we will use the payoffs of the 1PG as a secondary measure to complement to our primary measure of understanding, "Solved 1PG"/"Not solved 1PG".

The two player guessing game
The left panel of Fig. 2 shows the distribution of choices in the 2PG, for subjects who have played the 1PG before. The distribution appears to be quite different from the typical distribution one sees with guessing game "first timers." The mass of the distribution is close to zero with 50% of subjects playing Nash Equilibrium. 13 The mean is 13.47 and the median choice is 2. As mentioned in Sect. 2.3.1 we also collected data on 80 subjects who played the 2PG without previously taking part in the 1PG. Choices of these subjects are presented in the right panel of Fig. 2. While a relatively large number of subjects with no prior 1PG experience also play Nash Equilibrium (28.75%), the overall distribution of 2PG choices for those subjects is significantly shifted to the right compared to that of subjects with prior 1PG experience (Kolmogorov Smirnov test, p value < 0.001 , see Fig. 14 in "Electronic supplementary material Appendix C" for cumulative density plots). This shift results in a mean and median choice without 1PG experience of 27.8 and 26 respectively.
The difference in behavior between both groups could be the result of two phenomena: introspective learning from having played the 1PG (Weber 2003), or a change in the beliefs of subjects that previously played the 1PG given that they are facing a more "experienced" pool of subjects (Agranov et al. 2012). 14 In Sect. 3.4.1 we show that shifts of beliefs are relatively small. Therefore, we attribute most of the difference in behavior to introspective learning. So, while most subjects are not able to fully solve the 1PG, there appears to be some learning that carries over to the 2PG. Figure 3 shows the decisions of subjects in the 2PG on the vertical axis, and their payoffs for the 1PG on the horizontal axis. Subjects who fully solved the 1PG (solid circles) mostly chose zero in the 2PG (24/25, 96%), and picked significantly lower numbers in the 2PG than subjects who did not fully solve the 1PG (Mann-Whitney U Test, p value < 0.001 ). In line with this, we also observe that subjects who earn higher payoffs in the 1PG play lower numbers in the 2PG (Spearman = −0.745 , p value < 0.001) Result 2 Subjects with a better understanding of the structure of the one-player guessing game play numbers closer to the Nash Equilibrium in the two-player guessing game.

Relationship between the 1PG and the 2PG
But, is playing numbers near the Nash Equilibrium the best strategy in the 2PG? To answer this question we construct Π 2PG i . This variable represents the 13 Note that in the 2PG there are two Pareto ranked Nash Equilibria in pure strategies: (1) both players choose zero and (2) both players choose one. The latter is because the best response to the counterpart picking 1 would be picking 1/2. Since in our experiment subjects can only chose integers, both 0 and 1 are best responses. Such Pareto-ranked equilibria will always exist in laboratory experiments as it is impossible to implement a truly continuous choice. 14 Arguably, some learning could be due to a switch to simple heuristics (e.g. "play zero") rather than real learning about the structure of the game. Using games that reverse the end of the strategy space containing the equilibrium such as Ho et al. (1998) or Rick and Weber (2010), or games with interior equilibria as in Costa- Gomes and Crawford (2006) would allow to distinguish between these different kinds of learning. We thank the editor for this comment. i . This is because they play Nash Equilibrium, when payoffs would have been maximized by playing a number close to 9 as can be seen in the right plot. Overall, subjects who fully solved the game did not earn a significantly different payoff compared to subjects who did not fully solve the game (Mann-Whitney U test p value = 0.465).
Analyzing our secondary measure of understanding of the structure of the game ( Π 1PG i ) reveals a more nuanced pattern: it appears that there is a non-monotonic relationship between understanding of the structure of the game and expected payoffs in the 2PG (see Fig. 15 in "Electronic supplementary material Appendix C" for a close-up of Fig. 4.). Regressing Π 2PG understanding leads to increased expected payoffs, but that this relationship reverses for very high levels of understanding.

Result 3
The relationship between understanding of the structure of the one-player guessing game and payoffs in the two-player guessing game follows a non-monotonic pattern.

Subjective beliefs
On the left panel of Fig. 5, we plot the number of tokens subjects have placed correctly in the belief elicitation task against the payoff in the 1PG. Subjects who fully solved the 1PG placed a larger number of tokens correctly (Mann-Whitney U test, p value = 0.001). 15 This result is confirmed by the strong correlation that we observe between the 1PG payoff and the number tokens placed correctly (Spearman = 0.583 , p value < 0.001). On the right panel of Fig. 5 we plot the distribution of tokens (horizontal axis) against the payoff in the 1PG (vertical axis). While subjects who did not fully solve the 1PG spread out their tokens across most of the strategy space, subjects who fully solved the 1PG expect their counterparts to play numbers closer to the Nash Equilibrium (Mann-Whitney U test, p value < 0.001 ). Again, the correlation between the distance of tokens to NE and payoffs in the 1PG confirms this result (Spearman = −0.359 with p value = 0.001 ) ( Table 1).
To test how the accuracy of beliefs relates to the understanding of the structure of the game, we plot the mean of the belief distribution of each subject against their payoff in the 1PG (vertical axis) on the left panel of Fig. 6. 16 The vertical dotted line marks the mean choice across all subjects in the 2PG (13.63). The right panel of Fig. 6 plots the absolute distance of individual mean beliefs to mean 2PG play against earnings in the 1PG. Two things are clear from the graph: First, the mean beliefs of some subjects differ quite a bit from mean actual play in the 2PG. Second, subjects who fully solved the 1PG have a lower absolute difference of their mean beliefs and mean choice of all subjects in the 2PG (Mann-Whitney U test p value = 0.030 ). This result is supported using our secondary measure of understanding (Spearman = −0.473 with p value < 0.001).
Result 4 Subjects with a better understanding of the structure of the one-player guessing game form more accurate beliefs about their counterparts' choices in the two-player guessing game. 15 Again, in order to avoid noise due to session specific outliers, we compute the number of correct tokens by comparing individual beliefs to the distribution of the 2PG choices we collected across all sessions. For more details see "Electronic supplementary material Appendix B". 16  negative correlation between 1PG payoffs and Δz * i (Spearman = −0.531 , p value < 0.001 ). These results imply that better understanding of the structure of the guessing game improves the ability to best respond to own beliefs. Standard errors in parentheses *p < 0.05; **p < 0.01; ***p < 0.001Π Result 5 Subjects with a better understanding of the structure of the one-player guessing game choose numbers closer to the best response of their beliefs in the two-player guessing game.

"What-if" beliefs
As there could be some influence of having played the 1PG on the beliefs in the 2PG, we asked 40 subjects to use 19 tokens to guess the choices of 19 subjects that had played the 2PG "a couple of weeks ago, without having previously played the 1PG". We will refer to these distributions as "what-if" distributions, as opposed to the elicited distributions in the belief elicitation part of the experiment which we will refer to as "original" distributions.
We plot the resulting aggregated distributions in Fig. 8. At first glance, the differences between what-if and original distributions appear to be small. 18  for the difference in means and variance respectively). This indicates that subjects adjust their beliefs depending on the population they face. 19 To get a better understanding of how subjects change their beliefs in the 2PG when faced with different populations, we plot the difference in means between the what-if and original distributions ( ΔB i ) against the individual payoff from the 1PG ( Π 1PG ) in Fig. 9. In this figure, any value above the horizontal dotted line indicates a shift of the what-if distribution, with respect to the original one, away from the NE.
As can be seen in Fig. 9, whenever subjects who fully solved the 1PG adjust their beliefs, they seem to do so in the right direction (i.e., away from the NE). However, we cannot reject the null hypothesis of no differences in the distribution means ( ΔB i ) between subjects who fully solved the 1PG and those who did not (Mann-Whitney U Test, p value = 0.683 ). Using our secondary measure of understanding, we find a significant correlation between payoffs in the 1PG and ΔB i (Pearson = 0.308 , p value = 0.053 ), but we cannot reject the hypothesis that this relationship is not monotonic (Spearman = 0.262 , p value = 0.101 ). Therefore, if we interpret a higher payoff in the 1PG as a better understanding of the structure of the game, then   Fig. 9 Payoff in the 1PG on the horizontal axis, and change in the mean between belief distributions ( ΔB i ) on the vertical axis. Any value of ΔB i above zero is a change in the mean away from the Nash Equilibrium it would appear that a better understanding is associated with better adjustment of beliefs in response to facing an inexperienced population. 20 Result 6 There is weak evidence suggesting that subjects with a better understanding of the structure of the one-player guessing game are better at adjusting their beliefs in response to facing an inexperienced population in the two-player guessing game.

Conclusion
In laboratory experiments, subjects often deviate from equilibrium play. These deviations can be the result of either subjects not understanding the structure of the underlying game or from not forming the correct beliefs about the strategies of their counterparts. One strand of the literature has tried to explain these deviations as errors in belief formation (e.g., Costa-Gomes and Crawford 2006;Ho et al. 1998).
Yet, some recent research shows that subjects might not fully understand the experimental environment. In this paper we use an individual decision-making task that allows us to uncouple subjects' understanding of the game from their belief formation, and thus to establish to what extent understanding of the structure of the game contributes to non-equilibrium play in our experiment.
We find that a majority of subjects fail to fully understand the structure of the game. Moreover, subjects who understand the structure of the game play closer to the Nash Equilibrium, are better at best-responding to their own beliefs, and seem to modify their beliefs (correctly) depending on the population they are facing. This result is inconsistent with models of the Level-K type (e.g., Costa-Gomes and Crawford 2006) which assume that agents fully understand the structure of the game and only play out of equilibrium due to flaws in belief formation. Our findings suggest, otherwise, that out of equilibrium play is not only the result of a limited ability to form correct beliefs, but that it also results from the inability of subjects to fully understand the game's structure.
In light of these results, we believe the 1PG could be a useful "quick and easy" test for researchers interested in, or aiming to control for, understanding of the structure of guessing games. More generally, we believe that the reduction of strategic games into one-player forms could be a useful tool in the analysis of other games too. 21 Such transformation would allow researchers to study the degree of understanding that subjects have of the structure of the game, and to control for any deviations from Nash equilibrium play independent of errors in belief formation. 22 Finally, a potential extension of this experiment could be to vary the number of selves subjects play in the 1PG and compare play to standard guessing games populated by the same number of strategic players. On the one hand, increasing the number of selves and strategic players increases the complexity of the game, and may therefore make understanding the structure of the game and belief formation more difficult. On the other hand, increasing the number of selves and strategic players may lead subjects to better understand the unraveling mechanic of the game. We leave it to future extensions of this work to test whether the general findings in this paper would also hold under such conditions.