Relaxing the symmetry assumption in participation games: a specification test for cluster-heterogeneity

We propose a novel approach to check whether individual behaviour in binary-choice participation games is consistent with the restrictions imposed by symmetric models. This approach allows in particular an assessment of how much cluster-heterogeneity a symmetric model can tolerate to remain consistent with its behavioural restrictions. We assess our approach with data from market-entry experiments which we analyse through the lens of ‘Exploration versus Exploration’ (EvE, which is equivalent to Logit-QRE) or of Impulse Balance Equilibrium (IBE). We find that when the symmetry assumption is imposed, both models are typically rejected when assuming pooled data and IBE yields more data-consistent estimates than EvE, i.e., IBE’s estimates of session and pooled data are more consistent than those of EvE. When relaxing symmetry, EvE (IBE) is rejected for 17% (42%) of the time. Although both models support cluster-heterogeneity, IBE is much less likely to yield over-parametrised specifications and insignificant estimates so it outperforms EvE in accommodating a model-consistent cluster-heterogeneity. The use of regularisation procedures in the estimations partially addresses EvE’s shortcomings but leaves our overall conclusions unchanged.


Introduction
Over the last three decades, the literature on behavioural economics has proposed a number of models to explain various anomalies that can hardly be organised by the standard equilibrium approach.In the context of games, these models consider alternative preferences, traits and/or rationales which relative explanatory powers have been assessed with laboratory experiments (see e.g., McKelvey & Palfrey, 1995;Selten & Chmura, 2008;Costa-Gomes et al., 2009;Crawford, 2013).While such horseracing approach documents the models' relative goodness-of-fit performances and helps determining a 'best model', it leaves unanswered the question of whether the estimated models are indeed consistent with the restrictions they impose on individuals' behaviour.This article presents a novel approach and a specification test to address this question in the context of symmetric binary-choice participation games such as market-entry games, volunteer's dilemmas, discrete step-level public good and voter participation games.It contributes to the existing literature on this issue in two ways.
First, it provides a useful theory-based selection criterion for models which explanatory powers can hardly be assessed otherwise than by their goodness-of-fit.This is the case with the Quantal Response Equilibrium model (QRE, McKelvey & Palfrey, 1995), a stochastic version of the Nash equilibrium that assumes players to best-respond to their own and to the others' payoff disturbances and which predictions hinge upon the distributional properties of these errors (see Goeree et al., 2016, and the references therein).This model has proven remarkably successful with fitting the data of numerous experiments but its reliance on players' unobservable payoff disturbances has raised concerns about its falsifiability, see Haile et al. (2008).Goeree et al. (2005) addressed such concerns by determining restrictions on these disturbances to bracket QRE's falsifiability (see Goeree et al., 2016, for further discussion on this topic).Golman (2011) deals with this problem in the context of heterogeneous agents and provides conditions under which the behaviour of the representative agent of a pool of individuals may be rationalised by QRE.These conditions determine whether the aggregation of agents' payoff disturbances fulfils the i.i.d.assumption on which QRE builds, and they yield useful predictions for asymmetric binary-choice games by restricting the set of QRE-consistent choice frequencies.On the other hand, Melo et al. (2019) check whether players' behaviour in multiple games is consistent with the QRE hypothesis.Their procedure exploits a set of restrictions on agents' choices in different games and on these games' payoffs.It is also nonparametric in the sense that it does not require the distribution of payoff disturbances in a particular game to be specified.
Unlike these investigations which pertain to QRE settings, ours exploits the behavioural restrictions imposed by a symmetric model on individuals' participation rates only and therefore allows the comparison of different models, including QRE.
Second, it permits an assessment of a model's consistency with the assumption of 'cluster-heterogeneity', whereby individuals with common characteristics (e.g., their participation rates) are clustered together and share a common model-parameter to be estimated.It thus alleviates the problem of modelling heterogeneity, which typically raises questions about which sort of relaxation of "common knowledge" assumption(s) about what agents believe about others can be used and which still allow one to 'close' the model. 1 Rogers et al. (2009), for example, develop QRE models where heterogeneity is modelled either in terms of common knowledge beliefs about others' traits (as in Camerer et al., 2016) or of subjective beliefs, i.e., each player believes that the others' traits are i.i.d.from the same distribution as her/ his own, which is assumed private information (as in Armantier & Treich, 2009). 2lthough making these modifications shows that assuming heterogeneity considerably improves the model's goodness-of-fit, it also heightens the question of the model's falsifiability since the presumed beliefs about others' behaviour remain difficult to assess.Our approach does not require additional behavioural assumptions about one's own or others' behaviour since it is based on observables, e.g., the players' participation rates; and it allows one to determine how much cluster-heterogeneity a symmetric model can tolerate to remain consistent with the restrictions it imposes on individual behaviour.While the symmetric assumption provides valuable normative predictions for policy recommendations such as the design of markets, contracts and/or bargaining legislations, it is rather unrealistic and thus restrictive.By considering an observable cluster-heterogeneity rather than a hypothetical heterogeneity in the players' beliefs, we can better assess a model's predictions and possibly broaden its range of applications.
We assess our approach with new data on market-entry games of complete information.These games suit well our case since they involve fairly straightforward incentives and may account for a relatively large number of players (which is needed for studying cluster-heterogeneity).These games have also been widely studied in the social sciences and laboratory experiments typically indicate that participants somehow manage to behave almost optimally since their participation rates often even out the expected profits from entry and from no entry (Ochs, 1990;Sundali et al., 1995;Zwick & Rapoport, 2002). 3This observation was first coined as 'magic'
We study these market-entry games through the lens of two stationary behavioural models: the 'Exploration versus Exploitation' dilemma (EvE) outlined in Nadal et al. (1998), Weisbuch et al. (2000), Kirman (2011) and Bouchaud (2013) and which essentially entails a trade-off between maximising current and future profits, or through that of Impulse Balance Equilibrium (IBE, Selten et al., 2005) which balances off the foregone expected payoffs associated to each possible choice.The details of these models are discussed in the next section and we highlight here two of their properties that motivate our experimental investigation.First, EvE is structurally equivalent to Logit-QRE and thus directly relates to the predictions of Goeree and Holt (2005)-in brief, 'Exploration' in EvE corresponds to a 'purely random behaviour' in QRE, 'Exploitation' corresponds to a 'best-responding behaviour', and any mix of these two options corresponds to a 'stochastic best-responding behaviour'.Second, despite their different premises, EvE and IBE fit observed entry probabilities equally well in the range of EvE-consistent choice frequencies.And since the range of IBE-consistent choice frequencies is larger, the usual 'goodness-of-fit horseracing' would document nothing more than occurrences where IBE outperforms EvE and is therefore not pursued. 4Given these properties, we focus analysis on the models' relative success with consistently organising behaviour in treatments that manipulate payoff levels (i.e., 'High' or 'Low') and payoff structures (i.e., with payoffs from entry depending on attendance in various ways).In addition, we document the sensitivity of our conclusions to the econometric procedures used, i.e., with(out) imposing symmetry, with(out) assuming homoscedastic errors, and with(out) regularisation of the errors' variance matrix.
We summarise our experimental findings in the following four points.First, imposing symmetry (as is usually done in the literature) yields significant IBE-estimates that are of similar magnitudes across aggregation levels (i.e., session or pooled data) and EvE-estimates that are either insignificant or that bear little consistency across aggregation levels.Second, relaxing symmetry and using OLS estimation methods leads the specification test to reject EvE with cluster heterogeneity less often than IBE no matter the payoff level or structure (17% vs 42% of all sessions).However, when considering the models' non-rejected specifications, EvE typically yields insignificant 4 The relative goodness-of-fit performances of these (and other) models has been investigated in the context of 2 × 2 constant-and non-constant sum games by Selten and Chmura (2008), Brunner et al. (2010), andSelten et al. (2010) who found no clear evidence of a 'best model'.Brunner et al. also extended the analysis to other binary-choice games to stress-test the particular parametrization of IBE used by Selten and Chmura (2008) but found no evidence of a 'best model'.
cluster-estimates, and most of its multi-clustered specifications are over-parametrised, i.e., their cluster-estimates are not significantly different from each other.This is not the case for IBE which, in addition, can rationalise the presence or absence of clusters of players with low participation rates.Third, these patterns hardly change when the estimations pertain to the second half of the experiments to account for participants' experience of play.Fourth, when estimating the models with more efficient econometric procedures with(out) regularisation, the EvE-specifications become more likely to be rejected (25% vs 17% of all sessions) and yield less insignificant cluster-estimates.Yet, most of the non-rejected EvE-specifications are still over-parametrised whereas our conclusions for IBE are hardly affected.In sum, our study indicates that IBE yields more consistent estimates than EvE when symmetry is imposed and that it accommodates cluster-heterogeneity better than EvE when it is relaxed.
The next section presents the EvE and IBE models for market-entry games.Section 3 lays out the econometric procedures and our specification test for this class of binary-choice games.The experimental design and procedures are presented in Sect. 4. Section 5 reports the estimation results when symmetry in the players' choices is imposed and when it is relaxed.Section 6 concludes.

Two stationary models of market-entry games
Assume n agents who independently decide whether to enter a market or not.Agent i 's decision is represented by a variable d i that takes the value 1 if she enters and 0 if not.The payoff from not entering is constant and equal to H , whereas the one from entering is a function G(⋅) of the number of entrants A = ∑ i d i .A congestion problem typically arises if for some integer value c < n , we have With such a reward scheme, any vector of decisions d such that exactly c out of n agents choose to enter constitutes a pure Nash equilibrium.There are exactly n c such equilibria, each yielding an aggregate payoff equal to cG(c) + (n − c)H.
There may also exist symmetric mixed-equilibrium strategies, i.e., that equalize an agent's expected payoff from entering, E , to that from not entering, NE = H .That is, if p stands for the common probability of entry, then an equilibrium prob- ability p Nash solves: where k is a realization of the random variable K characterizing the number of entrants other than oneself.Note that (1) requires that the n agents behave sym- metrically in that they all choose to enter with the same probability p-clearly, one could also consider asymmetric mixed-equilibria in which some agents enter with commonly known probabilities.For reasons that will become clear in Sect.3, it is Relaxing the symmetry assumption in participation games:… convenient to rewrite this expression as being conditional on p −i , the n − 1 vector of entry probabilities for agents other than agent i5 :

Exploration versus Exploitation: EvE
In this framework, agents aim at finding a compromise between maximizing their current payoff and keeping themselves informed about market conditions to maximize their future payoffs.In our context, we can think of changing market conditions driven by agents' irregular or stochastic entry behaviour.In this case, agents may find it worthwhile to sometimes explore the alternative option, i.e., entering or not entering the market.While the 'exploitation' part of the dilemma, i.e., the maximization of current payoffs, is straightforward, the 'exploration' part hinges upon the maximum entropy principle which captures the agent's information seeking behaviour (see Anderson et al., 1992). 6In brief, an agent seeking maximal information from her/his decisions would explore each alternative with equal probabilities so that entropy is maximized whereas an agent who does not seek information would clearly avoid exploring and would focus on maximizing current payoffs, so the weight on entropy is minimized.This framework was first used by Nadal et al. (1998) for the study of buyer-seller interactions and we adapt it here for the analysis of market-entry games.Denote agent i 's probability of entry by p i and that agent's expected payoff from entry in terms of the probabilities of entry of the n − 1 other agents by E p −i .Using Shannon's measure of entropy S i = −p i ln p i − 1 − p i ln 1 − p i with p i nei- ther 0 nor 1, the agent's objective function to maximise is then given by: where ≥ 0 is a parameter capturing the weight that agent i assigns to the pres- ervation of information about market conditions for long term profits.Differentiating this expression with respect to p i , we obtain the following first-order condition formaximisation: or equivalently (with = 1∕ ) This yields a system of n equations if there are n agents, and given the homogenous weighting parameter , this should be solved for the vector p * = p 1 , p 2 , … , p n .Under the assumption of symmetry, p −i has all its components equal to p i , which we simply denote by p , and thus p and are related by:or equivalently Note that this exactly matches McKelvey and Palfrey's definition of a Logit-QRE (with standing for the agents' homogenous 'best-responsiveness') so that the models are structurally equivalent if agents' payoff shocks in QRE are extreme-value i.i.d and if EvE assumes Shannon's entropy measure. 7Thus, if rational agents behave symmetrically and do not explore, then p is such that E (p) = H , i.e., p = p Nash and → ∞ .On the other hand, if they maximise exploration, then they choose p such that p = 1 − p = 0.5 , so that → 0 .If p > 0.5 , is positive if  E (p) > H and it is negative (theory-inconsistent) otherwise.The Maximum Likelihood estimate of p , assuming independent observations, is the relative frequency of entry, d n , and the Maximum Likelihood Estimator (MLE) of follows from (4).Note that d n remains a statistically consistent estimator for E(d) = p for less restrictive covariance struc- tures of the observations, by various flavours of the weak laws of large numbers.

Impulse Balance Equilibrium: IBE
IBE basically assumes that if at some stage an alternative option would have yielded a higher payoff, then the agent receives an impulse to use this alternative in the next stage, i.e., agents only take account of foregone payoffs, as in Learning Direction Theory (Selten & Buchta, 1999).It is defined as the long run outcome of such stage-to-stage behaviour.In the context of market-entry games, an agent receives an impulse for entry if the payoff received from not entering is smaller than that from entering.Denoting by I the number of other entrants and (3) 7 See Appendix 2 for the derivation of the Logit-QRE for this game.The i.i.d.assumption is central to QRE and reverts to assuming that agents take their decisions independently and do not interact with each other in EvE models.The modelling of dynamics in settings with time-correlated decisions and heterogeneous agents quickly becomes intractable and the determination of equilibria is confined to special cases, see Bouchaud (2013) and Goeree et al. (2016).EvE has also been formalised in terms of 'rational inattention' by Matějka and McKay (2015), see Gabaix (2018) for a review of this literature.See also Evans and Prokopenko (2021) for an application of EvE with a state dependant variant of Shannon's measure to study housing market data.

3
Relaxing the symmetry assumption in participation games:… by p the common probability of entering the market, the expected magnitude of these impulses for entry is defined as: or equivalently in terms of p −i rather than p: Similarly, an agent receives an impulse for no entry if the payoff received from entering is not larger than that from not entering.The expected magnitude of these impulses for no entry is defined as: or equivalently Note that these impulses are defined relatively to the game's maximin pure strategy of not entering the market which yields a sure payoff of H . Selten and Chmura (2008) further observe that receiving a payoff lower than this sure payoff should be perceived as a loss.To this extent, and in the light of empirical and experimental evidence of loss aversion in agents' preferences (Bernatzi & Thaler, 1995;Tversky & Kahneman, 1991), we follow Ockenfels and Selten (2005) and define an IBE for this market-entry game such that agent i is indifferent between 'receiving IMP E p −i and entering' and 'receiving IMP NE p −i and not entering', where  > 0 stands for an impulse weight.That is, agent i would choose to enter the market with probability p i that equalises her expected weighted impulses: This impulse balance equation characterizes a long-run IBE in which participants do no more react to the expected impulses they receive.We could of course consider a short-run IBE, i.e., that would solve IMP E p −i = IMP NE p −i , but the resulting IBE for agent i would then be independent of p i and, as shown in the next section, this would considerably limit the scope of our study.
Finally, unlike Selten and Chmura (2008) who assume = 2 , we estimate the impulse weight by Maximum Likelihood (as for EvE) so the estimator of p is d n and the MLE of (assuming symmetry and p ≠ 1 ) follows from 3 A specification test: the 6-test When we assume symmetry, the models we consider only propose a reparametrization (p) for EvE and (p) for IBE.Thus, under symmetry, there is no scope for discriminating between these models beyond commenting on implausible values of (p) and (p) .If we do not impose symmetry, then (3) and ( 7) can be rewritten as systems of linear restrictions on parameters and : and Both systems can thus be written in the form y(p) − x(p) = g(p, ) = 0 , with = or , and with y , x and g vector functions with values in ℝ n .The proposed formulation of ( 9) and (10) in terms of p −i makes it possible to express the EvE or IBE model for homogenous players-in the sense that they share a common single parameter-while still allowing for possibly different individual entry probabilities, and to design a specification test.A further possibility we shall explore is to allow for cluster-heterogeneous players, i.e., players with similar characteristics (e.g., entry-probabilities) whom the model considers identical by assigning them the same parameter.In this case, θ is a vector instead of a scalar and the length of the vector directly affects the power of the Σ-test since a vector of length n represents full het- erogeneity and leads to never rejecting the null of consistency.
Given the asymptotically normal estimator pT of p , the vector of individual entry frequencies, with asymptotic variance V of which we describe a consistent estimator VT in Appendix III.A, an optimal asymptotic least squares estimator of is8 : 1 3 Relaxing the symmetry assumption in participation games:… θT is thus the GLS estimator in the regression of y(p) on x(p) , the variance of the error term being S.
Given a preliminary estimate of , say θT obtained by replacing ŜT in (11) with the identity matrix, i.e., θT is the OLS estimator in the regression of y(p) on x(p) , a consistent estimator of S is: The asymptotic variance of θT is given by V asy θT = x � (p)S −1 x(p) −1 and a con- . Under the null that there exists such that g(p, ) = 0 for the true p , or in other words that the restrictions on entry probabilities embodied by the model are valid, and this over-identification test can be used to test the underlying theory.All we need for the implementation of this specification test, for short the Σ-test, are thus VT and the derivatives g i (p, )∕ p i .The technical details for the determination of these expressions are given in Appendix III.B.The number of degrees of freedom is n − 1 when assuming homogeneity [i.e., the length of vector is 1, cf. ( 12)] and it is at most n − K when assuming heterogeneous players sorted in K clusters (i.e., the length of vector is K ), as discussed in Appendix III.C.
Note finally that since this test exploits the game's probabilistic structure by rewriting agents' probabilities of entry as a function of p −i , it can be tailored for the assessment of behaviour in other binary-choice participation games like the volunteer's game, the (discrete) step-level public good game and voter participation games.This, of course, remains conditional on having well-defined predictions to test, as is the case for EvE and QRE in general but not necessarily for IBE since its long-run equilibrium may not always be defined. 9

Experimental design and procedures
The experiments involve groups of 10 participants and a 2 × 3 factorial design which assumes two payoff levels, High and Low, and three payoff structures: one two-step payoff function (DISC) yielding a positive payoff G from entering if attendance In the Volunteer's dilemma game, for example, players receive a gain G if at least one of them incurs the cost C < G of volunteering.Assuming symmetric agents, the expected impulses from 'volunteering' and 'not volunteering' are then defined as , respectively, so the long-run IBE would solve the equation pIMP V (p) = (1 − p) IMP NV (p) which solution p ∈ (0, 1] may not exist for some values of ≥ 0. A < c and 0 otherwise, and two non-monotone ones (NOM1 and NOM2) in which payoffs first increase and then decrease with A .The binary payoff structure of DISC implies that the players' choices are strict substitutes whereas the non-monotone structures introduce both strategic complementarity and strategic substitutability in the players' actions that have been theoretically studied in the context of global congestion games (see e.g., Karp et al., 2007) but which effects in complete information settings have not yet been investigated experimentally. 10hese payoff structures are displayed in Fig. 1, and the models' equilibrium relationships between p and  > 0 or  > 0 for the treatments considered are shown in Fig. 2. For each payoff level, both DISC and NOM1 yield 10 6 = 210 Nash equi- libria in pure strategies, unique mixed-equilibrium strategies and unique IBE strategies whereas NOM2 has one more equilibrium in pure strategies (where all agents choose not to enter), two mixed-equilibrium strategies and two IBE strategies (one with a low entry-probability and one with a high entry-probability). 11e are interested in checking if and how behaviour is affected by these payoff structures and to what extent it is consistent with EvE and/or IBE when allowing for cluster-heterogeneity.In this regard, since the ranges of probabilities for which EvE and IBE yield model-consistent estimates in DISC and NOM1 are 0.5, p Nash for EvE and [0, 1] for IBE, the models' cluster-estimates should lie within these ranges and be significantly different from each other for cluster-heterogeneity to be modelconsistent and significant.It thus follows that the scope for IBE to accommodate the latter in these treatments is considerably larger than that for EvE. 12 similar argument holds for NOM2 since the mixed-equilibria have different loci of consistent choice frequencies (defined either on 0.5, p Nash 1 or on 0, p Nash 2 with p Nash 2 < 0.5 ) whereas the IBE equilibria have a unique locus (because both equilibria depend on a common ), so the identification of model-consistent clusters of participants playing in such different (Nash or IBE) equilibria can be achieved with IBE but not with EvE.
Our motivation to consider different payoff levels is to check whether the payoffs' magnitude affects the presence of model-consistent clusters of players, and thus to possibly complement the findings of McKelvey et al. (2000) who report no significant payoff-magnitude effect on the participants' QRE best-responsiveness in 2 × 2 games and evidence of a heterogeneous play.

3
Relaxing the symmetry assumption in participation games:…  The experiments were conducted at the Laboratory for Experimental Economics of the University of Jaume I (Spain).Participants were undergraduate students in Business Administration, Law or Engineering and were recruited by public advertisement on campus.We conducted eight sessions per payoff structure (DISC, NOM1, NOM2) with 10 participants per session, totalling 240 individuals.For each payoff structure, we conducted four sessions with Low payoffs and four sessions with High payoffs.The experiments were conducted with a between-subject matching protocol and participants could play in only one session.Upon arriving in the laboratory, they were randomly assigned to cubicles equipped with computer terminals and were given instructions that were read aloud. 13To avoid framing effects, we presented the game in neutral language by asking participants to choose between actions A and B. Each session involved 150 rounds of play, and at the end of each round, participants were only informed about the total number of players in their group who chose B ("No entry"), their own payoff in that round and their cumulated payoff.This information was appended to a "History" window that could be seen at any time during the experiment.Although participants played in fixed groups of 10, we believe that the provision of a sparse end-of-round information feedback combined with the relatively large number of players (10), and a relatively large 'market size-to-capacity' ratio (60%) renders entry-coordination very difficult to achieve.Each session lasted a maximum of 1 h, including the time needed to read the instructions.Participants were rewarded for each round of play at the rate of 0.02 € per 100 points and individual average earnings were €12.77(i.e., €11.94 in the Low payoff sessions and €13.60 in High payoff ones).

Results
We start with an overview of the data by displaying the evolution of averaged entry probabilities and their polynomial fits in Fig. 3.The plots suggest an under-entry (p < p Nash ) in all High payoff treatments, and that the 'magic' p ≈ p Nash is more likely to hold when payoffs are Low, especially in NOM1 and NOM2.These entry patterns are also present in the session data (cf.Appendix V) and in line with the session and treatment average entry rates of Table 1.
The treatment (pooled) figures of Table 1 show no support for the predicted ranking of entry rates p Nash DISC < p Nash NOM1 < p Nash 1,NOM2 .Pairwise comparisons indicate a substantially higher entry rate in NOM1 than in DISC and NOM2 when payoffs are High and similar entry rates when they are Low.They also significantly increase with the payoff level, as predicted in equilibrium and as reported by Zwick and Rapoport (2002) who study the effect of 'low' and 'high' entry costs in treatments with a similar 'market size-to-capacity' ratio (50%).We summarise this overview of the pooled data as follows: 1 3 Relaxing the symmetry assumption in participation games:… Observation 0: (A) There is under-entry when payoffs are High.When payoffs are Low, there is (1) over-entry in DISC, (2) a weak support for the Nash mixed-equilibrium play in NOM1, and (3) under-entry (with respect to the high probability equilibrium) in NOM2.(B) The effect of the payoff structure is most salient when payoffs are High and yields a substantially higher average entry rate in NOM1.Average entry Fig. 3 Evolution of average probabilities of entry.Horizontal lines stand for the symmetric mixed-equilibrium predictions (we only consider the high-probability equilibrium of NOM2).Bold lines represent polynomial fits of degree 10 Table 1 Average entry probabilities Each 'session' ('pooled') estimate refers to 1500 (6000) observations; Nash mixed-equilibrium predictions in italics; bold cells characterize instances where the symmetric mixed-equilibrium strategy cannot be rejected at the 5% level; 95% Confidence Intervals (based on Newey-West variance estimates) in brackets a Significant over-entry, i.e., when p Nash is smaller than the lower bound of the 95% CI rates also significantly increase with the payoff level, as expected in equilibrium.
Before estimating the models, we briefly assess the symmetry of individuals' entry probabilities.The bar-charts in Fig. 4 reveal minor differences in average entry probabilities between the sessions of a treatment, and large within-session disparities with clusters of participants displaying a similar entry behaviour. 14The data also show no support for the 'low probability' mixed-equilibrium of NOM2 so we will always refer to the 'high probability' equilibrium of this treatment when discussing our estimation results.

Structural estimations when imposing symmetry
Table 2 reports the (pseudo-)Maximum Likelihood estimation outcomes of EvE and IBE when assuming symmetric players and unknown forms of autocorrelation and heteroskedascity in the errors.As the log-likelihood values contain no information about the model's goodness-of-fit beyond the estimated probability of entry p , we focus on the estimates' overall consistency with Observation 0, and on their Relaxing the symmetry assumption in participation games:… data-consistency, i.e., that a treatment's session estimates are of similar magnitude and significance as the estimate for the pooled data. 15ooking first at the outcomes for EvE, it appears that except for NOM2/High, all sessions report insignificant or inconsistent (negative) estimates no matter if p Nash is rejected or not (cf.shaded cells) or if their average entry rates indicate under-entry (cf.Table 1 and Fig. 4).Such insignificant estimates support maximal exploration whereas inconsistent ones result from EvE's inability to rationalize over-entry when p Nash > 0.5 , as shown in Fig. 2. In the case of NOM2/High, they are all significantly positive and support a contained exploitation that is in line with the observed under-entry.
The pooled EvE-estimates indicate a contained exploitation in all High payoff structures and in NOM2/Low, and they are otherwise inconsistent (or almost so) as a result of over-entry.Thus, besides a significant under-entry in NOM2/High, the EvE-estimates provide no evidence of a data-consistent behaviour when the estimations impose a symmetric play.
This sharply contrasts with the outcomes for IBE since the session estimates are all significantly positive, typically larger when payoffs are High in DISC and NOM2, and similar across payoff levels in NOM1.This is confirmed by the treatments' estimates which pairwise-comparisons further indicate that κDISC > κNOM2 > κNOM1 when payoffs are High and κDISC > κNOM1 ≈ κNOM2 otherwise.We summarise the above in the following observation: Observation 1: When assuming symmetric players and estimating the models with pseudo-Maximum Likelihood methods: (A) The EvE-estimates are data-consistent in NOM2/High and indicate a contained exploitation that is in keeping with the observed under-entry.Otherwise, they are data-inconsistent: they mostly indicate maximal exploration whereas pooled estimates are either negative (thus inconsistent) or they support a contained exploitation.(B) The IBE-estimates are data-consistent and in keeping with Observation 0. They indicate: (1) κHigh > κLow in DISC and NOM2, and κHigh ≈ κLow in NOM1.(2)  κDISC > κNOM2 > κNOM1 when payof fs are High and κDISC > κNOM2 ≈ κNOM1 when they are Low.

Structural estimations when relaxing symmetry
We now estimate the models without imposing symmetry and we run our specification test to assess the consistency of estimates with the restrictions that either model imposes on individual behaviour.Note that the Σ-test only suits the analysis of session data, i.e., games with n players.
For each session, we cluster the entry probabilities p i using the kmeans proce- dure (with 20 random initial values) and estimate each model and its inverse form with K = {1, 2, 3, 4} clusters; each cluster having its own -parameter (where is either to or ). 16This generates eight specifications for each model and 16 We use the inverse regressions x(p) on y(p) (cf.Sect.3) as a convergence check, since both the direct and inverse approaches must yield the same optimum when we optimize the test statistic (12).Also, we do not consider specifications with more than four clusters because the high volatility and imprecision of parameter estimates when there are four clusters do not encourage to go further.The technical details to determine the agents' clusters based on their entry probabilities are provided in Appendix III.C-the alternative to cluster the (x, y) vectors defined below Eq. ( 10) would have led to different groupings for IBE and QRE and was therefore not pursued.

3
Relaxing the symmetry assumption in participation games:… treatment which we estimate with OLS procedures.For each session, model (IBE and EVE) and value of K , we select the 'best' specification in terms of the esti- mates' theoretical consistency and the credibility of their confidence intervals.
Next, for each session and model, we select the estimated specification with the smallest number of clusters, K Min , needed to not reject the Σ-test at = 5% .Thus, the reported estimation results document the models' non-rejections of the Σ-test when K Min < 4 , and their rejections or non-rejections when K Min = 4 .Noting that a rejection with K Min = 4 can reasonably be seen as disqualifying the model when n = 10 , we focus discussion on specifications that do not reject the Σ-test.
The estimation outcomes are relegated to Tables VII.A.1-4 in Appendix VII.A, and since they display no obvious pattern in terms of payoff structure, we start with summarising their main characteristics for each payoff level in the upper panel of Table 3.The first three columns tally the models' rejections and nonrejections of the Σ-test when K Min = 1 (i.e., homogeneity is not rejected) or when 1 < K Min ≤ 4 (i.e., homogeneity is rejected in favour of cluster-heterogeneity).
EvE is not rejected for a total of 20 sessions (out of 24, 83%) whereas IBE is not rejected for a total of 14 sessions (58%).Of these non-rejected specifications, EvE supports cluster-heterogeneity in 12 sessions (60%) whereas all non-rejected

Table 3 Summary of specification test outcomes: OLS procedures
There is a total of 12 sessions per payoff level; K Min = 1 characterises homogenous players and 1 < K Min ≤ 4 cluster-heterogeneity; Detailed statistics refer to non-rejected specifications a % of Over-Parametrised specifications with 1 < K Min ≤ 4 b % of insignificant/inconsistent estimates c % of individuals with insignificant estimates *Including/relating to two inconsistent EvE estimates #(Rejections) #(Non-rejections) IBE-specifications do so.The summary tables in Appendix VII further reveal that both models are rejected for 4 sessions and that both are not rejected for 14 others.Since the remaining 6 sessions (25%) reject only IBE, it appears that EvE organises best the observed behaviour.We proceed with checking whether the cluster-estimates of a specification (session) are heterogeneous with pairwise 2 -tests of equality and note that when all pairwise-tests are rejected, the estimates are considered heterogeneous if all pairwise-tests are also rejected when assuming K + 1 clusters and the clusters were nested -the pairwise test outcomes are summarised in the last columns of Tables VII.A.1-4 in Appendix VII.A. On the other hand, a single non-rejection of equality implies that the specification is over-parametrised so the estimated cluster parameters are unreliable and one can only conclude that it has at most K Min − 1 clusters.
The last three columns of Table 3 refer to the non-rejected specifications of a treatment and report the percentages of ( 1) over-parametrised multi-clustered specifications, (2) insignificant or inconsistent estimates and (3) individuals affected by such estimates.The models sharply differ according to these criteria as EvE's specifications are far more likely to be over-parametrised than the IBE ones no matter the payoff level, i.e., a five-fold (three-fold) percentage difference when payoffs are High (Low).Most estimates of non-over-parametrised EvE-specifications are insignificant and none of these specifications yields estimates that fulfil the conditions to be considered heterogeneous.As for IBE, all estimates of non-over-parametrised specifications comply with these conditions when 1 < K Min ≤ 3 which leads us to conclude that, as expected, IBE accommodates cluster-heterogeneity better that EVE (cf.Sect.4).Finally, about 50% of EvE's estimates are insignificant and affect some 37% of individuals no matter the payoff level whereas for IBE the figures drop at least by half, especially when payoffs are Low.
We highlight treatment differences by assigning to each participant the -estimate of the cluster s/he belongs to and by comparing the resulting cumulative distributions of estimates for High and Low payoffs in each payoff structure.These distributions are displayed in Fig. 5 (with the samples' median estimates)-insignificant estimates were set equal to 0. To document the effect of the Σ-test on inference, the plots assume either (1) all estimates regardless of the sessions' Σ-test outcomes (cf.dashed lines), or (2) estimates of non-rejected specifications only (cf.plain lines).In this regard, the distributions pertaining to ( 1) and ( 2) reveal important differences only when non-rejected specifications are seldom, as for IBE in NOM2/High.
The distributions' large steps witness the presence of prominent clusters.In the case of EvE, the most prominent clusters consist of insignificant estimates and are found in DISC and NOM1 no matter the payoff level.There are also noticeable clusters of relatively large estimates supporting a more intense exploitation when payoffs are Low in DISC (with λi ≥ 5 for over 20% of participants) and in NOM2 (with λi ≥ 3.5 for about 40%).Such larger estimates counter-intuitively suggest that for 1 3 Relaxing the symmetry assumption in participation games:… these participants, exploitation is more intense when payoffs are Low.This contrasts with NOM1 where the distributions are more alike across payoff levels and support the prediction that exploitation intensifies with payoffs, as the median estimates also qualitatively suggest.
As for IBE, the distributions look similar in NOM1 and suggest no particular 'payoff magnitude' effect.They also display no prominent clusters of large estimates and thus contrast with the distributions of DISC and NOM2 which both do when payoffs are High ( κi > 10 for about 30% of participants in these treatments).The presence of such clusters in those treatments identifies participants with low entry-rates, and their absence in NOM1 is in line with Observation 1(B): (1) the distributions and median estimates suggest that  High >  Low in DISC and NOM2, and High ≈ Low in NOM1, and (2) the median estimates support  NOM1 <  DISC ≈  NOM2 when payoffs are High.
We attribute the absence of such clusters in NOM1 and the higher participation in NOM1/High to the relatively lower risk of regretting to enter that this structure entails when compared to DISC (which yields zero payoffs in case of over-entry) or to NOM2 (which bears an incentive to enter to avoid the risk of under-entry but which highest payoffs obtain only when A = {3, 4, 5} , cf.Appendix IV.D).
All in all, allowing for cluster-heterogeneity in the estimations reveals important differences in the models' explanatory powers and indicates that IBE outperforms EvE in this regard.We summarize this as follows: Observation 2: When relaxing symmetry and estimating the models with OLS procedures, the null of the Σ-test is less likely to be rejected by EvE than by IBE (17% vs 42% of all sessions, respectively).However, when compared to IBE, the non-rejected EvE-specifications are: (1) less likely to reject homogeneity, (2) more likely to be over-parametrised, (3) more likely to generate insignificant or inconsistent estimates that affect a larger proportion of participants, Insignificant estimates are set equal to 0. The plots report the estimates medians and numbers of nonrejected specifications (in brackets).The CDFs assume a maximum λi -and κi -estimates of 5 and 15, respectively and (4) unable to rationalise the presence of clusters of players with low entryprobabilities.Thus IBE accommodates cluster-heterogeneity better than EvE.
We conduct the same analysis for the last 75 rounds to check for a possible experience effect in the observed behaviour.The tests' outcomes are summarised in the lower panel of Table 3-see Tables VII.B.1-4 in Appendix VII.B for detailed results.17Now EvE is not rejected for all sessions whereas IBE is not rejected for 18 of them (75%, instead of 58% when accounting for all rounds) mostly with Low payoffs.Homogeneity ( K Min = 1 ) is again rejected for IBE in all sessions, and it is not for EvE in 12 sessions so that 50% of EvE-specifications are multi-clustered (instead of 60%).These specifications also display fewer clusters only when assuming EvE in DISC and NOM1/High so behaviour in these treatments would become more homogenous in the long run according to EvE.Overall, since both models are not rejected for 18 (75%) sessions and the remaining 6 reject IBE but not EvE (cf.Appendix VII.B), EvE would appear again to organise the observed behaviour best.
Looking into the specifications' details, we find that the models yield more nonrejected over-parametrised specifications: over 83% for EvE, and 50% for IBE no matter the payoff level.There is also no evidence of heterogeneous estimates in the unique non-over-fitted EvE-specification (cf.NOM1/Low/Session 1) whereas all IBE-specifications with 1 < K Min ≤ 3 are heterogeneous.The models' differences remain in terms of insignificant estimates, with 55% of EvE-estimates indicating maximal exploration and affecting about 50% of participants whilst only 27% of the IBE ones are insignificant and concern 17% of participants no matter the payoff level.
The distributions of estimates in Fig. 6 tend to confirm the patterns found when assuming all data and they are moderately affected by the data-attrition resulting from the Σ-test rejections.Insignificant EvE-estimates are frequent in all treatments but NOM2/Low, where the estimates support a contained exploitation and the null of homogeneity ( K Min = 1 ) in all sessions.Otherwise, the distributions pertaining to DISC and NOM2 still counter-intuitively suggest that it increases when payoffs are Low whereas those of NOM1 comply with the alternative that exploitation increases with payoff levels.
For IBE, the distributions and median estimates of DISC look alike those in Fig. 5 whilst those of NOM1 and NOM2 reveal (1) a drop in the median estimate of NOM1/Low and the presence of a cluster with large i -estimates in NOM1/ High,and (2) the absence of such a cluster in NOM2/High.However, the evolution of play in session data suggests that such higher (lower) participation for some participants in NOM1/Low and NOM2/High (NOM1/Low) are actually due to an 'end-game effect' in the last 10-20 rounds of these treatments, cf.Appendix V.This leads to the following observation: 1 3 Relaxing the symmetry assumption in participation games:… Observation 3: When relaxing symmetry and estimating the models with OLS procedures and the data of the last 75 rounds: (A) The null of the Σ-test is less likely to be rejected by EvE than by IBE (0% of all sessions vs 25% for IBE).We proceed with a second robustness check of Observation 2 by estimating the models with more efficient procedures that possibly call for ('naïve' or Tikhonov) regularisation of the error variance matrix to address the unstable results we got when estimating the models with GLS methods.We thus consider five minimum-distance estimators in addition to the OLS and GLS ones, and we allow for regularisation whenever it is deemed necessary to give the models their best shot at organising the data. 18That is, we estimated the models and their inverse forms for each session with seven estimators and with K = {1, 2, 3, 4} clusters, generating over 80 specifi- cations per model and treatment.For each model and value of K , we selected the specification that best addresses a set of criteria regarding the theoretical consistency Fig. 6 Cumulative distributions of individuals' OLS estimates (last 75 rounds).Thick (Thin) lines stand for High (Low) payoff levels-dashed lines refer to the 4 × 10 estimates of a treatment regardless of the Σ-test outcomes.Insignificant estimates (at = 5% ) are set equal to 0. The plots report the estimates medians and numbers of non-rejected specifications (in brackets).The CDFs assume a maximum λi -and κi -estimates of 5 and 15, respectively of parameter estimates and the credibility of their confidence intervals, but also to the condition number of the variance matrix of the error terms (not too large) and to the magnitude of the efficiency gains relative to OLS (not too large but not negligible).
The selected K Min -specifications are reported in Tables I to IV of Appendix VII.C and indicate that some form of regularisation is needed for 19 sessions (79%) when estimating EvE and for only 2 (8%) when estimating IBE. 19The Σ-test outcomes and the main characteristics of the models' non-rejected specifications are summarised in Table 4.They first indicate that the use of regularisation marginally affects the Σ-test outcomes for EvE and leaves those for IBE idle.Also, both models are not rejected for 13 sessions (instead of 14 when using OLS methods), EvE is not rejected for 6 (25%) and IBE for only 1 session (instead of 0).
The effect of regularisation is more salient on the estimates since the models become comparable in terms of rejecting homogeneity (i.e., 22 sessions for EvE vs 24 sessions for IBE), the proportion of insignificant/inconsistent estimates and, to a lesser extent, the proportion of individuals with such estimates.Yet, over 50% of EvE's non-rejected specifications are still over-parametrised whereas less than 20% of the IBE-ones are so.
The plots in Fig. 7 refer to heterogeneous samples of estimators and appear again to be affected by the Σ-test results only when the available data is sparse, as for EvE in NOM2/High.Insignificant EvE-estimates are mostly found in DISC/Low and NOM2/High, and they are about equally frequent no matter the payoff level in NOM1.Otherwise, the distributions of IBE-estimates, like those of EVE-estimates in DISC, display similar patterns as those referring to OLS estimates, cf.Fig. 5.The 1 3 Relaxing the symmetry assumption in participation games:… most noticeable changes occur for the EVE-estimates of NOM1 and NOM2: they are now most similar across payoff levels in NOM1 and suggest no particular payoff magnitude effect (like the IBE-distributions of this treatment) whereas they are mostly different in NOM2, with stochastically larger (and mostly homogenous) cluster-estimates when payoffs are Low.
Overall, this robustness analysis confirms the models' respective (in)sensitivity to the symmetric assumption (Observation 1) and IBE's superior ability to diagnose a model-consistent cluster-heterogeneity in the observed behaviour (Observation 2).We summarise the above in the following final observation: Observation 4: When relaxing symmetry and using (naïve or Tikhonov) regularisation procedures when estimating the models with GLS or distance-based estimators (instead of OLS estimators): (A) EvE is still less likely to reject the null of the Σ-test (25% of all sessions vs 42% for IBE).(B) The features 1) to 4) of EvE's non-rejected specifications outlined in Observation 2 hold and confirm IBE's superior ability in organising the observed behaviour.(C) Our conclusions for IBE are hardly affected by the use of regularisation procedures.

Conclusion
In this paper we propose a novel approach to the analysis of symmetric participation games that checks the consistency of a model's estimates with the restrictions it imposes on individual behaviour.This approach relaxes the model's assumption of symmetry by allowing for the existence of clusters of players with similar observable characteristics, and it assesses how much cluster-heterogeneity a model can tolerate to still be consistent with its behavioural restrictions by means of a specification test.Thus, besides offering an alternative to the usual assessment of a model in terms of its goodness-of-fit, this approach allows for individual differences to be accounted for in a model-consistent way and therefore contributes to the literature on modelling heterogeneity in static games, see e.g., Rogers et al. (2009) and Golman (2011). 20e assessed this approach with data on market-entry experiments which we analyse in terms of two stationary models: Exploitation versus Exploration (EvE, which is equivalent Logit-QRE) and Impulse Balance Equilibrium (IBE).Our empirical analysis sheds new light on the models' sensitivities to the assumption of symmetric players or of cluster-heterogeneity and to the econometric procedures used.We summarise our findings in the following four points.
First, estimating EvE with the usual assumption of symmetric and homogenous players provides limited insight into the analysis of behaviour in these games because (1) the session estimates are largely invariant to treatment conditions and mostly support a maximal exploration (or purely random behaviour), and ( 2) the estimates for the pooled data are seldom consistent with session estimates.In this regard, IBE outperforms EvE.
Second, when allowing for cluster-heterogeneity and estimating the models with OLS methods, the null of the specification test is less likely to be rejected for EvE, and EvE is more likely to support homogeneity than IBE.However, the estimated specifications have considerably more insignificant cluster-estimates and are typically over-parametrised, so IBE also outperforms EvE in terms of accommodating cluster-heterogeneity.This holds when the estimations pertain to the second half of the experiments to account for participants' experience of play.
Third, our approach can unveil behavioural patterns such as the presence of clusters of players with low-entry rates in some treatments and may explain them, i.e., such clusters are absent in treatments where payoffs remain positive when participation is over-capacity (as in NOM1) and they are present in treatments where the risk of experiencing a regret from entering is more salient (as in DISC and NOM2).
Fourth, when estimating the models with more efficient procedures (i.e., GLS or distance-based estimators that possibly allow for regularisation) our conclusions for IBE are hardly affected whereas those for EvE change considerably: homogeneity is then always rejected (like for IBE when assuming OLS methods) and insignificant or inconsistent cluster-estimates are less frequent.Yet, IBE still accommodates cluster-heterogeneity better than EvE.

3
Relaxing the symmetry assumption in participation games:… Finally, the proposed approach is flexible enough to also allow an assessment of which type of heterogeneity is most consistent with some behavioural model, e.g., gender, socio-demographics, or any relevant mixture of observable characteristics.For example, it can be used to reveal a gender and/or a socio-demographic effect in the players' participation, and the specification test could determine whether this effect (or which of these effects) is consistent with the symmetric model considered. 21It can also be applied to test predictions regarding the sorting of players into clusters of individuals who either always or never participate as a result of reinforcement learning, as Duffy and Hopkins (2005) predict and find.This, however, would raise the more challenging question of the formation of such clusters over time and its consistency with the type of learning considered.In this regard, our approach provides some first insights which we hope will be further explored.

Fig. 1
Fig. 1 Payoff levels and structures of market-entry games.No filling stands for 'No Entry', light gray (dark gray) stands for 'Entry' when payoffs are Low (High).Payoffs expressed in Experimental Currency Units-see Appendix IV.D for exact figures

Fig. 2
Fig.2Relationship between p and EvE's or IBE's .Thick (Thin) lines stand for High(Low)  payoff levels.For EvE, the plots report the p Nash predictions for each payoff structure and level (cf.coloured horizontal lines).For IBE, the plots display the p Nash predictions (cf.dots) for each payoff structure and level.As → ∞ , p → 0 in DISC and NOM1

Fig. 4
Fig. 4 Bar-charts of individual probabilities of entry.Each vertical bar represents an individual.Horizontal thin (thick) lines stand for the symmetric mixed-equilibrium predictions (average probabilities of entry)

Fig. 5
Fig. 5 Cumulative distributions of individuals' OLS estimates.Thick (Thin) lines stand for High (Low) payoff levels-dashed lines refer to the 4 × 10 estimates of a treatment regardless of the Σ-test outcomes.Insignificant estimates are set equal to 0. The plots report the estimates medians and numbers of nonrejected specifications (in brackets).The CDFs assume a maximum λi -and κi -estimates of 5 and 15, respectively (B) Non-rejected EvE-specifications in DISC/High and especially NOM1/ High have fewer clusters so behaviour becomes more homogenous in the long run according to EvE. (C) The features 1) to 4) of EvE's non-rejected specifications outlined in Observation 2 hold and confirm IBE's superior ability in organising the observed behaviour.

Fig. 7
Fig. 7 Cumulative distributions of individuals' (regularized) estimates.Thick (Thin) lines stand for High (Low) payoff levels-dashed lines refer to the 4 × 10 estimates of a treatment regardless of the Σ-test outcomes.Insignificant estimates are set equal to 0. The plots report the estimates medians and numbers of non-rejected specifications (in brackets).The CDFs assume a maximum λi -and κi -estimates of 5 and 15, respectively Significant under-entry, i.e., when p Nash is greater than the upper bound of the 95% CI b

Table 4
Summary of specification test outcomes with(out) regularisationThere is a total of 12 sessions per payoff level; K Min = 1 characterises homogenous players and 1 < K Min ≤ 4 cluster-heterogeneity; Detailed statistics refer to non-rejected specifications a % of over-parametrised specifications with 1 < K Min ≤ 4