Experienced vs. inexperienced participants in the lab: do they behave differently?


We analyze whether subjects with extensive laboratory experience and first-time participants, who voluntarily registered for the experiment, differ in their behavior. Subjects play four one-shot, two-player games: a trust game, a beauty contest, an ultimatum game, a traveler’s dilemma and, in addition, we conduct a single-player lying task and elicit risk preferences. We find few significant differences. In the trust game, experienced subjects are less trustworthy and they also trust less. Furthermore, experienced subjects submit fewer non-monotonic strategies in the risk elicitation task. We find no differences whatsoever in the other decisions. Nevertheless, the minor differences observed between experienced and inexperienced subjects may be relevant because we document a potential recruitment bias: the share of inexperienced subjects may be lower in the early recruitment waves.

    Employing the strategy method in the trust game may have an impact on decisions. See Burks et al. (2003), Brandts and Charness (2011) or Casari and Cason (2009). Also the fact that participants played both roles in the trust game might have an impact on fairness considerations. Playing both roles may reduce the degree of trust and trustworthiness, see for example Burks et al. (2003) or Johnson and Mislin (2011).

    As the UG was conducted effectively as a simultaneous-move game, we have multiple Nash equilibria but selection criteria such as trembling-hand perfection or weak dominance would yield the equilibrium reported above.

    Using the strategy method might also influence the behavior in the UG. See Oxoby and McLeish (2004), Oosterbeek et al. (2004) or Brandts and Charness (2011).

    After the payoff-relevant throw, we encouraged the subjects to keep on throwing the die in order to test whether the die was fair. In the end, subjects had to report the result of their first throw. Participants were seated in individual cubicles and were not monitored by anyone during this task, so lying can only be detected at the aggregate level and not at the individual level.

    An English translation of the instructions can be found in the online appendix.

    Our power calculations are (with the exception of BC) based on treatment effects reported in the literature where subjects are randomly assigned to treatments, whereas we are not looking at random assignments when we compare experienced vs. inexperienced subjects.

    We note that we have two hypotheses here regarding UG1 which might, and in our data actually do, give rise to different hypotheses. Hypothesis 1 (2) suggests lower offers made by experienced subjects but since the payoff-maximizing offer turns out to be above the average, Hypothesis 2 (3) suggests experienced subjects make higher offers. We do not maintain hypotheses regarding RE. If experienced subjects earn higher payoffs, this might translate into less risk-averse attitudes. This, however, would be an indirect conclusion about preferences whereas direct evidence on risk preferences (Cleave et al., 2013) is not in favor of such a hypothesis.

    Since we conduct eight different tests here, we may encounter false positives due to multiple testing. If we Bonferroni-correct our p-values, we obtain a critical p-value of \(0.05/8=0.00625\), and so only TG1 would remain significant. The Bonferroni method controls for the family-wise error rate and is known to be rather conservative. Following Benjamini and Hochberg (1995), we can alternatively control for the false discovery rate. This suggests a critical p-value of \(0.056\cdot2/8 = 0.014\), on the basis of a significance level of 0.056, and we can maintain significance of our results for TG1 and TG2.

    This result does not change when we include maos of one (which are also payoff maximizing) in the analysis.

    Choices in TG1 and TG2 are strongly correlated (\(\rho =0.51\), \(p<0.01\)). Blanco et al. (2014) suggest that a consensus effect may be driving this correlation: subjects who choose exploit in TG2 overestimate the share of people who exploit. In other words, the belief when making the TG1 choice is biased toward players’ own TG2 decision and the two choices are positively correlated. This would explain why, in TG1, experienced subjects choose optimally less often than inexperienced subjects.

    Following Holt and Laury (2002), we simply count the number of safe choices, also for subjects with non-monotone decisions.

    The direction of causality is not obvious here: inexperienced subjects may be slow because they are inexperienced, or inexperienced subjects may be (or remain) inexperienced because they respond slowly.

    While we do not study treatments, the recruitment bias we find has implications when (in-)experienced subjects are unevenly distributed across treatments.


We are grateful to our editor, Bob Slonim, and two anonymous referees for helpful comments. Also Tim Cason's and Hannah Schildberg-Hörisch's comments helped improving the paper. Thanks also to Brit Grosskopf and Rosemarie Nagel for sharing their data with us.

