Individual cheating in the lab: a new measure and external validity

This paper investigates to what extent laboratory measures of cheating generalise to the field. To this purpose, we develop a lab measure that allows for individual-level observations of cheating whilst reducing the likelihood that participants feel observed. Decisions made in this laboratory task are then compared to individual choices taken in the field, where subjects can lie by misreporting their experimental earnings. We use two field variations that differ in the degree of anonymity of the field decision. According to our measure, no correlation of behaviour between the laboratory and the field is found. We then perform the same analysis using a lab measure that can only detect cheating at the aggregate level. In this case, we do find a weak correlation between the two environments. We discuss the significance and interpretation of these results.


Introduction
Cheating permeates many social and economic interactions of daily life (DePaulo et al. 1996;Ariely 2012). Examples range from corporate scandals (e.g., Dieselgate, Facebook-Cambridge Analytica), tax evasion (Slemrod 2007) and consumer misbehaviour (Mazar and Ariely 2006). To make things worse, endeavours to study cheating in natural contexts are hindered by its secretive nature. Therefore, controlled experiments represent an attractive instrument to study individual attitudes towards cheating.
The die-roll paradigm (Fischbacher and Föllmi-Heusi 2013) represents the most popular measure of cheating used in the laboratory. Participants are asked to roll a die in private and to report the result to the experimenter. As the true outcome is observed by subjects only, there is a monetary incentive to lie by reporting those outcomes associated with higher rewards. Despite its simplicity, this type of task presents a considerable limitation: cheating can only be inferred at the aggregate level by comparing the empirical distribution of actual reports with its theoretical prediction. Hence, it is not possible to know, by design, if a particular subject lied or not. 1 Whether or not laboratory measures of cheating extend to non-controlled environments is still under investigation. For instance, the experimenter scrutiny or the artificiality of the lab environment might trigger different ethical norms. If this is the case, then laboratory results on cheating might not generalise to the field (Levitt and List 2007). Our paper aims to address these two limitations.
First, we design a novel task that, in contrast to the existing literature, allows us to observe cheating at the individual level. In our task, subjects have five seconds to choose, in their mind, one out of 60 colours (e.g. Yellow) from a list displayed on their screen. Once this list disappears, three new lists containing four colours each (e.g. White, Beige, Milk, Plum) are displayed. Every new list is associated with a different positive payoff. If subjects claim their chosen colour to be in one of the three new lists, they receive the payoff associated with that list; otherwise, they receive nothing. We know that the participants have cheated if they pick a list of colours on the second screen that does not contain any colour that was already present in the first larger list.
Second, we use the fact that in our task cheating is observable at the individual level and ask to what extent cheating in the lab predicts cheating in the field within the same population. Participants are not paid immediately after the experiment. Instead, after a few days, they have the opportunity to cheat in the field by selfreporting their earnings. Subjects are paid according to the amount of money they claim to have earned in the laboratory. We use two field variations that differ in the degree of anonymity of the field decision. In the first, the self-reporting procedure is completely anonymous, while the second field variation requires participants to meet in person with the experimenter.
The main contribution of this paper is twofold: (i) it develops a laboratory task that allows for individual level observations of cheating, and (ii) it allows for a comparison of both the extensive and intensive margins of cheating between the laboratory and a non-controlled environment. 2 In line with previous findings on individual dishonesty, we find that a considerable fraction of subjects cheat in our laboratory task but, for some, not to the fullest extent. However, no significant correlation of dishonest behaviour between the lab and the field is observed. Although more than half of the subjects cheat to some extent in our task, most of them refrain from over-reporting their experimental earnings. Moreover, for those who do so, we find no difference in the extent of cheating between subjects that are honest in the laboratory and those who are not. Interestingly, when using a variation of the die-roll task that only allows to infer cheating at the aggregate level, we do find a weak correlation between lab and field behaviour.
To the best of our knowledge, only few other studies examine the correlation between dishonest behaviour in the lab and cheating in the field within the same population. 3 Dai et al. (2018) perform an artefactual field experiment where passengers of public transportation are asked to play a modified version of the dieroll task. As a main result, the study finds that fare dodgers, on average, are more likely to report the most profitable outcome than ticket holders. Similar to our study, Potters and Stoop (2016) use a student subject pool to correlate self-reported performance in a mind game implemented in the lab with a field measure of cheating. After the experiment, payments are issued via bank transfer and some subjects are deliberately overpaid by an amount of €5. A significant correlation of 0.31 between performance in the mind game and not reporting the overpayment is found. In contrast to Potters and Stoop (2016), our study allows for the observance of cheating at the individual level, measures cheating at both the extensive and intensive margins, provides full anonymity in the lab and in one of the field tasks and requires active misreporting in both environments. These new features allow for a deeper understanding of whether lab measures of cheating are reliable predictors of dishonesty in other environments.
The extent to which laboratory results on cheating can be generalised to other settings remains unclear. 4 Laboratory evidence shows persistent patterns on dishonesty across subjects. Some individuals are completely honest, while others either lie to the maximum extent possible, or forfeit part of the monetary gains when they do cheat Abeler et al. 2019;Gerlach et al. 2019). Instead, 2 The extensive margin corresponds to the fraction of people who lie, whereas the intensive margin corresponds to the extent of cheating for people who choose to do so. 3 Other papers focus on the correlation between a lab measure of cheating with the broader concept of rule violation in the field: in-prison offences (Cohn et al. 2015), school misconduct (Cohn and Maréchal 2018) and work absenteeism (Hanna and Wang 2017). 4 For a broad discussion on the generalisability of experimental results in economics, see Levitt and List (2007), Al-Ubaydli and List (2013), Heckman (2009), Camerer (2015), Kessler and Vesterlund (2015), Al-Ubaydli et al. (2017). studies that focus on dishonesty in the field provide mixed results. While some find substantial cheating among subjects (e.g., Drupp et al. 2019;Bucciol and Piovesan 2011), other studies report different findings. For example, Abeler et al. (2014) report no evidence of lying in a randomised field experiment where subjects are called at home and have a monetary incentive to misreport the outcome of a privately tossed coin. Similarly, Cohn et al. (2014) show that bankers cheat in a coin-flip task when they are reminded about their professional identity. However, when such cue is not emphasised, reported outcomes do not differ from their truthful distribution.
The remainder of this paper is organised as follows: Section 2 describes the experimental design, Sect. 3 presents the main results of the paper, Sect. 4 discusses the main findings and Sect. 5 concludes.

Experimental procedure
The experiment was conducted between November 2017 and July 2019 at EssexLab at the University of Essex. In total, 249 participants were recruited using hroot (Bock et al. 2014). Laboratory sessions (12 in total) lasted about 43 mins, and average total earnings (inclusive of a £4 show-up fee) were £12.62 (s.d. £4.60). The experiment was programmed using z-Tree (Fischbacher 2007).
Before the laboratory session, participants acknowledged that the experimental proceedings were paid after a few days (see Fig. 6 in appendix). Any further detail about the payment procedure was omitted. Subjects entered the lab anonymously and were randomly allocated to their terminals so that it was impossible to link their identity to a particular workstation. At the beginning of the experiment, each subject was informed that the session consisted of five parts and a short final questionnaire. Detailed instructions about each part were displayed on subjects' screens only upon completion of the previous part (all instructions are reproduced in Appendix A). Where needed, control questions where elicited before the actual choices were made. Participants were informed that, at the end of the experiment, two of the five parts were randomly selected for payment.

Laboratory experiment
The laboratory experiment consisted of five different parts, whose order was randomised at the session level.
Part 1. In the first part of the experiment, subjects faced our so-called mind game (hereinafter list game). 5 The list game consists of a simple decision problem. First, a random list of 60 colour names (e.g., Yellow) appears on the computer screen and is displayed for only five seconds. This ensures that no subject can read all of the colours in the given time. Before the timer expires, each participant must choose, in their mind, a colour from the list. After five seconds, the list disappears and three new random lists containing four colours each are displayed on the screen-e.g., one of the three lists might be White, Beige, Milk, Plum. Subjects are then asked whether the colour they have in mind appears in one of the three new lists, each of which is associated with a specific payoff: £1, £3 and £5, respectively. If yes, then they must select the list that contains the colour they thought of, otherwise they must select the alternative option (''Not in the lists''). Participants who claim to have found their colour by selecting one of the three lists earn the corresponding payoff. However, subjects who choose the alternative option earn £0. By design, the colours displayed in the three new lists are never present in the list where subjects actually choose from. Hence, every positive payoff reported by participants can be classified as a lie. 6 As the colour choice is made in the subject's mind, individual cheating appears to be undetectable. 7 This is verified via a control question. After the decision on whether to cheat or not is made, participants answer the following question: ''Out of 100 participants, how many do you think successfully choose a colour in the first list that is also present in one of the three lists?'' Subjects earn an additional £1 if their answer is within five points from the true value-i.e., zero. As a consequence, any answer below or equal to five indicates that subjects believe the colours in the three lists are not present in the first one. Thus, they realise that cheating could be detected with certainty. 8 Part 2. This part consists of a computerised variation of the mind game used in Kajackaite and Gneezy's (2017). Subjects have to roll a virtual five-sided die where each side is associated with a colour. First, participants must choose one of the five colours in their mind. Then, the outcome of the die roll is revealed and subjects must report whether the colour they have in mind corresponds to the actual outcome of the die roll. If the answer is yes, they earn £5, otherwise they earn £0. This task resembles the list game because the decision is made in the subject's mind, with the difference being that cheating cannot be detected at the individual level. The purpose of this mind game is twofold. First, it can be used to corroborate the list game as a valid measure of cheating. Second, it is possible to correlate the reports of the die-roll to behaviour in the field. However, correlation can only be measured at the aggregate level.
Part 3. In this part, subjects are randomly paired and play a dictator game. Each member of the pair is endowed with £6 and decides how much money to transfer, in steps of £1, to the other group member. After both subjects have made their decisions, one of the two choices is implemented with equal probability. The dictator game is used as a measure of greed and is elicited as a proxy for pro-social behaviour.
Part 4. Part four consists of a trust game similar to Burks et al. (2003), where each participant knows in advance that they will play both the role of a sender and a receiver. Subjects are randomly paired and after being endowed with £3, they choose whether to send £0, £1, £2 or £3 to their counterpart. Any amount sent is tripled. Without knowing the decision of the other player, both subjects decide how much to return for any possible transfer they could receive. After all decisions are made, the computer assigns the roles with equal probabilities and the corresponding decisions are implemented. We measure trust as a control for social preferences. This measure allows us to investigate whether subjects that put more trust in others or are more trustworthy, are also less likely to lie.
Part 5. In the last part, risk preferences are elicited using a slightly modified version of the lottery choice task implemented in Eckel and Grossman's 2008 study. Participants must choose one out of five virtual boxes. Every box contains two payoffs that are realised with equal probability (see Table 4 in Appendix C). Starting from a risk free lottery that yields £2, the expected payoffs of the subsequent lotteries increase so as their variance. Hence, the higher the expected payoff, the higher the risk. The main advantage of this task resides in its simplicity and thus, can be easily understood by participants. Nonetheless, it can identify enough heterogeneity in risk attitudes. It is important to elicit risk attitudes as the decision to cheat also depends on the risk of being caught lying. Understanding the relation between individual preferences for honesty and risk attitudes might unveil important insights on one's decision to cheat.
Upon completion of the five parts, subjects answer an incentivised questionnaire collecting socio-demographic information and a 20-item measure of Big five (Donnellan et al. 2006). Once participants complete the questionnaire, their own experimental earnings are calculated and displayed on their screen. Subjects are then asked to note their earnings on a piece of paper ('reminder card'), to fold this into an envelope, and to conceal their rewards by clicking a button on their screen. 9 At this point, participants are the only ones who know the amount of money they have earned. 10 At the very end of the session, each subject is provided with a paper sheet entitled 'Payment form', which contains detailed instructions about the payment procedure. 11 Note that every form contains a hidden code that allows it to be associated with the corresponding workstation. 12 Hence, it is possible to uniquely identify behaviour in the lab-but not individuals' identity-with subsequent choices in the field.
Subjects are then asked to leave the lab without filling in the payment form.

Field experiment
The field experiment is designed to resemble a variation of the standard payment procedure. Participants are not paid immediately after the laboratory session. Instead, after a few days, they can self-report their earnings using the payment form they were provided with. Payments are provided, in cash, upon provision of this paper sheet. Subjects are free to self-report any integer number between the minimum and the maximum possible payoff, £5 and £26, respectively. 13 Thus, there is a monetary incentive to cheat by claiming a higher payment than the amount of money actually earned in the lab. Note that, at this stage, detection of lies is not possible. Cheating in the field can only be inferred after decoding each payment form and then by comparing the self-reported payment with the actual experimental earnings. Moreover, apart from self-reported earnings and the payment date, no other personal information is contained on the forms. Hence, it is not possible to link the payment forms to individuals' identities. We employ two treatment variations so as to investigate possible factors that might influence cheating outside the laboratory. The first treatment involves no face-to-face (NoFtF) interaction with the experimenter, resembling the full anonymity condition present in the lab. In more detail, at the end of the experiment each participant is randomly assigned to a locker located in a university campus building and is endowed with the corresponding key. Subjects must leave the payment forms, containing their self-reported earnings, in their assigned locker. The sheets are then collected by the experimenter and replaced with cash corresponding to the money claimed by subjects. After all payments have been provided, participants can then collect their cash earnings. 14 In contrast, the second treatment requires participants to meet face-to-face (FtF) with the experimenter in an office room. Instead of leaving the payment form in a locker, subjects hand the paper sheet to the experimenter and are paid immediately. 15 Besides the personal interaction, a degree of anonymity is also assured in this phase of the experiment, as no personal information is collected.

Design considerations
The main contribution of this experiment is to allow for individual level observation of cheating. Moreover, the list game makes it possible to correlate both the extensive and intensive margins of cheating between the lab and the field.
Despite the fact that the laboratory and the field tasks differ in their intrinsic nature, the experimental design still allows for a comparison of the behaviour between two similar decision problems. It is true that the field experiment differs in many aspects from the list game and the die-roll game. The aim of this exercise, however, is to relate a laboratory measure of cheating to dishonesty in a task that might reflect a real-life situation and thus, is not too artificial.
First, it must be noted that in the lab as well as in the field, participants can only cheat by commission. This is in contrast with Potters and Stoop (2016)-the study closest to our design-where subjects can cheat by not reporting the payment error to the experimenter. The difference between cheating by commission and omission might lead to differences in behaviour. As one might expect, lying by commission is less tempting when compared to a situation where cheating requires no active choice (Pittarello et al. 2016).
Another important variable that is kept constant between the two environments is anonymity. As Gneezy et al. (2018) suggests, the probability of being caught lying highly affects dishonesty. In this experiment, despite the fact that cheating can be detected at the individual level, subjects' identities can never be linked to their choices. This feature allows for the generation of conditions similar to those real-life situations where dishonest actions cannot be associated to one's identity (e.g., not returning a lost wallet). 16 Finally, the design allows for the control of possibly confounding variables caused by social preferences. The consequence that lying might have on other people is known to affect dishonesty (Gneezy 2005;Erat and Gneezy 2012). For this reason, in the lab as well as in the field, the victim of the lie is always the experimenter.

Laboratory results
The main results presented in this section focus on choices made in the list game and on how they correlate with the die-roll task. Appendix B provides additional results using choices from other tasks elicited in the laboratory. Due to the fact that the treatment variation pertains only to the field, laboratory observations are pooled to increase the power of the analysis. Figure 1 shows the choices made in the list game, where each bar represents one of the options that subjects could choose. The three rightmost bars (£1, £3 and £5) represent the fractions of participants that dishonestly reported to have found the colour they had in mind in one of the three subsequent lists. Instead, the first column (£0) corresponds to the percentage of subjects that have been honest in the list game. The figure highlights significant heterogeneity in lying preferences. In contrast to standard economic predictions, 41% of the subjects choose to not cheat at all by selecting the option that pays nothing. Interestingly, although 40% of participants cheat to the maximum extent possible (£5), a substantial proportion of them (4% and 15%) forfeit the maximal gains from lying by choosing the lists associated with either the £1 or £3 payoff, respectively. Hence, dishonest behaviour seems to be driven by heterogeneity in lying preferences. Some participants are either always honests or unconditional liars, whilst the remaining subjects fall in between these two categories depending on the relative gains from lying.
Result 1: The highest fraction of cheaters in the list game report the payoffmaximising lie. A significant proportion of liars do not cheat to the maximum extent possible. Statistical support: When restricting the data to two options, one-sided binomial tests reject the null hypothesis that these two options occur with probability equal to 0.5. For the pairs (£1,£3), (£3,£5) and (£1,£5), the conditional probability for the option with a higher payoff is significantly above 0.5, at 1% level for all pairs. 17 Looking at participants' beliefs, Fig. 2 presents the answers to the control question elicited after the list game. This question allows us to verify whether participants think their lies cannot be detected. As the figure shows, only about 6% of the subjects reported a belief lower or equal to five. 18 Thus, almost all of the participants made their decisions as if it was not possible to detect cheating at the individual level.
One might question whether the new task herein introduced can be related to other laboratory measures of cheating that do not allow for individual level observations. To corroborate our new measure, we look at how choices in the list game are correlated with choices made in the die-roll game (Part 2). In the latter task, the fraction of positive claims amounts to about 60%, which is very distant  5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Belief Fig. 2 Beliefs elicited in the control question. Participants earned £1 if their answer was within 5 points from the correct value (zero). Hence, the vertical dashed line represents the upper bound for which a subject is thought to believe the colours in the three lists were not present in the first one. Notably, the highest fraction of answers corresponds to 20. This is consistent with the belief that the 12 colours in the three lists were randomly drawn, with equal probability, from the first list containing 60 colours ðN ¼ 249Þ 17 In detail, N ¼ 48; p\0:001 for pair (£1,£3), N ¼ 136, p\0:001 for pair (£3,£5) and N ¼ 110, p\0:001 for pair (£1,£5). 18 We acknowledge that some subjects might have misunderstood the question and reported their belief of how many participants actually cheated. As we did not want to emphasise cheating as the matter of the study, such question was not elicited. from its expected value (20%). Hence, about 40% of participants cheated in the dieroll task by reporting a ''Yes'' answer. If the two measures are related, then we should expect participants that are dishonest in the list game to be more likely to answer ''Yes'' after rolling the die. As can be seen in Fig. 3, this seems to be the case. Participants who cheat in the list game (right panel) are more likely to obtain a positive payoff in the other mind game (two-sided Fisher's exact test: p ¼ 0:027, N ¼ 225). 19 This result is also confirmed by Table 3 in Appendix B.

Field results
In this section, we present results for both the field treatments and their correlations with behaviour in the two cheating tasks measured in the laboratory. 20

List game
We start by analysing the correlation between cheating in the list game and overreporting in the field. As the maximum amount of money a subject can claim depends on their actual experimental earnings, cheating in the field is standardised as follows: Cheat field ¼ self-reported earnings À actual earnings 26 À actual earnings Hence, such a variable can take values in the interval of [0, 1]. 21 In other words, it measures how many pounds (£) are over-reported relative to the maximum amount of money a subject could claim. Figure 4 presents the results for both field treatments and their relation with choices made in the list game. The vertical axis measures cheating outside the laboratory as defined in the previous equation. Thus, any observation above zero represents the extent of cheating in the field for a particular subject. The horizontal axis summarises the choices made by participants in the laboratory. Thus, from this graph it is possible to relate both the extensive and intensive margins of cheating between the list game and over-reporting of experimental earnings.
As the figure shows, the data do not support the generalisability of laboratory results on cheating in either of the two field variations. First, in both cases, most of the participants refrain from over-reporting their experimental earnings. The percentage of cheaters drops from about 66% (54%) in the lab to slightly below 19% (5%) in the field in the NoFtF (FtF) treatment. 22 As expected, in the field variation with a weaker degree of anonymity (FtF), the fraction of participants that do cheat is significantly lower (two-sided Fisher's exact test: N ¼ 226, p ¼ 0:002). The faceto-face interaction appears to trigger higher costs associated with lying with the consequence of reducing dishonest behaviour. A similar result is also found in Conrads and Lotz (2015).
Moreover, it appears there is no significant difference on the extent of cheating in the field between who cheated in the list game and those who did not. The mean value of the Cheat field variable is 0.59 for both honest and dishonest participants in the NoFtF treatment. In the FtF variation, this value is 0.37 and 0.69 for those who were honest and those who lied, respectively. However, the low number of observations does not allow us to make any reliable inference for this treatment.
Result 2: There is no significant correlation of cheating between choices in the list game and in the field. Statistical support: Cheaters in the lab are not more likely to cheat in the field in both treatment variations (two-sided Fisher's exact test: N ¼ 123, p ¼ 0:809 21 Actual lab earnings range between £5 and £19. Thus, the variable Cheat field is always defined. Further, no subject under-reported their earnings. On average, subjects actually earned £11.84 (SD 3.35) and £11.86 (SD 3.47) in the NoFtF and FtF treatments, respectively. A Mann-Whitney U test does not reject the hypothesis of equality (p ¼ 0:897; N ¼ 226). 22 Similar to this result, Gerlach et al. (2019) show in a meta-analysis that dishonesty is significantly more prevalent in lab experiments than in field studies. (NoFtF); N ¼ 103, p ¼ 1:000 (FtF)). The Spearman correlation between choices in the list game and Cheat field is 0.04 and 0.08 in NoFtF and FtF, respectively, and not statistically significant in either of the two field variations (two-sided test: Thus, Table 1 confirms the results. 23

Die-roll game
In this section, we correlate behaviour between the lab and the field using choices made in the die-roll game. Thus, we can replicate the exercise above with the main difference being that we cannot detect cheating at the individual level. For this reason, we label as cheaters all subjects who answered ''Yes'' in the mind game. Figure 5 depicts the correlation between choices in this task and the Cheat field variable for both the NoFtF and FtF treatments. With this measure, we do find a weak correlation of cheating in the treatment with no face-to-face interaction.
Result 3: There is a significant correlation between choices in the die-roll game and cheating in the field in the NoFtF treatment. Statistical support: Participants who answer ''Yes'' are more likely to cheat in the field (two-sided Fisher's exact test: N ¼ 123, p ¼ 0:088). The Spearman correlation coefficient between Cheat field and Yes is 0.16 and significant at the 10% level (two-sided test: N ¼ 123, p ¼ 0:070). 24 These results are also confirmed by  Fig. 4 Comparison of cheating between the lab and the field for the NoFtF (left panel, n ¼ 123) and FtF (right panel, n ¼ 103) treatments with weighted markers. The smallest circles represent one single participant. The y-axis indicates the extent of cheating in the field. The x-axis represents the choices made in the list game 23 A similar analysis is carried out in Table 8 (Appendix C). The OLS estimates are generated using the amount of over-reported money as a dependent variable and draw the same conclusions. 24 In the FtF treatment, these two variables do not correlate. The Fisher's exact test delivers p ¼ 0:649 while the Spearman correlation coefficient is 0.08 with p ¼ 0:417, N ¼ 103.
The evidence from Fig. 5 and Table 2 contrasts with Result 2. Interestingly, this might actually lead to the opposite conclusion.
This section correlated cheating between the lab and the field using two different measures. The results show that individual level observations of cheating appear to be of paramount importance in understanding such secretive and subtle behaviour. These type of data might then provide new insights that cannot be inferred using aggregate statistics.  (2) and (5) further control for actual laboratory earnings and specifications (3) and (6) additionally control for gender effects. Robust standard errors in parentheses. Ãp\0:1, ÃÃp\0:05, Ã Ã Ãp\0:01

Discussion
Dishonesty can be very sensitive to personal factors (Rosenbaum et al. 2014;Jacobsen et al. 2018), and this in turn translates into heterogeneity in lying preferences (Gibson et al. 2013 Fig. 5 Comparison of choices in die-roll mind game and cheating in the field for the NoFtF (left panel, n ¼ 123) and FtF (right panel, n ¼ 103) treatments with weighted markers. The smallest circles represent one single participant. The y-axis indicates the extent of cheating in the field relative to the maximum payoff a subject could claim. The x-axis represents the choices made in the mind game involving the die roll Variable Yes is a dummy which is equal to one if the subject reported a positive payoff in the mind game with the die. Specifications 1-3 represent least square estimations on the Cheat field variable. Specifications 4-6 represent marginal effects on a dummy variable indicating whether a subject lied in the field or not. Specifications (1) and (4) include a dummy for the NoFtF treatment, regressions (2) and (5) further control for actual laboratory earnings and specifications (3) and (6) additionally control for gender effects. Robust standard errors in parentheses. Ãp\0:1, ÃÃp\0:05, Ã Ã Ãp\0:01 two environments is sensitive to individual preferences. Moreover, while in the list game both risk and social preferences are correlated with individual dishonesty (Table 3 Appendix B), this seems not to be the case for cheating in the field. Tables 1 and 2 show that only choices in the trust game are significantly correlated with dishonesty in the payment procedure. Apart from heterogeneity in preferences, differences in dishonest behaviour might also hinge on the experimental paradigm (Gerlach et al. 2019). For example, while Gächter and Schulz (2016) find a positive correlation between the corruption index on the country level and reports in die-roll tasks, such effect is not found using coin-flip tasks (Pascual-Ezama et al. 2015). Hence, another possible source of variability in cheating can be generated by differences between the laboratory and the field tasks.
First, it should be noted that, although in the lab all decisions are computerised, the self-reporting procedure adopted in the field requires participants to lie to the experimenter. Cohn et al. (forthcoming) indeed find that interacting with a human  (1) and (4) control whether the list game was played after the other cheating task involving the virtual die. Specifications (2) and (5) include a dummy for the NoFtF treatment and regressions (3) and (6)  induces significantly less cheating when compared to interacting with a machine. Hence, this difference in the communication channel might concur with the explanation of the results presented in Sect. 3. However, the data cannot explain why subjects that have been either honest or dishonest in the list game cheat to the same extent (on average) in the FtF treatment. Hence, the communication channel, per se, does not seem to fully explain the main findings. Another difference between the lab and field tasks might rest on the moral costs associated with cheating. While participants can lie about a random event in the list game, the self-reporting procedure forces them to cheat in the field by claiming a higher payment, i.e., by 'stealing' money. In the latter case, it is possible that cheating triggers higher moral costs compared to lying about an artificial outcome, and this would result in more honest reports. Hermann and Mußhoff (2019) find that individuals are less willing to steal than lie in a die-roll experiment. Therefore, the higher moral costs implied by stealing would partially explain the low number of subjects that over-reported their experimental earnings. However, this effect alone cannot fully explain the lack of correlation between the lab and the field presented in Result 2.
It is also possible that differences in dishonest behaviour depend on the time available to make a decision. While in the laboratory choices are made within a few minutes, in the field this is not the case. Subjects can spend a few days to think on whether to claim a higher payment or not. If reflecting more time on the possibility to lie reduces dishonest behaviour, this might explain why only a few subjects lied in the field. To the best of our knowledge, only Andersen et al. (2018) studied the effect of time on cheating within the die-roll paradigm; they found no difference in dishonesty when participants are given an extra day to decide. In light of this finding, it seems unlikely for the results to be driven by the difference in the time available to make the decision.
Apart from individual preferences for honesty or differences between experimental paradigms, another explanation for Result 2 might rest on the experimental design as a whole.
As the reader might have noted, one's willingness to claim a higher payment could depend on their actual laboratory choices. Subjects who cheat in the list game are more likely to obtain higher earnings, and, in turn, they might refrain from self-reporting a higher payment because of an income effect. By a similar argument, participants that remain honest in the lab might be more tempted to cheat in the field due to the higher stakes involved. Thus, we should expect a negative relation between laboratory earnings and over-reporting in the payment procedure. 25 Although only two randomly drawn parts were used to determine each subject's payment, if the argument above is true, it could explain why no correlation is found between the two environments.
However, field behaviour seems to not depend on actual laboratory earnings. The coefficient of actual laboratory earnings in Tables 1 and 2 is not statistically significant at any conventional level. Although cheaters in the list game actually earned, on average, £2.8 (£1.3) more than honest participants in the NoFtF (FtF) treatment, these differences are relatively small. Therefore, the relative difference in potential gains from over-reporting between those who are honest and those who lie is little. Moreover, two recent meta-analyses find a weak (if none) effect of rewards on dishonesty (Abeler et al. 2019;Gerlach et al. 2019). Thus, although the lab and field tasks are not perfectly independent, income effect and stake size do not seem to explain the results shown in the previous section.
This section examined some factors that might have determined the results presented in this paper. Although some of them can partially account for the main findings, alone, none of them can fully explain the evidence presented in Sect. 3.2.

Conclusions
Even though laboratory experiments on cheating abound in the economic literature, only few studies explore their generalisability to the field. This paper aims to relate a laboratory measure of cheating with dishonesty in a non-controlled environment within the same population. To this purpose, we develop a laboratory task that allows for the observance of cheating at the individual level. Behaviour in the lab is then compared to choices in the field, where subjects have the possibility to cheat by over-reporting their experimental earnings. Payments are not issued immediately after the laboratory experiment. Instead, after a few days participants are allowed to self-report their earnings to the experimenter. Subjects are paid the amount of money they claim to have earned. As shown by the laboratory data, established results as lying aversion and non-payoff-maximising lies are replicated. However, according to our measure, no correlation of cheating between the lab and the field is observed. We then perform the same analysis using a laboratory task that measures cheating at the aggregate level. Using this measure, we do find a weak correlation between the two environments. However, it is not possible to pinpoint the drivers of these results. Yet it appears that only an interaction between individual preferences and contextual factors can account for the differences in cheating between the lab and the field.
Taken together, these findings underline the importance of being cautious when extending laboratory results regarding dishonesty outside a controlled environment.

Welcome!
You are about to take part in a decision-making experiment. It is important that you do not talk to any of the other participants during the experiment. If you have a question at any time, raise your hand and an assistant will come to your desk to answer it. This experiment consists of five different parts and you will play each of them only once. You will receive detailed instructions for each part on your computer screen as the experiment progresses. In each part you will be asked to make one or more decisions. Decisions made in one part of the experiment will bear no consequences for the other parts of the experiment. During the experiment your earnings will be calculated in pounds and you will have the chance to earn an amount of money that can range from £5 to £26. At the end of the main experiment you will have to complete a brief questionnaire. At the end of the experiment the computer will randomly select two parts for each participant. The sum of the earnings in these two selected parts will constitute your payment for this experiment. In addition to this money we will pay you £4 for showing up today and 1£for completing the questionnaire. Your cash earnings will not be immediately paid. Instead, payments will be issued within few days (from 23rd to 30th of November). You will receive further instructions about the payment procedure at the end of the experiment. If you have a question now, please raise your hand and a lab assistant will come to your workstation.

Instructions for the list game
You are about to play an easy game. In the next screen you will see a list of 60 colour names (e.g. tamarind). Once the list appears, a countdown of 5 seconds will start. This list will be displayed until the countdown reaches zero. Before the list disappears, you will have to choose one of the colour names in the list and keep it in your mind. Then, three random lists containing 4 colour names each (for a total of 12 colours names) will appear. If the colour you have in mind is in one of the lists, you will win the amount of money associated to that list, otherwise £0. After you click the OK button the first list containing 60 colour names will be shown and the 5 seconds timer will start. Choose a colour in your mind before the timer reaches zero. Click the OK button to start.

Instructions for the dice game
In this part you will have to roll a fair die with 5 sides. Every side corresponds to a colour. Hence, every colour has probability of 1/5 to come up. This means that, in expectation, out of 100 rolls every colour will come up 20 times. Before rolling the die, you have to choose a colour in your mind from the ones displayed below. If the outcome of the roll is the same colour you though of, you will earn 5£, otherwise 0£.

Instructions for dictator game
In this part, the computer will randomly pair you with another participant. You will remain paired with this person for the whole duration of this part. Once the decisions are made, the pair will be dissolved. You, as well as the person you are paired with, will never learn the identity of each other. In this part, both you and the participant you are paired with, will have to split the same amount of money among you. Each of you simultaneously decides the amount to transfer to the other participant. Hence, the decision of one subject is not observable by the other participant. The computer will then choose with equal probability which one of the two actions will be implemented. Your earnings from this part correspond to the money that you keep for yourself (in case your choice is implemented) or to the money the other participant decides to transfer to you (if his/her choice is implemented).

Instructions for the lottery choice
In this part, you will have to choose between five options. You will be paid based on which option you choose. Each option involves a simple lottery with two possible outcomes that are equally likely to occur. Hence, every lottery will return each of the two numbers with 50% probability.

Instructions for the trust game
In this part, the computer will randomly pair you with another participant. You will remain paired with this person for the whole duration of this part. Once the decisions are made, the pair will be dissolved. You, as well as the person you are paired with, will never learn the identity of each other. There are two types of player in this part, a sender and a receiver. You will play both roles: at first as a sender and then as a receiver. Each person will be allocated with the same amount of £X. Firstly, each of you will simultaneously decides as if you were the sender. As a sender you will have the opportunity to send some of the £X to the other person (receiver). Each pound sent to the receiver will be tripled. Thus, if the sender sends £x, the other player will receive £3x. Then, without observing the choice of the other sender, you will be asked to choose as if you were the receiver. You will have to decide how much money to send back to the sender for any possible amount of money that you can receive. Once the decisions are made, the computer will choose with equal probability which member of the pair is the sender and who is the receiver, implementing the corresponding choices. The earnings of the sender from this part will correspond to the amount of the endowment of £X he/she keeps for his/ herself plus the money returned by the receiver. The earnings of the receiver from this part will correspond to the endowment of £X, plus three times the transfer from the sender, minus the money returned to the sender.

Payment forms
See Fig. 7.  6 Consent forms for the NoFtF (left) and FtF (right) treatments. Subjects completed and handed the forms before entering the lab. The change in the format is due to a change in EssexLab policy happened between the two field treatments

B Additional results
This section presents additional results and focuses on the relation between behaviour in the list game and the other laboratory tasks. As described in Sect. 2, other individual attitudes as risk preferences, individual greed, and trust were further elicited during the laboratory sessions. Table 3 shows how behaviour in these tasks correlates with cheating. The first three columns represent linear average effects on choices in the list game, while specifications 4-6 show marginal effects on a dichotomous variable that takes value one if a subject lied in the same task. The variable Yes represents the report made in the die-roll game. The variable Risk corresponds to the lottery chosen in Part 5 and can take integer values starting from one, which corresponds to the safe option, to five, where higher numbers are associated with higher risk. The last two variables, Transfer dictator and Transfer trust, correspond to the money sent to the receiver in the dictator and trust game, respectively. 26 Similar to what is found in Hübler et al. (2018), it seems that participants who are more willing to choose risky lotteries are also more likely to lie. As dishonesty highly depends on the perceived risk of being exposed as a liar , it is reasonable to assume that individuals who are more prone to cheat, are also more willing to bear the risk associated with it. Fig. 7 Payment forms for the NoFtF (left) and FtF (right) treatments. The unique and hidden code that characterizes each form is given by the combination of the number of dots in the ''Payment date'' and ''Total earnings'' fields (subject id), and the lenght of the line below the email address (session id). To prevent copies, an university logo was stamped in the bottom right corner of the paper sheets Focusing on the variables Transfer dictator and Transfer trust, it is possible to note that both of them are inversely related with cheating. However, the coefficient representing the amount of money sent in the dictator game is significant in none of the probability models. This correlation translates into the relation between social preferences and dishonesty. Participants that are more generous or more trusting, cheat, on average, by a lower amount and less frequently. This suggests that individuals who value social preferences the most are also those who attribute high value to social norms or, in this particular case, honesty.
On what concerns how cheating relates to demographic co-variates elicited in the final questionnaire, no particular effect is found. Tables 6 and 7 (Appendix C) show no robust and significant pattern for any of the individual demographics or personality traits.

C Additional tables
See Tables 4, 5, 6, 7, and 8. Participants did not receive information regarding lottery's expected value and standard deviations. Specifications 1-3 represent least square estimations on choices made in the list game. Specifications 4-6 represent marginal effects on a dummy variable indicating whether a subject lied in the list game or not.