1 Introduction

Cash is king? According to standard economic theory, a monetary incentive is always better—or at least not worse—than a non-monetary incentive of equal market value due to the option value of cash (Jeffrey 2009; Waldfogel 1993). It is often difficult for companies to determine the preferences of individual employees and choose the most suitable non-monetary incentives. It is thus reasonable to assume that companies may occasionally choose inappropriate material incentives that do not match employees’ preferences. As a result, employees would be better off receiving cash incentives, which enable them to purchase benefits that maximize their individual utilities (Jeffrey 2009). However, although the use of non-monetary benefits is not reasonable from a neoclassical viewpoint, it is a widespread phenomenon within companies (Kauflin 2017; Zepelin 2017). Besides monetary incentives such as profit-sharing or bonus payments, non-monetary benefits such as restaurant coupons for meals (Condly et al. 2003), incentive travel, merchandise (i.e., electronics, luggage, or watches), and gift cards are often used by companies to reward top performing employees (Incentive Research Foundation 2016, 2017). A famous example is the cosmetics company Mary Kay, which rewards its top salespersons with luxury goods such as exclusive pink Cadillacs, diamond bracelets, and first-class trips to cities in foreign countries (Howell and Wanasika 2019).

This study investigates the effects of monetary, non-monetary, and a combination of monetary and non-monetary (mixed) incentives on performance, where non-monetary incentives are defined as tangible incentives with market value. To this end, we conducted a laboratory experiment with four different treatments (i.e., monetary, nonmonetary, mix, and control) and implemented a tournament, in which participants could earn a prize in addition to their fixed wage, according to their performance rank. The additional prize depended upon the treatment group: subjects in the monetary treatment group received cash prizes, those in the nonmonetary treatment group received non-monetary prizes (Lindt chocolates), and those in the mix treatment group a combination of non-monetary and monetary prizes (cash and Lindt chocolates). The task consisted of solving simple mathematical problems with the number of correctly solved problems serving as a performance measure.

Our experimental data indicate that, overall, there is no significant difference in performance between the treatment groups monetary, nonmonetary, or mix. However, when considering gender separately, a different picture is revealed: men’s performance in response to monetary incentives is significantly higher than in response to non-monetary incentives, while women’s performance is significantly higher in response to non-monetary incentives. Furthermore, our results suggest that these gender differences regarding the impact of monetary and non-monetary incentives on performance do not seem to be evoked by the perceived prize attractiveness.

To date, the economic literature has focused mainly on monetary incentives. Monetary incentives are considered powerful incentives suitable for enhancing employees’ performance (Condly et al. 2003; Jenkins et al. 1998; Prendergast 1999). However, several existing empirical studies show that monetary incentives do not always enhance performance, and can possibly have detrimental effects. This negative effect on performance is often ascribed to the motivation crowding-out effect (Deci and Ryan 2002; Frey 1997; Frey and Jegen 2001; Gneezy and Rustichini 2000; Mellström and Johannesson 2008; Titmuss 1970) or to the existence of reference-dependent preferences (Camerer et al. 1997; Fehr and Goette 2007; Pokorny 2008). Regarding non-monetary incentives, the existing empirical research focuses on their impact on performance either within the context of gift-exchange games, where incentives are given independently of performance (Kube et al. 2012; Mahmood and Zaman 2010), or within settings, where incentives are directly related to performance. Examples of these are tournaments (Hammermann and Mohnen 2014a; Jeffrey 2009; Kelly et al. 2017; Shaffer and Arkes 2009) or incentive bonus schemes, in which subjects receive a bonus after exceeding a pre-specified productivity threshold (Bareket-Bojmel et al. 2017).

Moreover, existing research has demonstrated a positive effect of performance-contingent non-monetary incentives (Jeffrey 2009; Kelly et al. 2017; Presslee et al. 2013). However, empirical research regarding the effectiveness and underlying psychological mechanisms of non-monetary incentives is still in its early stages and there is hitherto no clear evidence on whether monetary or non-monetary incentives are more effective. Furthermore, to the best of our knowledge, no study exists on the effects of a combination of monetary and non-monetary incentives, although this topic is of great importance as many companies use both monetary and non-monetary incentives to reward their employees. Moreover, there is only limited research dealing with gender differences in different incentive schemes (Gneezy et al. 2003; Jalava et al. 2015; Levitt et al. 2016; Masclet et al. 2015; Niederle and Vesterlund 2007). While previous literature has identified gender differences in tournament and competition settings, showing that women—in contrast to men—work reluctantly in competitive environments and shy away from competition (Datta Gupta et al. 2013; Dohmen and Falk 2011; Niederle and Vesterlund 2007), our experiment analyzes which incentives—monetary, non-monetary or mixed—are more effective in a tournament setting in relation to gender differences.

The contribution of this study to the literature is threefold. First, the effects of monetary and non-monetary incentives in a tournament setting are analyzed to obtain a clearer perspective and to provide an explanation for the equivocal results in the experimental literature regarding the question of which kind of incentive—monetary or non-monetary—is more effective. Second, the study endeavors to extend the literature by analyzing the effects of a combination of non-monetary and monetary incentives on performance in a tournament setting. Finally, it analyzes gender differences concerning the impact of non-monetary, monetary, and mixed incentives. To the best of our knowledge, neither mixed incentives nor gender differences regarding the impact of different kinds of incentives on performance have been analyzed before.

The remainder of this paper is organized as follows. Section 2 presents a review of the existing literature, followed by the hypotheses in Sect. 3 and a description of the experimental design in Sect. 4. Section 5 presents the results. Finally, Sect. 6 provides an in-depth discussion and Sect. 7 a concluding summary, containing the management implications of our findings as well as their limitations and directions for future research.

2 Literature review

As previously mentioned, according to standard economic theory, monetary incentives are always better, or at least not worse, than non-monetary incentives are (Jeffrey 2009; Waldfogel 1993). Nevertheless, several empirical studies show, to the contrary, that non-monetary incentives can have a stronger positive effect on performance than do monetary incentives of equivalent value.

In a controlled field experiment, in which workers had to catalog books at a university library, Kube et al. (2012) analyze the impact of gifts, that is performance-unrelated incentives, on performance. Their study reveals that people show a 25% higher performance when they receive a non-monetary gift (i.e., thermos bottle), whereas a cash gift of the equivalent value has no significant impact on their productivity. Kube et al. (2012) suggest that individuals might perceive the non-monetary gift as an act of generosity from the employer, who evidently invested time and effort into the gift, which thus elicits a positive reciprocal behavior. Furthermore, Lacetera and Macis (2010) show in their experimental study that while cash has a detrimental effect on the willingness to donate blood, non-cash incentives such as vouchers do not have adverse effects on pro-social activities. According to Heyman and Ariely (2004), the type of market, whether monetary or social, determines the relationship between payment and effort. In the former case, effort seems to stem from reciprocal motives and subjects determine their effort based on a simple cost–benefit analysis; in the latter case, where non-monetary incentives are used, effort seems to stem from altruistic motives.

Jeffrey and Shaffer (2007) identify four key psychological concepts that explain the motivational power and effectiveness of non-monetary tangible incentives: justifiability, social reinforcement, separability (based on Thaler 1999), and evaluability. The justifiability concept states the need to justify spending money on luxurious goods. However, if people earn these items as reward for good performance, this guilt is relieved and there is no need for the employees to justify consuming such items. According to the social reinforcement argument, non-monetary incentives have a trophy value, as they are highly visible in the recipient’s social environment, which brings indirect attention to the employee’s performance. The separability argument is based on the mental accounting theory of Thaler (1999), stating that individuals have different mental accounts for different earning types and do not consider their income collectively. Non-monetary incentives are evaluated independently of other income sources and, therefore, may have a higher impact than monetary incentives. Furthermore, non-monetary incentives allow subjects to mentally adjust the value of the benefit in both directions: upwards, if the benefit seems to be attainable, and downwards, if the benefit seems to be out of reach (the evaluability argument). These findings indicate that non-monetary benefits are perceived differently from cash-gifts and thus elicit different behaviors.

In addition to the impact of performance-unrelated non-monetary gifts, existing literature analyzes the effects of performance-related incentives, which is also the focus of our study. By means of a laboratory experiment with the staff members of an university, Jeffrey (2009) investigates the motivational power of tangible non-cash incentives. His results show that non-monetary incentives are more efficient in enhancing performance in comparison to monetary incentives of the same value, although individuals stated their preference for monetary incentives. He explains this result in terms of justification concerns, as people might have to justify the purchase of hedonic luxury goods. However, Hammermann and Mohnen’s (2014a) experimental study does not support these findings. The authors analyze work performance in competitions and the effects of non-monetary and monetary prizes. In contrast to Jeffrey (2009), they do not focus on justification concerns, but rather on the higher visibility of non-monetary incentives. Their results show that monetary incentives are more efficient in enhancing performance in comparison to non-monetary ones. This is also in line with results of Condly et al. (2003) who show by means of a meta-analytic review that money has a higher impact on performance than non-monetary tangible incentives do. However, Condly et al. (2003) also remark that the generalizability of their findings is limited as they are based on a small number of studies considering non-monetary incentives and the actual market values of the non-monetary incentives used in their meta-analysis could not be determined. Moreover, there is empirical evidence that people think more often about non-monetary tangible incentives than monetary incentives; this higher thought frequency positively affects performance (Jeffrey and Adomdza 2010).

In contrast to the aforementioned studies, Bareket-Bojmel et al. (2017), who analyze in a field study short-term bonus payments that subjects receive after exceeding a predefined productivity goal, do not find a significant difference between the impact of non-monetary (family pizza meal voucher) and monetary incentives on productivity; nevertheless, both types of incentives increased productivity significantly. This is in line with the results of Shaffer and Arkes (2009), who also do not find a significant difference in the effects of cash and non-cash incentives in a tournament setting. However, the incentive effect was weak as a single-winner tournament was used in their setting (Harbring and Irlenbusch 2008; Kelly et al. 2017).

Kelly et al. (2017) show that the positive effect of non-monetary incentives evolves only over time. In a repeated tournament setting, they find that while cash and non-cash incentives did not evoke different performance levels during the first tournament, first tournament losers performed in the second tournament better in the non-cash than in the cash condition. Therefore, non-cash incentives had a higher performance effect than cash incentives in the second tournament. Kelly et al. (2017) suggest that, in the first tournament, the fungibility of cash has a greater effect than the suggested higher attractiveness of the non-monetary incentive resulting from the categorization of cash and non-cash incentives to different mental accounts. Nevertheless, in the second tournament, losers in the non-cash condition overweighed the possibility of winning an attractive non-cash incentive and thus increased their efforts more compared to those in the cash condition.

Regarding gender differences in the effectiveness of performance-related incentives, there are existing studies showing that there are no significant gender differences regarding performance in simple piece rate schemes (Gneezy et al. 2003; Niederle and Vesterlund 2007). Furthermore existing research discusses the effect of gender differences on the effectiveness of non-monetary tangible and intangible incentives in schools (Jalava et al. 2015; Levitt et al. 2016; Riener and Wagner 2019). Levitt et al. (2016) show that, under low financial incentives, boys show a significantly higher performance compared to girls, whereas in the non-financial treatment, where they can earn a trophy, there are no differences in performance.

However, to the best of our knowledge, the literature has not yet analyzed gender differences regarding the effectiveness of different types of incentives of equal value in a tournament setting. Therefore, our study contributes to the literature on the effectiveness of monetary, non-monetary, and mixed incentives in tournaments with particular regards to gender differences.

3 Hypotheses

First, we discuss the overall effects of monetary, non-monetary, and mixed incentives on performance and, second, the possible gender differences regarding the effectiveness of these incentives.

3.1 Effectiveness of monetary, non-monetary, and mixed incentives

Based on the findings of extant empirical studies (Condly et al. 2003; Jeffrey 2009) we assume that, overall, performance-related incentives have a positive impact on performance in a tournament setting. Furthermore, following Jeffrey and Shaffer (2007), we argue that—in contrast to monetary incentives—non-monetary incentives have motivational properties in themselves (in addition to their market value), as they are highly visible (the social reinforcement argument) and can be evaluated independently of other income (the separability argument). Employees not only obtain the utility of the incentive per se, but also enjoy the recognition and acknowledgement of their performance within their social environment. While cash bonuses are typically invisible to others and people usually avoid discussing monetary rewards, non-monetary rewards have a trophy value and are highly visible, which fosters social communication of an employee’s strong performance (Jeffrey and Shaffer 2007). As individuals strive for social esteem and recognition (Bandura 1986; Ellingsen and Johannesson 2007; Stajkovic and Luthans 2003), the value of earning a tangible incentive is enhanced (Jeffrey and Shaffer 2007). Rewarding employees for good performance and showing respect and appreciation by means of non-monetary incentives may thus have a positive effect on employees’ effort choices (Ellingsen and Johannesson 2007; Hammermann and Mohnen 2014b; Kube et al. 2012). Therefore, we suggest that non-monetary incentives may lead to higher performance compared to monetary incentives.

Regarding the separability argument, Thaler’s (1999) mental accounting theory suggests that people have different mental accounts and do not consider their incomes collectively; that is, they cognitively divide different components of their incomes and value them separately in different mental accounts (Jeffrey and Shaffer 2007; Kelly et al. 2017). Jeffrey and Shaffer (2007) emphasize that any additional earnings might have a diminishing marginal utility for the employee, as he or she will mentally combine these earnings with the base salary and evaluate them relative to this salary. In contrast, employees will evaluate non-monetary incentives separately from the base salary. Choi and Presslee’s (2016) experimental results support this argument and show that subjects perform better when they categorize performance-related pay separately from salary. The allocation of cash and non-cash incentives to different mental accounts further influences people’s intentions of how to spend them. This in turn affects their attractiveness; cash incentives are mostly spent on necessities and utilitarian products, while non-monetary incentives often have hedonic attributes and are more attractive, thus leading to better performance (Kelly et al. 2017).

Based on the results of previous studies and the outlined psychological concepts, we posit:

Hypothesis 1

Monetary, non-monetary, and mixed incentives have a positive impact on performance.

Hypothesis 2

Non-monetary incentives have a higher positive impact on performance compared to monetary incentives.

Furthermore, we assume that by combining monetary and non-monetary incentives, the employer can combine the benefits of the former, namely the option value of cash (Jeffrey 2009; Waldfogel 1993) with the benefits of the latter, such as the motivational properties and attractiveness evoked by the psychological concepts of social reinforcement and separability (Jeffrey and Shaffer 2007). Mixed incentives may suit both the subjects with preferences for monetary incentives and those whose preferences are towards non-monetary incentives. For example, whereas women seem to appreciate non-monetary incentives, men seem to value the more monetary incentives (Clark 1997; Elizur 1994), which we will discuss in more detail in Sect. 3.2. Assuming that preferences for monetary and non-monetary incentives are equally distributed, we argue that mixed incentives should lead to a higher overall performance than either pure non-monetary or pure monetary incentives will. Following this reasoning leads to our third hypothesis:

Hypothesis 3

Mixed incentives have a higher positive impact on performance than either pure monetary incentives or pure non-monetary incentives.

3.2 Gender differences in the effectiveness of monetary, non-monetary, and mixed incentives

Moreover, this study addresses the possible role that gender differences play in the effectiveness of monetary, non-monetary, and mixed incentives in a tournament setting. To date, literature has focused on gender differences in tournament schemes and competitions. As such, there is considerable evidence that women are reluctant to work in competitive environments and shy away from competition, while men embrace competitive environments (Buser et al. 2014; Datta Gupta et al. 2013; Dohmen and Falk 2011; Masclet et al. 2015; Niederle and Vesterlund 2007). Furthermore, while women falter under performance pressure, men do well (Azmat et al. 2016; Shurchkov 2012). Following the literature, we assume that individuals are more focused on output when monetary incentives are at stake; this might lead to higher competitiveness, as individuals strive for monetary prizes (Hammermann and Mohnen 2014a; Vohs et al. 2008). Moreover, according to Heyman and Ariely (2004), money affects subjects’ perceptions and results in a shift from a social to a money market. In contrast, non-monetary prizes might reframe a competitive market as a more social market, thereby weakening the competitiveness of a tournament. Therefore, women might feel more comfortable and perform better in a competition where non-monetary incentives are at stake. In contrast, men seek competition and thus perform better when monetary prizes are at stake, being more persistent in pursuing them.

These possible gender differences in the effectiveness of monetary and non-monetary incentives might be due not only to different reactions to the perceived competitiveness and performance pressure, but also due to feelings of appreciation. As outlined in Sect. 3.1. non-monetary incentives can address employees’ need for acknowledgement (Ellingsen and Johannesson 2007; Hammermann and Mohnen 2014b; Kube et al. 2012). However, the most appropriate type of incentive to reward and acknowledge employees might differ between men and women. Several studies show that extrinsic job dimensions such as pay and promotion prospects are of high importance for men, while women value the more social aspects such as a positive relationship with the manager (Clark 1997; Elizur 1994). When the employer invests time in seeking and buying a prize, female employees may perceive the prize as being more personal than a pure monetary prize; thus, the non-monetary prize may signal more appreciation and evoke a higher degree of positive reciprocity and performance than a monetary one (Jalava et al. 2015; Kube et al. 2012; Prendergast and Stole 2001).

Assuming that individuals have standard preferences, that is monotonic preferences (“more is always better”), we argue that non-monetary incentives are superior to mixed incentives, and mixed incentives are superior to monetary incentives for women. Conversely, as men are more concerned with pay, we assume the reverse will hold true for them. We thus posit the following hypotheses regarding gender differences:

Hypothesis 4

For men, monetary incentives have a higher positive impact on performance than mixed incentives, and mixed incentives have a higher impact than non-monetary incentives.

Hypothesis 5

For women, non-monetary incentives have a higher positive impact on performance than mixed incentives, and mixed incentives have a higher impact than monetary incentives.

4 Experimental design and data

To analyze the effects of monetary, non-monetary, and mixed incentives, we conducted a real-effort experiment using z-Tree (Fischbacher 2007). Participants were recruited using the online recruitment system ORSEE (Greiner 2004) and were randomly assigned to one of four treatment groups. The experiment consisted of one working period and the task was to solve simple mathematical problems. Each mathematical problem contained two equations, each consisting of three one-digit numbers, which had to be added or subtracted. To calculate the final solution, subjects had to subtract the lower from the higher result of the single equations.Footnote 1 To ensure that participants understood the task, the working period was preceded by a test period, in which subjects had to solve five mathematical problems. Participants were only permitted to continue, if they answered all five mathematical problems correctly. The time subjects needed to solve the test equations correctly (testtime) served as an ability checker. Before the participants started the working period, they were informed that they were to work for 15 min and that they could decide to either solve mathematical problems or read articles of different genres (i.e., society, culture, travel, economics, science, and technology). Subjects were allowed to switch between the two options at any time. Reading articles only served as an outside option and was not relevant for performance ranking. However, according to Corgnet et al. (2015) it is important to offer an outside option to avoid performance triggered by boredom and people working only because of a lack of desirable alternatives in the laboratory. Furthermore, participants were informed that they would receive a fixed wage of 10 euros and that they could earn an additional prize depending on their performance rank.Footnote 2 Following Hammermann and Mohnen (2014a) and Jeffrey (2009), we implemented a tournament with four performance ranking groups to avoid a middle group. Each session included 22 subjects. The first ranking group consisted of the subject who performed best, the second group included ranks 2–8, the third group ranks 9–18, and the worst group ranks 19–22. Performance was measured by the number of correctly solved mathematical problems (score). If two participants had the same score, the ratio of correctly solved problems to overall completed problems decided the ranking. If this indicator was still equal, the rank was decided by chance.Footnote 3

Altogether, we conducted four treatments, which differed in the prizes subjects could earn according to their relative performance ranks. In our benchmark treatment control subjects only received a fixed wage of 10 euros without any additional prize, but were informed of their relative position afterwards. In treatments monetary, nonmonetary, and mix, participants were able to earn an additional prize according to their rank in addition to their fixed wage of 10 euros. The value of prizes increased with performance and rank. In the monetary treatment, the best performing subject received 10 euros, subjects of the second best performance ranking group 5 euros, those of the third group 2.50 euros, and the worst group received no prize. The prize for the best subject in treatment nonmonetary was a large box of Lindt chocolates (Lindt Pralinés Hochfein) worth 10 euros, subjects in the second group received a medium-sized box of Lindt chocolates (Lindt Mini Pralinés 100 g) worth 5 euros, and subjects in the third group received a small box of Lindt chocolates (Lindt Mini Pralinés 44 g) worth 2.50 euros. As in the monetary treatment, subjects in the fourth group received no prize. Finally, to analyze the effects of a combination of non-monetary and monetary incentives, we implemented the mix treatment, where subjects received a combination of non-monetary and monetary incentives based on their performance ranking: 5 euros + a medium-sized box of Lindt chocolates (Lindt Mini Pralinés 100 g), worth 5 euros for the best performance; 2.50 euros and a small box of Lindt chocolates (Lindt Mini Pralinés 44 g), worth 2.50 euros for the second best performance group; 1.25 euros + two Lindt chocolates (Fioretto), worth 1.25 euros for the third group; and 0 euros for the worst group (see Fig. 1 for an overview of treatments and prizes).

Fig. 1
figure 1

Overview of prizes

Prior to the working period, subjects in treatments nonmonetary and mix were shown a picture of the prizes and were told the market price of the Lindt chocolates to avoid them over- or underestimating the value of the non-monetary incentives (see Fig. 8 in “Appendix” for a screenshot example). Additionally, based on Hammermann and Mohnen (2014a), subjects were asked to rate the attractiveness of Lindt chocolates (“I consider Lindt chocolates attractive”) on a five-point Likert scale ranging from 0 for no attractiveness, to 4 for full attractiveness, and to rank the prizes according to their desirability to control for possible different preferences (regarding chocolate) between individuals, especially because of the stereotypes between men and women (Wiseman 2010).

After the working period subjects were asked to self-assess their performance by selecting which performance ranking group (1–4) they believed they belonged to. The self-assessment was not announced in advance and came as a surprise for the subjects after the working period (for more detailed information see the experimental instructions in “Appendix”). To elicit subjects’ accurate guesses about their own relative performance, we incentivized correct assumptions with 2 euros. Subjects then received feedback about their performance ranking, the number of mathematical problems they had solved and the proportion of correctly solved problems. Subsequently, participants had to answer a questionnaire on appreciation, motivation, other personal traits, and demographics.Footnote 4 Finally, they received their payment. All treatments were randomly distributed across different sessions and times to ensure that treatment effects were not mixed up with, for example, general performance shocks, arising at different times of the day (Kube et al. 2012). For an overview of the experimental timeline, see Fig. 2.

Fig. 2
figure 2

Experimental timeline

Our experiment took place from December 2016 to February 2017 at a large German university. In total, 264 studentsFootnote 5 from various faculties—mainly economics (42%), engineering (24%), and industrial engineering (7%)—participated in our experiment, with 63% being male and on average 23 years old (SD = 4.0). According to a Kruskal–Wallis test, there were no differences according to age (p = 0.505) and gender (p = 0.787) between treatments. The experiment lasted around 40 min and subjects received a fixed payment of 10 euros and an additional prize, depending on the treatment they were assigned to. Furthermore, subjects could earn an additional 2 euros if they self-assessed their performance correctly. We had a total of 66 subjects in control, 65 in monetary, 66 in nonmonetary, and 65 in mix.

5 Results

5.1 Effectiveness of monetary, non-monetary, and mixed incentives

As outlined above, the aim of our experimental study is to analyze the effects of performance-related monetary, non-monetary, and mixed incentives on workers’ performance. The number of correctly solved mathematical problems (score) served as a measure of performance and was considered our main outcome variable. Figure 4 in “Appendix” shows the mean work performance in the different treatments. To ensure that possible differences in performance were not the result of ability, but of effort, we compared ability [measured by the time needed to solve the mathematical problems correctly in the test period (testtime)] between treatments. Overall, there were no differences in ability between treatments (Kruskal–Wallis test: p = 0.310). Looking at the pairwise treatment tests, there were no significant differences in ability between control and monetary (Wilcoxon rank-sum test: p = 0.281; t-test: p = 0.228) or control and mix (Wilcoxon rank-sum test: p = 0.109; t-test: p = 0.237). Ability in nonmonetary differs from the control treatment at the 10% significance level (Wilcoxon rank-sum test: p = 0.097; t-test: p = 0.099). However, looking at a second ability measure—the number of false answers given in the test period (testerror)—there were no significant differences between control and nonmonetary (Wilcoxon rank-sum test: p = 0.126; t-test: p = 0.229).Footnote 6 Overall, the ability to solve simple mathematical problems did not significantly differ between men and women (Wilcoxon rank-sum test: p = 0.132; t-test: p = 0.281).This result remains robust when conducting treatment-wise tests for gender differences in ability (control—Wilcoxon rank-sum test: p = 0.684, t-test: p = 0.822; monetary—Wilcoxon rank-sum test: p = 0.124, t-test: p = 0.240; nonmonetary—Wilcoxon rank-sum test: p = 0.577, t-test: p = 0.959; mix—Wilcoxon rank-sum test: p = 0.556, t-test: p = 0.277). In addition, we separately conducted pairwise treatment tests for ability for women and men. Neither for women nor for men there were any significant differences in ability between treatments.Footnote 7 Table 1 presents a statistical summary of the main variables included in the analyses, divided by treatment and gender.

Table 1 Descriptive statistics

Our results indicate a positive relationship between monetary incentives and performance: Performance was 8.28% higher in monetary than in control (Wilcoxon rank-sum test: p = 0.175; t-test: p = 0.089). In addition to monetary incentives, also non-monetary and mixed incentives had a positive impact on performance (nonmonetary versus control—Wilcoxon rank-sum test: p = 0.081, t-test: p = 0.117; mix versus control—Wilcoxon rank-sum test: p = 0.068; t-test: p = 0.043). The average number of correctly solved mathematical problems was approximately 8.27% higher in nonmonetary than in control, and 9.53% higher in mix than in control. However, there were no significant differences in performance between monetary and nonmonetary (Wilcoxon rank-sum test: p = 0.708; t-test: p = 0.998). On average, subjects solved correctly roughly the same number of mathematical problems in these two treatments. The participants in mix exerted slightly more effort than those in monetary or nonmonetary, but this difference was not statistically significant (mix versus monetary—Wilcoxon rank-sum test: p = 0.767; t-test, p = 0.784; mix versus nonmonetary—Wilcoxon rank-sum test: p = 0.961; t-test, p = 0.801).

Additionally, we analyzed work quality, measured by the number of incorrectly solved mathematical problems (error). According to our data, prizes have a slight negative impact on work quality in treatments mix, monetary, and nonmonetary. Nevertheless, according to the Kruskal–Wallis test, work quality did not differ significantly between treatments monetary, nonmonetary, mix, and control (p = 0.840).

We further conducted ordinary least squares (OLS) regressions to examine if the treatment effects found in the non-parametric/parametric tests were robust and to control for potential differences in abilities and other variables. Models (1)–(3) in Table 2 show the results for treatments monetary, nonmonetary, mix, and control, including all 262 subjects, while Models (4) and (5) consider only treatments monetary, nonmonetary, and mix, and thus contain 196 subjects. We included dummy variables for treatments monetary, nonmonetary, and mix, with control as a reference group (in Model (4) monetary served as reference group, and in Model (5), mix was the reference group). Furthermore, the variable intrinsicmot was included to control for intrinsic motivation. This was based on (dis)agreement with the statement “I had fun solving the task” measured on a seven-point Likert scale. The variable belief served as an indicator of which performance-ranking group subjects believed they belonged to, since a subject’s payoff in a tournament does not depend only on his or her own performance, but also on that of the other subjects. The variable riskaversion indicates the willingness to take risks in general, being measured on an 11-point Likert scale (from 0 being completely risk averse to 10 being completely risk seeking), and was included as subjects’ remuneration is uncertain in the tournament setting. Finally, participants’ personal characteristics, namely age, gender, and ability (testtime), were also inserted in the model as control variables.

Table 2 OLS regressions on work performance (score)

The results of the OLS regressions, estimated with robust standard errors, confirm the previous findings. Model (3) shows a significant positive effect of monetary, non-monetary, and mixed incentives on performance, being significant at the 5% level. Moreover, the positive effect of mixed prizes remains significant, regardless whether explanatory variables are included [Models (1)–(3)]. To conclude, these results support hypothesis 1.

Some of the other explanatory variables also show a significant effect on performance. First, the more intrinsically motivated were the subjects to execute the task (intrinsicmot), the better their performance. There were no differences regarding the stated intrinsic motivation between treatments (Kruskal–Wallis test: p = 0.934); thus, intrinsic motivation is not influenced by the nature of the incentive. Second, age had a significant positive impact on performance. Third, a lower self-assessment of performance relative to the other participants (belief) was correlated with lower actual performance. Finally, the lower the ability (testtime) was, the worse the performance, although the coefficient is very small. Our main findings can be summarized as follows:

Result 1

Compared to the benchmark treatment control, monetary, non-monetary, and mixed incentives have a significant positive impact on performance.

Model (4) in Table 2 sheds light on whether monetary or non-monetary incentives are superior in terms of their effects on performance. The data revealed no significant differences in performance between treatments nonmonetary and monetary. In addition, performance in treatments mix, monetary, and nonmonetary did not differ significantly [see Model (5)]. Hence, we note the following results:

Result 2

There are no significant differences in the average incentive effects between treatments nonmonetary and monetary.

Result 3

There are no significant differences in performance between treatments monetary, nonmonetary, and mix.

5.2 Gender differences in the effectiveness of monetary, non-monetary, and mixed incentives

The initial results indicate that monetary, non-monetary, and mixed incentives are equally suitable to enhance employees’ performance. However, considering men and women separately reveals a different picture (see Fig. 3).

Fig. 3
figure 3

Gender differences in work performance over different treatments

In order to identify possible gender differences, we first conducted non-parametric tests (see Table 3).

Table 3 Non-parametric tests: Gender differences in performance

Monetary and mixed incentives had a significant positive impact on men’s performance. Compared to the control group, the average number of correctly solved mathematical problems was 17.83% higher in monetary and 12.28% higher in mix. Moreover, men’s performance was 12.86% higher in monetary than in nonmonetary, which was significant at the 10% level. In contrast, considering the female sample, monetary or mixed incentives had no significant impact on women’s performance, whereas non-monetary incentives evoked a significant performance increase of 15.95%. Moreover, women’s performance was 22.79% lower in the monetary than in the nonmonetary group, significant at the 1% level. Additionally, in the mix group, women’s performance was significantly higher compared with the monetary group.

To analyze the gender differences in more detail, we conducted OLS regressions and inserted interaction terms of treatments and gender (monetary × gender, nonmonetary × gender, mix × gender) in the regression models (see Table 4).

Table 4 OLS regressions on work performance (score) with regard to gender differences

The performance of men and women revealed no significant differences in treatment control [Model (2), p = 0.590]. Furthermore, there were no significant differences in the incentive effect of mixed prizes between men and women [Model (4), p = 0.714]. However, in response to monetary incentives, men’s performance significantly exceeded women’s [Model (3), p = 0.028]. Conversely, in the presence of non-monetary incentives, women had a better performance than men, although this difference was not significant [Model (3), p = 0.188].

As a second step, we investigated the impact of monetary, non-monetary, and mixed incentives on men’s performance in more detail. Compared to our benchmark treatment control, where no incentives were implemented, Model (2) in Table 4 reveals that monetary and mixed prizes had a highly significant positive effect on men’s performance (monetary: p = 0.000; mix: p = 0.048). Although men’s performance in nonmonetary was slightly better than in control, the difference was not significant (p = 0.436). Furthermore, Model (3) indicates that men’s performance was significantly higher in monetary than in nonmonetary, as predicted by hypothesis 4 (p = 0.020). In Model (4), there was no significant difference between men’s performance in mix and monetary or between their performance in mix and nonmonetary (mix versus monetary: p = 0.123; mix versus nonmonetary: p = 0.320); thus, hypothesis 4 is only partly supported. These results are in line with the non-parametric tests. Therefore, for men, pure monetary incentives are always better, or at least not worse, than non-monetary or mixed incentives of equal market value. There were no differences in men’s stated intrinsic motivation or the belief about one’s performance ranking between treatments according to a Kruskal–Wallis test (intrinsicmot: p = 0.772; belief: p = 0.699).

Result 4

Men’s performance in the monetary treatment is significantly higher than in the nonmonetary treatment. There is no significant difference between men’s performance between the mix and monetary, or between their performance in the mix and nonmonetary treatments.

In contrast, the pattern of women’s performance in the treatment groups monetary, nonmonetary, and mix showed a completely different picture, as suggested by the non-parametric tests. To analyze the effect of treatments monetary, nonmonetary, and mix on women’s performance, we conducted linear post-estimation tests after the OLS regressions (e.g., H0: monetary + monetary x gender = 0) and report the relevant t-statistics and p-values in the following section. The results of Model (2) in Table 4 and linear post estimation tests show that whereas non-monetary incentives had a highly significant positive impact on women’s performance [t(248) = 3.13, p = 0.002], monetary incentives had a negative impact on their performance compared to control, although this difference was not significant [t(248) = − 1.25, p = 0.214]. There was no significant difference between women’s performance in treatments mix and control [Model (2), t(248) = 0.89, p = 0.373]. In addition, Model (3) shows that women’s performance was significantly higher in treatment nonmonetary than monetary [t(184) = 4.10, p = 0.000]. Furthermore, women performed significantly better in treatment nonmonetary than mix [Model (4), t(184) = 1.86, p = 0.065], whereas their performance in mix was significantly higher than in monetary [Model (4), t(184) = 2.00, p = 0.047]. Therefore, our results are in line with hypothesis 5. These observed performance differences were not driven by intrinsic motivation, as a comparison of women’s statement concerning fun at work did not show differences between treatments (Kruskal–Wallis test: p = 0.613). Furthermore, there were no differences in their belief about one’s performance ranking between treatments (Kruskal–Wallis test: p = 0.350). The findings are summarized as follows:

Result 5

Women’s performance is significantly higher in the nonmonetary treatment than in the monetary or mix treatment. Furthermore, women’s performance is significantly higher in the mix treatment than in the monetary one.

6 Discussion

In this section, we analyze subjects’ motivation behind their effort decisions and shed light on possible explanations of our results, especially the impact of gender differences on the incentive effect of non-monetary and monetary prizes on performance.

When considering the entire sample, there were no significant differences in performance between the responses to monetary and non-monetary incentives in our experimental data. This result contradicts both sides of the debate in the literature. While Hammermann and Mohnen (2014a) show that monetary incentives have a higher impact on performance than non-monetary ones, Jeffrey (2009) concludes that non-monetary incentives outperform monetary incentives. These conflicting results may be driven by gender effects, as the proportions of men to women in these studies differ: whereas in Hammermann and Mohnen’s (2014a) study, the proportion of males was 64%, in Jeffrey’s (2009) research, it was only 38%. These results are thus in line with our experimental results showing that men’s performance is higher when competing for monetary prizes and women’s performance is higher when competing for non-monetary prizes. Although gender differences are not discussed by Hammermann and Mohnen (2014a) and Jeffrey (2009), they might be a possible explanation for the mixed evidence in the literature pertaining to the effectiveness and superiority of monetary and non-monetary incentives.

Moreover, it might be argued that our results are triggered by the perceived attractiveness of the non-monetary incentives: Lindt chocolates may be less attractive for some subjects than, for example, the massage vouchers used by Jeffrey (2009), and may therefore have a smaller incentive effect. The perceived attractiveness of the non-monetary incentive may further shape gender differences in performance, as boxes of chocolates might be, in line with stereotypes, less attractive to men than women. Nevertheless, our examination does not support these arguments since only 21% of the subjects in nonmonetary and 14% of the subjects in mix stated that Lindt chocolates are not attractive to them.Footnote 8 Moreover, there were no significant differences in the stated attractiveness of chocolates between men and women in treatment nonmonetary (Wilcoxon rank-sum test: p = 0.967) or in treatment mix (Wilcoxon rank-sum test: p = 0.658). Unexpectedly, in nonmonetary only 18% of men indicated that chocolates are not attractive to them, compared to 27% of women. In treatment mix, 13% of men and 15% of women stated no attractiveness. Therefore, these results strengthened our assumption that gender differences in the effectiveness of non-monetary incentives on performance do not seem to be the result of perceived prize attractiveness.Footnote 9

In addition, subjects were asked, ex post the experiment, what impact the prizes had on their effort decisions. Both men and women stated that monetary prizes had a significantly greater impact on performance than non-monetary prizes (Wilcoxon rank-sum test, women: p = 0.039; men: p = 0.000), despite that women performed significantly better in treatment nonmonetary than in monetary. Moreover, 85% of subjects in treatment nonmonetary (86% of women, 84% of men) and 95% in monetary (95% of women, 95% of men) stated their preference for cash over non-cash prizes.Footnote 10 Of the 65 participants in treatment mix, 71% (62% of women, 77% of men) stated that they preferred pure monetary prizes to a mix of monetary and non-monetary prizes, whereas 91% (81% of women, 97% of men) preferred mixed prizes instead of pure non-monetary prizes. These results confirm those in previous research which suggest that individuals state their preferences according to rational considerations, as money is the more rational choice, owing to its option value. Nevertheless, our results for the female sample as well as other experimental studies show that often the most preferred item is not actually the item which leads to the best performance (Jeffrey 2009; Kube et al. 2012; Shaffer and Arkes 2009).

Another possible explanation for the observed gender differences in performance is the feeling of appreciation. According to Ellingsen and Johannesson (2007), appreciation and recognition are important drivers of employee performance. Comparing the statements of subjects’ feelings of appreciation revealed that men felt much more appreciated by monetary than by non-monetary prizes, and this difference was significant at the 5% level (Wilcoxon rank-sum test: p = 0.041). Moreover, men stated a higher feeling of appreciation in treatment mix than in nonmonetary, although this difference was not significant (Wilcoxon rank-sum test: p = 0.526); further, there were no significant differences between treatments mix and monetary (Wilcoxon rank-sum test: p = 0.142). In contrast, women felt significantly more appreciated by non-monetary than by monetary prizes (Wilcoxon rank-sum test: p = 0.018). Women’s feeling of appreciation was also significantly higher in treatment mix than in monetary (Wilcoxon rank-sum test: p = 0.014). Furthermore, there were no significant differences in women’s stated feeling of appreciation between treatments mix and nonmonetary (Wilcoxon rank-sum test: p = 0.892). To conclude, the gender differences regarding the impact of monetary and non-monetary incentives on performance are reflected in the answers to the question of how appreciated subjects felt by the prizes (for an overview of the distributions of answers, see Fig. 6 in “Appendix”). In addition to the feelings of appreciation, we asked subjects, ex post the experiment, whether they were satisfied with their prize,Footnote 11 as research has shown that satisfaction might have an influence on subjects’ performance (Judge et al. 2001). The answers were in line with those for the feelings of appreciation and with the performance pattern of men and women: whereas women stated a higher satisfaction with their prize in treatment nonmonetary than in monetary (Wilcoxon rank-sum test: p = 0.128), men stated that they were more satisfied with monetary than with non-monetary prizes, with a difference significant at the 5% level.

However, our results on the feelings of appreciation and satisfaction can explain only part of the experimental results, as they do not explain the negative impact of monetary incentives on women’s performance. While this may initially seem somewhat puzzling, existing research on competitions and performance pressure can help explain our results, as it shows that women work reluctantly in competitive environments (Niederle and Vesterlund 2007) and falter under performance pressure (Azmat et al. 2016). By implementing a tournament, we created a competitive environment. This competition and the related performance pressure may well have been intensified when monetary prizes were at stake: women stated that they felt significantly more performance pressure in treatment monetary than in control (Wilcoxon rank-sum test: p = 0.097).Footnote 12 This higher perceived pressure in monetary may have led to the observed negative effect on women’s performance. In contrast, women’s stated performance pressure in treatment nonmonetary was lower than in control, although this difference was not statistically significant (Wilcoxon rank-sum test: p = 0.163). Building on the findings of Heyman and Ariely (2004), we argue that non-monetary prizes may have reframed the competitive market into a more social market, thereby weakening the competitiveness of the tournament. Women may have therefore felt more comfortable to perform in a tournament with non-monetary incentives in a more social market, which is associated with lower competition, and thus exerted more effort than when pursuing monetary incentives. This, in turn, may have intensified competition and the performance pressure in the tournament. However, this perception of a more social market in treatment nonmonetary might be triggered by the type of non-monetary incentive used in the experiment, that is Lindt chocolates.Footnote 13

7 Conclusion

In a real-effort experiment, we analyzed the impact of performance-related non-monetary, monetary, and mixed incentives on employees’ performance. Our data reveal three key findings. First, the experimental data suggest that monetary, non-monetary, and mixed incentives all have a significant positive impact on performance. Second, there are overall no significant differences between treatments monetary, nonmonetary, and mix. Third, however, upon dividing the subject pool into men and women, we see a different picture: whereas men’s performance is highest in treatment monetary, women’s performance is higher in treatment nonmonetary than in monetary or mix.

However, there are some limitations to the dataset and experimental setting. The impact of monetary, non-monetary, and mixed incentives in our experimental setting was considered only over a short period. Therefore, future research should analyze the effects of these incentives over a longer period to discover any long-term effects, particularly whether the impact of non-monetary incentives diminishes when the same incentives are used repeatedly. Company data or longer-term field studies would be suitable and useful to this end. In addition, we acknowledge that our sample consisted of more men than women, which might have influenced our statistical analysis. Thus, it would be beneficial to analyze gender differences with a more balanced sample in future research. Furthermore, future research should analyze the effects of the incentives that appeal more to male stereotypes to determine if our results, in particular with regard to gender differences, remain robust.Footnote 14 Additionally, the composition of mixed incentives should be addressed in greater depth, as their impact may vary according to the proportion of monetary to non-monetary incentives.

Despite the constraints of the experimental setting, the study makes several contributions to the literature on monetary and non-monetary incentives. First, the results provide suggestive evidence that gender differences may clarify the mixed results regarding the impact of monetary and non-monetary incentives in the literature. Additionally, to the best of our knowledge, this is the first study to consider gender differences when investigating the impact of monetary and non-monetary incentives on performance in a tournament setting. Finally, we extend the literature and provide evidence concerning the effectiveness of mixed incentives.

The comprehensive results of our experiment indicate that it is beneficial for companies to use non-monetary, monetary, and mixed incentives within competitive environments. Nevertheless, they have to be aware that gender differences may play an important role in the effectiveness of these incentives. Understanding how these incentives enhance employee performance is crucial in implementing them effectively, since their underlying mechanisms may determine the amount of effort exerted by individuals in response to a specific incentive. Employers can express their recognition and appreciation of employees’ performance by means of incentives; however, employers have to be aware that monetary, non-monetary, and mixed incentives affect men and women and their feelings of acknowledgement differently. For instance, our findings suggest that men feel most valued when monetary rewards are given, while women feel much more appreciated by non-monetary incentives, which is reflected in employee performance. However, to generalize the results and recommend an optimum incentive plan for companies—monetary, non-monetary, or mixed—future research should endeavor to obtain a deeper understanding of the motivational properties of non-monetary, monetary, and mixed incentives and their underlying psychological mechanisms, comprehensively and by differentiating between genders.