What determines how much effort and energy we invest to attain our goals? Brehm (Brehm and Self 1989) suggested in his motivational intensity theory that we follow an energy conservation principle. Given that energy is a crucial resource for survival, we aim at investing only the energy that is required for task success and not more. Energy investment should thus be a function of task demand: The higher the difficulty level of a task, the more energy is invested. There are, however, two limits of this proportional relationship between task difficulty and energy investment. If task success is impossible, any energy investment would be a waste of resources. Correspondingly, motivational intensity theory postulates that individuals do not invest energy in impossible tasks. The same should hold for tasks where the costs exceed the benefits. If the required energy exceeds the potential benefits, energy investment would waste resources and, correspondingly, individuals should not invest energy if the importance of task success does not justify the required energy. In sum, motivational intensity theory predicts that task difficulty determines energy investment if success is possible and if the required energy is justified by the importance of success. If success is not possible or if success is not important enough to justify the required energy investment, individuals should disengage and not invest any energy.Footnote 1

Most of the research on motivational intensity theory has drawn on these predictions to examine the determinants of mental effort (e.g., Gendolla and Wright 2005; Gendolla et al. 2012, 2019; Richter et al. 2006, 2016; Wright and Kirby 2001, for recent reviews). Employing cardiovascular measures to assess mental effort, this research consistently provided support for the predicted joint impact of task difficulty and success importance (e.g., Brinkmann and Gendolla 2008; Freydefont et al. 2012; Gendolla and Krüsken 2002; Gendolla and Richter 2006; Richter 2016a; Richter et al. 2012; Wright et al. 1990, 1992, 1997). However, the research that used cardiovascular measures does not provide much information as to whether task difficulty and success importance have the predicted impact on energy investment—the investment of resources that enable us to perform physical and mental activity. Cardiovascular measures may reflect energy-related processes, but they do not necessarily do so. For instance, Obrist (1981) suggested in his cardiac–somatic uncoupling hypothesis that cardiovascular responses and metabolic demands are dissociated in active coping tasks—tasks where the individual’s performance is instrumental. Sherwood et al. (1986) provided empirical evidence for Obrist’s hypothesis by showing that increases in cardiac output, heart rate, and pre-ejection period during a reaction time task could not be explained by the observed increase in oxygen consumption—an indicator of total energy consumption. Another example constitutes the research by Carroll et al. (2009) on exaggerated cardiac responses. They found that changes in cardiac output and heart rate in mental stress tasks were dissociated from changes in oxygen consumption.

The conclusions that can be drawn from the preceding research on motivational intensity theory are also limited because of its strong focus on sympathetic-driven cardiovascular measures. Sympathetic nervous system activity enhances cardiac output during heavy physical exercise to satisfy the increased oxygen demand of the working muscles. During low-intensity exercise, however, the increase in cardiac activity is driven by changes in parasympathetic activity (e.g., Fagraeus and Linnarsson 1976; Victor et al. 1987). A low-intensity physical exercise certainly requires energy but this increase in energy demand is not paralleled by an increase in sympathetic activity. In sum, there is evidence that suggests that (sympathetic-driven) cardiovascular measures are not perfect indicators of energy investment. Preceding studies that employed cardiovascular measures to test predictions derived from motivational intensity theory did therefore not employ the most sensitive measures to address the theory’s energy-related predictions.

Cardiovascular measures are also ill-suited for testing one specific aspect of motivational intensity theory’s predictions: the hypothesis that individuals invest exactly what is required. Even if cardiovascular measures were perfect indicators of energy investment, they would not enable the comparison of the invested energy with the minimally required energy given that there is no means to know what the minimally required cardiovascular response would have been. For instance, if one observes that a participant’s heart beat increases during the performance of a mental arithmetic task by five beats per minute, one would not know whether this increase was required or whether it would also have been sufficient for the participant to increase her/his heart beat by only three beats per minute. Interestingly, this also applies to measures that are adequate indicators of energy investment like oxygen consumption. To our knowledge, there is no means to establish how much is required independent of the measure itself.

Richter and Stanek (Richter 2013, 2015; Stanek and Richter 2016) recently acknowledged the lack of empirical research on motivational intensity theory's energy-related predictions and started to address this issue by using exerted hand grip force as an indicator of energy investment. Muscle contraction is caused by the binding of two proteins, myosin and actin, and a resulting pivoting of myosin heads that shortens the muscle (e.g., Sherwood 2010). The force created by a muscle is proportional to the number of myosin–actin complexes that bind and bend. Given that each myosin head consumes one molecule of adenosine triphosphate (ATP)—the body’s basic energy compound—to bind and bend, muscle force, the number of myosin–actin interactions, and energy consumption are directly related. Empirical research demonstrated that the proportional relationship between consumed energy and exerted force is particularly reliable in isometric tasks—tasks where the muscle contracts without shortening (e.g., Boska 1994; Russ et al. 2002). Drawing on this physiological evidence, Richter and Stanek (Richter 2015; Stanek and Richter 2016) tested motivational intensity theory’s prediction that task difficulty determines energy investment using isometric hand grip tasks. They manipulated the difficulty of the hand grip tasks across several levels and assessed exerted force as an indicator of energy investment.Footnote 2 Supporting motivational intensity theory’s hypothesis, exerted force was a direct function of task difficulty if task success was possible and dropped if task success was impossible.

The five experiments presented in this paper aimed at extending Richter and Stanek’s findings by demonstrating that the proportional relationship between task difficulty and energy investment is limited by success importance. As noted, motivational intensity theory postulates that increases in task difficulty only lead to increases in energy investment if the required energy is justified by success importance. If success importance is not high enough, increases in task difficulty should not lead to increased energy investment but to disengagement. To test these predictions, we conducted five experiments. In each one of these experiments, we manipulated the difficulty of an isometric hand grip task across two levels (easy vs. difficult). To manipulate success importance, we varied the reward that participants could earn by successfully performing the task (low reward vs. high reward). All experiments used within-person designs, that is, participants performed all possible four combinations of task difficulty and reward value. Drawing on motivational intensity theory, we expected exerted force to increase with increasing task difficulty under conditions of high reward. Under conditions of low reward, we expected low force in both the easy task—because the task does not require more energy investment—and the difficult task—because of disengagement due to required energy exceeding justified energy.

An additional aim of the presented studies was to assess the relative explanatory power of the interactive model predicted by motivational intensity theory in comparison with alternative models on the impact of task demand and success importance. For this purpose, we compared the performance of the interactive model with an additive model, a difficulty-main-effect model, and a reward-main-effect model. The additive model constitutes the straightforward integration of the main effects for task demand (e.g., Ach 1935; Hull 1943; Kukla 1972; Zipf 1949) and reward value (Aarts et al. 2008; Fowles et al. 1982; Gray 1982; Pessiglione et al. 2007) on effort that have been reported in the literature. The difficulty-main-effect model reflects an alternative prediction of motivational intensity theory for the case that even the low reward is high enough to justify energy investment in the difficult task. The reward-main-effect model completed the comparisons. Figure 1 shows the four competing models.

Fig. 1
figure 1

Theoretical predictions for the impact of task difficulty and reward value on exerted force

Study 1

Method

Participants and design

Nineteen women and one man (mean age = 24.05 years, SD = 10.40) participated voluntarily and anonymously for course credit. Each participant performed all four conditions of an isometric hand grip task in a 2 (task difficulty: easy vs. difficult) × 2 (reward value: low vs. high) within-persons design. Sample sizes for all studies were determined in an priori power analysis (5% type I error probability, 95% power, correlation between within-person measures = .50) using G*Power (Faul et al. 2007). We estimated the effect size using the results of our preceding five hand grip studies (Richter 2015; Stanek and Richter 2016). Given that this effect size (Cohen’s d = 1.45) was considerably higher than the effect sizes found in most psychological studies, we decided to use a more conservative approach and to assume an effect size of d = 0.80. According to the power analysis, a minimal sample size of 15 was required to detect the effect. We decided to round up and to aim for at least 20 participants in each study. The gender imbalance in our samples is the result of mainly recruiting amongst psychology students, which is in Switzerland a subject where most students are female.Footnote 3

Apparatus and measurement

The experiment was programmed using LabVIEW 2009 software (National Instruments, Austin, TX, USA). The software controlled the presentation of instructions, the randomization of the conditions, and assessed the force that participants exerted on a dynamometer (HD-BTA by Vernier Software and Technology, Beaverton, OR, USA). Exerted force (in Newton [N]) was sampled at 10 Hz. The dynamometer was fixed at the computer desk at the side of the participant’s dominant hand and at the level of the chair armrests.

Procedure

The experiment was run in individual sessions each lasting 25 min. Participants read a short description of the study, provided informed consent, and answered a short demographic questionnaire (gender, native language, age, dominant hand). Participants could then familiarize themselves with the dynamometer device. They could squeeze the dynamometer at will during a period of 30 s and the exerted force was displayed in real time to participants on a computer monitor. Participants then learned that they would perform two different hand grip tasks using their dominant hand.

The structure of the trials in the two tasks was identical. Each trial started with a countdown of 6 s followed by a squeezing period of 2 s. After the squeezing period, feedback was presented for 4 s. The force that participants exerted was only measured during the squeezing period. In the first of the two hand grip tasks, participants were asked to exert as precisely as possible one of two force standards. Furthermore, participants were informed that the computer would randomly choose one of the two force standards at the beginning of each trial and that the respective force standard would be presented on top of the screen during the trial. They also read that the difference between the trial’s force standard and their maximally exerted force would be displayed (e.g., “You exerted 20 N more than required”) during the feedback period to enable them to adjust their force in the next trial. Participants were also informed that they should feel free to refrain from squeezing if they felt like it. The two force standards that were presented to participants were 50 N (easy condition) and 150 N (difficult condition)—the lower standard corresponding to a weak handshake and the higher standard to a strong handshake (Knoop et al. 2017). The first task included ten trials of each force standard presented in a randomized order and served as a practice period that allowed participants to learn about the difficulty of exerting 50 N and 150 N.

After performing the first task, participants received the instructions for the second task. They were asked to imagine that the dynamometer represented a clogged Ketchup bottle that they could unclog by squeezing the dynamometer hard enough. To support this cover story, a black and white drawing of a hand holding a reversed Ketchup bottle was presented on the screen during the task. Participants also learned that they would earn a monetary reward in each trial where they were able to free the clogged Ketchup bottle by exerting a force at least equal to a force standard. Furthermore, participants read that the force standards required to unclog the bottle would be the same as in the first task (i.e., 50 N and 150 N) and that the required force would be presented on the screen as in the first task. Participants also learned that the rewards that they could earn would be either CHF 0.01 or CHF 0.25. The respective reward would be randomly chosen by the computer at the beginning of each trial and presented on the screen beside the required force standard. Participants then performed 40 trials of the Ketchup task—10 trials of each difficulty-reward combination presented in a randomized order. If the maximum force that they exerted during the squeezing period equaled or exceeded the required force standard, a black and white drawing showing a Ketchup bottle ejecting Ketchup was displayed during the feedback period. After having performed the Ketchup task, participants were debriefed and received their remuneration.

We refrained in all studies from collecting manipulation checks for our task difficulty and reward value manipulations given that preceding empirical work has already demonstrated the potency of similar manipulations. Studies on the effects of monetary rewards revealed that differences as small as $0.01 can result in differences on behavior and subjective experience (e.g., Bijleveld et al. 2012; Goldstein et al. 2006; Wright et al. 2002). Moreover, hand grip studies provided evidence for the close association of exerted grip force and ratings of perceived demand and exertion, and also showed that differences as small as 20 N can lead to reliable differences on subjective experience (e.g., Dahalan and Fernandez 1993; Hartmann et al. 2013; McGorry et al. 2010; Wright and Penacerrada 2002).

Data analysis

We determined the exerted peak force of each trial of the Ketchup task (i.e., the maximum value of the 20 data points collected during the 2-s squeezing period) and calculated the arithmetic mean of these scores for each condition. The mean peak force scores were then analyzed with a planned contrast (Rosenthal and Rosnow 1985; Rosenthal et al. 2000) that modeled our predictions about the impact of task difficulty and success importance on energy investment. Contrast weights were − 1 in the easy-low-reward condition, − 1 in the easy-high-reward condition, − 1 in the difficult-low-reward condition, and + 3 in the difficult-high-reward condition. We also computed force–time integrals (FTIs) as an indicator of total energy investment (Filion et al. 1970) by summing up the 20 data points and calculating arithmetic means for each condition. The mean FTI scores were analyzed using the planned contrast. It is noteworthy that exerted peak force constituted our primary dependent variable given that it was instrumental for task success (i.e., we compared exerted maximum force to the force standard to determine success in a given trial).

We additionally conducted Bayesian t-tests (Rouder et al. 2009, 2012) to test motivational intensity theory’s predictions that individuals invest only the required force and that they disengage if the required energy is not justified. The Bayesian t-tests compared mean peak force in the easy-low-reward and the easy-high reward conditions with the easy force standard of 50 N as well as mean peak force in the difficult-high-reward condition with the difficult force standard of 150 N. To test for disengagement, a Bayesian t-test compared mean peak force in the difficult-low-reward condition with 0 N. Bayesian t-tests compare the likelihood of the data under a model that predicts a difference with the likelihood of the data under a model that does not predict a difference. In contrast to p-value based hypothesis testing, these tests enable evidence for no difference (Johansson 2011) and thus for our hypotheses about no difference between exerted force and force standard.

Finally, we conducted model comparisons to compare motivational intensity theory’s predictions with the alternative models. Following the approach suggested by Glover and Dixon (2004), Wagenmakers (2007), and Masson (2011) (see also Richter 2016b, for a recent discussion), we computed Bayes Factors (BF) that compared the likelihood of the peak force data under one model with the likelihood of the peak force data under a second model. In the first comparison, we compared the predicted model with the model predicting an additive effect of task difficulty and reward value. In the other comparisons, we compared the predicted model and the additive model with the difficulty-main-effect and reward-main-effect models. Figure 1 displays the four alternative models. The sizes of the Bayes factors were interpreted according to Raftery (1995).

Results

Practice trials

A comparison of mean peak force in the easy practice trials (M = 94.78, SE = 9.32) with the force exerted in the difficult practice trials (M = 138.10, SE = 5.23) was significant, t(19) = 7.16, p < .001, suggesting that participants successfully learned to separate the task difficulty levels.

Task trials

The planned contrast was significant, t(57) = 5.73, p < .001, MSE = 403.42, d = 1.52. Mean peak force was higher in the difficult-high-reward cell (M = 159.87, SE = 7.63) than in all other conditions (M = 118.33 and SE = 7.25 in the easy-low-reward cell, M = 121.10 and SE = 7.69 in the easy-high-reward cell, and M = 151.04 and SE = 7.68 in the difficult-low-reward cell). Figure 2 displays the pattern of exerted peak force and also shows—as dashed lines—the force standards that participants had to attain to earn the monetary reward. The contrast was also significant for FTIs, t(57) = 6.17, p < .001, MSE = 1.11 × 105, d = 1.63, indicating that participants invested overall more energy in the difficult-high-reward trials than in all other trials. Table 1 displays FTI cell means and standard errors.

Fig. 2
figure 2

Means of exerted force (in Newton) during the Ketchup task in Study 1. Error bars represent standard errors. The dashed line indicates the force standards of the difficulty conditions

Table 1 Cell means and standard errors of force–time-integrals

The Bayesian t-test comparing exerted peak force in the difficult-high-reward condition against 150 N provided weak evidence in favor of motivational intensity theory’s prediction that individuals invest only the required energy and not more, BF = 2.08. The Bayesian t-tests comparing exerted peak force in the easy-low-reward and the easy-high-reward conditions against 50 N provided, however, very strong evidence against this hypothesis, BFs > 70.99 × 104. The comparison of the peak force exerted in the difficult-low-reward condition with 0 N provided very strong evidence against the disengagement predicted by motivational intensity theory, BF = 13.44 × 1010. The comparison of the predicted model with the additive model resulted in a BF of 1.61 × 10–6 providing very strong support in favor of an additive model and against the predicted joint impact of task difficulty and reward value. However, the comparison of the additive model with the difficulty-main-effect model slightly favored the difficulty model, BF = 0.30. The comparison of the additive model with the reward-main-effect model strongly favored the additive model, BF = 6.18 × 108.

Discussion

Study 1 found some support for the predicted impact of task difficulty and success importance on exerted force. Participants exerted a higher force in the difficult trials where they could earn a high reward than in the other three conditions. Moreover, there was evidence that individuals in the difficult-high-reward condition avoided wasting energy and invested only the required energy. However, some of the findings conflicted with motivational intensity theory’s predictions. In the easy conditions, participants invested considerably more energy to squeeze the dynamometer than required. There was also no evidence for the predicted disengagement in the difficult-low-reward condition. Participants did not disengage but continued to invest considerable energy to exert a high force. These conflicting findings are consistent with the findings of Richter (2015) and Stanek and Richter (2016), who also found that participants invested more than required and that they did not completely disengage.

Even though the contrast modeling the predicted impact of task difficulty and reward value was significant, the predicted model performed poorly compared to a model that postulates an additive effect of task difficulty and reward value. The additive model also performed better than a reward-main-effect model but was inferior to the difficulty-main-effect model that could be predicted drawing on motivational intensity theory. Given the numerous publications on motivational intensity theory that reported evidence for disengagement under the combination of high difficulty with low success importance (e.g., Brinkmann and Gendolla 2008; Freydefont et al. 2012; Gendolla and Krüsken 2002; Gendolla and Richter 2006; Richter et al. 2012; Wright et al. 1990, 1992, 1997), the poor performance of the predicted model was unexpected. We therefore decided to conduct additional studies that aimed at finding evidence for the predicted disengagement. In total, we conducted four additional studies varying force standards, reward values, and the position of the practice trials. All four studies differed only slightly form the first study’s procedure and had 2 (difficulty: easy vs. difficult) × 2 (reward: low vs. high) within-persons designs. The main changes aimed to increase the likelihood of finding evidence for disengagement by increasing the clarity of the manipulations of difficulty and reward level, by increasing the difficulty of the high-difficulty condition (180 N in Study 4), and by decreasing the reward that participants could earn per trial in the low-reward condition from CHF 0.01 in Studies 1 and 2 to CHF 0.005 in Study 3 and CHF 0.0005 in Study 5.

Study 2

Method

Participants and design

Eighteen women and eight men (mean age = 24.00 years, SD = 8.99) participated voluntarily and anonymously and received course credit for their participation.

Procedure

The procedure was similar to Study 1 with the following exceptions. First, the easy force standard was changed to 100 N. Second, the task trials were separated into blocks as a function of their force standard. In the first task, the practice task, all participants first performed 10 trials with a force standard of 100 N and then 10 trials with a standard of 150 N. In the Ketchup task, the order of the two difficulty blocks was randomized between participants (14 performed first the easy trials). Within each difficulty block, 10 trials of each reward level were presented in random order. By presenting the difficulty levels in separate blocks we expected to increase the likelihood that participants had accurate expectations regarding the difficulty level of the current trial. Due to the fully randomized presentation of the conditions in Study 1, participants might not always have had the correct expectation regarding the difficulty level of the current trial.

Results

Practice

Exerted force in the practice trials differed as a function of task difficult, t(25) = 4.19, p < .001. Participants exerted a higher force in the difficult trials (M = 163.45, SE = 9.43) than in the easy trials (M = 142.05, SE = 12.61) demonstrating that participants could differentiate the difficulty conditions.

Task

The contrast was significant, t(75) = 5.52, p < .001, MSE = 400.51, d = 1.27. Exerted peak force was higher in the difficult-high-reward condition (M = 177.58, SE = 7.52) than in all other conditions (M = 140.08 and SE = 5.80 in the easy-low-reward condition, M = 149.13 and SE = 6.98 in the easy-high-reward condition, and M = 168.53 and SE = 8.49 in the difficult-low-reward condition). Figure 3 displays exerted peak force values as well as the force standards that participants had to attain to earn the reward. The FTI scores replicated the peak force pattern, t(75) = 6.27, p < .001, MSE = 1.26 × 105, d = 1.45. FTI cell means and standard errors can be found in Table 1. The Bayesian t-tests found strong to very strong evidence against the hypothesis that exerted force equals required force, BFs > 30.22, and very strong evidence against disengagement in the difficult-low-reward condition, BF = 6.15 × 1013. The comparison of the predicted model with the additive model resulted in BF = 5.25 × 10–6 providing very strong evidence in favor of the additive model. The additive model was also favored when compared to the difficulty-main-effect model, BF = 1.64, or the reward-main-effect model, BF = 1.11 × 108.

Fig. 3
figure 3

Means of exerted force (in Newton) during the Ketchup task in Study 2. Error bars represent standard errors. The dashed line indicates the force standards of the difficulty conditions

Study 3

Method

Participants and design

Eighteen women and six men (mean age = 21.21 years, SD = 2.19) participated in Study 3 for course credit.

Procedure

Study 3 differed from Study 2 in the following aspects. First, the force standards were set to 50 N (easy condition) and 100 N (difficult condition). Second, the ten trials of each condition were presented together as a block in the Ketchup trial. The order of the four blocks (i.e., conditions) was randomized across participants. By grouping together all trials with the same difficulty-reward combination, we aimed to further increase the likelihood that participants had a correct representation of the difficulty of the upcoming trial as well as about the reward that they could earn. Third, the monetary reward was no longer earned on a trial basis. In each block of the Ketchup trial, participants could either earn CHF 0.05 (low reward) or CHF 0.50 (high reward) by successfully performing at least 8 of the 10 trials. This led to a reduction in the reward per trial that participants could earn from CHF 0.01 in the first two studies to 0.005 in this study.

Results

Practice

Exerted peak force differed between the two difficulty levels, t(23) = 16.30, p < .001. Participants exerted a higher force in the difficult conditions (M = 107.35, SE = 3.61) than in the easy conditions (M = 60.82, SE = 5.13) suggesting that they learned to differentiate between the easy and difficult trials during the practice period.

Task

The contrast attained statistical significance, t(69) = 3.35, p < .001, MSE = 741.88, d = 0.81. Exerted peak force was higher in the difficult-high-reward condition (M = 139.57, SE = 6.04) than in all other conditions (M = 103.18 and SE = 7.34 in the easy-low-reward condition, M = 118.60 and SE = 9.51 in the easy-high-reward condition, and M = 132.33 and SE = 8.66 in the difficult-low-reward condition). Figure 4 displays condition means and standard errors as well as force standards. The contrast was also significant for FTI scores, t(69) = 5.02, p < .001, MSE = 1.13 × 105, d = 1.21. FTI cell means and standard errors can be found in Table 1. The Bayesian t-tests found very strong evidence against the hypothesis that exerted force equals required force, BFs > 16,270.53, and very strong evidence against disengagement in the difficult-low-reward condition, BF = 4.39 × 1010. A BF of 0.002 provided very strong evidence in favor for the additive model and against the predicted model. The additive model was also favored in comparison with the reward-main-effect model, BF = 1.20 × 103, but not in comparison with the difficulty-main-effect model, BF = 0.95.

Fig. 4
figure 4

Means of exerted force (in Newton) during the Ketchup task in Study 3. Error bars represent standard errors. The dashed line indicates the force standards of the difficulty conditions

Study 4

Method

Participants and design

Thirteen women and seven men (mean age = 21.65 years, SD = 3.83) participated in Study 4. Fifteen participants were paid CHF 10 for their participation, the others received course credit.

Procedure

In contrast to the preceding studies, Study 4 did not include a separate practice task. Participants performed only the Ketchup task. The task included five blocks, each one with 20 trials (five trials per condition presented in a randomized order). The first four blocks served as practice trials that allowed participants to acquire information about task difficulty (that is, participants had 80 trials to learn about the difficulty of exerting the requested forces compared to the 20 practice trials that they had in the first three studies). Exerted force during the last block constituted the critical dependent variable that was used for the statistical analyses. In all trials, the value of the exerted peak force was presented during the feedback period to enable participants to compare exerted force with required force. The force standards were 130 N (easy condition) and 180 N (difficult condition). The rewards for successful trials were CHF 0.01 (low reward) and CHF 0.10 (high reward).

Results

The contrast attained statistical significance, t(57) = 1.79, p = .04, MSE = 1146.65, d = 0.47. Exerted peak force was higher in the difficult-high-reward condition (M = 181.96, SE = 16.78) than in the easy-low-reward condition (M = 160.04 and SE = 15.69), the easy-high-reward condition (M = 176.00 and SE = 9.15), and the difficult-low-reward condition (M = 163.01 and SE = 19.19). Figure 5 displays these results and also indicates as dashed lines the force standards that participants had to attain to earn the monetary reward. The contrast was not significant for FTI scores, t(57) = 1.45, p = .08, MSE = 3.68 × 105, d = 0.38. Table 1 shows FTI cell means and standard errors. The Bayesian t-tests found very strong evidence against the hypothesis that exerted force equals required force in the easy-high-reward condition, BF = 362.14. The test in the easy-low-reward condition was inconclusive, BF = 1.06, and the test in the difficult-high-reward condition found positive evidence for no difference between exerted and required force, BF = 4.28. There was strong evidence against disengagement in the difficult-low-reward condition, BF = 2.13 × 105. The comparison of the predicted model with the additive model provided evidence in favor of the additive model, BF = 0.28. The additive model performed better than the difficulty-main-effect model, BF = 1.87, but worse than the reward-main-effect model, BF = 0.15.

Fig. 5
figure 5

Means of exerted force (in Newton) during the Ketchup task in Study 4. Error bars represent standard errors. The dashed line indicates the force standards of the difficulty conditions

Study 5

Method

Participants and design

Nineteen women and five men (mean age = 22.33 years, SD = 8.43) participated in Study 5 for course credit.

Procedure

The procedure of Study 5 was similar to the procedure of Study 4 with the following exceptions. First, the current force standard and reward value were only displayed during the countdown and the squeezing period. Second, we did not provide information about exerted peak force during the feedback period. Third, force standards were set to 80 N (easy condition) and 130 N (difficult condition). Fourth, the rewards offered were 1 point or 100 points. Participants were informed in the task instructions that the earned points would be converted to money at the end of the experiment and that 100 points would be worth CHF 0.05. This resulted in the lowest reward per trial value of all five studies (CHF 0.0005 per trial in the low-reward condition).

Results

The contrast was significant, t(69) = 5.24, p < .001, MSE = 452.74, d = 1.26. Exerted peak force was higher in the difficult-high-reward condition (M = 157.56, SE = 8.09) than in all other conditions (M = 124.29 and SE = 5.24 in the easy-low-reward condition, M = 133.87 and SE = 6.30 in the easy-high-reward condition, and M = 135.69 and SE = 7.95 in the difficult-low-reward condition). These cell means and standard errors are also displayed in Fig. 6 together with the associated force standards. The contrast was significant for FTI scores, t(69) = 5.78, p < .001, MSE = 93,982.46, d = 1.39. FTI cell means and standard errors can be found in Table 1. The Bayesian t-tests found positive to very strong evidence against the hypothesis that exerted force equals required force, BFs > 16.29, and there was strong evidence against disengagement, BF = 4.06 × 1011. The comparison of the predicted model with the additive model resulted in BF = 0.37 providing positive evidence in favor of the additive model. The comparison with the difficulty-main-effect model, BF = 52.68, and the reward-main-effect model, BF = 201.93 also favored the additive model.

Fig. 6
figure 6

Means of exerted force (in Newton) during the Ketchup task in Study 5. Error bars represent standard errors. The dashed line indicates the force standards of the difficulty conditions

Model comparisons aggregated across studies

To provide an overall evaluation of the evidence for the four discussed models—a summary of the individual comparisons can be found in Table 2—we aggregated the data across studies as suggest by Masson (2011) and computed summarizing Bayes Factors. The predicted model performed better than the reward-effect-model, BF = 1.00 × 108, but worse than the additive model, BF = 4.62 × 10–9, or the difficulty-main-effect model, BF = 5.02 × 10–5. The additive model outperformed all other models (BF = 2.16 × 108 for the comparison with the predicted model, BF = 1.09 × 104 for the comparison with the difficulty-main-effect model, and BF = 2.17 × 1016 for the comparison with the reward-main-effect model).

Table 2 Bayes factors for the comparisons of the predicted model and the additive model with the difficulty-main-effect and reward-main-effect models

Discussion

In all five studies, the planned contrast that modeled the predicted impact of task difficulty and reward value was significant, demonstrating that participants exerted the highest force when the task was difficult and when it allowed them to earn a high reward. The contrast analysis thus supported motivational intensity theory's prediction that the difficulty-energy-investment relationship is limited by success importance. However, the comparison of the predicted model with an alternative, additive model, provided different results. In all five studies, the data were more likely under a model that predicts an additive effect of task difficulty and reward value than under the predicted model. Our results thus provided some evidence for the predicted impact of task difficulty and reward value but also showed that the predicted model does not offer the best explanation of the data.

It is noteworthy that the alternative prediction that one could formulate drawing on motivational intensity theory—the difficulty-main-effect model—performed better than the predicted model but was—overall—also less favored than the additive model. In Study 1, the data slightly favored the difficulty-main model but three of the other studies provided positive to very strong evidence in favor of the additive model and against the difficulty-main-effect model. Moreover, aggregating the data from all five studies, the additive model provided a better explanation of the data than the difficulty-main-effect model. The hypothesis that reward value was not low enough—or that the difficult task was not difficult enough—to result in disengagement does thus not offer a good explanation for the observed results. If success is viewed as possible and worth the required energy, motivational intensity theory postulates that energy investment is a direct function of task difficulty and that reward value does not play a role. Consequently, motivational intensity theory does not offer an explanation for the observed reward effect. The only conditions under which motivational intensity theory would predict a main effect of reward value are tasks where the demand is unclear or unfixed (see Brehm and Self 1989; Harper et al. 2018; Richter 2013; Richter and Gendolla 2009; Wright 2008, for discussions of tasks with unclear and unfixed difficulty). However, under these conditions, motivational intensity theory would predict no difficulty effect.

The presented findings thus challenge motivational intensity theory and preceding empirical findings based on the model. Despite the positive results of the contrast analyses, the data favored a model that predicts an additive effect of task difficulty and reward value. As noted, such an additive model cannot be derived from motivational intensity theory. To our knowledge, there is also no publication that introduced a theoretical model that predicts an additive effect of task demand and success importance and that can additionally explain the other predictions of motivational intensity theory that have been empirically supported (the direct impact of reward under conditions of unclear task difficulty, for instance). Given that there are approaches that predict either a main effect of difficulty (Ach 1935; Hull 1943; Kukla 1972; Zipf 1949) or a main effect of success importance (Aarts et al. 2008; Fowles et al. 1982; Gray 1982; Pessiglione et al. 2007), some researchers might implicitly assume that these effects add up but this obviously does not constitute an explicit theoretical model. There are models (e.g., Shenhav et al. 2013; Westbrook and Braver 2015) that would allow to predict effects similar to an additive model but these models do not specifically address energy investment and lack the explanatory power of motivational intensity theory. For instance, Shenhav et al.’s (2013) expected value of control model suggests that both increases in task difficulty and reward value can lead to an increase in the amount of exerted cognitive control by increasing payoffs. However, the model cannot account for some of the empirical observations that motivational intensity theory can account for. For instance, it cannot explain why increases in task difficulty sometimes lead to disengagement or a reduction in effort investment (e.g., Freydefont et al. 2012; Richter et al. 2012) or why success importance has no impact if task difficulty is low (e.g., Mazeres et al. 2019; Richter 2016a).

Our findings also question the primacy of energy conservation predicted by motivational intensity theory. In many conditions, participants exerted more force than required. Instead of aiming to avoid wasting energy, participants invested more energy than necessary. One might wonder whether this could be explained by participants being unable to exert the required force with a high precision (either because of a lack of information about what was required or because of a poor inner sense of how much force they were exerting). If this was true, participants would not have consistently invested more than required. A lack of precision in exerting the required force should have resulted in participants investing too much force in some trials and not enough force in other trials. They should not have consistently invested more than required, as indicated by our data. Moreover, preceding work (Richter 2015) suggested that a few Ketchup task practice trials are sufficient to learn to exert the required force with a high level of precision.

Participants did not only exert a higher force than required, they also did not disengage when high task difficulty was combined with low reward. Even if the total amount of energy required to squeeze the dynamometer is relatively low compared to other types of physical activity, it is unlikely that the low rewards offered—CHF 0.0005 for a successful trial in Study 5—constituted a benefit that was larger than the involved costs. Drawing on motivational intensity theory, one would expect participants to disengage in this situation and to refrain from squeezing the dynamometer. This was clearly not the case. In all five studies, participants invested a considerable force in the difficult-low-reward conditions. It is possible that participants did not consider the task instrumental to earn the offered reward in these conditions but adopted a different goal, like demonstrating that they are engaged participants. In this case, the difficulty and reward levels would not have been determined by the hand grip task and the offered reward but by the difficulty of attaining the alternative goal and the importance of attaining it. It is important to note that this post-hoc explanation has limited value given that it can always be applied to save motivational intensity theory’s energy conservation prediction. Any empirical observation that suggests that participants did not disengage or invested more than required could be explained by the fact that they adopted a different goal for which more effort was required or for which the required effort was justified by the importance of attaining the goal.

It is not surprising that the preceding research on motivational intensity theory did not find evidence that questioned the primacy of the energy conservation principle. As noted, most of the research on the theory used cardiovascular measures. Cardiovascular measures do not enable a comparison of invested energy with required energy. Even if cardiovascular measures constituted good indicators of energy investment, one would not have a standard to compare the invested energy with. Imagine that one observes a heart rate increase of 10 beats per minute during a task. Does this imply that the increase in 10 beats per minute was required for success? Would the participant have failed with an increase of 5 beats per minute or could she have succeeded with an increase of 5 beats? Given that it is impossible to know which increase in cardiovascular activity is minimally required to succeed in a task, it is impossible to test motivational intensity theory’s prediction that individuals invest only the required energy using cardiovascular measures.

It is, however, obvious that our results differ from the studies employing cardiovascular measures to examine the interaction of task difficulty and success importance (e.g., Brinkmann and Gendolla 2008; Freydefont et al. 2012; Gendolla and Krüsken 2002; Gendolla and Richter 2006; Richter et al. 2012; Wright et al. 1990, 1992, 1997). These studies consistently found the sawtooth pattern predicted by motivational intensity theory: If success importance was low, cardiovascular response increased from easy to moderate task difficulty and was low under high task difficulty. Two observations might help to reconcile our findings with the results of these cardiovascular studies. First, the main dependent variables—pre-ejection period and systolic blood pressure—had a low time resolution compared to the measure of exerted force used in our studies. The impedance signal required for the scoring of pre-ejection period was averaged across intervals of 1 min to reduce noise and, consequently, there was only one pre-ejection period value per minute. Depending on the employed blood pressure monitor, one systolic blood pressure value was collected every 15 s, each minute, or each 2-min. The cardiovascular measures assessed in the preceding studies on motivational intensity theory thus did not reflect the effect of single events but the combined effect of all processes and events during intervals of 15 s, 1 min, or 2 min. The observed reduced cardiovascular responses might therefore have been the result of participants oscillating between engagement and disengagement. If participants invest energy and effort in some task trials but disengage in others, one would obtain the decreased responses that have been observed.

Second, in many studies, the cardiovascular response was reduced in the difficult condition under low success importance but there was nevertheless an increase compared to baseline (e.g., Brinkmann and Gendolla 2008; Freydefont et al. 2012; Gendolla and Krüsken 2002; Gendolla and Richter 2006; Wright et al. 1992, 1997). If one follows the preceding research on motivational intensity theory and interprets increases from baseline to task performance as energy or effort mobilization, the reduced cardiovascular response in the high-difficulty-low-success-importance conditions would also have to be interpreted as effort investment. It would not indicate disengagement. Many of the preceding studies on the joint impact of task demand and success importance thus also failed to provide evidence for complete disengagement (see Stanek and Richter 2016, for a meta-analysis of the disengagement studies).

In sum, the presented findings challenge motivational intensity theory. We found some statistical support for the joint impact of task difficulty and success importance predicted by motivational intensity theory. However, we also observed that an additive model provides a better explanation of the data, that participants invested more energy than required, and that participants did not disengage. Our findings do not question that the motivation to avoid wasting energy is an important motivation in goal pursuit—participants adapted their force to the demand of the hand grip task—but they suggest that energy conservation is not the sole motivation underlying energy investment in instrumental tasks. Future research and theorizing will have to consider additional motivations to build models of energy investment and effort that explain the additive effect that we have found as well as the preceding empirical findings on motivational intensity theory.