The influence of associative reward learning on motor inhibition

Stimuli that predict a rewarding outcome can cause difficulties to inhibit unfavourable behaviour. Research suggests that this is also the case for stimuli with a history of reward extending these effects on action control to situations, where reward is no longer accessible. We expand this line of research by investigating if previously reward-predictive stimuli promote behavioural activation and impair motor inhibition in a second unrelated task. In two experiments participants were trained to associate colours with a monetary reward or neutral feedback. Afterwards participants performed a cued go/no-go task, where cues appeared in the colours previously associated with feedback during training. In both experiments training resulted in faster responses in rewarded trials providing evidence of a value-driven response bias as long as reward was accessible. However, stimuli with a history of reward did not interfere with goal-directed action and inhibition in a subsequent task after removal of the reward incentives. While the first experiment was not conclusive regarding an impact of reward-associated cues on response inhibition, the second experiment, validated by Bayesian statistics, clearly questioned an effect of reward history on inhibitory control. This stands in contrast to earlier findings suggesting that the effect of reward history on subsequent action control is not as consistent as previously assumed. Our results show that participants are able to overcome influences from Pavlovian learning in a simple inhibition task. We discuss our findings with respect to features of the experimental design which may help or complicate overcoming behavioural biases induced by reward history. Supplementary Information The online version contains supplementary material available at 10.1007/s00426-021-01485-7.

The same repeated-measures ANOVAs were calculated for accuracy of responses. A significant effect for the factor block showed that performance improved overall throughout the time course of the training phase (F(1.68, 126.00) = 28.20, p < .001, ηp 2 = .27; Greenhouse-Geisser corrected). Post-hoc pairwise comparisons specified this improvement stagnated after the second block (block1 vs. block2: p < .001; block1 vs. block3: p < .001; block2 vs. block3: p > .1). Accuracy of responses was higher in rewarded compared to unrewarded trials, but this difference was not significant (F(1, 75) = 2.31, p > .1). The interaction between block and feedback type revealed also no differences between the conditions over time (F(2, 150) = 0.57, p > .1).

Experiment 1, test phase:
Accuracy in no-go trials following (invalid) go-cues: Analysis of variance using SPSS revealed that accuracy was significantly lower in the group that associated the cue with reward (main factor experimental group: F(1, 74) = 4.01, p < .05 ηp 2 = .05). Factor SOA reached significance (F(1, 74) = 8.85, p < .01, ηp 2 = .11) suggesting that overall less inhibitory failures were made when responses were given after 100 ms compared to 300 ms. As experience with the probabilities proceeded over blocks, participants committed more impulsive errors (block: F(4.76, 352.24) =, p < .1, ηp 2 = .03, Greenhouse-Geisser corrected), a trend which suggests the development of automation of the frequent response. Block did not interact with other factors (p > .1).
All other interactions were also not significant (p > .1).
In addition to these two conditions of interest we also analysed the following data: Accuracy in no-go trials following (valid) no-go cues: When participants had to remain inactive after cues that were mostly followed by a no-go signal, an influence of value on accuracy should manifest as higher accuracy in trials in which the cue was associated with a neutral feedback versus monetary reward. We observed no differences between the experimental groups (F(1, 74) = .13, p > .1) and no interactions with the within subject factors (p > .1). The Bayesian analysis indicated that the hypothesis of a value-driven group difference in this variable was six times more likely under the null hypothesis (BF10 = 0.159). Like for response times in go-trials following no-go cues the results suggested that the learned associations with value had no influence when cues were associated with inaction. Responses became overall more accurate over the course of the test phase (block: F(5.05, 373.41) = 5.8, p < .001, ηp 2 = .07, Greenhouse-Geisser corrected; BF10 = 720.520). For the factor SOA the data analysis was overall not informative (SOA: F(1, 74) = 3.32, p < .1, ηp 2 = .04; BF10 = 1.015). All other factors did not reach statistical significance and did not interact with each other in the frequentist analysis (p > .1). The Bayesian factors regarding the interactions between experimental group and block (BF-inclusion = 0.006) as well as the three-way interaction (BF-inclusion = 0.025) were in favour of the null hypothesis. For the interaction between experimental group and SOA we did not find informative results (BF-inclusion = 0.253) Reaction times in go-trials following (invalid) no-go cues: When participants had to give a response after cues that were frequently followed by a no-go signal, we would expect to observe faster responses in the reward versus neutral experimental condition, because the association with reward is expected to activate a response and thus should help to overcome the conflict between the cue and the response signal. But we observed no difference between the experimental groups in response times after no-go cues (F(1, 74) = .00, p > .1). An influence of

Accuracy in go-trials following (valid) go-cues: Analysis of variance using SPSS
revealed that participants responded more accurate in go-trials in the group that associated the cue with reward, but this difference did not reach significance (main factor experimental group: Response times in no-go trails: Both models for trials containing valid and invalid cues could not be estimated, because there were not enough cases.
Accuracy in go trials: The Bayesian approach did not yield informative results concerning an effect of value in go trials (value: BF10 = 1.976), but there was moderate to extreme evidence for the null hypothesis looking at the interaction terms (value*block: BFinclusion =0.127; fractal*value: BF10 = 0.058; fractal*value*block: BF-inclusion = 1.716e-4).
The analysis of variance using SPSS suggested there was no difference in accuracy depending on whether the cue predicted a go or no-go response (fractal: F(1, 75) = 3.37, p < .1, ηp 2 = .04; fractal*block: F(5.88, 6.160e-5) = 0.54, p > .1, Greenhouse-Geisser corrected). Trend-wise accuracy was higher in the reward condition (value: F(1, 75) = 3.95, p = .51, ηp 2 = .05). When participants had to give an active response by pressing a button and the cue promoted this action, subjects responded more accurately when the cue's outline colour additionally was linked to reward. However, this did not reach significance and only very few behavioural failures constitute these results because accuracy in go trials was overall very high (valid: M =