Numerous neurophysiological and neuroimaging studies implicate the striatum, the input unit of the basal ganglia, as a structure important for reward processing (Delgado, 2007; Hikosaka, Nakamura, & Nakahara, 2006; McClure, York, & Montague, 2004). The striatum is a major target of midbrain dopamine neurons, which code for prediction errors, with a phasic increase in activity in response to unexpected rewards and a decrease below baseline during the omission of expected rewards (Schultz, 2002, Schultz, 2010). The blood oxygen level dependent (BOLD) signal in the striatum also shows greater response to rewards than to punishments, including positive versus negative performance-related feedback (Bischoff-Grethe, Hazeltine, Bergren, Ivry, & Grafton, 2009; Elliott, Frith, & Dolan, 1997; Seger, 2008; Seger, Peterson, Cincotta, Lopez-Paniagua, & Anderson, 2010; Tricomi, Delgado, McCandliss, McClelland, & Fiez, 2006; Tricomi & Fiez, 2008). In most of these studies, the outcome is presented immediately after the response of the participant or after a short delay of a few seconds (Delgado, 2007; Delgado, Nystrom, Fissell, Noll, & Fiez, 2000; Foerde & Shohamy, 2011; Haber & Knutson, 2010; Kobayashi & Schultz, 2008; Maddox, Ashby, & Bohil, 2003; Schultz, 2002, Schultz, 2010; Tricomi & Fiez, 2008). This suggests that the striatum is involved in learning when the action and an outcome are in close temporal proximity.

Previous studies emphasize the importance of close temporal proximity of the response and the corresponding feedback during dopamine-dependent learning, due to a rapid degradation of the dopamine signal, which is thought to strengthen the link between the original action and its outcome (Foerde & Shohamy, 2011; Maddox et al., 2003). In addition, if there are many intervening events between the action and the delayed outcome, the dopamine signal might not strengthen the association between the specific action and the appropriate outcome (Cardinal, 2006; Cheung & Cardinal, 2005; Kurniawan, Guitart-Masip, & Dolan, 2011). Delays in feedback presentation of even a few seconds have thus been posited to disrupt activity in striatal targets of dopaminergic innervation, especially in certain subregions of the striatum, such as the caudate body (Foerde & Shohamy, 2011; Maddox et al., 2003; Nomura & Reber, 2008).

Yet the delay between an action and an outcome can be much longer in daily life, and humans and animals are still able to learn the association between specific outcomes and the actions that produced them (Cardinal, 2006; Foerde & Shohamy, 2011). For example, in academic testing situations, there is usually a time period between test submission and feedback about test performance. However, it is still unclear whether learning from such substantially delayed feedback engages similar brain structures and leads to similar performance as compared with learning from immediate feedback.

In the present study, we investigated whether the neural substrates underlying learning from delayed feedback are similar to those underlying learning from immediate feedback. Specifically, we examined the effect of performance-related feedback presentation after a substantial delay of 25 min, with many intervening events, on activity in the striatum and other brain structures. As a model, we used an academic testing situation and a feedback-based word association task, similar to one used previously to examine striatal activity following immediate feedback (Tricomi & Fiez, 2008). Participants engaged in a study phase performed outside the scanner, followed by fMRI data acquisition as they performed a multiple-choice test of their memory. For some trials, participants received immediate feedback, but for others, they did not receive feedback until the trials were presented during a second, review phase. As with an academic test, they were shown their previous responses and feedback about whether they were correct. This paradigm allowed us to test several alternative hypotheses. First, delay could have no effect on the pattern of neural activity observed following feedback. In this case, positive feedback should produce stronger activation than should negative feedback, regardless of whether the feedback is presented immediately or after a substantial delay. Alternatively, a substantial delay could alter neural processing of feedback. For example, unlike immediate feedback, positive and negative feedback might not produce differential activation in the striatum after a delay. Finally, unlike learning from immediate feedback, learning from delayed feedback might be dependent on neural structures other than the striatum. For example, recent evidence has suggested that the medial temporal lobe (MTL) may be particularly important for learning from delayed feedback (Foerde & Shohamy, 2011).

Thus, the aim of the present study was to investigate whether participants would be able to learn from delayed feedback presentation to the same degree as from immediate feedback, and to map brain regions responsible for the processing of the feedback presentation after a substantial delay.

Method

Participants

Twenty-four right-handed individuals consented to participate in the experiment for a payment of $50. Four participants were excluded from the main analysis due to technical problems and 1 participant was excluded due to excessive motion. Therefore, data from 19 participants were analyzed (11 females; mean age 23.89 years, SD 3.14). Sample size was determined on the basis of pilot behavioral testing, which indicated that approximately 20 participants were sufficient to show significant behavioral effects, and published statistical power analyses of fMRI studies, which suggest that approximately 15–20 participants are necessary for 80 % power (Desmond & Glover, 2002). All of the participants were fluent in English and had no MRI contraindications. The research was approved by the Institutional Review Boards of Rutgers University and the University of Medicine and Dentistry of New Jersey.

Materials

A 3-Tesla Siemens Allegra scanner was used to acquire all fMRI data. Behavioral data acquisition and stimulus presentation was administered using the “E-Prime” software (Schneider, Eschman, & Zuccolotto, 2002).

Procedure

Scan session

A T1-weighted pulse sequence was used to collect structural images in 43 contiguous slices (3 × 3 × 3 mm voxels) tilted 30° from the AC-PC line (Deichmann, Gottfried, Hutton, & Turner, 2003). Similarly, 43 functional slices were collected using a single-shot echo EPI sequence amounting to 172 acquisitions per run (TR = 2,500 ms, TE = 25 ms, FOV = 192 mm, flip angle = 80°).

Behavioral paradigm

In this study, participants had to perform a paired-associate word learning task (Tricomi & Fiez, 2008). The words used in the experiment contained 4–8 letters and 1–2 syllables, had Kučera–Francis frequencies of 20–650 words per million, and had imagibility ratings over 400 according to the MRC database (Coltheart, 1981). The words were matched for word length and frequency at the trial level. To ensure that words presented on the same trial were not semantically related, we used the Web implementation of the latent semantic analysis (LSA) similarity matrix (lsa.colorado.edu), using the corpus, “General Reading up to 1st Year College (300 factors).” The maximum factors available were used, and the words within each trial had a score of less than 0.2 on the LSA matrix (Landauer, Foltz, & Laham, 1998). Additionally, they did not rhyme or begin with the same letter.

There were four phases of the experiment: study phase, scanning phase 1, scanning phase 2, and test phase (see Fig. 1a). The study phase occurred outside the scanner and involved initial learning of the word pairs. Scanning phases 1 and 2 were performed during acquisition of fMRI data. Scanning phase 1 involved an initial “multiple-choice” test of memory for the word pairs, with feedback provided only in the immediate feedback condition. The trials were repeated during scanning phase 2 (with the original response highlighted), and feedback was provided for those trials in the delayed feedback condition. Finally, the test phase occurred outside the scanner and served to test the influence of the different feedback conditions on subsequent performance.

Fig. 1
figure 1

a. Chart of experimental events demonstrating their progression through time. b. Depicition of trials for each condition during Scanning Phase 1 and 2. During Scanning Phase 2, the blue highlight appears for all conditions and indicates the participant’s choice made at Scanning Phase 1

During the study phase, performed outside of the scanner, participants attempted to learn the word associations (180 trials). This was done in order for participants to acquire initial learning that would be further augmented via feedback presentation. In addition, it has been shown that the striatum is differentially activated when feedback is informative of one’s performance, but not when feedback is only arbitrarily related to one’s responses, prior to learning (Tricomi & Fiez, 2008).

The format of experimental trials resembled multiple-choice test questions. That is, on each trial of the study phase, participants were presented with three words, where the top word was the main word with two word options underneath. One of the options was highlighted in green, indicating that this option was the correct match for the main word. Participants were instructed to memorize the main word and the associated highlighted option. Trials were presented in random order for the duration of 4 s, separated by a fixation point lasting for 3 s.

The words learned during the study phase were then randomly assigned to the three feedback presentation conditions (immediate-, delayed-, and no-feedback conditions) and were presented during six scanning sessions, each lasting 7 min (Fig. 1a). The conditions were presented randomly in blocks of 10 trials (six blocks of immediate- and delayed-feedback conditions; three blocks of no-feedback condition). Each block was separated from the next block by a jittered fixation point (1–5 s). Each trial lasted approximately 8 s and started with a jittered fixation point (1–5 s), with a label informing participants of the upcoming condition. Although the order of the blocks was randomly interspersed during scanning phase 1, the trial order was then maintained when the trials were repeated during scanning phase 2, so that the length of time between the presentation of the initial trial and the corresponding delayed-feedback event would not vary.

During scanning phase 1, participants had to press a button to select one of the options as a match for the main word, on the basis of what they remembered from the study phase. The order of the two options was randomized. Feedback, which reflected whether participants selected a correct match for the main word (green √; red X), was presented for trials in the immediate-feedback condition only. For trials in the delayed-feedback and no-feedback conditions, participants were not informed as to whether they selected the correct option. Instead of the feedback, they were presented with a control screen that showed a black pound key (#). Both feedback and control screens were presented for the duration of 1 s.

During scanning phase 2, the trials were repeated so that participants were reminded of their choice made during scanning phase 1. This was done by presenting a blue highlight around the option that they had selected during scanning phase 1, for all conditions (Fig. 1b). The order of the two word options was again randomized. In order to control for the motor response, when presented with the stimulus, participants were required to press a third button, unrelated to any word option. During scanning phase 2, feedback was presented for trials in the delayed-feedback condition that reflected whether participants had selected the correct option during scanning phase 1. The resulting delay between the action and the outcome was approximately 25 min. For trials in the immediate-feedback and no-feedback conditions, participants were presented with a control screen that showed a black pound key (#) instead of the feedback (green √; red X). Although our question of interest necessitated that the delayed feedback occur later in the experiment than the immediate feedback, the control screens ensured that the only difference between conditions was the stage at which the feedback was delivered.

To test the effect of feedback presentation on subsequent accuracy, the words from all of the feedback conditions were presented in random order during the test phase, which occurred outside of the scanner. Feedback was not presented during the test phase. Each trial lasted 4 s and was followed by a confidence-rating question, where participants were given an unlimited time to indicate how certain they were about their response on the scale from 1 to 7 (1 = complete guess; 7 = completely sure). A fixation point followed and lasted for 3 s.

Data analysis

Behavioral data

Behavioral analysis was performed on the data from participants whose data were included in the fMRI data analyses. Accuracy data from the scanning phase 1 and accuracy data from the test phase were analyzed with an ANOVA and post hoc two-sample t-tests.

fMRI data

Preprocessing of the functional data for each session was performed using the Brain Voyager QX software (Version 2.1.2; Brain Innovation, Maastricht, The Netherlands). Preprocessing included three-dimensional correction for motion using six parameters. Images were spatially smoothed (8 mm, FWHM), voxel-wise linearly detrended, and passed through a high-pass temporal filter of frequencies (3 cycles per time course, which is equivalent to 0.007 Hz). The resulting data were normalized to the Talairach stereotaxic space (Talairach & Tournoux, 1998). A random-effects general linear model (GLM) analysis was performed on regressors corresponding to the 1-s time period of feedback presentation, which were convolved with a canonical hemodynamic response function.

To gain as complete an understanding as possible of our results, we performed three types of analyses on our data. First, we performed a whole-brain, voxel-wise ANOVA with valence (positive, negative, and no feedback) and delay (phase 1 vs. phase 2) as within-subjects factors. For this analysis, the phase 2 presentation of immediate-feedback trials and phase 1 presentation of delayed-feedback trials (for which feedback was not presented) were included in the model as predictors of no interest. For illustration purposes, we show event-related average time courses of the evoked responses in the basal ganglia regions identified by this analysis. These plots show percentage of signal change from baseline, with the baseline calculated as the average signal for the 1-s period before the start of each trial.

Second, additional whole-brain contrasts were conducted to directly compare immediate- and delayed-feedback presentation and to compare valence conditions for immediate and delayed feedback, individually. For the delay contrast, the no-feedback trials of scanning phase 1 versus scanning phase 2 were used to control for non-feedback-related effects of time, since the immediate-feedback trials necessarily occurred during scanning phase 1, while the delayed-feedback trials occurred during scanning phase 2. The valence contrasts were aimed at detecting differences between positive and negative feedback associated with immediate- and delayed-feedback presentation and between feedback presentation and no-feedback presentation of a corresponding delay. Thus, for these analyses, the predictors of interest were immediate feedback (positive and negative), delayed feedback (positive and negative), and no feedback (phases 1 and 2). For all analyses, the no-response trials and the six motion parameters were included in the model as regressors of no interest. We identified regions of interest (ROIs) thresholded at p < .005 with a contiguity threshold of 6 (3 × 3 × 3 mm3) contiguous voxels, determined by using the cluster-level statistical threshold estimator in BrainVoyager (Version 2.1; Brain Innovation, Maastricht, The Netherlands). This method corrects for multiple comparisons and produces a cluster level false positive alpha rate of .05.

Finally, we performed a targeted ROI analysis on a priori regions of interest, using spheres with a diameter of 5 mm, centered on coordinates in the dorsal striatum (15, 15, 15) and hippocampus (−30, −12, −18) reported by Foerde and Shohamy (2011) to show effects of a 6-s delay on feedback processing. This allowed us to make more straightforward comparisons between the results from the two studies.

Results

Behavioral results

Accuracy

Figure 2a displays accuracy results for the three feedback conditions of scanning phase 1 and for the words from the three conditions presented during the test phase. We conducted an ANOVA with the factors of feedback type (immediate-, delayed-, and no -feedback type) and experimental phase (scanning phase 1 and test phase) in order to see the effect feedback had on learning. The ANOVA revealed a significant interaction of feedback type and experimental phase, F(2, 18) = 7.39, p = .002, and a main effect of experimental phase, F(1, 18) = 19.97, p < .0001.

Fig. 2
figure 2

a. Accurancy for the three feedback conditions during the scanning phase and at test phase. Significant difference between the scanning phase and the test phase accuracy were detected for immediate and delayed feedback conditions (p < 0.001). At test phase, the immediate feedback condition and delayed feedback conditions differ significantly from the no feedback condition (p < 0.05 for both comparisons). b. Accuracy at the test phase for positive and negative feedback trials from Scanning Phase 1. A significant difference was observed for negative feedback trials between immediate and delayed feedback (p < 0.01). ***indicates p-values less than 0.001: **indicates p-values less than 0.01: * indicates p-values less than 0.05

Post hoc two-tailed paired t-tests showed that participants’ accuracy improved significantly in the immediate- and delayed-feedback conditions. This was indicated by a significant difference between scanning phase 1 and the test phase, t(18) = 4.04, p = .001, d = 0.86 (immediate-feedback condition), and t(18) = 5.39, p < .0001, d = 1.24 (delayed-feedback condition). No significant accuracy increase was observed between scanning phase 1 and the test phase in the no-feedback condition, t(18) = 1.34, p = .19, d = 0.2.

In addition, two-tailed paired t-tests revealed significant differences at the test phase between the immediate- and no-feedback conditions and between the delayed- and no-feedback conditions, t(18) = 2.53, p = .02, d = .54 (immediate- vs. no-feedback condition), and t(18) = 2.7, p = .01, d = 0.60 (delayed- vs. no-feedback condition), showing that both immediate and delayed feedback were effective in improving performance, relative to no feedback. Consistent with this, participants’ accuracy in the immediate-feedback condition was not significantly different from the accuracy in the delayed-feedback condition, t(18) = 0.64, p = .53, d = 0.70.

Influence of feedback valence and delay on subsequent performance

Two-tailed paired t-tests were performed on the test phase accuracy data in order to see whether learning from immediate versus delayed feedback produced subsequent differences in performance. No significant differences in test phase accuracy were found on the basis of delay condition for positive feedback trials. However, for trials on which participants received negative feedback during the scan, there was a significant difference at the test phase between immediate and delayed feedback, t(18) = 2.90, p = .0095, d = 0.71). That is, significantly more trials with incorrect responses during the delayed-feedback condition were correctly identified at the test phase than the negative feedback trials of the immediate-feedback condition (Fig. 2b).

fMRI results

ANOVA results

We performed a whole-brain, voxel-wise within-subjects ANOVA with delay (phase 1 vs. phase 2) and valence (positive, negative, and no feedback) as within-subjects factors. The resulting clusters of activation are listed in Supplemental Table 1 (p < .05, corrected). The largest and most significant region identified as showing a main effect of delay was the lentiform nucleus (the putamen and globus pallidus [GP]), bilaterally. A similar region also showed a main effect of valence. An overlap map of the main effect of delay and valence in the lentiform nucleus is presented in Fig. 3a, and the activation time courses in the voxels that show both a main effect of delay and a main effect of valence are presented in Fig. 3b and c. Two-tailed t-tests show that negative feedback produces a greater response than does positive feedback [t(18) = 3.58, p = .002, d = 0.87 for immediate feedback; t(18) = 2.47, p = .02, d = 0.57 for delayed feedback] and also a greater response than does no feedback [t(18) = 3.1, p = .006, d = 0.78 for immediate negative vs. no feedback; t(18) = 2.88, p = .01, d = 0.76 for delayed negative vs. no feedback]. Additionally, the temporal pattern of activation differs between immediate and delayed feedback, with a rise in activation at the beginning of the trial for the immediate-feedback condition that falls quickly back to baseline after feedback delivery and more sustained activation following delayed negative feedback [t(18) = 4.17, p < .001, d = 1.03 for delayed negative vs. immediate negative feedback presentation].

Fig. 3
figure 3

a. An overlap map of regions displaying the main effect of valence (in orange; positive vs. negative vs. no feedback) and the main effect of delay (in blue; Phase 1 vs. Phase 2). The anterior caudate shows sensitivity to the valence of feedback (positive vs. negative feedback), while the lentiform nucleus (putamen and globus pallidus) shows sensitivity both to delay (immediate vs. delayed feedback presentation) and valence. b. Time course of activation to immediate feedback in the right lentiform nucleus. c. Time course of activation to delayed feedback in the right lentiform nucleus. d. Time course of activation to immediate feedback in the right anterior caudate nucleus showing main effect of valence. e. Time course of activation to delayed feedback in the right anterior caudate nucleus. Feedback onset is at 0 seconds

A main effect of valence was also found more anteriorly, in the caudate head, bilaterally (Fig. 3a), as well as in the caudate body. As Fig. 3d and e show, regardless of delay, positive feedback and no-feedback trials produce increases in caudate activation, whereas negative feedback produces a decrease in activation. Two-tailed paired t-tests indicate that parameter estimates were higher for positive feedback than for negative feedback [t(18) = 4.15, p = .001, d = 0.92 for immediate feedback; t(18) = 2.69, p = .01, d = 0.66 for delayed feedback]. The parameter estimates for phase 1 no feedback versus immediate negative feedback showed a trend toward significance, t(18) = 1.87, p = .08, d = 0.38, and there was no significant difference between phase 2 no feedback and delayed negative feedback, t(18) = 0.81, p = .43. Finally, several cortical regions showed an interaction of delay and feedback valence (Supplemental Table 1); for all of these regions, the difference between positive and negative feedback-related activity was greater for immediate than for delayed feedback. However, no striatal areas were identified as showing a significant effect for this contrast.

Effects of delay

To further investigate the brain responses to feedback presented immediately versus after a delay, a whole-brain GLM analysis was conducted to directly compare immediate- and delayed-feedback presentation, while controlling for scanning phase. That is, collapsing across valence, immediate-feedback presentation was compared with delayed-feedback presentation, while including the no-feedback presentation trials of the corresponding phase as a control [i.e., (delayed feedback − phase 2 no feedback) vs. (immediate feedback − phase 1 no feedback)]. This contrast resulted in activity of the lentiform nucleus and the posterior caudate nucleus (Fig. 4; Supplemental Table 2).

Fig. 4
figure 4

Effect of delay. The lentiform nucleus (circled) shows increased activation to feedback presentation after a delay (p < 0.05, corrected): (Delayed feedback-Phase 2 no feedback) versus (Immediate feedback-Phase 1 no feedback). The no feedback presentation trials serve as a control for potential order effects

Effects of valence for immediate-feedback presentation

To gain a better understanding of the effects of valence in our data set, we also performed contrasts between the different valence conditions for immediate- and delayed-feedback presentation, individually. For the immediate-feedback condition, the contrast between positive feedback presentation versus negative feedback presentation revealed significant differences in activity in the right and left caudate nuclei (Fig. 5; Supplemental Table 3a). The cluster of activity in the right caudate overlapped with the cluster identified by our ANOVA as showing a main effect of valence. No striatal activity was detected for the contrast of immediate positive versus no feedback (Supplemental Table 3b). A cluster of activity in the caudate tail was detected for the contrast of no-feedback (phase 1) versus immediate negative feedback presentation.

Fig. 5
figure 5

Brain activity resulting from immediate feedback presentation. The anterior caudate nuclei show increased activation to immediate positive feedback presentation versus immediate negative feedback presentation (p < 0.05, corrected)

A cluster of activity was also detected in the right and left dorsal anterior cingulate cortex (dACC), a region often implicated in error processing and conflict resolution, showing greater activation during the presentation of the negative feedback than during presentation of positive feedback. These clusters were similar to areas from our ANOVA that showed a main effect of delay (right dACC) and a main effect of valence (bilaterally).

Effects of valence for delayed-feedback presentation

Differential activity in the lentiform nucleus was observed for the contrast of delayed negative feedback versus delayed positive feedback and for the contrast of delayed negative versus no feedback, suggesting a role in error processing (Fig. 6; Supplemental Table 4a, b, c). This region overlapped with the ANOVA regions showing a main effect of valence and of delay.

Fig. 6
figure 6

Brain activity resulting from delayed feedback presentation. An overlap of activity in the lentiform nucleus associated with increased activation to delayed negative feedback versus delayed positive feedback presentation and delayed negative feedback versus no feedback presentation (p < 0.05, corrected)

Consistent with our findings for immediate feedback, the anterior insula and ventral anterior cingulate were activated bilaterally for the presentation of delayed negative feedback versus no feedback and for delayed negative feedback versus delayed positive feedback. These regions overlapped with the clusters of activity identified by our ANOVA as showing a main effect of valence.

A priori ROI results

We also conducted a targeted ROI analysis, with spheres centered on coordinates in the dorsal caudate (x = 15, y = 15, z = 15) and in the hippocampus (−30, −12, 18) from a recent paper investigating feedback over a delay of 6 s (Foerde & Shohamy, 2011). In the dorsal caudate, Foerde and Shohamy found more prediction error related activation for immediate feedback than for delayed feedback. Consistent with this, we found a significant difference in activation in this ROI between positive and negative feedback for immediate feedback, t(18) = 2.5, p = .023, but not for delayed feedback, t(18) = −0.16, p = .87. The difference between positive and negative feedback was not significantly greater for immediate than for delayed feedback, although there was a trend toward significance, t(18) = 1.8, p = .09. These results differ from our finding that the somewhat more ventral caudate region identified from our ANOVA showed a significant effect of valence for both immediate and delayed feedback. The discrepancy suggests that there may be differences within the striatum in sensitivity to feedback delay.

In the hippocampus, Foerde and Shohamy (2011) found more prediction error related activation for delayed feedback than for immediate feedback. We did not find significant differences between conditions in the hippocampal ROI, although both immediate and delayed positive feedback produced increases in signal, whereas immediate and delayed negative feedback produced decreases in the hippocampal signal.

Discussion

Feedback processing after a delay

In this experiment, performance-related feedback was presented either immediately or after a substantial delay of approximately 25 min. Similarly to studies with a shorter delay (Foerde & Shohamy, 2011), our behavioral findings revealed equivalent accuracy at the test phase, indicating that participants were able to learn from both immediate and delayed feedback. To support learning in the delayed condition, either the striatum must be similarly recruited during both immediate- and delayed-feedback processing, or delayed-feedback processing must be accomplished through a separate neural mechanism. We observed activation in the caudate and the lentiform nucleus during the presentation of both types of feedback in our study. These subregions appear to be playing different roles in feedback-based learning, however.

Our whole-brain ANOVA identified the head of the caudate nucleus as showing a main effect of valence, in line with previous work showing an increase in caudate activation following positive feedback and a decrease in activity following negative feedback presentation (e.g., Tricomi et al., 2006; Tricomi & Fiez, 2008). This region showed a similar response profile for both immediate- and delayed-feedback presentation in our study, with positive feedback eliciting greater activation than negative feedback in both conditions. This suggests that striatal processing of the affective component of feedback is not limited to feedback presented immediately after a response but that a similar mechanism is involved in feedback processing even after a substantial delay. It should be noted, however, that this effect was not quite as robust for the delayed-feedback condition, and therefore, we did not identify the caudate at our significance threshold in our whole-brain contrast of positive versus negative feedback for the delayed-feedback condition. This leaves open the possibility that the caudate may be more sensitive to immediate feedback than to delayed feedback, although we did not observe either a main effect of delay or an interaction between delay and valence in this region.

We reminded our participants of their choice in the delayed-feedback condition, because we wanted to use an academic testing situation as a model for learning from delayed feedback. In addition to having more real-world significance, this reminder also avoided confounds related to the added memory demands the delayed feedback condition would have required without such a reminder. That is, we did not wish to test participants’ memory for their original choice across the 25-min delay but, rather, whether a delay from the original action would be enough to alter striatal responses to feedback. Future research could investigate the degree to which a reminder is critically important, both for performance and for patterns of brain activity in the striatum and in memory regions. Experiments using different delays both with and without intervening events might be expected to produce different results in this regard, since a failure of working memory and a failure of long-term memory for the original response would likely depend on different brain structures.

Relation to previous research

The COVIS (competition between verbal and implicit systems) theory posits that the prefrontal cortex and head of the caudate is involved in explicit rule-learning tasks, whereas the tail of the caudate and inferotemporal visual form processing areas are involved in implicit “information integration” learning (Ashby, Alfonso-Reese, Turken, & Waldron, 1998). The COVIS theory predicts that delays will have less of an effect on rule-based learning than on information integration learning (Maddox et al., 2003). Although this theory is usually applied to category-learning tasks, our task with individual word pairs to be memorized on each trial bears more resemblance to a rule-learning task than an information integration task. Therefore, our finding of differential activation to positive and negative feedback in the head of the caudate, whether it is immediate or delayed, is in line with this theory. Neuroimaging studies investigating the COVIS theory have found that both rule-based and information integration tasks activate the caudate similarly (Nomura et al., 2007; Nomura & Reber, 2008) but that there are functional distinctions between the caudate head, which processes feedback valence, and the caudate body/tail, which plays a more general role in learning from feedback (Seger & Cincotta, 2005). These results may help explain why our results in the head of the caudate differed from our results in the more dorsal body of the caudate region from our targeted ROI analysis, which showed more sensitivity to delay.

A recent study found that a delay of only 6 s between a response and feedback resulted in diminished feedback-related activity in the striatum (Foerde & Shohamy, 2011). This stands in contrast to our finding that the caudate head continues to show differentiation of positive versus negative feedback even after a substantial delay of 25 min. There are several possible reasons for this discrepancy. As was noted above, there may be differences within the striatum in sensitivity to feedback delay. Indeed, our targeted ROI analysis of the caudate body region from the Foerde and Shohamy results was consistent with this study in that we did not find a valence effect for delayed feedback. Additionally, there are important differences in the experimental design between these two studies. As was mentioned above, participants in our study were briefly reminded of their original answer prior to receiving feedback. This may have been enough to reactivate a representation of the response and allow the striatal reward-processing system to link the feedback to the original response, producing the typical “reward” and “punishment” responses for positive and negative feedback, respectively. Although the stimuli chosen by the participants in Foerde and Shohamy’s study remained on the screen across the 6-s interval between the response and the outcome, it is possible that this encouraged the formation of stimulus–outcome (or stimulus–stimulus) associations, rather than response–outcome associations.

Furthermore, whereas we did not find significant effects in the hippocampus, Foerde and Shohamy (2011) found the hippocampus to be engaged in outcome processing when it was temporally separated from the cue. Again, if the longer presentation of the chosen stimulus encouraged the formation of stimulus–stimulus associations, this might have caused increased hippocampal engagement, relative to our task, in which participants were reminded of their response just prior to the presentation of the delayed feedback. Thus, even though our task involved declarative memory acquisition, tasks that are more specifically aimed at modulating MTL activity across conditions may be necessary to identify dopaminergic or reward-related influences on activity in this region (Foerde & Shohamy, 2011; Sadeh, Shohamy, Levy, Reggev, & Maril, 2011; Shohamy, Myers, Kalanithi, & Gluck, 2008; Wittmann et al., 2005).

Lentiform nucleus activation

We identified a second, more posterior region in the basal ganglia as showing not only a main effect of valence, but also a main effect of delay. A similar lentiform nucleus region was identified in our follow-up analyses as showing greater activity for delayed versus immediate feedback, delayed negative versus positive feedback, and delayed feedback versus no feedback. This region did not overlap with the more anterior region in the caudate identified in processing immediate-feedback valence. These results suggest that the lentiform nucleus may become more involved in feedback processing, especially in processing negative feedback after a delay. Even though both regions lie within the basal ganglia, the role of this region appears to be distinct from the role of the more anterior caudate region, due to its increased activation after a delay.

The posterior-dorsal basal ganglia (caudate body and tail and the putamen and GP) is typically thought to be part of the “motor loop” (Grahn, Parkinson, & Owen, 2008; Middleton & Strick, 2000; Seger, 2008); therefore, it is possible that the motor demands of the task differed upon receipt of delayed negative feedback, relative to the other conditions of our experiment. However, the order of the response options was randomized in each phase of the experiment, so the associations learned were not specific to a particular motor response. Additionally, in phase 2, across all conditions, participants pressed a third button, unrelated to the two buttons used in phase 1. It is also possible that the lentiform nucleus plays a cognitive role in our task, since the GP projects to prefrontal cortex structures involved in processing of cognitive information, such as the dorsolateral prefrontal cortex and the dACC (Boettiger & D’Esposito, 2005; Haber & Knutson, 2010; Han, Huettel, Raposo, Adcock, & Dobbins, 2010; Longe, Senior, & Rippon, 2009; Mohanty et al., 2007). Recent studies also suggest that in addition to its motor functions, the GP plays an important role in memory processing and learning (Baier, Karnath, & Dieterich, 2010; McNab & Klingberg, 2008; Scimeca & Badre, 2012).

Anterior cingulate cortex–basal ganglia network

Anterior cingulate activity was observed for the presentation of delayed negative feedback when contrasting it to no-feedback presentation and for the presentation of immediate negative feedback when contrasting it to positive feedback and no-feedback presentation. Previous studies also report increased ACC activation in response to feedback, suggesting that the ACC plays an important role in decision making (Rushworth, Behrens, Rudebeck, & Walton, 2007). The ACC is proposed to detect conflict related to improper responding during the task (Daniel & Pollmann, 2010; Holroyd & Coles, 2002; Holroyd et al., 2004; Yeung et al., 2004). Specifically, the prediction error signal from the midbrain, which reflects how well the expectation of the outcome matches the actual outcome, is used by the ACC (Kennerley, Walton, Behrens, Buckley, & Rushworth, 2006; Milham & Banich, 2005) to signal to other cognitive areas to increase cognitive control and correct future performance (Hong & Hikosaka, 2008).

Activation of the lentiform nucleus in our study was especially pronounced for trials with delayed negative feedback. This activity, in conjunction with the observed activation for negative feedback trials in the dACC, may reflect a role of this network in error processing. Indeed, neurons in the GP internal capsule, which influence dopamine neurons via their projections to the lateral habenula, are sensitive to prediction errors and increase their firing rate when a target signals the absence of a reward (McNab & Klingberg, 2008; Walsh & Phillips, 2010). Even though it is not possible to distinguish the specific part of the GP activated during our task using fMRI tools, this is one potential mechanism by which the basal ganglia region activated in our study may contribute to learning from feedback after a delay. Further research will be needed, however, to determine why this region might be especially sensitive to errors after a substantial delay.

One possibility is that, after time elapses, negative feedback may be perceived as providing an opportunity to learn so that one’s performance can be subsequently corrected. In line with this interpretation, we found that errors were more likely to be corrected during the test phase if the negative feedback was received after a delay, rather than immediately. Since the test phase occurred closer in time to the presentation of the delayed feedback than to the presentation of immediate feedback, it is also possible that the increased improvement in accuracy on the test phase after receiving delayed negative feedback could be due to recency effects. We would expect, however, that recency effects would apply to both positive and negative feedback trials, but this is not the case. Immediate positive feedback and delayed positive feedback did not result in differential performance during the test phase. This finding, in conjunction with the enhanced activation of the lentiform nucleus following delayed negative feedback, suggests that temporal delays may have an especially important influence on processing error-related information.

Limitations

Our study shows a sensitivity of the caudate and the lentiform nucleus to feedback after a substantial delay of 25 min, which adds to our understanding of the neural processing of performance-related feedback. One limitation of this study, however, is that our results do not allow us to explicitly link activation in striatal regions with learning. We included a study phase in our experiment because previous research using the word-learning task showed that the striatum was responsive to feedback only when it was linked to performance, and not when it was arbitrarily related to responses during initial learning (Tricomi & Fiez, 2008). The study phase limits our ability to test the relationship between striatal activity and learning, however, since answers that are well-learned in the study phase are the most likely to remain correct during the test phase, irrespective of the BOLD response to feedback during the scanning phase. Our behavioral finding of greater learning from delayed feedback than from no feedback indicates either that the striatum supports learning from delayed feedback or that a different neural mechanism enables learning in the delayed condition to be similar to learning in the immediate condition. Future research will be necessary to firmly dissociate these two possibilities, although our results suggest that the caudate and lentiform nucleus may be candidate regions for supporting learning after a delay.

Conclusion

This study suggests that the neural mechanisms involved in feedback processing are affected by the temporal proximity of the feedback to an action. That distinct basal ganglia subregions are involved in immediate- and delayed-feedback processing suggests that feedback might be interpreted differently if it is presented after a delay rather than immediately.

This study replicates previous findings related to the caudate’s role in processing immediate feedback and suggests that this role is preserved when feedback is received, even after a substantial delay. In addition, our results shed light on a potential role of other basal ganglia nuclei such as the lentiform nucleus in delayed-feedback processing. Taken together, our results underscore the importance of the basal ganglia in the performance of cognitive tasks and point to a functional heterogeneity within the basal ganglia in supporting learning under different time frames.