It is generally true that some of what a person encounters is important to remember, whereas other things are less important. One critical operation is to selectively remember important information, often at the expense of less important information. For instance, when studying for an exam, some students might maximize efficiency, focusing exclusively on the most important material. Other students might not be as selective; even though they know that some items are more important than others, they may still try to remember as much as possible, a strategy that often leads to poorer results. In the present work, we used functional magnetic resonance imaging (fMRI) to better understand what people do differently, on both cognitive and neural levels, when remembering items deemed important.

In order to address these questions, we used a variant of the value-directed remembering (VDR) paradigm (Castel, 2008; Castel, Benjamin, Craik, & Watkins, 2002). The VDR paradigm involves having participants study a list of words paired with point values, with the participants’ goal being to maximize the total score, which is the sum of the values associated with recalled words. A number of behavioral studies (e.g., Ariel & Castel, 2014; Castel et al., 2002; Castel, Farb, & Craik, 2007; Castel, Murayama, Friedman, McGillivray, & Link, 2013; Hanten et al., 2007; Loftus & Wickens, 1970; Soderstrom & McCabe, 2011; Watkins & Bloom, 1999) have shown that words that are arbitrarily determined to be valuable (via high point values) tend to be recalled better than words that are arbitrarily assigned lower values. However, prior studies with this paradigm have been limited in fully explaining the effect on a mechanistic level, with explanations ranging from differential forms of rehearsal, use of imagery, and strategic encoding and retrieval operations.

There is reason to believe that people make an explicit effort to prioritize encoding of high-value items in the VDR paradigm. Specifically, the degree to which people optimize their point score, as measured by the selectivity index (Castel et al., 2002), increases from earlier lists to later lists (Castel, 2008; Castel et al., 2011). The VDR paradigm is structured such that people learn multiple distinct word lists, with a free recall test after each list and immediate feedback on the number of points earned after each test. The improvement in selectivity across lists suggests that people may be learning about how many words they can remember and about which encoding strategies will lead to the highest point total. This pattern of results would be consistent with the use of explicit cognitive strategies to enhance the encoding of high-value items.

A number of functional neuroimaging studies have examined the brain mechanisms that might mediate the enhancement of memory for high-value items. Adcock, Thangavel, Whitfield-Gabrieli, Knutson, and Gabrieli (2006) were the first to do so in the context of an intentional encoding paradigm. They found that increased activity in regions of the dopaminergic reward system, specifically the ventral tegmental area (VTA) in the midbrain and nucleus accumbens (NAcc) in the ventral striatum, elicited in response to a value cue that preceded presentation of the actual stimulus, predicted successful encoding of high-value items. A similar pattern was observed in the hippocampus, and moreover the functional connectivity between VTA and hippocampus was strongest during cues preceding high-value items that were subsequently remembered. These findings suggest that input from the midbrain reward system might serve to prepare the hippocampus to better encode the important information that is about to be encountered, in this case a photograph of a landscape scene. Such connections between dopaminergic midbrain systems and the hippocampus had previously been shown to be important in rodents (Huang & Kandel, 1995; Jay, 2003; Lisman & Grace, 2005), but this was the first direct evidence for such a mechanism in humans. Although the study by Adcock et al. and subsequent work by others (e.g., Murty, LaBar, & Adcock, 2012; Wolosin, Zeithamova, & Preston, 2012) have contributed valuable insights into the neural mechanisms that can underlie reward-based learning, it is likely that people selectively engage additional, strategic mechanisms during encoding of high-value items, as a means of optimizing limited memory. We focus primarily on those mechanisms in the present article.

One difference between selective strategic enhancement of memory for valuable items and midbrain reward-motivated learning mechanisms is seen in the time courses of these effects. For example, Adcock et al. (2006) tested memory at a delay of 24 h, following evidence from rodent work (e.g., Frey, Matthies, Reymann, & Matthies, 1991; Frey, Schroeder, & Matthies, 1990; O’Carroll, Martin, Sandin, Frenguelli, & Morris, 2006) suggesting that the enhancement of encoding for valuable items via dopamine-driven increases in hippocampal plasticity is likely to emerge only after a delay. Although Adcock et al.’s study did not include an immediate memory test for comparison, Spaniol, Schain, and Bowen (2014) tested young and older adults on a very similar task and found that on an immediate test, value did not reliably enhance memory in either age group. With a test given 24 h after encoding, however, they replicated the finding of a significant enhancement of memory for valuable items. Similarly, Murayama and Kuhbandner (2011) found that after a 1-week delay, monetary rewards increased memory for trivia questions that were not inherently interesting, an effect believed to be dopamine-driven. No effect of reward on memory was observed on an immediate test, however, again suggesting that effects of the putative dopaminergic reward-motivated learning mechanism that Adcock et al. and others have examined only emerge after a delay. Reward-related activity in the VTA–hippocampal circuit thus appears to engage a consolidation process that makes memory for valuable items less vulnerable to forgetting after a delay, but this process does not seem to affect retrievability in the shorter term. However, under different circumstances, people can improve their memory for valuable items in a way that is apparent in tests administered immediately following learning (e.g., Castel, 2008; Castel et al., 2002). It thus seems likely that an additional mechanism that is capable of enhancing the encoding of valuable items is engaged by the VDR paradigm, and most likely by certain real-world situations as well.

As we noted above, there is reason to believe that participants in the VDR paradigm gradually learn to employ effective mnemonic strategies that allow them to strengthen their encoding of high-value items; this is apparent both from the pattern of recall data across lists and from postexperiment self-reports (e.g., Castel, 2008; Castel, McGillivray, & Friedman, 2012). In contrast, participants in most studies of reward-motivated learning (e.g., Adcock et al., 2006) are presented with a long list of stimuli, and memory is only tested after all encoding is complete, with no opportunity to modify their encoding strategy on the basis of feedback. Additionally, performance in the VDR paradigm is typically assessed via free recall, whereas memory in reward-based learning tasks is usually assessed by a yes/no recognition task. Thus, the VDR task is more likely to tap into strategic enhancement of encoding for high-value items than are the paradigms that are typically used to assess reward-based learning. Additional neural mechanisms may be engaged during encoding of high-value items in the VDR paradigm that may reflect real-life situations in which people are able to preferentially remember valuable information.

To our knowledge, no prior neuroimaging studies have examined the effects of value on neural mechanisms of strategy use during the encoding of items with different values. However, a number of studies have examined which brain areas are preferentially recruited when people engage in deep encoding of study materials versus shallower encoding. In one of the first such studies, Kapur et al. (1994) examined how tasks structured to promote different levels of processing (Craik & Lockhart, 1972; Craik & Tulving, 1975) differentially affected cerebral blood flow. They found that a task that engaged deep encoding by evoking semantic representation of words was associated with greater activity in the left inferior prefrontal cortex (PFC), relative to a task that required only surface-level encoding. Thompson-Schill, D’Esposito, Aguirre, and Farah (1997) provided a more precise account of left inferior PFC function, suggesting that this region has a role specifically in the selection of the most relevant semantic representation(s) for a given task, rather than in the retrieval of semantic knowledge more generally. Subsequent studies (e.g., Badre, Poldrack, Paré-Blagoev, Insler, & Wagner, 2005; Wagner, Paré-Blagoev, Clark, & Poldrack, 2001) have further clarified how left inferior PFC contributes to controlled semantic processing; see also the reviews by Bookheimer (2002), Costafreda et al. (2006), and Badre and Wagner (2007).

Other work has more directly implicated left PFC in the use of verbal or semantic strategies at encoding. When participants are instructed to use a semantic clustering strategy, they tend to show increased activity in areas including left dorsolateral and left ventrolateral PFC at encoding, relative to earlier blocks in which such a strategy is possible but has not been explicitly instructed (Miotto et al., 2006; Savage et al., 2001). Similarly, Kirchhoff and Buckner (2006) showed that individual differences in encoding-related activity in left inferior PFC are associated with the degree to which people report having used a verbal elaboration strategy during encoding. Use of these elaborative strategies was associated with better memory performance, suggesting that the often-observed association between left ventrolateral prefrontal activity at encoding and successful subsequent memory (e.g., Kim, 2011; Wagner et al., 1998) is mediated by increased use of semantic strategies at encoding. One possible neural mechanism underlying enhanced memory for high-value items in the VDR paradigm may be the differential engagement of regions associated with the use of semantic strategies at encoding. Such a finding would be particularly interesting, given that prior work has largely ignored the ways in which intentional strategic processing can mediate the effects of value on memory.

Method

Participants

A group of 22 young adults participated in the study. The data from two participants were excluded, one for being a nonnative English speaker, and a second for only completing three lists, due to discomfort in the scanner. The remaining 20 participants (mean age = 21.65 years, SD = 3.66, age range = 18–30; 11 female, nine male) were all right-handed, native English speakers who reported no current psychoactive medications or severe psychiatric or neurological disorders. All participants had either normal or corrected-to-normal vision. Written consent was obtained from each participant, and all procedures were approved by UCLA’s Medical Institutional Review Board. Participants were recruited via flyers posted on the UCLA campus and were paid $10/h, plus additional earnings from the Monetary Incentive Delay (MID) task (typically $10–$12), and also had the chance to win up to an additional $25 in a delay-discounting task (Kirby, Petry, & Bickel, 1999) that we ran after the scan. For one participant, we were unable to finish data collection on one run of the VDR task due to discomfort, but the remaining four VDR runs for that participant are included in our analyses.

Task stimuli and behavioral procedures

Our VDR task paradigm was based on that used by Castel et al. (2002), but was altered to make it more amenable to neuroimaging (see Fig. 1). Each trial of our task began with a cue for point value, either high (10, 11, or 12 points) or low (1, 2, or 3 points), presented as a number inside of a gold “coin” on the screen for 2 s. This was followed by a fixation cross of jittered duration (equal proportions of 3, 4.25, 5.5, and 6.75 s). Next, a word was presented for 3.5 s, followed by 1.5 s of fixation and then an active baseline task (Stark & Squire, 2001) of jittered duration (50 % 4 s, 25 % 6.5 s, and 25 % 8 s). The vowel–consonant baseline task involved the presentation of a pseudorandom series of letters, with approximately equal ratios of vowels and consonants. Each letter was presented for 1 s, with a 0.25-s fixation between letters, and a 1.5-s blank screen at the end of the trial. Participants were instructed to respond to each letter while it was still on the screen. Button mappings were fixed across participants, such that all individuals used their index finger if the letter was a consonant and their middle finger if the letter was a vowel. Letters in the vowel–consonant task were arranged such that they did not spell any words. We used a vowel–consonant task in order to continually engage verbal processing resources throughout the intertrial intervals, thereby reducing our participants’ ability to simultaneously engage in verbal rehearsal of the words during this time.

Fig. 1
figure 1

Value-directed remembering task design. On each trial, participants were first presented with the value cue, then with a to-be-remembered word, and finally with two to six trials of an active baseline task (vowel–consonant judgment) to be performed during the intertrial interval (ITI)

Each list included 24 different words, of which, arbitrarily, 12 were defined as high-value and 12 as low-value (with four words at each specific value level). Five lists of the VDR task were presented in the scanner. Items were drawn from clusters 6 and 7 of the Toglia and Battig (1978) “Colorado” word norms. All stimuli were four- to eight-letter, one- or two-syllable nouns, rated as being highly familiar (range 5.5 to 7 on a 1–7 scale), moderate to high on concreteness and imagery (range 4 to 6.5 on a 1–7 scale), and moderate in pleasantness (range 2.5 to 5.5 on a 1–7 scale). Values were pseudorandomly assigned to words, with the assignments of particular words to value groups (high or low) counterbalanced across participants. The order in which the different lists were presented in the scanner was also counterbalanced. Each list began with 12.5 s of fixation and ended with an extra 15 s of the vowel–consonant task. Within about 10–20 s after the end of each scan, the recall test began, and the participant was given 90 s to recall as many words as possible from the preceding list. Immediately after recall was complete, the experimenter scored the test and gave feedback on the point score earned for that list.

Prior to scanning, participants were given detailed instructions about the VDR task, and then completed six practice items, followed by two full practice lists. Each of the two full practice lists included recall tests with feedback. Prior work has shown that selectivity is typically stronger on the third and subsequent lists than on the first two lists (Ariel & Castel, 2014; Castel, 2008; McGillivray & Castel, 2011). Thus, we assumed that by presenting two full lists prior to scanning, strategy use would be relatively well established and consistent in the scanner.

After completion of the VDR task, participants remained in the scanner to perform one run of the MID task (Knutson, Adams, Fong, & Hommer, 2001), which was intended to serve as a functional localizer task for the VTA and NAcc. This task included a total of 48 trials, divided equally between high reward (+$1.00), low reward (+$0.10), and no reward (+$0.00). Loss/punishment trials were not included, since these were not relevant for our purposes. In addition, our version of the task includes feedback in word form, unlike the symbolic cues used in the classic MID paradigm, but consistent with the version used by Samanez-Larkin et al. (2007). This version was intended to be more amenable for use with older adult participants, whom we expect to test in follow-up studies. Although the number of trials of each type may appear low, a recent study by Wu, Samanez-Larkin, Katovich, and Knutson (2014) used a similar number of trials of each type and reported robust and consistent changes in BOLD signal as a function of value.

Each trial began with a text cue indicating the potential value of that trial (e.g., “Win $1.00”). To earn this reward, the participant was required to make a buttonpress during the brief window of time that a square stimulus appeared on the screen. As in prior studies with this paradigm, we used an adaptive algorithm, which adjusted the response period to keep the overall win percentage at approximately 66 %. The initial response period for the practice run outside of the scanner was 300 ms, and the initial response period in the scanner was determined on the basis of the average response time (RT) for successful responses during practice. If the participant’s win percentage exceeded 66 %, the response period would tend to be made shorter (i.e., more difficult) on the next trial. If the participant’s win percentage was less than 66 %, the response period would tend to be made longer (i.e., easier) on the next trial, down to a minimum possible response period of 140 ms. Overall, the mean accuracy results across the 18 participants for whom we have behavioral data were 60.4 % (SD = 8.3 %) for $0.00 trials, 60.4 % (SD = 7.1 %) for $0.10 trials, and 60.1 % (SD = 6.5 %) for $1.00 trials. The mean RTs for correct trials were 195.8 ms (SD = 27.3 ms) for $0.00 trials, 178.6 ms (SD = 29.3 ms) for $0.10 trials, and 169.9 ms (SD = 26.9 ms) for $1.00 trials.

The experimental session also included several supplementary behavioral measures collected before and after scanning. Prior to scanning, we ran reading span and counting span tests (Kane et al., 2004) to measure working memory capacity. We used a partial-credit load-weighted scoring procedure, such that each unit that was correctly recalled, in the correct serial position, was scored as 1 point (Conway et al., 2005). Following guidance by Conway et al., we generated a composite measure of working memory from scores on these two tests. Because we did not have enough data to do a true latent-variable analysis, we computed z scores for each measure and averaged the z scores to yield a composite measure of working memory.

At the end of the session, we administered a debriefing questionnaire that included questions about what strategies participants had used to encode the words, what (if anything) they had done differently during encoding of the high-value words, and questions about what (if anything) they were rehearsing during the fixation and vowel–consonant periods. Self-reported encoding strategies were categorized as either relying on semantic aspects of the words or relying more on surface features of the words. We also classified each participant into one of three categories: only attempting to encode high-value items (ignoring low-value items), trying harder on high-value items, or ignoring value entirely.

Scanning procedure

T2*-weighted echoplanar (EPI) images sensitive to blood oxygenation level dependent (BOLD) contrast were collected using a 3-T Siemens Tim Trio MRI scanner at the UCLA Staglin IMHRO Center for Cognitive Neuroscience. For the VDR task, each 179-volume functional run lasted approximately 7.5 min; five such runs were acquired for each participant. Each functional volume consisted of 45 interleaved axial slices, TR = 2,500 ms, TE = 25 ms, flip angle = 75º, slice thickness = 3.0 mm, in-plane resolution = 3.0 × 3.0 mm, matrix = 64 × 64, FOV = 192 mm, and no gap between slices. For the MID task, similar scan parameters were used, except that the TR was shortened to 2 s, only 36 slices were acquired per volume, and only one 246-volume run was collected. In addition, we collected matched-bandwidth T2-weighted coplanar structural scans to use as an intermediate step in spatial registration. We also collected a high-resolution structural scan (MPRAGE), using the following parameters: TR = 1,900 ms, TE = 3.26 ms, flip angle = 9º, 176 slices, 1-mm3 voxels, 18.2 % slice oversampling, FOV = 250 mm, with GRAPPA acceleration. To minimize head movement during scanning, we placed extra cushions between the participant’s head and the coil. Stimuli were presented using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA), and images were shown via either a custom-built MR-compatible rear projection system, or via MR-compatible goggles (Resonance Technology, Inc.).

fMRI data analysis

Preprocessing

Analyses of the EPI data were carried out using FEAT v5.98 (fMRI Expert Analysis Tool), as implemented in FSL v4.1.9 (www.fmrib.ox.ac.uk/fsl). We corrected for head motion using MCFLIRT (FMRIB’s motion correction linear image registration tool; Jenkinson, Bannister, Brady, & Smith, 2002), and also used the fsl_motion_outliers script to detect and censor any volumes with excessive head motion. We then removed non-brain tissue using BET (Brain Extraction Tool; Smith, 2002). Grand-mean intensity normalization was applied to the 4-D data set from each run on the basis of a multiplicative scaling factor. We applied a Gaussian kernel of 5 mm full width at half maximum for spatial smoothing, and for temporal filtering, a high-pass filter to remove low-frequency noise using a Gaussian-weighted least-squares straight-line fitting with a sigma of 100 s. Temporal autocorrelation was corrected for using prewhitening as implemented by FILM (FMRIB’s Improved Linear Model; Woolrich, Ripley, Brady, & Smith, 2001). Functional images were registered to a coplanar structural scan and then to a high-resolution structural scan using FLIRT (FMRIB’s Linear Image Registration Tool) linear registration. Registration from the high-resolution structural scan to standard Montreal Neurological Institute (MNI) space was further refined using FNIRT (FMRIB’s Non-linear Image Registration Tool).

Analysis of value-directed remembering task

We included four different event types in the statistical model: high-value cue period, high-value word-encoding period, low-value cue period, and low-value word-encoding period. The cue period was defined on the basis of the time period in which each value cue was on screen, was 2 s in duration, and was convolved with a double-gamma hemodynamic response function (HRF). The word-encoding period was defined as a separate event, on the basis of the time period in which the to-be-learned word was on screen, was 3.5 s in duration, and was also convolved with a double-gamma HRF. Temporal derivatives were included in the model for all four event types. Motion regressors generated by MCFLIRT and regressors coding for any motion outlier TRs, as defined by the fsl_motion_outliers script included with FSL, were also included in the model as covariates of no interest.

A first-level general linear model (GLM) analysis was carried out separately for each run. Then, in a second-level fixed-effects analysis, we combined the parameter estimates across all five runs of the VDR task and created a set of linear contrasts. Our primary contrasts of interest compared the BOLD signal during high-value and low-value items, looking separately at the cue period data and the word-encoding period data. For whole-brain analyses across participants, we used the FLAME Stage 1 and 2 mixed-effect model in FSL, with automatic outlier detection. Clusters were determined using a voxel-level threshold of z > 2.3, with a cluster-corrected significance level of p < .05.Footnote 1 Cortical surface renderings were created using Caret v5.65 (http://brainvis.wustl.edu; Van Essen et al., 2001) on the inflated Conte69 atlas in FNIRT space (Van Essen, Glasser, Dierker, Harwell, & Coalson, 2012), with FSL activation maps that were transformed from volume to surface space using Caret’s interpolated voxel algorithm. Activation peaks noted in the tables were a subset of the local maxima generated for each contrast by FSL’s “cluster” command, with a minimum distance of 10 mm between peaks. Labels were determined using the Harvard–Oxford structural atlas and other relevant brain maps (e.g., Brodmann, 1909; Talairach & Tournoux, 1988), and redundant peaks were eliminated.

We computed each participant’s selectivity index for each list using the formula [(actual score – chance score) / (ideal score – chance score)], as has been described in prior literature (Castel et al., 2002; Watkins & Bloom, 1999). We then averaged the selectivity indices across all scanned lists to yield a single score. To search the whole brain for correlations between behavioral measures (e.g., selectivity index) and changes in BOLD signal, we included the behavioral measure as an EV in an FSL group-level model, in addition to the group mean. For region-of-interest (ROI) analyses, we computed Pearson correlation coefficients across participants using each individual’s mean selectivity index and the mean parameter estimates for a given contrast in a given ROI for each participant. We applied a Bonferroni–Holm correction (Holm, 1979) to correct for multiple comparisons across each set of related ROIs; unless otherwise indicated, all effects survived this correction for the particular cohort of ROIs tested.

Analysis of monetary incentive delay task

The analysis workflow applied to the MID task data was generally similar to that described for the VDR task. We modeled the cue period and the feedback period as separate event types, each convolved with a double-gamma HRF. The cue period was defined as an event of 2-s duration during which the value cue was onscreen. The feedback period was defined as an event of 1.92-s duration during which feedback (i.e., whether or not the participant had “won” on a given trial) was on screen. High-value, low-value, and no-reward trials were defined as separate event types. Our primary analysis of interest compared activity during the cue period on high-value trials with activity during the cue period on no-reward trials. Group-level analyses followed the same procedure described for the VDR task.

Results

Behavioral data

We first examined the behavioral data to confirm that high-value words were consistently recalled better than low-value words (Fig. 2). Using paired-samples t tests (two-tailed), we found that high-value words were remembered better than low-value words even on the first practice list, t(19) = 4.13, p = .001, and the second practice list, t(19) = 7.02, p < .001. A paired-samples t test confirmed that the mean number of items recalled across all scanned lists was significantly greater for high-value words, t(19) = 9.58, p < .001. A 2 × 5 (Value Group × List) repeated measures analysis of variance (ANOVA) on the proportions of items recalled (performed on the data from the 19 participants who completed all five lists) additionally showed an interaction between list and value group, F(4, 72) = 3.15, MSE = 1.96, p = .019, η p 2 = .149, but no main effect of list, F(4, 72) = 1.79, η p 2 = .090. The significant interaction suggests that point values had a reliably stronger effect on recall on later lists. Separate paired-samples t tests for each list confirmed that a highly reliable effect of value still emerged on all five scanned lists, with all ts > 6.00 and all ps < .001.

Fig. 2
figure 2

Mean numbers of high-value and low-value items recalled on each list (including on the two practice lists shown prior to scanning). Significantly more high-value items were recalled than low-value items on all lists. Error bars represent ±1 SE

In addition, we examined the data for the individual value levels, in part to confirm that the binarization into high versus low value that we use throughout the remainder of this article was justified by the data (Table 1). For low-value items, a one-way within-subjects ANOVA did not reveal a difference in the number of items recalled across the three low-value conditions, F(2, 38) < 1, η p 2 = .004. Within high-value conditions, a one-way repeated measures ANOVA showed a trend toward an effect of point value, F(2, 38) = 2.97, MSE = 1.16, p = .063, η p 2 = .135. Significantly more 12-point items (M = 9.17) were recalled than 11-point items (M = 8.50), t(19) = 2.83, p = .028, but the difference between 11- and 10-point items (M = 8.42) was not significant, t(19) < 1.

Table 1 Mean (with SE) number of items recalled across the five scanned lists, by specific point values

Main effects of value

Cue period

We first examined how brain activity differed during high-value trials as compared to low-value trials across individuals, during the cue period. A whole-brain analysis revealed several frontoparietal regions that showed greater BOLD signal in response to high- than to low-value cues (Fig. 3a; Supplemental Table 1). In addition, as predicted, we observed significant effects in mesolimbic reward structures, including clusters in left nucleus accumbens (NAcc; peak voxel MNI coordinates: –6, 8, –4), and right NAcc (peak voxel: 8, 10, –6). The whole-brain analysis also revealed a cluster in right pregenual cingulate cortex (peak voxel: 4, 44, 24), an area that was associated with reward processing in a recent meta-analysis (Liu, Hairston, Schrier, & Fan, 2011). This cluster is immediately dorsal to the ventromedial prefrontal cortex, which is widely considered to be important in reward processing (e.g., O’Doherty, 2013).

Fig. 3
figure 3

Group activation contrast showing main effects of value on the BOLD signal (a) during the cue period and (b) during the word-encoding period. Warm colors indicate regions showing greater activity on high-value trials, and cool colors indicate regions showing greater activity on low-value trials. Note that scales were chosen separately for each contrast, and for positive and negative activations within each contrast, in order to maximize dynamic range, but the actual thresholds were constant across this and all other figures

In addition, we conducted ROI analyses to probe for differential levels of activity in those specific reward-sensitive regions for which we had a priori hypotheses—specifically, VTA/midbrain and NAcc/ventral striatum. Our primary method of localizing these reward-sensitive regions was to locate the peak coordinates in midbrain and ventral striatum from a group-level analysis of the MID task. To localize the reward-sensitive midbrain, we placed a sphere of radius 4 mm around the peak midbrain coordinate obtained from the MID functional localizer task (left hemisphere: –6, –24, –6; right hemisphere: 6, –26, –6). We also functionally defined a NAcc/ventral striatal reward-sensitive region by placing a sphere of radius 4 mm around the peak coordinates in the vicinity of NAcc from the MID task (left hemisphere: –8, 12, –2; right hemisphere: 8, 10, –2). Because effects of value were reliably correlated across corresponding regions in the two hemispheres, and in order to increase statistical power, we combined the left- and right-hemisphere spheres to create bilateral functionally defined ROIs for the NAcc and the midbrain. We found a significant effect of value in the bilateral NAcc/ventral striatum, t(19) = 3.73, p = .001. We also found greater activity during high-value cues in the reward-sensitive midbrain, t(19) = 2.48, p = .022.

Because our functionally defined midbrain ROI was somewhat lateral, posterior, and superior to the typical anatomical definition of the VTA, possibly due to imperfect registration of the midbrain BOLD signal to the anatomical template brain (e.g., Limbrick-Oldfield et al., 2012), we elected to also interrogate our data using an alternative VTA ROI, defined on the basis of a probabilistic anatomical MRI atlas (Murty & Adcock, 2014; Shermohammed et al., 2012); we included all voxels that had nonzero probability values, resulting in a cluster of 698 voxels. Note that unlike our functionally defined ROIs, which we defined separately in each hemisphere, this VTA ROI consists of a single midline region. Within the anatomically defined VTA ROI, activity tended to be greater during high-value cues than during low-value cues; this difference approached, but did not reach, significance, t(19) = 1.84, p = .08. Thus, it seems that more activity generally occurred in reward-sensitive brain regions during high-value cues, relative to low-value cues.

Word-encoding period

We also examined differences in brain activity as a function of value during the word-encoding period (Fig. 3b; Supplemental Table 2). A whole-brain analysis revealed greater BOLD signal during high-value encoding in a large cluster that included almost the entirety of the left inferior frontal gyrus (IFG), including both the pars triangularis (peak voxel: –44, 32, 6) and the pars opercularis (peak voxel: –42, 8, 18). The whole-brain analysis also showed greater activity during high-value encoding in the left superior temporal gyrus and throughout the posterior portion of the left lateral temporal cortex (peak voxel: –46, –52, –12). Similar patterns of brain activity were apparent in homologous right-hemisphere regions, but these effects were weaker and less extensive than their left-hemisphere counterparts. In addition, during the encoding of high-value words, less activity was apparent in bilateral posterior cingulate cortex and right angular gyrus, suggesting greater deactivation of the default mode network during encoding of these items, relative to low-value words.

We also observed increased activity in dopaminergic striatal and midbrain regions during the word-encoding period for high-value words. Whole-brain analysis revealed clusters centered in the caudate/putamen bilaterally (left peak voxel: –16, 10, 10; right peak voxel: 22, 6, –8). In addition, we examined how value affected activity in NAcc/ventral striatal and midbrain reward-sensitive regions during word encoding using the same ROIs described above. We found significantly greater activity during high-value encoding in bilateral NAcc/ventral striatum, t(19) = 4.23, p < .001. We also found a significant effect of value in our reward-sensitive midbrain ROI, t(19) = 3.02, p = .007. Finally, we found a significant effect of value in our anatomically defined VTA ROI, t(19) = 2.26, p = .036. Overall, we can conclude that these reward-sensitive brain regions were generally more active on high-value items, during the word-encoding period as well as during the cue period.

Correlation with selectivity index

Our primary question of interest concerned how value contributes to subsequent recall. Because many of the participants remembered few low-value words or forgot few high-value words, it was not possible to construct a viable contrast representing the interaction between value and recall. We instead used an individual differences approach to examine the relationship between item value and memory success. Specifically, we correlated each individual’s mean selectivity index with effects of value in the brain (i.e., the difference between activity on high- and low-value trials in each voxel). The selectivity index reflects how close participants were to achieving an optimal point score, independent of the actual number of items recalled. We can thus infer that participants who were more selective in the words that they remembered on the recall test were engaging more strongly the processes that yield relatively better memory for high-value items in this task.

We first looked for regions in which the effect of value on the BOLD signal during the cue period correlated with selectivity index. A whole-brain analysis yielded no significant correlations with selectivity index during the cue period. When using a whole-brain analysis to examine brain activity during the word-encoding period, however, a number of significant clusters emerged (Fig. 4, Table 2). Most notably, we found a correlation between selectivity index and value-related activity in a cluster that included the anterior portion of the left IFG and ventral portions of the left middle frontal gyrus (peak voxel: –46, 20, –6), as well as in a second cluster that included the left posterior IFG (peak voxel: –38, 6, 28). Another notable cluster was apparent in the posterior portion of the middle and inferior temporal gyri (peak voxel: –52, –64, –2).

Fig. 4
figure 4

Map depicting regions demonstrating a significant positive correlation between selectivity index and the effects of value on BOLD signal during the word-encoding period. No regions demonstrated a significant negative correlation between these variables

Table 2 Activation peaks for regions showing significant correlations between selectivity index and value effects (high > low contrast) during the word-encoding period

We also examined how selectivity index correlated with value-related changes in activity in the mesolimbic dopamine system. During the cue period, none of the three reward-sensitive ROIs described above showed significant correlations with selectivity index (NAcc, r = –.01; functionally defined midbrain, r = –.27; anatomical VTA, r = –.11). During the word-encoding period, however, we found a positive correlation between selectivity index and value-related activity in NAcc/ventral striatum (r = .495, p = .026). After applying a Bonferroni–Holm correction, the corrected p value for this correlation was .052, narrowly missing our cutoff for significance; we nonetheless believe that this trend is noteworthy. We did not find a correlation in our functionally defined midbrain ROI (r = .12) during the word-encoding period, but we did find a positive correlation between selectivity index and value in the anatomically defined VTA ROI (r = .534, p = .015), and this correlation did survive a Bonferroni–Holm correction. Thus, although it seems clear that the effects of value on activity in dopaminergic reward regions during the cue period do not correlate positively with memory selectivity, the activation of dopaminergic reward regions during word encoding may make some contribution to greater selectivity.

To provide additional evidence for inferences about the use of cognitive strategies at encoding, we also examined value effects in three different a priori regions from a prior fMRI study of strategy use during encoding (Kirchhoff & Buckner, 2006). The three relevant peaks were in left anterior IFG (BA 45/47), left posterior IFG (BA 44/6), and extrastriate cortex (BA 19/37). Kirchhoff and Buckner found that activity in both IFG clusters correlated positively with the use of verbal elaboration strategies during encoding. Activity in the extrastriate cortex correlated instead with the use of a visual inspection strategy, which would likely not be useful for our verbal materials. Thus, if participants were using elaborative verbal encoding strategies to selectively remember the high-value words in our study, we would expect to find correlations between selectivity index and the effects of value on the BOLD signal in the two left IFG ROIs, but not in the extrastriate ROI.

To test this hypothesis, we converted the activation peaks reported by Kirchhoff and Buckner (2006) from Talairach to MNI space (Lancaster et al., 2007) and drew a sphere with an 8-mm radius around each of those peaks. During the cue period, no significant main effects of value emerged in any of the three ROIs, all ts < 1.76, nor a correlation with selectivity index in any of the three ROIs, all rs < .13. During the word-encoding period, we found a main effect of value in both the anterior left IFG ROI, t(19) = 5.65, p < .001, and the posterior left IFG ROI, t(19) = 3.96, p = .001, but not in the extrastriate ROI, t(19) = 1.33. In addition, during word encoding, selectivity index correlated significantly with value effects in the anterior left IFG ROI, r = .56, p = .010, and with value effects in the posterior left IFG ROI, r = .61, p = .005, but not with value effects in the extrastriate ROI, r = –.05 (Fig. 5). Thus, our results are consistent with the idea that participants who exhibit more memory selectivity may be preferentially engaging prefrontally mediated verbal elaboration strategies during the encoding of high- versus low-value words.

Fig. 5
figure 5

Correlations between selectivity index and differences in BOLD signal parameter estimates for High minus Low Value items (in arbitrary units) in three regions of interest from Kirchhoff and Buckner (2006): a anterior left IFG (BA 45/47), b posterior left IFG (BA 44/46), and c extrastriate visual cortex (BA 19/37). Regions a and b, which have been associated with verbal strategy use, show significant correlations between selectivity index and value effects on the BOLD signal. Region c, a visual association area that has been associated with nonverbal encoding strategies (BA 19/37), does not show a significant correlation

Individual differences in self-reported strategies, selectivity, and working memory

To further enhance our understanding of how people tend to strengthen the encoding of high-value items, we examined responses to the poststudy questionnaires. We first examined and categorized self-reported strategy use at encoding. All participants reported using some type of verbal strategy to try to remember the words. Of these, 14 participants described strategies that would seem to rely on the meaning of the words (e.g., generating stories or images that combined multiple words). The remaining six participants described strategies that did not rely on meaning (e.g., rote rehearsal or alphabetizing). Selectivity index did not reliably vary between the groups using these two different strategy types, t(18) < 1. Individuals who used meaning-based strategies did recall more high-value words (M = 9.21, SD = 1.57) than did those who used other verbal strategies (M = 7.48, SD = 1.95), t(18) = 2.10, p = .050, but the strategy groups did not differ on the numbers of low-value words recalled, t(18) < 1. In addition, individuals who used meaning-based strategies tended to have higher working memory (WM) composite span scores (M = .26, SD = .70) than did those who used nonsemantic verbal strategies (M = –.61, SD = .85), t(18) = 2.40, p = .027.

Another result that speaks to strategy use is based on whether individuals reported limiting rehearsal exclusively (or nearly so) to high-value items. These reports largely came from people’s descriptions of what they were doing during the fixation and vowel–consonant periods. We assume that the distinction between those who exclusively rehearsed high-value words and those who merely preferred rehearsing high-value words during these periods of “down time” reflected similarly divergent strategy use during the word-encoding period. Twelve participants reported largely or entirely ignoring the low-value items, whereas seven participants reported trying harder on high-value items, but did not appear to ignore low-value items. Finally, one participant reported ignoring value completely. An independent-samples t test comparing the first two groups (excluding the one person who reported being indifferent to value) showed that selectivity index was significantly higher for individuals who reported that they ignored low-value items (M = .74, SD = .19) than for those who just tried to focus more on high-value items (M = .47, SD = .22), t(17) = 2.80, p = .012. Perhaps unsurprisingly, individuals who reported ignoring low-value items recalled significantly fewer of these items per list (M = 1.87, SD = 1.85), as compared to those who did not report ignoring low-value items (M = 4.34, SD = 2.96), t(17) = 2.25, p = .038. The two groups did not differ reliably on numbers of high-value items recalled, however, t(17) = 0.91.

These findings led us to further examine individual differences in high-value versus low-value recall. We found that selectivity index showed a highly significant negative correlation with low-value recall (r = –.72, p < .001), whereas the expected positive correlation between selectivity index and high-value recall did not reach significance (r = .26). We compared the absolute values of these r coefficients via a test of dependent correlation coefficients (Steiger, 1980) and found that the correlation between selectivity index and low-value recall was significantly stronger than the correlation with high-value recall, t(18) = 2.40, p = .03. Thus, our selectivity index measure was more strongly driven by the number of low-value items recalled than by the number of high-value items recalled.

We also examined more closely the relationship between selectivity and WM span. We found that WM span score did not significantly correlate with selectivity index (r = .25), which was similar to the null effect shown by Castel, Balota, and McCabe (2009). We also observed dissociations in how selectivity and WM affected memory as a function of value. We used linear regression analyses to determine the degree to which selectivity and WM jointly predicted high-value recall and, separately, low-value recall. We found that WM span was a strong positive predictor for high-value recall (β = .66, p = .002), but that selectivity index was not (β = .09, p = .61). In contrast, WM span was a positive predictor of low-value recall (β = .33, p = .048), whereas selectivity index was a strongly negative predictor (β = –.81, p < .001). Thus, it seems that higher WM span is generally associated with better recall, consistent with prior work (e.g., Rosen & Engle, 1997; Unsworth, Brewer, & Spillers, 2013). At the same time, selectivity seems to be primarily associated with the degree to which people avoid encoding low-value items. These findings suggest that selectivity relies on strategic control processes that are, at least to some extent, separable from WM.

Discussion

Prior neuroimaging studies have demonstrated the functional contributions of left ventrolateral prefrontal cortex to deep semantic processing and to the use of verbal elaboration strategies during memory encoding. Here, we demonstrate that activity in this region (specifically in left inferior gyrus and ventral portions of the left middle frontal gyrus) is greater during encoding of high-value words. We also demonstrate a correlation between neural effects of value in this region and a behavioral expression of memory selectivity.

An association between effects of value on BOLD signal and memory selectivity is specifically apparent in regions of left IFG for which individual differences in activity have previously been associated with individual differences in the use of verbal encoding strategies (Kirchhoff & Buckner, 2006). Others have additionally shown that left IFG is specifically involved in control processes related to semantic retrieval (e.g., Badre et al., 2005; Thompson-Schill et al., 1997; see Badre & Wagner, 2007, for a review). Our findings thus provide suggestive evidence that people who selectively encode the most valuable items tend to do so by being more selective in the degree to which they engage semantic encoding strategies when encoding items deemed to be more valuable, relative to items that are less valuable. Participants with high selectivity frequently reported that they tried to ignore low-value items, and this was reflected in greater differences in brain activity in these left hemisphere regions during encoding of high- versus low-value words.

The effect of value on activity in posterior portions of the middle temporal gyrus (pMTG) also correlated with individual differences in memory selectivity. Prior evidence has related this region with controlled retrieval of semantic knowledge, as well. For instance, Wagner, Paré-Blagoev, Clark, and Poldrack (2001) found that searching for a weak semantic associate led to increased activity in pMTG, as well as increased activity in both anterior and posterior portions of left IFG, as compared to searching for a strong semantic associate. Badre et al. (2005) observed a similar effect of semantic relatedness on both MTG and left IFG, but also found evidence suggesting that MTG activity reflects retrieval of semantic knowledge, but that only activity in left IFG mediates semantic control processes per se. More recent work has supported a somewhat different viewpoint that both regions play a necessary role in control processes related to retrieval of semantic knowledge, rather than pMTG activity only reflecting retrieval of semantic knowledge itself. For instance, Whitney, Kirk, O’Sullivan, Lambon Ralph, and Jeffries (2011) found that virtual lesions temporarily induced by transcranial magnetic stimulation (TMS) in either left IFG or left pMTG led to similar impairments to performance when judging weak semantic associates, but did not impair performance in judging strong semantic associates. The fact that the degrees of increased activity during high-value encoding in both left IFG and pMTG were associated with memory selectivity in the present task, then, provides additional evidence that successfully enhancing memory for high-value items in our value-directed remembering task depends on strategic engagement of semantic processing.

An automated meta-analysis using Neurosynth (http://neurosynth.org; Yarkoni, Poldrack, Nichols, Van Essen, & Wager, 2011) provides further evidence suggesting that the regions in which activity was modulated by value in the present study are typically involved in semantic processing. Specifically, a “reverse inference” statistical map generated from peak coordinates from the 670 neuroimaging studies in the Neurosynth database that most heavily utilized the term “semantic,” a map that formally quantifies the probability that the term “semantic” would be associated with activation in these regions (Fig. 6), looks strikingly similar to the regions associated with the encoding of high-value words in the present study (Fig. 3b). The meta-analytic map also reflects many of the same regions in which the degree of increased activity during encoding of high-value words correlated with memory selectivity. Thus, the automated meta-analysis supports the view that memory selectivity arises from differential semantic processing of valuable items.

Fig. 6
figure 6

Automated Neurosynth meta-analysis of semantic processing. Voxel intensity values reflect the statistical likelihood that any given study reporting an effect in that voxel would be a study that heavily utilized the term “semantic.” Note the correspondence between the left PFC and posterior lateral temporal regions that emerged in this meta-analysis and the regions of our effects reported in Figs. 3b and 4

Selective encoding could potentially be mediated via a selective increase in the use of verbal strategies during encoding of high-value words, or via a selective reduction in verbal strategy use during encoding of low-value words. We observed that selectivity was reliably associated with the degree to which people self-reported ignoring low-value words, and that selectivity index was more strongly associated with reduced memory for low-value words than with increased memory for high-value words. Thus, it seems likely that, at least in young adults, selectivity is primarily modulated by the degree to which people disengage semantic processing during the encoding of low-value items, rather than by how effectively they encode high-value items.

At the same time, we observed that memory for high-value words and self-reported engagement of semantic encoding strategies was reliably associated with WM span, but that these measures were not reliably associated with selectivity. One possible reason for the association between high-value recall and WM capacity is that individuals with high WM span may be better able to implement deep encoding of new high-value items while also simultaneously maintaining previous items. High-WM-span individuals may also have been better able to maintain important items in memory while simultaneously performing the vowel–consonant task that occurred between successive word encoding trials. Although these WM mechanisms do seem to be related to higher point totals, they do not seem to be a major factor in selective encoding.

It is also worth noting that we did tend to find greater activity in reward-sensitive regions (specifically, functionally defined NAcc/ventral striatum and midbrain regions, and an anatomically defined VTA region) on high-value trials than on low-value trials across participants. The VDR paradigm differs from most studies of reward-motivated learning in that we incentivize high-value items with higher point values, rather than using rewards that have external value (e.g., money). The observation that high point values still lead to increased activity in dopaminergic reward regions, similar to that observed with monetary rewards, supports our assumption that points are sufficiently rewarding to motivate changes in behavior. Indeed, memory performance was very sensitive to point value. This finding is similar to what is observed in a number of real-world contexts (e.g., video games, sports), in which people are motivated by the prospect of a high score. Our findings do, however, differ from past work in that the strength of dopamine-driven reward effects during the anticipatory cue period did not correlate significantly with individual differences in memory selectivity. Rather, in our data, this relationship was only apparent during the phase of the task during which participants actually encountered the words. Previous work also suggested that the effects of activity in mesolimbic dopamine regions on subsequent memory are most apparent after a delay (e.g., Murayama & Kuhbandner, 2011; Spaniol et al., 2014), perhaps due to their dependence on offline consolidation mechanisms. Such findings imply that the role of mesolimbic dopamine regions on value-induced memory enhancement should not be apparent in the immediate free recall measure used in the VDR. We believe that our findings of strategic enhancement of encoding and free recall relate to a second mechanism for value-related memory enhancement. This additional mechanism may be complementary to the dopaminergic enhancement of memory consolidation that has been demonstrated by others (e.g., Adcock et al., 2006; Murty et al., 2012; Wolosin et al., 2012), but the two different mechanisms appear to make varying contributions to memory performance on the basis of the time scale and the type of information to be remembered.

Finally, our results suggest important potential implications for research on cognitive aging. Castel et al. (2002, Castel et al., 2009, 2007, 2013) found that healthy older adults generally show an excellent ability to be selective in the VDR task. Indeed, older adults often have memory equivalent to that of young adults for the most valuable items, despite recalling fewer items overall. This pattern of data often yields a higher selectivity index for older adults than is shown by young adults for tests of short-term memory. Thus, whatever older adults do to selectively encode high-value items in the VDR paradigm, they clearly seem to be relying on processes that are not substantially degraded by healthy aging. It may be that older adults retain the ability to be selective in their engagement of the semantic encoding strategies mediated by left PFC, which would provide important evidence about the type of processing that older adults are typically able to engage successfully. Thus, an important direction for future research will be to examine age-related differences and similarities in the neural mechanisms of value-directed remembering.

Although dopaminergic reward systems play an important role in memory formation, it is also important to consider how the strategic control of frontally mediated encoding processes serves to selectively enhance memory for valuable items. Particularly in situations in which the delay between study and recall is relatively short, and when the items that need to be memorized are amenable to a selective use of verbal encoding strategies, we might expect differential strategy use to be a more important contributor to memory performance than dopaminergic modulation of hippocampal activity. We anticipate that future work will help to determine the specific situations that preferentially engage these respective mechanisms, and whether they independently or interactively contribute to memory performance.