Adult age differences in frontostriatal representation of prediction error but not reward outcome

Emerging evidence from decision neuroscience suggests that although younger and older adults show similar frontostriatal representations of reward magnitude, older adults often show deficits in feedback-driven reinforcement learning. In the present study, healthy adults completed reward-based tasks that did or did not depend on probabilistic learning, while undergoing functional neuroimaging. We observed reductions in the frontostriatal representation of prediction errors during probabilistic learning in older adults. In contrast, we found evidence for stability across adulthood in the representation of reward outcome in a task that did not require learning. Together, the results identify changes across adulthood in the dynamic coding of relational representations of feedback, in spite of preserved reward sensitivity in old age. Overall, the results suggest that the neural representation of prediction error, but not reward outcome, is reduced in old age. These findings reveal a potential dissociation between cognition and motivation with age and identify a potential mechanism for explaining changes in learning-dependent decision making in old adulthood. Electronic supplementary material The online version of this article (doi:10.3758/s13415-014-0297-4) contains supplementary material, which is available to authorized users.


S2: fMRI Task Schematics
In the Monetary Incentive Learning (MIL) task, subjects view each cue (2 s each), select one when presented with the word "Choose" (max 3.5 s), view their highlighted choice (4 s -choice reaction time), view the monetary outcome of their choice (2 s), and then fixate centrally for a variable inter-trial interval (2-6 s). In the Monetary Incentive Delay (MID) task, subjects view an explicit cue which indicates the trial type and reward magnitude (2 s), fixate centrally for a variable delay (2 -2.5 s), respond as quickly as possible to a target, fixate centrally (4 s -(pre-target delay + choice RT)), view performance feedback (Hit!, Miss!) and the monetary outcome of their choice (2 s), and then fixate centrally for a variable inter-trial interval (2-6 s).

S3: MIL Task PE Whole-Brain Regression Model
The prediction error regression model used computationally-derived estimates of value produced by a standard reinforcement learning model fit to individual subject choices (O'Doherty, et al., 2003;Sutton & Barto, 1998). Expected values (V) were computed to represent the reward expected by selecting each cue. The values of V for each cue were initialized at 0 and updated for the chosen cue on each trial according to where s i is the cue encountered on trial i and α is the learning rate. Learning is mediated by a prediction error (the bracketed portion of Equation 1) between the reward received (r i ) and the expected value of the chosen option V(s i ). The prediction error is positive if the reward received is larger than expected and negative if the reward received is smaller than expected. Prediction errors were modeled at outcome in the MIL task (but reward predictions, V, were not included in the model). Learning is modulated by a learning rate, or recency parameter (α; 0 < α < 1), that determines the degree to which subjects update the reward expectations based on the most recently received rewards. As this rate approaches 1 greater weight is given to the most recent reward. To fit behavior, the probability of selecting each action was assumed to follow a softmax rule, which asserts that the probability of selecting a cue based on the expected values: Here β is a decision slope parameter that determines the degree to which the option with the highest EV The constants α (learning rate) and β (decision slope) were adjusted to maximize the probability of the observed choices under the model. Best-fitting values for gain learning were α = 0.34 (95% CI: 0.25-0.43) and β = 0.12 (95% CI: 0.06-0.17), and for loss learning were α = 0.39 (95% CI: 0.27-0.51) and β = 0.22 (95% CI: 0.16-0.28). Learning rates were not correlated with task performance in either the gain condition, β = .21, p = .20, or loss condition, β = .25, p = .12. Decision slopes were positively correlated with task performance in the gain condition, β = .50, p < .01, but not the loss condition, β = .23, p = .14. Despite these inconsistent associations between model parameters and overall task performance, the predicted probabilities of choosing the higher probability cue on each trial over time

S4: MIL Task Performance in Larger Behavioral Sample
Seventy-seven adults (age mean = 55, SD = 17, range 20-85) played 12 trials per condition (gain, loss, neutral). Half of the sample (39 adults) were the same individuals who completed the fMRI version of the task. For these analyses, only the first half of their trials (12 trials per condition) were included in the measures of learning performance (to equate trial numbers and learning phase). The additional thirty-eight subjects in the sample were recruited using the same market research firm, but did not complete the 24 trial version of the MIL task while undergoing fMRI. They only played the first half of the learning task (12 trials per condition). Separate analyses of this dataset are reported in a recent publication that focuses on long-term financial outcomes in life (i.e., asset and debt accumulation) rather than age differences in learning ability (Knutson et al., 2011). All subjects in this sample of 77 were recruited from the community in exactly the same way. They were all initially invited to participate in both neuroimaging and behavioral phases of experiments in the lab, although some of them were either ineligible for imaging or opted to only complete a behavioral session.
Analysis in this larger behavioral dataset revealed a main effect of age, F 1,75 = 6.57, p < .05, such that learning performance was higher in younger compared to older adults. There was a non-significant main effect of task condition (gain, loss), F 1,75 = 0.03, p = .86, and a non-significant interaction of age and task condition (gain, loss) F 1,75 = 0.21, p = .65. In a separate model that included both a linear and quadratic effect of age, the main effect of age was still significant, p = .01, but the quadratic effect of age was non-significant, p = .45. In summary, older adults performed more poorly across both conditions of the learning task in this larger behavioral sample focused on the early stages of learning.
Learning performance scores by age and task condition are plotted below in Supplementary

S5: Additional MIL Task Whole-Brain fMRI Results
In a second and separate model (that is more analogous to the MID task reward outcome model), we  Regions of the brain that showed age differences in activation between PPE and NPE at outcome. Blue corresponds to negative z-scores which indicate a reduced difference between PPE and NPE as age increased. R = right. R/L, A/P, or S/I value listed in upper corner of each statistical map. Anatomical underlay is an average of all subjects' spatially normalized structural scans.

S6: Loss Condition Results
MIL task. For the task that required learning, there were no regions that showed a significant difference in activation between loss outcomes (PPE vs NPE) across subjects at the cluster-corrected threshold (see Supplementary Table 2). Reducing the cluster-correction revealed a five-voxel subthreshold cluster in the right anterior cingulate (RAS = 4, 15, 23; Z = 3.69) with significantly greater signal for loss avoidance (-$0) than actual losses (-$1) across subjects.
A linear effect of age was observed in a dorsomedial frontopolar region at the cluster-corrected whole-brain threshold (see Supplementary Table 2 Figure 6). However, there were no regions that showed significant age differences for this contrast.

S7: MID Task Low Magnitude Timecourses
One important difference between the MID and MIL task was that different magnitudes of rewards were at stake. The MIL task included two reward levels, $0, $1, and the MID task included three reward levels, $0, $0.50, $5. An alternative account of the presence of age differences in the MIL task and absence of age differences in the MID task is that older adults are less sensitive to lower magnitude rewards. To test this possibility we plotted timecourses for the low magnitude ($0.50) condition from the MID task for each region of interest. The same pattern of results emerges ruling out this alternative account. Older adults show signal change differences between $0.50 gains and $0 nongains in the MID task. Thus, the differences between tasks are not simply due to differences in reward magnitudes.
Supplementary Figure 4. Black lines are +$0.50 outcomes (successfully hit target) and grey lines are +$0.00 outcomes (missed the opportunity to win $0.50).
Additional model fitting explored whether adults in either study were better fit by a Rescorla-Wagner reinforcement learning (RL) or win-stay/lose-shift (WSLS) model. Additionally, we independently examined win-stay and lose-shift across age groups and tasks. For all analyses and discussion below both gains (+$1) in the gain condition and non-losses (-$0) in the loss condition will be categorized as "wins" for the estimation of win-stay. Likewise, both losses (-$1) in the loss condition and non-gains (+$0) in the gain condition will be categorized as "losses" for the estimation of lose-shift.
In Study 1, a marginally significant main effect of task condition, F 1, 36 = 3.98, p = .05, suggested that subjects were slightly better fit by RL than WSLS in the gain relative to the loss condition. Within the two valence conditions, subjects across age were better fit by RL than WSLS in the gain condition, suggested that younger adults were more likely to win-stay than lose-shift compared to the older adults.
Follow-up tests revealed non-significant effects of age for win-stay in the short condition, p = .16, and win-stay in the long condition, p = .25, but significant age differences in lose-shift in the short condition, p < .001, and lose-shift in the long condition, p < .0001. The condition (short, long) by WSLS (win-stay, lose-shift) by age group interaction was not significant, F 1, 48 = 0.82, p = .37. See Supplementary Figure   5D.