On a simple level, the functional architecture of the human brain can be modeled as a network of interconnected systems, each with specific functions that transform incoming signals to an output (Arbib 2003). Like any physical input-output system, these systems are characterized by inherent load constraints and optimal load capacity ranges, and are designed to function in an adaptive manner within these constraints. As a corollary to this, exceeding capacity constraints presents one possible means by which a system may suffer performance degradations.

The human working memory system provides a useful experimental model for studying transient neural overload effects (Callicott et al. 1999). Neuroimaging studies employing variable-load repetitive working memory tasks such as the N-back task have demonstrated adaptive increases in brain activation in dorsolateral prefrontal cortex (DLPFC) in response to increasing working memory load (Braver et al. 1997; Cohen et al. 1994). At high loads associated with poor task performance, left DLPFC activation has been shown to decrease, forming an inverted U-shaped load-response curve (Callicott et al. 1999). Although working memory has been extensively studied in laboratory settings, an important area that has not been explored adequately is the impact of exposure to capacity-exceeding loads on subsequent working memory function and its underlying neurocircuitry.

In addition to the DLPFC, another brain region that may play an important role in influencing the effects of capacity-exceeding working memory loads is the amygdala, a brain region centrally involved in emotional responses (LeDoux 2003). The amygdala has been implicated in mediating the deleterious effects of emotional interference on cognitive processing (Dolcos and McCarthy 2006; Gray 2001; Northoff et al. 2004; Simpson et al. 2000). Amygdala activity has also been linked with improvements in reaction times during high load N-back task performance without loss of accuracy and in a manner not contingent upon mood state (Schaefer et al. 2006), a finding that suggests a possible attention or vigilance role at high cognitive loads. Since amygdala and DLPFC are not thought to be strongly connected anatomically (Porrino et al. 1981), functional connectivity between these regions is likely indirect and could be mediated through the ventromedial prefrontal cortex (VMPFC), a region connecting amygdala with DLPFC (Ghashghaei and Barbas 2002). Inverse correlations of activity between VMPFC and DLPFC have been observed during working memory tasks (Longe et al. 2008) as well as during the induction of sadness and recovery from depression (Mayberg et al. 1999).

The present study was designed to test the effect of overloading the working memory system on subsequent functioning at a moderate load. We implemented the N-back at a high-load 4-back condition to overload working memory capacity, using an easy 1-back task as the control condition. Although the N-back task is primarily used to test working memory and not emotion, at capacity-exceeding loads an element of uncontrollable failure is introduced, which we anticipated would cause a negative emotional response that could interfere with cognitive performance through activation of the amygdala. Our primary hypothesis was that 4-back exposure would cause transient declines in subsequent DLPFC activity and task accuracy during subsequent 2-back performance. We also hypothesized that such declines would be associated with greater amygdala activity, and through connections traversing VMPFC, greater inverse functional connectivity between amygdala and DLPFC.

Materials and methods


All procedures were approved by the Institutional Review Boards of Yale University and the VA Connecticut Healthcare System. Informed consent was obtained from 20 right-handed healthy control subjects (9 males and 11 females, mean age ± SD of 26.6 ± 8.0 years, mean years of education ± SD of 17.0 ± 2.9). Mean IQ ± SD estimated from performance on the Wechsler Test of Adult Reading (WTAR) (Ginsberg 2003) was 112.7 ± 11.6. Each subject was assessed with the Structured Clinical Interview for DSM-IV to rule out presence of psychiatric illness (First et al. 2002). Subjects with significant medical or neurological illnesses were also excluded.

N-back task (Fig. 1)

Each subject underwent a functional magnetic resonance imaging (fMRI) session composed of four separate runs, with each run consisting of 11 blocks lasting 32–40 s each, with 10 s between each block for rest. Blocks were counterbalanced between runs to minimize potential confounds of fatigue and block order. Subjects were given trials grouped in blocks, with 16–20 trials/block. Each trial consisted of a visual presentation of a letter for 500 msec followed by an interstimulus interval of 1500 msec. Subjects were instructed to press a target button as soon as possible for each trial in which the letter shown was the same as the letter shown N trials previously, and to press a non-target button otherwise. For the 0-back, targets were each occurrence of the letter “a.” Target responses were set to occur with 50% frequency, with accuracy computed as the number of correct responses divided by the number of trials. Initial letters in a sequence with no comparison letters were not counted as trials. Subjects were given blocks of 0-back, 1-back, 2-back, and 4-back, with half of the 2-back blocks occurring immediately after a 4-back block (“2-back/4”), and the remaining 2-back blocks occurring immediately after a control-condition 1-back block (“2-back/1”). The 4-back blocks were used intentionally to exceed subjects’ working memory capacity, with the expectation of subsequent deficits in 2-back performance and associated involvement of limbic circuitry. Prior to the fMRI testing session, each subject was given written and verbal instructions on the task and underwent a practice session composed of blocks of 0-back, 1-back, and 2-back. In order to maximize the subjective failure experience of 4-back condition, subjects were told they may be given 3-back or 4-back blocks, but were not given practice trials at this difficulty level. Subjects were instructed to perform the tasks to the best of their ability, emphasizing accuracy over reaction time when possible. They were informed that this experiment would assist in our understanding of cognitive processes but were otherwise not told about the specific purpose of the experiment until after the fMRI session. Upon completing the fMRI runs, subjects were given a post-test questionnaire that asked them to rate the different N-back conditions on separate 5-point scales of difficulty and frustration (1 = not at all difficult/frustrating, 5 = very difficult/frustrating). As subjects were not made aware of the distinction between 2-back/1 and 2-back/4 blocks until after the testing session, their ratings did not distinguish between these two conditions.

Fig. 1
figure 1

Experimental design. a Sample N-back task for N = 2. b Block design. Subjects underwent runs in Sequence A and B in counterbalanced fashion

FMRI acquisition and preprocessing

Blood oxygen level-dependent (BOLD) fMRI data were collected on a Siemens Trio 3-Tesla scanner using an EPI pulse sequence (TR = 2 sec, TE = 30, FA = 85, FOV = 20 cm) and whole brain coverage (33 axial oblique slices, slice thickness = 4.0 mm, gap = 0.5 mm, voxel size = 3.1 × 3.1 × 3.5 mm3). Preprocessing of images was performed using SPM2 ( Slice timing correction was performed using the middle slice as the reference, and the slices were re-orientated setting the origin at the anterior commissure and setting the horizontal axis to run along the anterior commissure-posterior commissure line. Motion correction was then performed with INRIalign (Freire et al. 2002) using the first scan of each run as the reference. For each subject, the motion parameters of each run were visually inspected. One subject exhibited greater than 2.5 mm of translational motion during the experiment. This subject was excluded from further fMRI analysis. Only two of the six motion parameters (movement in y and z directions) exhibited greater than 0.5 mm or 0.5° deviations for any subject during the scanning. These two parameters were entered as covariates of non-interest during model specification. Following motion correction, mean functional scans were normalized to the EPI average brain template from the Montreal Neurological Institute (MNI; supplied in SPM2) using affine and nonlinear warping algorithms, and the resulting transformation matrix was used to normalize all the individual scans in the time series. Functional scans were then re-sampled to a 4 × 4 × 4 mm voxel size. Spatial smoothing was performed using a 10 mm full-width half-maximum Gaussian kernel, and data were subjected to a high-pass temporal filter (.008 Hz).

Analysis of behavioral data

Mean accuracy and median reaction times for each condition were determined for each subject, and subsequently used to generate overall group means. Percent change in 2-back accuracy was calculated as (mean accuracy on 2-back/4 minus mean accuracy on 2-back/1) divided by mean accuracy on 2-back/1. Subjective responses, accuracy scores, and reaction times were analyzed using repeated measures multivariate analysis of variance (MANOVA) (F tests based on Wilks’ Lambda), with task condition (0-back, 1-back, 2-back, and 4-back) as a within-subjects factor. Paired t-tests (two-tailed) were used to examine accuracy and reaction time differences between the 2-back/1 and 2-back/4 conditions. Otherwise, post-hoc comparisons of task conditions were made only when the main analysis was significant at the p < .05 threshold.

Analysis of fMRI data

fMRI data were analyzed using the general linear model as implemented in SPM2. For first-level individual subject analyses, fixed effects multiple linear regression time series analysis was implemented to model task condition effects, generating images of the parameter estimates (beta images) for each condition, as well as contrast images generated by subtracting beta images for specific conditions. For second-level group analyses, a random-effects model was applied to the individual subject beta or contrast images derived from the first-level analyses to determine the location and extent of brain activations. Exploratory voxel-wise one-sample t-tests of whole brain contrast images were conducted using a p < .05 voxel-level probability threshold and an extent threshold of 3 voxels, with correction for multiple comparisons based on the false discovery rate (FDR) (Genovese et al. 2002). Cluster-level analysis was based on these same parameters. Within each significant (corrected p < .05) cluster, the percent of each region activated and the corresponding number of activated voxels were determined using the MSU utility (∼pet_lab/MSU/MSUMain.html). All voxel coordinates are given in MNI space.

Region of interest analysis

Mean activation of voxels for each task condition was determined for a priori regions of interest (ROIs) including left and right DLPFC (BA46), amygdala, and VMPFC (BA25). Analyses were conducted separately for each hemisphere in consideration of the verbal nature of the working memory task and studies reporting lateralized cognitive load effects or emotional-cognitive interactions(Altamura et al. 2007; Erk et al. 2007; Low et al. 2009; Sandrini et al. 2008; Siegle et al. 2006). Because of the relatively large area encompassed by these anatomical regions, “functional” ROI masks were generated by identifying voxels within each anatomical region that exhibited supra-threshold activation or deactivation in any of the task conditions. This procedure prevented attenuation of task-related signals associated with inclusion of inactive voxels in the calculation of ROI means. The first step of this ROI procedure involved conducting second-level group analyses for the contrasts of each task condition with the 0-back condition (1-back minus 0-back, 2-back/1 minus 0-back, 2-back/4 minus 0-back, 4-back minus 0-back, and the reverse contrast for each of the above). The resulting activation maps were masked with anatomical ROIs generated by the WFU Pickatlas (Maldjian et al. 2003) (dilated x1), based on the Talairach Daemon database (Lancaster et al. 1997). For each of the above contrasts, voxel values within the masks were thresholded at p < .001, uncorrected for multiple comparisons, with an extent of 3 voxels. For each ROI, the supra-threshold voxels generated from each task condition contrast were combined, forming a single functional ROI mask that represented the union of the activated voxels from each contrast. This ROI mask thus included only voxels that were activated or deactivated in at least one of the task conditions, but was otherwise unbiased in terms of direction of activation. The resulting six functional ROI masks (right and left BA46, amygdala, and BA25) were overlaid on the individual subject contrast images for each condition vs. the 0-back control condition, and the mean activation of all voxels within each of these ROIs was tabulated for each subject. Values from 2-back/1 and 2-back/4 were initially pooled to form a single 2-back condition. Two-way (condition x hemisphere) repeated measures MANOVA (F tests based on Wilks’ Lambda) was used to analyze the subjects’ mean activations in the 1-back, 2-back, and 4-back conditions for each ROI, and to test for hypothesized differences in regional brain activity between the 2-back/1 and 2-back/4 conditions. Where appropriate, follow-up paired t-tests (two-tailed) were used to further parse the results.

Across subjects, mean activity values derived from contrast images in bilateral amygdala for the 4-back and 2-back (pooled 2-back/1 and 2-back/4) conditions (minus 0-back) were tested for correlation with percent change in accuracy from 2-back/1 to 2-back/4. Similar voxel-wise correlational analyses were also performed within SPM2 for Talairach-based bilateral amygdala ROI masks using small-volume correction, an extent threshold of 3 voxels, and a significance threshold of p < .05, corrected for multiple comparisons based on FDR. All correlations in this study were performed using two-tailed assumptions.

Functional connectivity

For each run of each subject’s testing session, a representative amygdala time series was extracted using the first eigenvariate of the voxels in bilateral Talairach-defined amygdala. For each subject, a voxel-wise multiple regression analysis was performed, regressing each voxel’s time series on the amygdala time series as well as several covariates of non-interest, including the time series of global mean brain activity and y- and z- motion parameters. Resulting beta images for each subject, reflecting voxel-wise correlations (i.e. functional connectivity) with amygdala activity, were then passed forward to second level random effects group analysis. Exploratory voxel-wise t-maps were generated for positive and negative linear regression slopes using a probability threshold of p < .05 (FDR-corrected) and an extent threshold of 3 voxels. To test for hypothesized functional connectivity between the amygdala and specific ROIs, small volume corrections were applied using ROI masks for DLPFC (Talairach-defined BA 9 and 46) and VMPFC (Talairach-defined BA25) generated using WFU Pickatlas (dilated x1). A similar procedure was utilized to explore functional connectivity between VMPFC and DLPFC, using VMPFC as the seed region. Talairach-defined regions were used here as opposed to the “active voxel” approach used in the ROI analysis due to the emphasis of functional connectivity on relationships over time as opposed to mean activity.

Psychophysiologic interaction (PPI) analysis

In order to explore whether the functional relationship between amygdala and DLPFC varied between the difficult 4-back and control 1-back conditions, PPI analysis was conducted using SPM2 based on Gitelman, et al. (Gitelman et al. 2003) Specifically, we tested whether the slope of the regression line relating amygdala activity to DLPFC activity in the individual subjects’ time-series data differed between the 4-back and 1-back conditions, and whether this slope difference correlated with the percent change in 2-back accuracy across subjects. PPI-analysis was conducted for each individual subject at the first level, with PPI beta images then being passed forward to second level random effects group analysis. The left and right Talairach-defined amygdalae were each used as source regions in separate analyses. In the first level PPI analysis, the time series for the first eigenvariate of the voxels in the amygdala was extracted, representing the “physiological variable.” This hemodynamic time series was deconvolved to estimate the underlying neural time series used in the calculation of the PPI interaction term (Gitelman et al. 2003). The “psychological variable” was a two-condition contrast vector representing the timing of the 1-back and 4-back conditions. The psychophysiologic (PPI) term was then defined as the product of the amygdala’s neural time series and the two-condition contrast vector, reconvolved with the default hemodynamic response function. To generate whole brain PPI maps, the time series for each voxel was regressed on the amygdala’s hemodynamic time series, the contrast vector, and the PPI product term in a multiple regression model. Beta images for the PPI term from each subject, reflecting the degree to which the slope of the regression line describing the relationship of each voxel’s time series with the amygdala’s time series differed between task conditions, were then carried forward to second level analyses. A whole-brain t-map was generated using voxel-wise one-sample t-tests of these PPI slope differences and thresholded using an FDR-corrected probability threshold of p < .05 and an extent threshold of 3 voxels. Because of our interest in the relationship between amygdala and DLPFC, the PPI t-map was interrogated with the same left and right DLPFC ROI masks used in the functional connectivity analysis described above. In order to test the relationship between amygdala-DLPFC interaction and behavioral effects of 4-back exposure, voxel-wise correlational analyses were performed between the PPI terms and percent decline in 2-back accuracy within the left and right DLPFC ROI masks, again using a voxel-wise probability threshold of p < .05 with FDR and small volume corrections.

In order to derive the separate slope terms for the 4-back and 1-back conditions, separate task condition vectors for the 1-back and 4-back (i.e. vectors consisting of ones and zeros rather than ones and negative ones) were generated in addition to the single two-condition contrast vector. A time series was also extracted using the first eigenvariate of the voxels for DLPFC regions showing a significant PPI with amygdala. Linear regressions were then performed for each task condition vector separately, regressing the DLPFC time series on the amygdala time series, the single task condition vector, and the PPI product. Modeled in this way, the PPI product represents the slope of the DLPFC-amygdala correlation for the specified task condition, deriving separate slopes for the 1-back and 4-back conditions at the first level for each subject. These slopes were then compared across conditions in a second level random effects paired t-test analysis. A similar procedure was used with the original two-condition contrast to extract the amygdala-DLPFC PPI values for the 4-back minus 1-back contrast. To assist in conceptualizing the above analysis, the six subjects showing the greatest percent decline in 2-back accuracy were compared to the six subjects showing the least decline using the Mann-Whitney U statistic.

As a final step, 4-back accuracy, amygdala activity, and amygdala-DLPFC PPI product were entered into a multiple regression model predicting change in 2-back accuracy.


Subjective responses

As expected, subjective reports of difficulty and frustration increased with increasing working memory load. Mean reported difficulty ± SD was 1.13 ± 0.52 for the 0-back, 1.47 ± 0.64 for the 1-back, 2.73 ± 0.80 for the 2-back, and 4.60 ± 0.63 for the 4-back condition. Mean reported frustration ± SD was 1.33 ± 0.72 for the 0-back, 1.47 ± 0.64 for the 1-back, 2.53 ± 0.83 for the 2-back, and 4.13 ± 0.92 for the 4-back condition. Both difficulty (Pearson’s r = −.71, p < .000001) and frustration (Pearson’s r = −.64, p < .000001) were highly correlated with task accuracy. Repeated measures MANOVA revealed a significant main effect of task condition on both difficulty (F(3,12) = 124.6, p < .001) and frustration (F(3,12) = 49.0, p < .001). Post-hoc comparisons indicated that the 4-back was more difficult (F(1,14) = 46.5, p < .001) and frustrating (F(1,14) = 34.5, p < .001) than the 2-back, which in turn was more frustrating (F(1,14) = 26.7, p = .001) and difficult (F(1,14) = 37.7, p < .001) than the 1-back. The 1-back was more difficult (F(1,14) = 7.00, p = .02) than the 0-back, but did not differ significantly in frustration.

Task accuracy

For analysis of accuracy and reaction times, data from the 2-back/1 and 2-back/4 conditions were initially pooled together. Mean accuracy ± SEM at each test condition is depicted in Fig. 2a. Repeated measures MANOVA data revealed a significant main effect of task condition on accuracy (F(3,16) = 34.03, p < .001). Accuracy was worse on 1-back compared to 0-back (F(1,18) = 7.34, p = .01), on 2-back compared to 1-back (F(1,18) = 22.76, p < .001), and on 4-back compared to 2-back (F(1,18) = 58.53, p < .001). Individual comparisons of 2-back/1 and 2-back/4 accuracy is shown in Fig. 2b. A paired t-test confirmed the central hypothesis that 2-back performance would decline immediately following 4-back exposure (mean accuracy ± SEM for 2-back/1 was 85.5 ± 2.8% and for 2-back/4 was 82.0 ± 3.5%, t(19) = 3.1, p = .006). Overall, seven subjects showing declines ranging from 5% to 24%, with the remaining 13 subjects ranging within 5% of no change.

Fig. 2
figure 2

Behavioral measures. a Accuracy on the N-back task (n = 20, mean ± SEM). b Individual data comparing accuracy on the 2-back/1 and 2-back/4 conditions, with mean accuracy represented by the thick solid bars (* p =  .006). c Reaction time on the N-back task (n = 20, mean ± SEM). d Correlation between percent change in 2-back accuracy and 4-back accuracy

Task reaction time

Mean reaction times are plotted in Fig. 2c. Repeated measures MANOVA of individual subjects’ median reaction times revealed a significant effect of task condition (F(3,17) = 24.42, p < .001). Reaction times were longer on 1-back compared to 0-back (F(1,19) = 61.01, p < .001) and on 2-back compared to 1-back (F(1,19) = 31.72, p < .001). Reaction times did not differ significantly between the 2-back and 4-back conditions or between the 2-back/1 and 2-back/4 conditions.

Correlation of 4-back accuracy with decline in 2-back accuracy

Although the 4-back was intended to induce overload effects associated with poor accuracy, there was notable variability in 4-back accuracy, with five out of 20 subjects achieving accuracies of 70% or greater. Based on our hypothesis that cognitive overload during the 4-back condition impairs subsequent 2-back performance, we examined the correlation between 4-back accuracy and subsequent percent change in 2-back accuracy (Fig. 2d). Accuracy on the 4-back task correlated positively with percent change in 2-back accuracy (Pearson’s r = .46, p = .03; Spearman’s rho = .44, p = .06), indicating that poor 4-back performance was associated with a greater decline in subsequent 2-back accuracy.

The N-back task activates a fronto-parietal network

As expected, task performance at the 1-back, 2-back/1, 2-back/4, and 4-back conditions relative to 0-back activated a network that included DLPFC as well as regions in medial prefrontal and parietal cortex (Fig. 3a–c).

Fig. 3
figure 3

N-back activation maps at different loads. ad Brain activation during the 1-back, 2-back/1, 4-back, and 2-back/4 conditions. Maps show areas activated for each condition relative to the 0-back condition. e Brain activation map depicting areas of significantly decreased activity during the 2-back/4 condition relative to the 2-back/1 condition. Statistical maps were thresholded at p < .05 (FDR-corrected) with an extent threshold of 3 voxels

Effect of 4-back exposure on subsequent brain activity during 2-back

The 2-back/4 condition activated the same fronto-parietal network seen in the other conditions, but to a lesser degree and extent (Fig. 3d). To test the hypothesis that brain activity was impaired following 4-back exposure, the 2-back/4 condition was directly contrasted with the 2-back/1 condition. The 2-back/1 minus 2-back/4 contrast yielded widespread hypoactivity following 4-back exposure in areas including significant clusters in prefrontal cortex including DLPFC, cingulate cortex, parietal cortex, cerebellum, and thalamus (Fig. 3e, Table 1). These regions were thus functionally hypoactive following exposure to the 4-back condition. No significant activation differences were observed in the reverse 2-back/4 minus 2-back/1 contrast.

Table 1 Areas showing reduced activity during 2-back performance following exposure to the 4-back condition (2-back/1 minus 2-back/4). Thresholded voxel-wise at p < .05, FDR-corrected. Only clusters with corrected p < .05 are shown. Voxel-level statistics are for the maximally significant voxel within the cluster. Regions listed were those with 10 or more active voxels within the cluster

Region of interest analysis of load-dependent activity in DLPFC, amygdala, and ventromedial PFC


Repeated measures MANOVA indicated a trend toward a main effect of task condition (F(2,17) = 3.33, p = .06) on DLPFC activity, as well as a significant condition x hemisphere interaction (F(2,17) = 14.25, p < .001). Left DLPFC exhibited an inverted-U shaped load-response curve, while right DLPFC showed monotonically increasing activity with increasing load (Fig. 4a). Orthogonal polynomial contrasts confirmed that overall, the quadratic trend exhibited a significant condition x hemisphere interaction (F(1,18) = 21.80, p < .001). To parse these interactions, further analyses were conducted separately for left and right DLPFC. Repeated measures MANOVA in left DLPFC indicated a significant main effect of load (F(2,17) = 4.17, p = .03), with polynomial contrasts indicating a significant quadratic (F(2,17) = 6.71, p = .02) effect. While activity in the 2-back and 4-back conditions did not differ from each other, both showed greater activity than the 1-back (2-back vs. 1-back: F(1,18) = 8.75, p = .008; 4-back vs. 1-back: F(1,18) = 4.56, p = .05). Repeated measures MANOVA in right DLPFC indicated a trend toward a main effect of condition (F(2,17) = 3.03, p = .075), with polynomial contrasts showing a positive linear trend (F(1,18) = 6.37, p = .02).

Fig. 4
figure 4

Load-dependent activation in regions-of-interest. a Load-dependent activation of left and right DLPFC (mean ± SEM). b Comparisons between 2-back/1 and 2-back/4 in DLPFC (*p = .009, **p = .001). c Load-dependent suppression of left and right amygdala activity (mean ± SEM). d Load-dependent suppression of left and right VMPFC activity (mean ± SEM)

For comparisons between 2-back/1 and 2-back/4, the two-way repeated measures MANOVA revealed significant main effects of condition (F(1,18) = 13.16, p = .002) and hemisphere (F(1,18) = 7.36, p = .01), and a significant condition x hemisphere interaction (F(1,18) = 7.27, p = .02). Follow-up paired t-tests revealed lower activity in 2-back/4 relative to 2-back/1 in both left (t(18) = 2.93, p = .009) and right (t(18) = 4.03, p = .001) DLPFC (Fig. 4b). Right DLPFC activity was lower than left DLPFC activity for the 2-back/4 (t(18) = 3.37, p = .003) condition and trended lower for the 2-back/1 condition (t(18) = 1.94, p = .068).


Both left and right amygdala exhibited load-response curves opposite in direction to those of the DLPFC (Fig. 4c), with decreased activity at higher loads relative to the 0-back condition. Two-way repeated measures MANOVA indicated a significant main effect of load (F(2,17) = 14.75, p < .001), but no significant hemisphere or load x hemisphere effect. Post-hoc contrasts indicated significantly reduced activation in the 2-back (F(1,18) = 30.93, p < .001) and 4-back (F(1,18) = 23.75, p < .001) conditions relative to 1-back.

For comparisons of amygdala activity between 2-back/1 and 2-back/4, two-way repeated measures MANOVA did not reveal significant effects of condition, hemisphere, or their interaction.


Load response curves for both left and right VMPFC were similar to the amygdala response curves, exhibiting load-related reduction of activity (Fig. 4d). Two-way repeated measures MANOVA revealed significant main effects of load (F(2,17) = 9.87, p = .001) and hemisphere (F(1,18) = 5.28, p = .03), but no significant interaction. The hemisphere effect reflected greater reduction of activity in left VMPFC than in right VMPFC. The condition effect reflected a monotonic reduction of VMPFC activity across loads, as indicated by a negative linear effect (F(1,18) = 19.83, p < .001). Moreover, post-hoc contrasts showed reduction of VMPFC activity to be greater in the 2-back (F(1,18) = 8.75, p = .008) and 4-back (F(1,18) = 19.83, p < .001) conditions relative to the 1-back condition, and in the 4-back condition relative to the 2-back condition (F(1,18) = 9.51, p = .006).

For comparisons of VMPFC activity between 2-back/1 and 2-back/4, two-way repeated measures MANOVA did not show a significant main effect of condition or hemisphere, but did show a significant condition x hemisphere interaction (F(1,18) = 7.02, p = .02). Activity in the 2-back/4 condition appeared to be less suppressed than in the 2-back/1 condition in both left and right VMPFC, with this effect being greater in the left than the right (data not shown). However, paired t-tests for differences between the 2-back/1 and 2-back/4 conditions did not reach significance in either left or right VMPFC, nor was the left-right difference significant in the 2-back/1 or 2-back/4 conditions.

Correlation of percent change in 2-back accuracy with Amygdala activity

As predicted, percent change in 2-back accuracy exhibited a negative correlation with mean 4-back activity in bilateral amygdala (Pearson’s r = −.56, p = .01; Spearman’s rho = −.45, p = .05) (Fig. 5a). These correlations indicated that decreased suppression of amygdala activity during the 4–back condition was associated with greater subsequent decline in 2-back accuracy. Comparable results were found using a voxel-based approach (maximum left amygdala voxel Z = 3.04, p = .013; maximum right amygdala voxel Z = 3.02, p = .013) (Fig. 5b).

Fig. 5
figure 5

Correlation of amygdala activity in the 4-back minus 0-back condition with percent change in 2-back accuracy. a Analyses based on mean activity of voxels within bilateral amygdala. b Voxel-based correlational analysis within Talairach-defined bilateral amygdala mask (probability threshold of p < .05, FDR- and small volume corrected). Image shown is at y = −4 mm

Functional connectivity between Amygdala, VMPFC and DLPFC

Amygdala activity positively correlated with activity in VMPFC (maximum voxel Z = 4.69, p < .001), as well as areas in medial temporal lobe and lateral temporal cortex (Fig. 6a). Amygdala activity negatively correlated with activity in DLPFC (maximum voxel Z = 6.20, p < .001), as well as areas of bilateral ventrolateral cortex, dorsomedial cortex, parietal cortex, and cerebellum (Fig. 6b). DLPFC activity was also negatively correlated with VMPFC activity (maximum voxel Z = 5.51, p < .001, not shown).

Fig. 6
figure 6

Functional connectivity maps for the amygdala across experimental conditions (FDR-corrected threshold of p < .05). a Brain areas showing positive correlation with amygdala activity. b Brain areas showing negative correlation with amygdala activity

Psychophysiologic interaction (PPI) analyses

Second-level random effects analyses did not reveal any significant differences in the amygdala-DLPFC functional relationship (i.e. regression line slope) between the 1-back and 4-back conditions. However, across subjects, percent change in 2-back performance correlated with the PPI between left amygdala and a 14-voxel region in left DLPFC (Pearson’s r = .70, p = .001; Spearman’s rho = .55, p = .015) (Fig. 7a). Subjects with a more negative amygdala-DLPFC regression line slope during 4-back as compared to 1-back exhibited a more negative percent change in accuracy from 2-back/1 to 2-back/4, corresponding to greater vulnerability to the deleterious effects of 4-back exposure. In contrast, subjects with a more positive (or less negative) amygdala-DLPFC regression line slope during 4-back as compared to 1-back exhibited a smaller decline in 2-back accuracy, corresponding to decreased vulnerability. To aid in conceptualizing this result, subjects were divided into tertiles based on percent change in 2-back accuracy. The six subjects with the greatest decline in 2-back accuracy (“vulnerable group”) were compared to the six subjects with the least decline (“resilient group”). The resilient group showed a positive PPI between left amygdala and left DLPFC for the 4-back minus 1-back condition contrast. Inspection of the individual condition slopes for this group indicated that the activities of the amygdala and DLPFC were inversely related (negative slope) in the 1-back condition but directly related (positive slope) in the 4-back condition (Fig. 7b). In contrast, the vulnerable group showed a negative PPI between left amygdala and left DLPFC for the 4-back minus 1-back contrast. These subjects exhibited an inverse relationship in the 4-back condition. The 1-back vs. 4-back PPI between left amygdala and left DLPFC significantly differed between the vulnerable and resilient groups (p = .02, Mann-Whitney U-test). For right amygdala-DLPFC, the PPI correlational analysis did not yield any statistically significant activations, and no further analysis was performed.

Fig. 7
figure 7

Psychophysiologic interaction between left amygdala and left DLPFC. a Voxel-based analysis showing left DLPFC region where difference in functional connectivity with left amygdala between 1-back and 4-back conditions correlated with percent change in 2-back accuracy (maximum voxel at −40, 40, 36; Z = 4.23, p = .002 FDR-corrected). b Slopes of the amygdala-DLPFC interaction during 1-back and 4-back for the 6 subjects showing the least decline in 2-back accuracy (resilient group) and the 6 subjects with the greatest decline (vulnerable group). * p = .02, Mann-Whitney U-test

Multiple linear regression model

Using a multiple linear regression model, percent change in 2-back accuracy was regressed on the three separate predictive factors identified above: 1) 4-back accuracy, 2) mean activity of voxels in bilateral amygdala during the 4-back condition, and 3) beta coefficient (i.e. slope difference) from the 4-back vs. 1-back PPI of left amygdala and left DLPFC activity. The multiple R for the full model was highly significant (R = .90, F(3,15) = 20.27, p < .001), accounting for 80% of the variance in 2-back performance decline. Each of the three standardized beta coefficients for the predictor variables was statistically significant in the full model (4-back accuracy: beta = .52, R2 change = .22, p = .001; amygdala deactivation: beta = −.51, R2 change = .20, p = .001; PPI: beta = .44, R2 change = .16, p = .003), and tolerance exceeded .78 for each of the predictor variables, indicating that each made relatively independent contributions to the prediction of 2-back accuracy decline.


In this study, overloading the working memory system with a high-load 4-back task induced a significant decline in subsequent working memory function and robust decreases in brain activity in various regions including relatively large areas of the prefrontal cortex. Although the mean decline in 2-back accuracy across subjects was modest, there was notable individual variability in the degree of decline, with approximately a third of the subjects showing declines of greater than 5%, and no subjects showing improvements of greater than 5%. We demonstrated three separate factors predicting vulnerability to the overload effects: 1) poor accuracy on the 4-back condition, indicative of cognitive overload, 2) a relative lack of amygdala suppression in response to higher working memory loads, and 3) a more negatively sloped regression line relating amygdala activity to DLPFC activity in the 4-back condition relative to the 1-back control condition. The multiple regression analysis indicated that these factors acted independently and taken together, accounted for approximately 80% of the variance in the decline in 2-back accuracy. These results are consistent with a model in which vulnerability to working memory performance declines subsequent to excessive load is dependent upon 1) the degree to which capacity constraints are exceeded, 2) the degree of amygdala response to the overload, and 3) the degree of inverse amygdala-DLPFC coupling during the overload.

The hypoactivity observed in the 2-back/4 condition relative to the 2-back/1 condition occurred in regions including DLPFC and dorsomedial PFC that were active during N-back performance, suggesting involvement of task-relevant functional circuitry. As the DLPFC has been shown to have a specific role in working memory, and dorsomedial PFC is involved in cognitive control and performance monitoring (Botvinick et al. 2004; Ridderinkhof et al. 2004), it is not surprising that decreased activity in these regions was associated with impaired accuracy. However, the magnitude and extent of the clusters seen in the 2-back/1 minus 2-back/4 contrast, particularly in the right lateral prefrontal cortex (Fig. 3e), suggest that other mechanisms besides task-specific hypoactivity may be operating as well. One possibility is that reduced activity in the 2-back/4 condition was in part due to decreased functioning of brain circuitry involved in limbic suppression. The right lateral prefrontal cortex has been shown to be particularly important in the cognitive modulation of emotional responses (Hariri et al. 2003; Levesque et al. 2003). Voluntary regulation of negative affect has also been associated with activations in right dorsomedial PFC, right DLPFC, bilateral lateral orbitofrontal cortex, right ventrolateral PFC, and bilateral dorsal anterior cingulate (Phan et al. 2005). All of these regions exhibited decreased activation during the 2-back/4 condition relative to the 2-back/1 condition in our study.

Our hypothesis regarding the role of the amygdala was confirmed in that amygdala activity during the 4-back condition correlated positively with decline in subsequent 2-back performance. The physiologic mechanisms underlying this correlation may be related to mechanisms involved in emotional interference of verbal working memory (Gray 2001). As the task itself was emotionally neutral, the variability in amygdala response across subjects may have been related to the subjects’ differential emotional response to the experience of task failure. To the extent that this is so, amygdala involvement in the face of cognitive overload may depend on psychological factors such as attributional style (Peterson and Seligman 1984) and be a marker for more general vulnerability to stress-related cognitive impairments. Interestingly, even though the subjects in our experiment reported greater frustration at higher loads, there was an overall tendency for amygdala activity to be less active at higher loads compared to lower loads, a signature notably distinct from most paradigms designed to elicit an emotional response. This could have been due to an increased shift of brain resources toward prefrontal circuitry in response to greater cognitive demands, which is consistent with our finding that relative failure to suppress amygdala activity was associated with performance declines. Furthermore, amygdala activity did not differ significantly between the 2-back/1 and 2-back/4 conditions. One possible explanation is that amygdala involvement during the 4-back condition led to prefrontal effects that carried over and affected the subsequent 2-back condition. For example, prefrontal resources may have been diverted toward amygdala suppression, resulting in similar amygdala activity between the 2-back/1 and 2-back/4 conditions. This notion is further supported by our PPI analysis, which indicated that changes in the amygdala-DLPFC relationship between the 1-back and 4-back conditions appear to affect behavioral performance above and beyond the effects of amygdala activation alone. More specifically, subjects who were less able to recover normal function following 4-back exposure exhibited greater inverse amygdala-DLPFC connectivity during the difficult 4-back condition relative to the control 1-back condition. Presence of strongly coupled inverse linkage may have facilitated DLPFC interference by limbic activity (Dolcos and McCarthy 2006) and/or diversion of prefrontal resources to suppressing amygdala activity via top down inhibition (Pessoa et al. 2005). The VMPFC-amygdala correlation and VMPFC-DLPFC anti-correlation we observed are consistent with previous work (Ghashghaei and Barbas 2002) and support the possibility that indirect connections traversing the VMPFC could mediate the DLPFC-amygdala connectivity observed in our study. Rather than passively transmitting limbic information to the DLPFC, the VMPFC may play a critical role in signaling the cognitive appraisals to the dorsal raphe nuclei (Amat et al. 2005), which together with the ventral tegmental area and locus coeruleus (Krystal et al. 1989; van der Kolk et al. 1985), may influence the expression of cognitive impairment through ascending modulation of limbic and cortical activity.

One possible confound we considered when designing this study was the possibility that some effects could be due to mental fatigue associated with performing a more demanding 4-back task compared to an easier 1-back task. We minimized the potential of this occurring by using an alternating block pattern, a rest period between blocks, and a 4-back block duration of 40 s that is considerably shorter than durations of other cognitively demanding tasks used to experimentally induce mental fatigue (Cook et al. 2007; Tartaglia et al. 2008). We also did not observe significant reaction time differences between the 2-back/4 and 2-back/1 blocks that would suggest differences in overall task attention or motivation. In the future, a more careful study of subjective cognitive and emotional responses to the high-load condition may be informative. A more inherent limitation of this study is the difficulty in completely isolating initial input effects (i.e. varying working memory load) from recurrent feedback effects such as the emotional impact of failure experienced during the 4-back condition. As emotional feedback is an intrinsic component of human experience and difficult to control experimentally in an overload condition, we intentionally chose to not inform subjects of their performance, thus allowing for natural emotional responses and their impact to be part of the overall response. Thus, we conceptualized “overload” in a functional sense as initial task conditions that exceed the brain’s ability to be perform accurately, with the understanding that feedback responses will come into play. In doing so, we were able to show in our regression model that amygdala activity and amygdala-DLPFC interactions, as intermediate responses, accounted for a portion, but not all, of the overall behavioral effect. One other limitation of our study is the lack of quantitative temporal information characterizing the observed neural and behavioral effects. The time course of the impact on accuracy was presumed in our study to be less than two blocks long (the time between the 2-back/1 and 2-back/4 conditions). Quantitative measurement of effect duration would require more data than was collected for this study. Also, it would be interesting to examine how the duration of the overload condition affects both the magnitude and duration of accuracy decline, particularly given that adaptive stress responses in animals appear to vary depending on whether the stressor is acute or chronic in duration (McEwen 2004). More generally, greater understanding of the mechanisms and dynamics related to overload-induced effects in neural systems may provide further insight into the physiologic processes by which capacity overload can lead to adaptive or maladaptive behavioral changes.