Introduction

Despite more than 25 years of intensive research there is still a vigorous and ongoing debate concerning the specific aspects of complex and adaptive behaviors that are mediated by brain dopamine (DA) function (Ikemoto and Panksepp 1999; Redgrave et al. 1999; Salamone and Correa 2002; Schultz 2002; Berridge and Robinson 2003; Joseph et al. 2003; Wise 2004; Kelley et al. 2005; Salamone et al. 2005; Young et al. 2005). Much of the attention is focused on the nucleus accumbens (NAc) and a consensus has formed around the theory that DA innervation of this structure plays a key role in incentive motivation, a Pavlovian conditioned appetitive state that can influence the vigor of approach behavior (Fibiger and Phillips 1986; Robbins et al. 1989; Berridge and Robinson 1998; Cardinal et al. 2002; Parkinson et al. 2002). Consistent with the importance of conditional stimuli (CS+) in the initiation of approach behavior, visual and olfactory stimuli associated with natural rewards such as food or sexually receptive conspecifics can evoke significant increases in DA efflux in the NAc that precede similar changes observed during consummatory behavior (Fiorino et al. 1997; Ahn and Phillips 1999, 2002). Depletion or antagonism of DA function in the NAc eliminates or diminishes the influence of Pavlovian CSs on the level of instrumental responding, i.e., Pavlovian to instrumental transfer (Cardinal et al. 2002; Parkinson et al. 2002), and additionally, shifts response choice from that with higher work demand and a food reward of greater value to that with lower work cost and lower reward value (Salamone et al. 1994; Salamone and Correa 2002).

Contemporary views of instrumental conditioning propose that performance of an instrumental task is controlled by two distinct processes. Based on carefully designed experiments, Dickinson et al. (1995) have demonstrated that during initial learning trials (i.e., after exposure to 120 outcomes), an animal’s performance is based on the knowledge (or expectation) that an instrumental action will lead to a specific biologically significant outcome. After many more trials (i.e., after exposure to 360 outcomes), responses gradually shift from being outcome-dependent to habit-based. It is only in the early action–outcome controlled stage that instrumental performance is sensitive to manipulations that alter the incentive value of the outcome, underscoring the remarkable ability of rats to acquire and encode information relating current incentive value to an action–outcome contingency in as little as 120 outcome trials. Using outcome devaluation and Pavlovian to instrumental transfer tests, lesions of the basolateral amygdala have been reported to impair the capacity of rats to encode the relation between a specific action and the value of an outcome (Corbit and Balleine 2005).

Given the generally accepted role of DA in the NAc in Pavlovian-based incentive motivation, the question arises as to whether DA activity in this region is involved in action–outcome and/or habit-based based stages of instrumental responding. A role for either the core or shell region of the NAc in the formation of action–outcome associations has not been confirmed (Balleine and Killcross 1994; Corbit et al. 2001; de Borchgrave et al. 2002), but deficits in the acquisition of instrumental responding have been reported after blockade of the NMDA class of glutamate receptors in the NAc core (Kelley et al. 1997). In a recent study, Yin et al. (2006) examined DA function during early action–outcome as distinct from later habit-based stages of instrumental responding. Mice, with a knockdown of the DA transporter and chronically elevated levels of DA, showed no deficits in acquisition of an instrumental task and responding early in training was still sensitive to tests of outcome devaluation (Yin et al. 2006). However, in rats with bilateral 6-OHDA lesions of the nigrostriatal DA system, responding on instrumental tasks remained sensitive to reward devaluation despite extensive training sessions (Faure et al. 2005). Thus, the nigrostriatal DA system, and more specifically, its innervation of the posterior lateral striatum, appears to be necessary for transition of instrumental conditioning from an action–outcome stage to a habit-based stage.

Important insights into the role of DA transmission at different stages of instrumental behavior may be gained by examining in vivo changes in DA levels in major areas of DA innervation. Therefore, we conducted microdialysis experiments in the NAc and mediodorsal (MD) striatum at early (5th day) and later (16th day) stages of instrumental learning employing a protocol used successfully to demonstrate that responding after 120 but not 360 reinforced responses is sensitive to outcome devaluation and is therefore action–outcome as distinct from habit-based (Dickinson et al. 1995). Given the finding that quinolinic acid or NMDA-induced cytotoxic lesions of the NAc failed to alter suppression of instrumental responding after outcome devaluation (de Borchgrave et al. 2002), it is unlikely that activity in the NAc is involved in instrumental incentive learning. Rather, the finding that lesions restricted only to the NAc completely abolished Pavlovian to instrumental transfer is consistent with the involvement of DA function in this nucleus in conditioned appetitive states. Therefore, we hypothesize that DA efflux in the NAc may be comparable during both early action–outcome and later habit-based stages of instrumental responding. We also examined changes in DA levels in the MD striatum, as a control site for generalized activity; hence, we predicted no significant changes in DA efflux associated with instrumental responding. The design of this study also incorporated a “within-trial” extinction phase to test the hypothesis that DA efflux in the NAc but not MD striatum would be increased significantly by presentation of a Pavlovian CS+ paired previously with food pellets during instrumental responding in extinction. The relationship between response rate and magnitude of DA efflux in either NAc or MD striatum was also of interest.

Materials and methods

Surgery

Long–Evans male rats (Charles River, Canada) weighing 280–310 g were implanted bilaterally with stainless steel guide cannulae (19 gauge, 15 mm) under anesthesia induced by xylazine (7 mg/kg) and ketamine hydrochloride (100 mg/kg) delivered intraperitoneally. Cannulae were implanted 1 mm below dura immediately above the NAc [in millimeter: +1.7 anteroposteriorly (AP) and ±1.1 mediolaterally (ML) from bregma] or MD striatum (+1.2 AP, 2.5 ML) and secured with dental acrylic and four stainless steel screws. Stylets maintained patency of the cannulae until probe implantation. An additional sham-cannula was embedded within the back half of the acrylic head cap for purposes of habituation (see Behavioral apparatus and training).

Immediately after surgery, rats were moved to a reverse light-cycle (lights on 7 a.m.to 7 p.m.) colony room maintained at 20°C and housed individually in plastic bins lined with corncob bedding. Novel objects (e.g., egg cartons and paper rolls) were placed weekly in the cages to promote exploratory and play behavior. Rats were handled and weighed by the experimenter on a daily basis. Four days after surgery, rats were placed on a food restriction schedule for the duration of the experiment, maintaining their body weight at ∼85% of their free-feeding weight. The daily ration of food (20–25 g Purina Rat Chow) was given to the rats in their home cages following each day’s operant responding session. Water was available at all times except during the operant training and testing sessions.

Behavioral apparatus and training

Training and experiment sessions were conducted in a Plexiglas chamber (30×23×23 cm) with wire-mesh flooring. It was equipped with a retractable lever, a dispenser that delivered 45 mg Noyes pellets, and a light source (1 W, 12 V) located on the wall opposite the location of the lever. Pellets were dispensed into a photocell-equipped magazine located just left of the lever. The chamber was enclosed in a sound-attenuated ventilated box (Colbourn Instruments; Allentown, PA, USA) which had a small hole in the ceiling to allow passage of dialysis lines.

Training began 8 days into the food-deprivation schedule. Before a rat was placed in a testing chamber, a stainless steel coil was attached to the sham-cannula. This allowed the rat to habituate to the weight and feel of being tethered to the stainless steel coil which normally sheathes microdialysis lines during experiments. Each session began with rats being placed in operant chambers under baseline conditions (coil attached, lever retracted, and light off), the length of which was varied from day to day. Illumination of the light and insertion of the lever signaled the availability of pellets contingent on lever presses. A computer was used to control the equipment and record the number of lever presses and nose poke entries into the magazine.

At the start of the study, rats were trained to approach the magazine to retrieve 30 pellets delivered on a random-time 60-s schedule. Instrumental training sessions began the next day (day 1) on a RI 2-s schedule of pellet delivery. The schedule was changed to RI 15-s on day 2 and then to RI 30-s on day 3, and remained so for the remainder of the training and microdialysis sessions. Each instrumental training session was terminated when rats had responded for 30 pellets, except on days 4 and 15 when rats were allowed to press for 60 pellets (see Fig. 2). To minimize the effect of stress before microdialysis tests, rats remained overnight in the chambers under baseline conditions with water, after the training session on day 2.

Microdialysis probes and high pressure liquid chromatography system

Microdialysis probes were constructed as previously described (Ahn and Phillips 1999; Fiorino et al. 1997). They were concentric in design with a 2-mm semipermeable membrane (340 μm outer diameter, 65 kD MW cut-off, Filtral 12, Hospal; Neurnberg, Germany) with PE50 inlet and silica outlet tubing. A probe assembled in this manner typically had, at 21°C, in vitro recoveries of ∼22% DA. Once a probe was implanted in the brain through the guide cannula (see “Microdialysis experiments” section below for implant coordinates), a cylindrical brass collar secured the probe in place. The inlet tubing was connected to a liquid swivel (Instech 375s; Plymouth Meeting, PA, USA) which was mounted on top of the Colbourn box. Both inlet and outlet tubing were encased in a protective stainless steel coil which extended from the liquid swivel to the brass collar. A 2.5-ml gas-tight syringe (Unimetrics) and syringe pump (Model 22, Harvard Apparatus; South Natick, MA, USA) were used to perfuse a modified Ringer’s solution (in millimolar: 10 sodium phosphate buffer, 1.3 CaCl2, 3.0 KCl, 1.0 MgCl2, 147 NaCl, pH 7.4) through the probe at 1 μl/min.

DA content in microdialysis samples were separated by high pressure liquid chromatography (HPLC) and quantified by electrochemical detection. The HPLC system consisted of, in sequence of flow, a Bio-Rad pump (Hercules, CA, USA), an SSI pulse damper (State College, PA, USA) a Valco Instruments two-position auto-Injector (EC10W; Houston, TX, USA), a Beckman reverse-phase column (Ultrasphere, ODS 5 μm, 15 cm, 4.6 mm internal diameter; Fullerton, CA, USA), an ESA guard cell (Model 5020; Chemlsford, MA, USA), an ESA analytical cell (Model 5011), an ESA Coulochem II Electrochemical detector, and a dual channel chart recorder. The electrochemical detection parameters were: +450 mV for the oxidation channel, −300 mV for the reduction channel, and −450 mV for the guard cell output. The mobile phase (in millimolar: 83 sodium acetate buffer, 27 EDTA, and 1.30 sodium octyl sulfate at pH 3.5, 10% methanol) flowed through the system at 1 ml/min.

Microdialysis experiments

In all experiments, each rat was tested with microdialysis on days 5 and 16. The side of probe implantation (i.e., left or right brain structure) was counterbalanced for each experiment day. The two microdialysis sessions were conducted using an identical protocol. The day before an experiment, rats were implanted unilaterally with probes in the NAc (exposed membrane spanned −6.0 to −8.0 mm DV from dura) or MD striatum (−4.0 to −6.0 mm DV from dura) after the day’s training session and kept overnight (∼16–18 h) in the test chamber with water under baseline conditions. The next morning, microdialysis samples were collected at 10-min intervals (10 μl volume) and analyzed immediately with HPLC. Baseline conditions were continued until four consecutive samples showed stable DA levels (i.e., <10% fluctuation between samples; times 1–4).

In “Experiment 1”, the baseline period was followed by the concurrent illumination of the chamber and insertion of the lever. Rats were then allowed a 30-min period (times 5–7) during which they could lever press for food on a RI 30-s schedule. The session concluded with the retraction of the lever and a 40-min time-out period with the lights off (times 8–11). In “Experiment 2”, a 30-min extinction phase was incorporated (times 5–7) before a food reward phase. All aspects of the extinction phase were identical to previous rewarded training sessions except that lever presses did not lead to food delivery. A 20-min time-out period under baseline conditions (times 8–9) preceded a 10-min priming period when the light was on and five pellets were delivered randomly and noncontingently (time 10). The start of a food-rewarded phase (times 11–13) was marked by the insertion of the lever into the already illuminated chamber. During this period, rats were allowed to lever press for pellets on a RI 30-s schedule for 30 min. An experiment concluded with an additional 40-min postsession baseline period (times 14–17).

Histology

Rats were deeply anesthetized with chloral hydrate and intracardially perfused first with 0.9% NaCl and then phosphate-buffered formalin (3.7% formaldehyde). The removed brains were stored in 15% (w/v) sucrose in formalin for at least 24 h before being prepared as 50 μm coronal sections on 2% gel-coated slides. Cresyl violet staining was used to help verify placement of probe tracts. Only data from those rats with tracts in the shell/core region of the NAc (16 of 18 rats) and MD striatum (six of seven rats) of both hemispheres were included in the statistical analyses.

Data analyses

Neurochemical data were transformed into percentage of change from baseline (i.e., 0% representing the average concentration of the three samples preceding the final 4th baseline sample). Neurochemical data were analyzed using either a one-way (time) or two-way (day × time) repeated measures ANOVA followed by the Dunnett method of multiple comparisons, using the final baseline sample (time 4) as the control sample. The Huynh–Feldt correction for nonsphericity was applied to the degrees of freedom for all within-subject analyses. Comparisons between two means were assessed using paired t tests. A coefficient of determination (R 2) was computed based on a linear regression of a scatter plot between lever presses/10 min (y-axis) and corresponding percent changes in DA efflux (x-axis), for data obtained on test days 5 and 16. Statistical analyses were performed using Systat or SPSS statistical packages.

Results

Experiment 1

Changes in extracellular DA levels in the NAc were compared after limited and extended training sessions, while rats lever pressed for food reward on a RI 30-s schedule. As shown in Fig. 1, the rates of responding approximately doubled from days 5 to 16, even though rats obtained and consumed a similar number of rewards during the two sessions [43.3 pellets (∼2.0 g) on day 5 and ∼46.2 pellets (∼2.1 g) on day 16]. The number of magazine entries also did not differ between test days. Separate one-way ANOVAs revealed a significant main effect of time on DA efflux on day 5 (F 7,49=12.038, p<0.001) and day 16 (F 7,49=5.325, p<0.044). Further analyses indicated that on both days, there was a significant increase in DA efflux above their respective baselines that remained elevated for the 30-min duration of instrumental responding for food pellets (Dunnett’s, p<0.05). Despite the doubling of response rates from days 5 to 16, the pattern and magnitudes of DA efflux on the two test days were not statistically different (maximal increase of 69±16% on day 5 and 71±28% on day 16), as a two-way repeated measures ANOVA failed to show a significant interaction of day × time on DA efflux (F 7,98=0.598, p=0.528). However, a paired t test showed that during the first 10 min (time 5), DA efflux was significantly higher on day 16 than day 5 (68±24 vs 30±12%, respectively; p<0.05). Basal values of DA in the NAc (uncorrected for recovery) were 2.72±0.61 and 2.19±0.29 nM for days 5 and 16, respectively, and were not statistically different from each other.

Fig. 1
figure 1

Change in DA efflux in the NAc (line graph) during instrumental conditioning on day 5 (left panel) and day 16 (right panel) in the same group of rats (n=8). Dotted lines highlight the Reward phase in which food pellets were delivered on a RI 30-s schedule (times 5–7). Lever presses (gray bars) and magazine entries (black bars) are shown per 10 min bin. Data are represented as mean±SEM. * indicates significant difference from the final baseline value (time 4) according to Dunnett’s method of multiple comparisons, p<0.05. † indicates significant difference from the corresponding data point on day 5 according to a paired t test, p<0.05

Experiment 2

Instrumental behavior

As shown in Fig. 2, all rats in the NAc and MD striatal microdialysis groups learned to lever press for food pellets on a RI 30-s schedule and to retrieve the pellets, as indicated by the number of magazine entries. Training data for one rat in the MD striatal group was lost and not included in the following analyses. Instrumental behavior became more efficient through the initial days of training, and during the third and fourth training sessions, rats made significantly more lever presses than magazine entries. This increase in ratio of lever presses to magazine entries, from limited to extended training experience, may be explained by proposing that rats learned to use the auditory click made by the dispenser with the delivery of a pellet. Thus, rats learned to enter the magazine only when they heard the click. On days (4 and 15) before microdialysis tests, rats were allowed to press for 60 pellets, rather than the normal 30 pellets available on other days, and accordingly increased the rate of lever presses on these days. The mean number of lever press responses per training session across the RI 30-s schedule tended to be higher in the NAc group (from 327 to 835) than in the MD striatal group (from 367 to 436). However, an ANOVA test revealed no main effect of group (F 1,11=2.414, p=0.149). ANOVA of magazine entry data similarly indicated that counts were comparable between the NAc and MD striatal groups (F 1,11=0.366, p=0.557).

Fig. 2
figure 2

Instrumental performance of rats in the NAc or MD striatal group over 16 training sessions. Rats were trained to lever press for food pellets on a RI 2-s schedule the first day, RI 5-s the next day, and then RI 30-s for the remainder of the study. Each training session terminated upon delivery of 30 pellets (exception: *60 pellets delivered on days before test). Test days (day 5 and 16) were composed of two phases, Extinction and Reward, which lasted 30 min each. Lever presses (gray and striped bars) and magazine entries (black bars) are shown per training session. Data are represented as mean+ SEM

On test days, rats were tested for instrumental responding in extinction and then with food reward. During the extinction phase, the pellet dispenser was disconnected from the magazine, but still produced an auditory click according to the RI 30-s schedule. During the rewarded phase, all rats consumed every pellet delivered during the lever press sessions. During the rewarded component of the experiment, rats consumed ∼45.7 pellets (∼2.1 g) on day 5 and ∼46.2 pellets (∼2.1 g) on day 16. The similarity in food consumption and difference in magnitude of DA efflux (see below), across days 5 and 16, again supports the view that DA response is not a function of food reward. In both the NAc and MD striatal groups (Figs. 3 and 4), rate of instrumental performance approximately doubled from day 5 to day 16, during both the extinction and rewarded phases of the sessions. Over the 30-min extinction phase, rats displayed a typical decline in rate of lever presses, whereas over the 30-min reward phase, there was a slight increase in rate of responding. In the NAc group, the number of lever presses was significantly higher on day 16 than on day 5 during the first 10 min of the extinction (t test, p=0.003) and rewarded (t test, p=0.002) phases. In the MD striatal group, the lever press rates during the first 10 min of the extinction phase were comparable on days 5 and 16, but differed significantly between the 2 days during the first 10 min of the rewarded phase (p=0.038). In both the NAc and MD striatal groups, an ANOVA indicated that the number of magazine entries did not differ significantly between days 5 and 16.

Fig. 3
figure 3

Change in DA efflux in the NAc (line graph) during instrumental conditioning on day 5 (left panel) and day 16 (right panel) in the same group of rats (n=8). Dotted lines highlight the Extinction phase during which an auditory CS+ was activated on an RI 30-s schedule in the absence of food pellet rewards (times 5–7) and a Reward phase in which food pellets were delivered on a RI 30-s schedule (times 11–13). P represents the period during which five priming pellets were delivered noncontingently. Lever presses (gray bars) and magazine entries (black bars) are shown per 10 min bin. Data are represented as mean±SEM. * Indicates significant difference from the final baseline value (time 4) according to Dunnett’s method of multiple comparisons, p<0.05. † Indicates significant difference from the corresponding data point on day 5 according to a paired t test, p<0.05. § Indicates significant difference from the final baseline value (time 4; t test, p<0.05)

Fig. 4
figure 4

Change in DA efflux in the MD striatum (line graph) during instrumental conditioning on day 5 (left panel) and day 16 (right panel) in the same group of rats (n=6). See Fig. 3 for explanation of symbols. No statistically significant results were observed for neurochemical data set. † Indicates significant difference from the corresponding behavioral score on day 5 according to a paired t test, p<0.05

DA efflux

Basal values of DA in the NAc (uncorrected for recovery) were 2.58±0.32 and 2.16±0.16 nM for days 5 and 16, respectively, and were not statistically different from each other. Basal values of DA in the MD striatum (uncorrected for recovery) were 2.71±0.23 and 2.95±0.34 nM for days 5 and 16, respectively, and also did not differ statistically.

In the NAc, the overall pattern of DA efflux across the different phases of the microdialysis test on day 5 appeared comparable to that observed on day 16 (Fig. 3), but statistical analyses revealed several key differences. Accordingly, an ANOVA identified a significant day × time interaction on DA efflux (F 13,182=2.760, p=0.022), with a significant simple main effect of Time on day 5 (F 13,91=8.690, p<0.001) and day 16 (F 13,91=14.649, p<0.001). On both days, there was an increase in NAc DA levels during the initial 10 min of responding in extinction (10±7% above baseline on day 5 and 19±6% on day 16), but this increase was only significant after extended training on day 16 (paired samples t test, p=0.005). DA efflux then returned to baseline for the remainder of the extinction phase and time-out period. During the 10-min period preceding the rewarded responding phase, five pellets were noncontingently dispensed into the magazine; the purpose of this was to prime the rats to lever press again for food reward. During this period, DA levels did not differ from baseline values on day 5 (4±9%) but by day 16, were increased significantly above baseline (27±7%; Dunnett’s test, p<0.05). The reward phase of the session was accompanied by significant elevation of DA efflux on both days (maximal increase of 37±12% on day 5 and +83±15% on day 16; Dunnett’s test, p<0.05) that remained elevated after retraction of the lever and cessation of instrumental responding. DA levels then gradually declined towards baseline values over the remaining 60 min of the test session. During the initial 10 min of the reward phase (time 11), the magnitude of change on day 16 (83±15%) was significantly greater than the change in efflux observed on day 5 (37±12%; t test, p=0.013).

In the MD striatum, there were no statistically significant changes in DA levels during the entire instrumental responding session on both days 5 and 16 (Fig. 4). Despite performing lever presses and magazine entries at rates comparable to the NAc group, DA levels in this group showed only slight fluctuations around baseline.

Correlation between response rate and DA efflux

Based on a linear regression of a scatter plot between lever presses/10 min and percent change in DA efflux, R 2 values of 0.0014 for data from day 5 and 0.0065 on day 16 indicated that <1% of the variation in lever presses could be explained by a linear correlation between lever presses and DA levels, on both test days (Fig. 5).

Fig. 5
figure 5

Correlation between instrumental response rates and NAc DA efflux. Scatter plot of lever presses/10 min (y-axis) and corresponding percent change in DA efflux (x-axis) for data obtained on day 5 (left panel) and day 6 (right panel). Shown on each graph is the best fit linear regression line and R 2 value

Histology

The locations of all microdialysis probes are presented in Fig. 6. The 2-mm lengths of the dialysis membrane were in the NAc (shell-core boundary) or medial aspect of the MD striatum (just dorsal and lateral of the anterior commissure).

Fig. 6
figure 6

Location of microdialysis probes in the NAc and MD striatum. Black bars represent 2 mm length of dialysis membranes. Numbers beside each plate correspond to millimeter from bregma. Coronal drawings were modified from Paxinos and Watson (1997)

Discussion

The present study examined the role of DA activity in the NAc in behaviors maintained by instrumental and Pavlovian incentive learning. DA efflux in the NAc was increased significantly during both early and later training stages of an instrumental response for food on a RI-30-s schedule of reinforcement (Figs. 1 and 3). It is important to note that this pattern of results was observed whether a period of extinction preceded a 30-min period of instrumental responding reinforced by food pellets. As such, the present findings confirmed previous reports of increased DA release in the NAc during lever pressing for food, employing fixed interval or ratio schedules of reinforcement (Salamone et al. 1994; Richardson and Gratton 1996; Cousins et al. 1999). Response actions during the early phase of training on interval schedules (i.e., in rats having received as few as 120 outcomes) have been characterized as goal- or outcome-directed (Adams and Dickinson 1981; Balleine and Dickinson 1992; Dickinson et al. 1995) and accordingly may represent instrumental incentives, as distinct from Pavlovian incentive processes related to incentive motivation (Parkinson et al. 2002). Continued performance of these instrumental actions leads to habitual responding which, unlike action–outcome learning, is impervious to outcome devaluation or contingency degradation (Dickinson et al. 1995). We failed to observe a selective increase in DA efflux when rats had limited as compared to extended training experience, as might be expected if dopaminergic activity in the NAc is related to instrumental incentive learning. Indeed, the magnitude of DA efflux was significantly greater after extended training when behavior is said to be no longer controlled by incentive learning and is based instead on habit. These increases in DA efflux were site-specific, as no significant changes in medial MD striatal DA efflux were observed throughout the different phases of this experiment (Fig. 4).

Performance on instrumental tasks is often conducted under nonrewarded or extinction conditions to evaluate the control of behavior by Pavlovian incentive stimuli, unconfounded by unconditioned reward stimuli. In the present study, a significant increase in DA efflux in the NAc was observed only during the initial 10-min sample of responding in extinction on training day 16, but not on day 5 (Fig. 3). The specific CS+ present in this experiment was a distinct auditory “click” of the pellet dispenser which occurred on a RI 30-s schedule, by itself during extinction or accompanied by delivery of food pellets into the magazine during rewarded responding. As such, these data are consistent with previous reports of increased DA efflux in the NAc elicited by a CS+ (Phillips et al. 1993; Datla et al. 2002). In a Pavlovian to instrumental transfer protocol, a CS+ previously paired noncontingently to food reward can facilitate the acquisition of an instrumental response (Parkinson et al. 2002). Systemic administration of DA receptor antagonists during Pavlovian pairings of a CS+ with food reward blocks Pavlovian to instrumental transfer (Beninger and Phillips 1981; Dickinson et al. 2000). Damage to the shell or core of the NAc spares the acquisition of Pavlovian to instrumental transfer, but disrupts the potentiation by intra-NAc amphetamine on responding for the CS+ (Parkinson et al. 2002). Together, these findings suggest that a phasic increase of DA in the NAc, shown to occur after treatment with amphetamine (Taepavarapruk and Phillips 2003; Brebner et al. 2005), may mediate the facilitatory effects of a Pavlovian CS+ on instrumental responding.

It is also of interest to note that the inclusion of an extinction session before a reinforced phase of instrumental responding attenuated the magnitude of DA efflux in the NAc observed when food reward was available on day 5, but not on day 16. This finding may be attributed to attenuation in the secondary reinforcement property of the CS+ associated with the delivery of food reward or possibly, an influence of frustrative nonreward engendered by extinction. In either case, it is apparent that these effects of extinction are restricted to the early phase of instrumental training.

In earlier studies, Salamone et al. (1994) proposed that an important aspect of dopaminergic activity in the NAc is related to behavioral activation, exertion of effort, and possibly cost benefit analyses relating effort to value of reward stimuli (Salamone et al. 2003, 2005). In support of this hypothesis, consumption of large quantities of freely available food pellets or lab chow was not accompanied by increased DA efflux. It must be noted in passing that consumption to satiety of a large meal of a palatable food such as fruit loops, onion rings (Ahn and Phillips 1999), or sucrose (Hajnal and Norgren 2002), is accompanied by a significant increase in DA efflux in both the NAc and medial prefrontal cortex. Salamone et al. (1994) also observed a significant relationship between response rates of individual rats and the magnitude of DA efflux in the NAc. Specifically, an increase in DA release was only observed in rats that responded at medium to high rates of responding, whereas in rats with low response output, this measure did not differ from controls.

Our data are also relevant to the relationship between response output and DA activity. In both Experiments 1 and 2, there was no evidence of a simple relationship between response rates and magnitude of DA efflux. As shown in Fig. 1, although the rate of lever presses were twice as high on day 16 compared to day 5, the magnitude of DA efflux did not different significantly between the 2 days. In Fig. 3, on day 5, initial response rates during the extinction and reward phases of the test were comparable during the first 10 min (times 5 and 11), yet the corresponding magnitude of DA efflux during the reward phase was three times greater than the extinction phase. A similar pattern was observed on day 16. Thus, different rates of responding were associated with similar magnitudes of DA efflux, and similar rates of responding were associated with different magnitudes of DA activity. Accordingly, no positive correlations between rate of lever press responding and magnitude of DA efflux in the Nac were observed after limited and extended training (Fig. 5). Finally, with respect to the appealing hypothesis that dopaminergic activity in the NAc is related to behavioral activation (Salamone et al. 2003, 2005), it must be emphasized that although the present data challenge this hypothesis, it cannot be refuted simply on the basis of the lack of a correlation between magnitude of DA efflux and intensity or degree of behavioral activation.

The failure to observe a significant increase in DA efflux in the MD striatum during either action–outcome or habit-based instrumental responding provides a clear indication that DA transmission in this region of the striatum is not involved in instrumental conditioning or stimulus–response habit formation. These data are consistent with the finding that neither excitotoxic lesions nor reversible inactivation of the anterior MD striatum had any effect on acquisition or expression of action–outcome associations in instrumental conditioning (Yin et al. 2005). In contrast, lesions or inactivation of the MD striatum, posterior to the probe placements in the present study, impaired instrumental performance based on outcome–expectancy (Yin et al. 2004). Blockade of NMDA receptors in the dorsomedial striatum also disrupted action–outcome learning consistent with a role for glutamate-mediated synaptic plasticity in the encoding of action–outcome associations (Yin et al. 2005). The MD (“associative”) striatum receives inputs from association cortices (e.g., prelimbic region of the prefrontal cortex and premotor areas), as well as the basolateral amygdala, which appears to mediate the assignment of incentive value to the consequences of instrumental actions (Corbit and Balleine 2005).

Integrity of the dorsolateral striatum has been shown to be required for habit formation in instrumental learning, and furthermore, rats with damage to this region of the striatum reverted to a state in which instrumental actions were goal-directed (Yin et al. 2005). This finding implies that the system involving the dorsolateral striatum responsible for habit formation can inhibit the circuit that mediates action–outcome or goal-directed instrumental actions. This in turn raises the possibility that the increase in NAc DA efflux observed in the present study after extended training provides a representation of instrumental incentive learning that is held in check by activity in the dorsolateral striatum.

In conclusion, the present findings provide neurochemical evidence in support of previous data questioning the role of DA in the NAc in coupling incentive value to representations of instrumental outcomes (de Borchgrave et al. 2002). The data showing elevated DA efflux in the NAc during extinction in the presence of a Pavlovian CS+, in turn are consistent with a role for the NAc in incentive motivation (Fibiger and Phillips 1986; Robbins et al. 1989; Phillips et al. 1993; Balleine and Killcross 1994; Berridge and Robinson 1998). Rats received an apportionment of ∼60 food pellets during microdialysis test on days 5 and 16, yet the magnitude of DA efflux was significantly greater after extended training sessions. These data refute the hypothesis that dopaminergic activity in the NAc is a reflection of either reward value or reinforcement of instrumental responses (Wise 2004). The present “within subject” design revealed a hitherto unappreciated effect of extended training of instrumental responding with an interval schedule on the magnitude of DA efflux in the NAc. For reasons discussed above, this does not appear to be related to motor responding per se. Rather, we speculate that this effect may reflect the specific condition of a random or variable interval schedule of outcome presentation, in which extended training is necessary to appreciate that the probability of receiving a beneficial outcome at any particular time in the 30-min test session is always unpredictable. This degree of uncertainty may be highly compatible with the optimal conditions for activating midbrain DA neurons (Fiorillo et al. 2003), which in turn would result in a sustained increase in DA release in the NAc throughout a period of random reinforcement. This pattern of DA release could play an important role in maintaining a high level of motivation at the service of a variety of response strategies available to ensure access to objects essential for survival.