Introduction

Drugs are initially used because of their pleasurable subjective effects. However, casual drug use can progress into inappropriate and excessive use, finally culminating in drug addiction, characterized by loss of control over drug intake. One of the mechanisms underlying the development of addiction is that drugs of abuse usurp neurobehavioral processes that mediate the motivational properties of rewards (Robinson and Berridge 1993; Vanderschuren and Everitt 2005; Koob and Volkow 2010). Increasing our understanding of the motivational processes that govern drug use and drug addiction may aid the development of a successful pharmacotherapy for addiction. To accomplish this, it is essential to have an animal model that emulates drug seeking and taking as it occurs in humans and in which we can reliably measure the motivation for drugs.

Progressive ratio (PR) schedules of reinforcement are the most widely used self-administration schedules to study the motivational properties of food and drug rewards (Hodos 1961; Richardson and Roberts 1996; Arnold and Roberts 1997; Stafford et al. 1998). In PR schedules, the response requirement is systematically increased after every obtained reward. The maximum number of responses a subject performs in order to receive a single reward is referred to as the breakpoint and this breakpoint is thought to reflect the motivational value of the reward. PR schedules can be used to study addiction-like behavior since rats with prolonged cocaine or heroin self-administration experience show increased breakpoints under a PR schedule of reinforcement (Paterson and Markou 2003; Deroche-Gamonet et al. 2004; Wee et al. 2008; Lenoir and Ahmed 2008; Orio et al. 2009; but see Liu et al. 2005).

Although they have enormously contributed to our understanding of the neurobehavioral processes underlying drug use and drug addiction, there is a drawback to traditional PR schedules. That is, once response criteria are met under a traditional PR schedule, the reward is delivered without requiring further action from the animal. When this reward is administered intravenously, it has no gustatory or consummatory component. Therefore, these schedules do not take into account that in real life procuring drugs takes a different set of actions than the actual consumption of drugs. In contrast, when an experimental schedule in which reward seeking and reward taking are separate actions is used, the animal has explicit control over its intake. Such a schedule also allows to study the neurobehavioral underpinnings of reward seeking and reward taking separately. This feature is relevant because it has been shown that food seeking and food taking can be differently affected by manipulations of the motivational state of an animal (Balleine et al. 1995; Corbit and Balleine 2003). Olmstead et al. (2000) were the first to evaluate a heterogeneous seeking–taking (ST) chain schedule of reinforcement in which seeking and taking drugs were studied as two separate actions. In their setup, animals had to seek a cocaine or sucrose reward by pressing a “seeking” lever under a random interval (RI) schedule of reinforcement, in order to get access to a second “taking” lever, pressing which under a fixed-ratio 1 (FR-1) schedule of reinforcement resulted in delivery of the reward. They observed that, under certain conditions, the response rates on the seeking lever were proportional to the reward size for both cocaine and sucrose. Although the ST(RI) schedule has since been used to study different aspects of appetitive and addictive behavior (Olmstead et al. 2001; Johnston et al. 2001; Hellemans et al. 2002; Vanderschuren and Everitt 2004; Pelloux et al. 2007; Zapata et al. 2010), the neural and behavioral characteristics of heterogeneous ST chain schedules remain relatively unexplored.

Here, we evaluated two variants of the heterogeneous ST chain schedule for sucrose and cocaine self-administration, using either an RI or a PR requirement on the seeking link of the response chain. We hypothesized that changing the value of the cocaine and sucrose reward affects seeking and taking under the ST(RI) and ST(PR) schedules of reinforcement. We manipulated the value of cocaine and sucrose in three different ways.

First, we varied the magnitude of the cocaine and sucrose reward, by either changing the unit dose of cocaine, or the amount of sucrose pellets that could be earned per reinforcement. We expected that larger rewards have a higher value and would therefore evoke more seeking responses (cf. Olmstead et al. 2000).

Second, we systemically administered the dopamine receptor antagonist α-flupenthixol before self-administration sessions. There is an ongoing debate about the exact role of dopaminergic neurotransmission in appetitive and hedonic processes (Wise 2004; Schultz 2007; Berridge 2007; Robbins and Everitt 2007). Suppression of dopaminergic neurotransmission decreases the motivation to respond for both cocaine and food, as measured under traditional PR schedules of reinforcement (Roberts et al. 1989; Hubner and Moreton 1991; Depoortere et al. 1993; Richardson et al. 1994; Cheeta et al. 1995; Smith et al. 1995; Ward et al. 1996; Aberman et al. 1998; Aberman and Salamone 1999; Reilly 1999; Caul and Brindle 2001; Bari and Pierce 2005; Zhang et al. 2005). However, this decrease in motivation could be a consequence of alterations in the rewarding properties of food and cocaine and/or the willingness to exert effort for a reward. It has been shown that suppression of dopamine neurotransmission usually attenuates responding for food under demanding effort schedules (Reilly 1999; Cheeta et al. 1995; Aberman and Salamone 1999; Caul and Brindle 2001; Salamone et al. 2001), but does not affect food intake when food is readily available or response costs are low (Salamone et al. 1991; Ikemoto and Panksepp 1996; Aberman and Salamone 1999; Baldo et al. 2002; Barbano and Cador 2006). Therefore, it is thought that dopamine decreases the willingness to respond for food, rather than its positive subjective properties. On the other hand, it has been widely shown that animals increase their cocaine intake after administration of a dopamine receptor antagonist when responding under low ratio schedules. It is then assumed that animals compensate for the reduced positive subjective effects by increasing their intake (De Wit and Wise 1977; Ettenberg et al. 1982; Roberts and Vickers 1984; Koob et al. 1987; Bergman et al. 1990; Corrigall and Coen 1991; Hubner and Moreton 1991; Caine and Koob 1994). Therefore, dopamine is thought to be important for the positive subjective properties of cocaine. However, this role for dopamine in cocaine reward value does not exclude a role for dopamine in the willingness to work for cocaine under high effort schedules as well. We hypothesized that dopamine mediates both the positive subjective and motivational properties of cocaine. Therefore, we expected that α-flupenthixol would decrease both sucrose and cocaine seeking, but that its effects on cocaine seeking would be stronger. In addition, we expected that treatment with α-flupenthixol would affect cocaine taking, but not sucrose taking. To confirm whether the effects on seeking were a consequence of a decrease of the positive subjective properties and/or a suppression of the willingness to exert effort, we also performed an experiment in which we treated animals responding for sucrose or cocaine under an FR-1 schedule of reinforcement with α-flupenthixol. Here, we expected an increase in cocaine intake and no effect on sucrose intake after treatment with α-flupenthixol, indicating a role for dopamine in cocaine but not sucrose reward.

Third, we measured responding after reward omission. Extinction of cocaine and sucrose taking, using ST chain schedules, has been shown to reduce seeking rates (Johnston et al. 2001; Olmstead et al. 2001). However, in these previous studies, extinction of the taking response was performed in the absence of the seeking lever. We here evaluated whether the absence of the primary reinforcer would also reduce seeking within the same sessions, which we expected to happen. In fact, the absence of a reduction of seeking rates when the primary reinforcer is not subsequently presented would indicate that responding had progressed from a goal-directed action–outcome, to a habitual stimulus–response associative structure (Dickinson 1985).

Materials and methods

Subjects

Male Wistar rats (Charles River, Sulzfeld, Germany) weighing 250 ± 15 g at the time of arrival in our facility were used for all experiments. Animals were singly housed in Macrolon cages (40 × 26 × 20 cm) in climate-controlled rooms (temperature 21 ± 2°C, 60–65% relative humidity) under a reversed 12-h day/night cycle with lights on at 7 p.m. Animals were allowed to habituate to the housing conditions for at least 9 days before use. Prior to the start of the self-administration sessions, rats were subjected to scheduled feeding. To avoid association between the self-administration sessions and feeding, rats were fed with 20 g chow (SDS, England) at least 1 h after the sessions. This amount was sufficient to maintain body weight and growth. Throughout the experiment, water was available ad libitum, except during self-administration sessions. Self-administration sessions were carried out between 09:00 and 18:00, for 5 days a week. Experiments were approved by the Animal Ethics Committee of Utrecht University, The Netherlands and were conducted in agreement with Dutch laws (Wet op de dierproeven, 1996) and European regulations (Guideline 86/609/EEC).

Surgery

Rats were anesthetized with ketamine hydrochloride (75 mg/kg, i.m.) and xylazine (10 mg/kg, i.m.) or medetomidine (0.4 mg/kg, s.c.), and a catheter was placed in the right jugular vein. Catheters consisted of a silastic tube connected to a guide cannula and a mesh on the base of the cannula (CamCaths, Cambridge, UK). The cannulas were secured by placing the mesh below the skin on the back of the animals. All objects and instruments used during surgery were thoroughly sterilized. Carprofen (5 mg/kg, s.c.) was administered before and two times after surgery for postsurgical pain relief. To prevent infection, rats were treated with gentamycine (5 mg/kg, s.c.) before surgery and for five consecutive days postsurgery. In two subjects, a defective catheter was replaced with a new catheter in the left jugular vein during the course of the experiment.

Apparatus

All subjects were trained and tested in operant conditioning chambers (29.5 cm L, 24 cm W 25 cm H; Med Associates, Georgia, VT, USA). The chambers were placed in light- and sound-attenuating cubicles equipped with a ventilation fan. Each chamber was equipped with two 4.8-cm-wide retractable levers, placed 11.7 cm apart and 6.0 cm from the grid floor. The assignment of the left and right lever as seeking and taking lever was counterbalanced across rats. A cue light was present above each lever (28 V, 100 mA) and a house light (28 V, 100 mA) was located on the opposite wall. Sucrose pellets (45 mg, formula F, Research Diets, New Brunswick, NJ, USA) could be delivered at the wall opposite to the levers via a dispenser. Cocaine infusions were controlled by a syringe pump placed on top of the cubicles. During the cocaine self-administration sessions, polyethylene tubing ran from the syringe placed in the syringe pump via a swivel to the cannula on the subjects’ back; in the operant chamber, tubing was shielded with a metal spring. Priming infusions of cocaine were never given. After each session, intravenous catheters were flushed with 0.15-ml heparinized saline. Experimental events and data recording were controlled by procedures written in MedState Notation using MED-PC for Windows (WMPC).

Procedure

ST(RI) schedule with cocaine reinforcement

Rats were trained to lever press for cocaine under a heterogeneous ST chain schedule of reinforcement with an RI of 120 s on the seeking link (ST(RI-120)). Self-administration training started with the acquisition of the taking response under an FR-1 schedule of reinforcement. During acquisition sessions, only the taking lever was present and pressing this lever immediately resulted in the infusion of 0.25 mg cocaine in 0.1 ml saline delivered over 5.6 s, the illumination of the cue light above the taking lever for 5.6 s, the retraction of the lever, and the switching off of the house light. After a 20 s time-out period, the taking lever was reintroduced and the house light illuminated, signaling the start of a new cycle. Once animals had acquired cocaine self-administration, they were gradually introduced to the seeking–taking chain schedule, starting with a schedule with an RI requirement of 2 s on the seeking link. ST(RI) sessions started with the introduction of the “seeking lever” and the illumination of the house light. The first press on the seeking lever initiated the RI and pressing this lever was without consequences until the RI had elapsed. When the RI had elapsed, pressing the seeking lever resulted in retraction of the seeking lever and insertion of the taking lever. Next, responding on the taking lever (FR-1) resulted in an infusion with cocaine, illumination of the cue light, retraction of the taking lever, and the switching off of the house light. This was followed by a 10 min time-out period to minimize the influence of cocaine-induced locomotor effects on responding for the next reward. After the time-out period, a new cycle started by the reintroduction of the seeking lever and the illumination of the house light. When the rats had acquired the task under an RI of 2 s, the RI was progressively increased between sessions until animals had acquired the task under an RI of 120 s. The program automatically ended after 2 h or if the animals had obtained 10 rewards, whichever occurred first.

After rats had acquired stable seeking response rates under the training dose of cocaine (0.25 mg/infusion), we performed a between-session dose–response curve for different unit doses of cocaine (0.063, 0.125, 0.25, or 0.5 mg cocaine per infusion). Stable seeking was defined as less than 15% variation in presses per min over three consecutive sessions and no up- or downward trend. Animals were allowed to self-administer each unit dose until they reached stable seeking rates for this dose. Each animal self-administered all unit doses according to a Latin square design. In a second group of animals, we examined the effect of the dopamine receptor antagonist, α-flupenthixol, on the performance under the ST(RI) cocaine schedule of reinforcement. Rats were trained to self-administer 0.25 mg of cocaine per infusion under the ST(RI-120) schedule, and after acquiring stable response rates, the rats received α-flupenthixol (0, 0.05, 0.25, or 0.5 mg/kg, i.p.) 30 min before the start of the session. α-Flupenthixol injections were administered according to a Latin square design, and test sessions were separated by at least two sessions without treatment. The omission and reacquisition experiment was performed with 15 rats, 10 of these rats were previously tested in the dose–response curve experiment and 5 rats were previously tested for the effects of α-flupenthixol. Preceding reward omission sessions, rats were trained to self-administer 0.25 mg of cocaine per infusion. Next, rats received 10 reward omission sessions. Procedures during reward omission sessions were similar as during cocaine sessions, except that animals received an infusion with saline instead of cocaine after pressing the taking lever. After 10 reward omission sessions, cocaine (0.25 mg/infusion) was reintroduced, to assess reacquisition of responding for cocaine.

ST(PR) schedule with cocaine reinforcement

Rats were trained to lever press for cocaine under a heterogeneous ST chain schedule of reinforcement with a PR on the seeking link (ST(PR)). Experimental procedures were similar to responding for cocaine under the ST(RI) schedule, with the following exceptions. After acquisition of cocaine taking under an FR-1 schedule, an ST(PR) schedule was introduced. Under the ST(PR) schedule, animals had to meet a response requirement on the seeking lever that progressively increased after every earned reward (1, 2, 4, 9, 12, 15, 20, 25, etc.; Richardson and Roberts 1996) in order to get access to the taking lever. A session continued until the animal failed to meet the response requirement on the seeking lever within 1 h.

After acquiring a stable seeking response rate for 0.25 mg/infusion of cocaine, rats were allowed to self-administer either 0.125 or 0.5 mg of cocaine per infusion. Under this schedule, stable levels of responding were defined as less than two reward variations over three consecutive sessions and no up- or downward trend. When individual rats showed stable response rates under this dose, they were switched to the other dose. In an additional group of animals, we tested the effects of α-flupenthixol (0, 0.05, 0.25, or 0.5 mg/kg i.p., 30 min before the start of the session according to a Latin square design). After completion of testing the effects of α-flupenthixol, these rats received at least two more ST(PR) training sessions before they went through a reward omission period of three sessions. After three reward omission sessions, cocaine was reintroduced, to assess reacquisition of responding for cocaine.

ST(RI) schedule with sucrose reinforcement

Rats were trained to lever press for sucrose under a heterogeneous ST chain schedule of reinforcement with an RI requirement of 120 s on the seeking link (ST(RI-120)). This procedure was similar to the ST(RI-120) with cocaine as the reward, with the following exceptions. Training started with a “shaping” session, in which the rats received two sucrose pellets every minute for 1 h; throughout the experiment, animals were rewarded with sucrose pellets, the time-out period after receiving the reward was 1 min, and sessions lasted for a maximum of 30 min. When rats showed a stable seeking response rate for two sucrose pellets per reward (defined as less than 15% variation in presses per minute on the seeking lever over three consecutive sessions and no up- or downward trend), we performed a between-session dose–response curve for the following reward magnitudes according to a Latin square design: one sucrose pellet, two sucrose pellets, or four sucrose pellets per reward. Next, the effect of α-flupenthixol (0, 0.05, 0.25, or 0.5 mg/kg i.p., 30 min before the start of the session according to a Latin square design) was tested. This was followed by a reward omission phase. Procedures during reward omission sessions were similar as during sucrose sessions, except that animals did not receive a reward after pressing the taking lever. After 10 reward omission sessions, reacquisition of responding for sucrose (two pellets/reward) was performed.

ST(PR) schedule with sucrose reinforcement

Rats were trained to lever press for sucrose under a heterogeneous ST chain schedule of reinforcement with a PR requirement on the seeking link (ST(PR)). This procedure was similar to the ST(PR)cocaine procedure, with the following exceptions. Training started with a “shaping” session, in which the rats received two sucrose pellets every minute for 1 h, and the time-out period after receiving the reward was 30 s. When rats showed a stable seeking response rate for two sucrose pellets per reward (defined as no up- or downward trend and no more than four reward variations for three consecutive days), rats were allowed to press for either one or four sucrose pellets. When individual animals showed a stable response rate for this reward size, they were switched to the size they had not yet received. Next, the effect of α-flupenthixol (0, 0.05, 0.25, or 0.5 mg/kg i.p., 30 min before the start of the session according to a Latin square design) was tested. A separate group of animals was trained to respond for two sucrose pellets per reward under the ST(PR) schedule. After showing stable breakpoints, these rats underwent reward omission. After 20 reward omission sessions, responding for sucrose was reacquired.

FR-1 schedule with cocaine or sucrose reinforcement

Rats were trained to lever press for cocaine or sucrose under an FR-1 schedule of reinforcement. During these 2 h sessions, two levers were present. In the sessions in which animals were trained to self-administer cocaine, pressing the active lever immediately resulted in the infusion of 0.25 mg cocaine in 0.1 ml saline delivered over 5.6 s, the illumination of the cue light above the active lever for 5.6 s, the retraction of both levers, and the switching off of the house light. After a 20 s time-out period, the levers were reintroduced and the house light illuminated, signaling the start of a new cycle. Pressing the inactive lever had no programmed consequences. Once animals had acquired stable self-administration under the FR-1 schedule, the effect of α-flupenthixol (0, 0.05, 0.125, 0.25, or 0.5 mg/kg i.p., 30 min before the start of the session) was tested. α-Flupenthixol injections were administered according to a Latin square design and were separated by at least two sessions without treatment. A second group of rats was trained to lever press for sucrose. This procedure was similar to the one in which the animals were responding for cocaine, except that animals were rewarded with two sucrose pellets instead of a cocaine infusion and that sessions lasted for 30 min.

Drugs

Cocaine-HCl (Bufa BV, Uitgeest, The Netherlands) and cis-(Z)-α-flupenthixol dihydrochloride (Sigma, Zwijndrecht, The Netherlands) were dissolved in sterile physiological saline (0.9%NaCl).

Data analysis and statistics

Approximately 8% of the animals trained under the ST(RI) schedules and approximately 3% of the rats trained under the ST(PR) schedules did not reach stable baseline seeking rates and were therefore excluded from the analysis. Seeking under the ST(RI) schedule was expressed as presses per minute and the seeking under the ST(PR) schedule was expressed as breakpoints. Taking responses under the ST(RI), ST(PR), and FR-1 schedules were measured using total number of responses in a session. In the experiments where reward magnitude was varied, the value used for analyses was the average of the last three sessions in which the rats showed stable responding. Experiments in which the animals were responding under an ST(RI) or FR-1 schedule in which the reward magnitude was varied or the effects of α-flupenthixol were tested were analyzed using a repeated measures ANOVA, and where appropriate, post hoc comparisons were made using a paired t test. We also tested if there was an effect of α-flupenthixol on responding in the following self-administration session via repeated measures ANOVA. Reward omission experiments under the ST(RI) schedule were analyzed by planned comparisons between the baseline and the first reward omission session, between baseline and reacquisition sessions, and between the last reward omission session and the first reacquisition session using paired t tests. The reward omission periods and reacquisition periods were separately analyzed using repeated measures ANOVA, and where appropriate, post hoc comparisons were made using paired t tests. Since breakpoints in the ST(PR) experiments are derived from an escalating curve, samples violate the homogeneity of variance. Therefore, we analyzed breakpoints via nonparametric testing. ST(PR) experiments where we varied the reward magnitude or tested the effects of α-flupenthixol were analyzed using a Friedman test, followed by a Wilcoxon signed ranks test to identify differences between groups. Reward omission experiments under the ST(PR) schedule were analyzed by planned comparisons between the baseline and the first reward omission session and between the last reward omission session and the first reacquisition session using a Wilcoxon signed ranks test. We also compared α-flupenthixol treatment with the first reward omission session with independent t test or Mann–Whitney U test for the ST(RI) and the ST(PR) experiments, respectively. The criterion for statistically significant differences was set at p < 0.05. All data were analyzed using SPSS 15.0.

Results

ST(RI) schedule with cocaine reinforcement

For the ST(RI) experiments with cocaine reinforcement, animals reached baseline taking levels with an average of 9.8 ± 0.1 rewards/session (range 8.6–10 rewards/session). Figure 1a shows that varying the reward magnitude by changing the unit dose of cocaine affected response rates on the seeking lever under an ST(RI) schedule (F 3, 33 = 11.231, p < 0.001, Fig. 1a). Post hoc analysis showed that seeking rate for 0.063 mg/infusion of cocaine differed from all other unit doses of cocaine. In addition, there was a difference between 0.125 and 0.25 mg/infusion of cocaine, but 0.125 mg/infusion did not differ from 0.5 mg/infusion of cocaine. Taking responses were not altered by changing the reward magnitude (F 3, 33 = 1.951, NS; Fig. 1b). Figure 1c shows that α-flupenthixol decreased cocaine seeking in a dose-dependent manner (F 3, 27 = 23.829, p < 0.001). Post-hoc analyses showed that 0.25 and 0.5 mg/kg α-flupenthixol differ from each other and from all other treatments. α-Flupenthixol treatment also reduced taking responses (F 3, 27 = 93.429, p < 0.001; Fig. 1d): 0.25 and 0.5 mg/kg α-flupenthixol differed significantly from all other treatments. As shown in Table 1, there was no effect of α-flupenthixol on responding for cocaine during the following (drug-free) self-administration session (seeking, F 3, 27 = 0.968, NS; taking, F 3, 27 = 0.355, NS). Furthermore, 0.5 mg/kg α-flupenthixol suppressed seeking to a greater extent than that seeking was suppressed in the first reward omission session (t 23 = 6.824, p < 0.001). There was a small but significant difference in taking between the 0 mg/kg α-flupenthixol group and the last baseline session preceding the first reward omission session (t 23 = 3.432, p < 0.05), indicating a small difference in baseline taking levels between these groups. Nevertheless, both 0.25 and 0.5 mg/kg α-flupenthixol suppressed taking to a greater extent than during the first reward omission session (0.25 mg/kg α-flupenthixol vs. first extinction session t 23 = 3.895, p = 0.001; 0.5 mg/kg α-flupenthixol vs. first extinction session t 23 = 16.469, p < 0.001).

Fig. 1
figure 1

Effect of altering reward size, α-flupenthixol, and reward omission on responding for cocaine under an ST(RI-120) schedule in rats. a Seeking response rates for different unit doses of cocaine (n = 12). b Number of taking responses for different unit doses of cocaine (n = 12). c Effect of α-flupenthixol on cocaine seeking (n = 10). d Effect of α-flupenthixol on cocaine taking (n = 10). e Effects of reward omission and reacquisition on cocaine seeking: baseline response (session 0; closed symbol) followed by 10 reward omission sessions (sessions 1–10; open symbols), followed by reacquisition (sessions 11–15; closed symbols) (n = 15). f Effects of reward omission and reacquisition on cocaine taking: baseline response (session 0; closed symbol) followed by 10 reward omission sessions (sessions 1–10; open symbols), followed by reacquisition (sessions 11–15; closed symbols) (n = 15). Data are presented as mean presses/min ± SEM (seeking) or mean ± SEM taking responses. *p < 0.05 for difference between groups (ad) or sessions (e, f) (paired t test)

Table 1 The results on seeking and taking in the self-administration sessions following sessions in which animals were treated with α-flupenthixol

Both seeking and taking were affected when animals were exposed to reward omission (Fig. 1e, f). Seeking already decreased during the first reward omission session compared to the last baseline session (t 14 = 5.152, p < 0.001), and in the subsequent reward omission sessions, seeking gradually declined further (F 9, 126 = 12.408, p < 0.001; Fig. 1e). When cocaine was reintroduced, seeking increased compared to the last reward omission session (t 14 = −8.094, p < 0.001) but responding did not return to pre-reward omission levels until the second reacquisition session (baseline vs. first reacquisition session, t 14 = 3.820, p < 0.005; baseline vs. second reacquisition session, t 14 = 1.767, NS; baseline vs. third reacquisition session, t 14 = 1.556, NS; baseline vs. fourth reacquisition session, t 14 = 1.948, NS; baseline vs. fifth reacquisition session, t 14 = 2.365, p = 0.033). Compared to the last baseline session, taking did not change during the first reward omission session (baseline vs. first reward omission session, t 14 = 1.871, p < 0.082), but in the subsequent reward omission sessions, taking gradually did decrease (F 9, 126 = 6.765, p < 0.001; Fig. 1f). When cocaine was reintroduced, the taking response increased compared to the last reward omission session (t 14 = 3.969, p = 0.001), and responding on the taking lever did return to pre-reward omission levels (baseline vs. first reacquisition session, t 14 = 1.293, NS; baseline vs. second reacquisition session, t 14 = 1.468, NS; baseline vs. third reacquisition session, t 14 = 1.000, NS; baseline vs. fourth reacquisition session, t 14 = 1.871, NS; baseline vs. fifth reacquisition session, t 14 = 1.740, NS).

ST(PR) schedule with cocaine reinforcement

For the ST(PR) experiments with cocaine, animals reached stable levels of responding with an average of 16.3 ± 0.5 rewards/session (range 13.0–19.6 rewards/session). The average baseline session time was 274.4 ± 8.8 min. Thus, sessions took approximately 2.5 h longer than the ST(RI) schedule session with cocaine reinforcement. Figure 2a shows that varying the magnitude of the cocaine reward resulted in a dose-dependent increase in breakpoints under an ST(PR) schedule of reinforcement (χ 2(2)  = 12.286, p < 0.01). Post hoc analyses showed that all unit doses of cocaine tested differed from each other (0.125 vs. 0.25 mg/infusion, Z 6 = −2.366, p < 0.05; 0.25 vs. 0.5 mg/infusion, Z 6 = −2.197, p < 0.05; 0.125 vs. 0.5 mg/infusion, Z 6 = −2.366, p < 0.05). α-Flupenthixol decreased the breakpoints for cocaine in a dose-dependent manner (χ 2(3)  = 22.898, p < 0.001; Fig. 2b). Post hoc analysis showed that 0.25 and 0.5 mg/kg α-flupenthixol significantly differed from saline pretreatment and from each other (0 vs. 0.25 mg/kg, Z 9 = −2.397, p < 0.05; 0 vs. 0.5 mg/kg, Z 9 = −2.805, p < 0.01; 0.25 vs. 0.5 mg/kg, Z 9 = −2.090, p < 0.05). There was no effect of α-flupenthixol on responding during the next, drug-free self-administration session (χ 2(3)  = 2.224, NS; Table 1). Furthermore, 0.5 mg/kg α-flupenthixol suppressed seeking to a greater extent than that seeking was suppressed during the first reward omission session (Z 17 = −2.417, p < 0.05). Figure 2c shows that the breakpoint was reduced during the first reward omission session compared to the last baseline session (Z 7 = −2.521, p < 0.05). Over the next two sessions, the breakpoint did not change further (χ 2(2)  = 2.400, NS). When cocaine was reintroduced, the breakpoint returned to pre-reward omission levels (Z 7 = −2.521, p < 0.05). There was no difference between the last baseline session before the omission sessions and the first session in which cocaine was reintroduced (Z 7 = −0.169, NS).

Fig. 2
figure 2

Effect of altering reward size, α-flupenthixol, and reward omission on responding for cocaine under an ST(PR) schedule in rats. a Breakpoints for different unit doses of cocaine (n = 7). b Effect of α-flupenthixol on breakpoints for cocaine (n = 10). c Effects of reward omission and reacquisition on breakpoints; baseline response (session 0; closed symbol) followed by three reward omission sessions (sessions 1–3; open symbols), after which reacquisition took place (session 4; closed symbols) (n = 8). Data are presented as mean breakpoints ± SEM. *p < 0.05 for difference between groups (a, b) or sessions (c) (Wilcoxon signed ranks test)

ST(RI) schedule with sucrose reinforcement

For the ST(RI) experiments with sucrose reinforcement, baseline responding averaged 9.1 ± 0.2 rewards/session (range 7.6–10 rewards/session). Figure 3a shows that varying the reward magnitude by changing the amount of sucrose pellets per reward did not affect the seeking and taking responses when rats were lever pressing under an ST(RI) schedule of reinforcement (seeking, F 2, 18 = 2.432, NS; taking, F 2, 18 = 0.172, NS; Fig. 3a, b). Panels c and d of Fig. 3 show that α-flupenthixol decreased sucrose seeking and taking (seeking, F 3, 27 = 6.939, p = 0.01; taking, F 1, 9 = 5.949, p < 0.05). Post hoc analyses showed no significant between-group effects on taking, but seeking under 0.5 mg/kg α-flupenthixol differed from all other treatments (0 vs. 0.5 mg/kg, t 9 = 3.743, p < 0.01; 0.05 vs. 0.5 mg/kg, t 9 = 4.541, p = 0.001; 0.25 vs. 0.5 mg/kg, t 9 = 3.255, p = 0.01). There was no effect of α-flupenthixol on responding during the consecutive self-administration session (seeking, F 3, 27 = 2.140, NS; taking, F 3, 27 = 1.898, NS; Table 1). Furthermore, α-flupenthixol treatment suppressed seeking and taking to a comparable extent as the first reward omission (seeking—0 mg/kg vs. first extinction session t 17 = −0.651 NS; 0.05 mg/kg vs. first extinction session, t 17 = −0.123, NS; 0.25 mg/kg vs. first extinction session, t 17 = 0.121, NS; 0.5 mg/kg vs. first extinction session, t 17 = 1.498, NS; taking—0 mg/kg vs. first extinction session, t 17 = −0.415, NS; 0.05 mg/kg vs. first extinction session, t 17 = −1.102, NS; 0.25 mg/kg vs. first extinction session, t 17 = 0.344, NS; 0.5 mg/kg vs. first extinction session, t 17 = 1.492, NS). Both the seeking and taking rates were reduced when animals were exposed to reward omission (Fig. 3e, f). Seeking decreased as soon as sucrose was no longer delivered (t 8 = 2.555, p < 0.05; Fig. 3a) and seeking and taking continued to gradually decline during further reward omission sessions (seeking, F 9, 72 = 13.117, p < 0.001; taking, F 9, 72 = 5.836, p < 0.001). When sucrose was reintroduced, seeking and taking rates increased (seeking, t 8 = −2.373, p < 0.05; taking, t 8 = −2.646, p < 0.05). During the second reacquisition session, seeking levels were again comparable to pre-reward omission sessions (baseline vs. first reacquisition session, t 8 = 4.071, p < 0.05; baseline vs. second reacquisition session, t 8 = 1.305, NS; baseline vs. third reacquisition session, t 8 = 1.996, NS; baseline vs. fourth reacquisition session, t 8 = 0.537, NS), but taking responses did not return to pre-reward omission levels during the four reacquisition sessions (baseline vs. first reacquisition session, t 8 = 2.475, p < 0.05; baseline vs. second reacquisition session, t 8 = 2.626, p < 0.05; baseline vs. third reacquisition session, t 8 = 2.530, p < 0.05; baseline vs. fourth reacquisition session, t 8 = 2.443, p < 0.05).

Fig. 3
figure 3

Effect of altering reward size, α-flupenthixol, and reward omission on responding for sucrose under an ST(RI-120) schedule in rats. a Seeking response rates for different amounts of sucrose pellets (n = 10). b Number of taking responses for different amounts of sucrose pellets (n = 10). c Effect of α-flupenthixol on sucrose seeking (n = 10). d Effect of α-flupenthixol on sucrose taking (n = 10). e Effects of reward omission and reacquisition on sucrose seeking: baseline response (session 0; closed symbol) followed by 10 reward omission sessions (sessions 1–10; open symbols), after which reacquisition took place (sessions 11–14; closed symbols) (n = 9). f Effects of reward omission and reacquisition on sucrose taking: baseline response (session 0; closed symbol) followed by 10 reward omission sessions (sessions 1–10; open symbols), after which reacquisition took place (sessions 11–14; closed symbols) (n = 9). Data are presented as mean presses/min ± SEM (seeking) or mean ± SEM taking responses. *p < 0.05 for difference between groups (ad) or sessions (e, f) (paired t test)

ST(PR) schedule with sucrose reinforcement

For the ST(PR) experiments with sucrose reinforcement, baseline responding averaged at 15.0 ± 0.4 rewards/session (range 10.3–19 rewards/session). Baseline sessions lasted on average for 136.7 ± 8.3 min, which is nearly 2 h longer than the ST(RI) schedules with sucrose reinforcement. Varying the reward magnitude by changing the number of sucrose pellets per reward did not alter seeking rates under the ST(PR) schedule (χ 2(2)  = 6.000, NS; Fig. 4a). Figure 4b shows that α-flupenthixol treatment dose-dependently altered sucrose seeking (χ 2(3)  = 10.800, p < 0.05). Post hoc analysis showed that saline pretreatment significantly differed from pretreatment with 0.25 and 0.5 mg/kg α-flupenthixol (0 vs. 0.25 mg/kg, Z9 = −2.075, p < 0.05; 0 vs. 0.5 mg/kg, Z 9 = −1.990, p < 0.05). There was no effect of α-flupenthixol on responding for sucrose during the consecutive, drug-free, self-administration session (χ 2(3)  = 3.809, NS; Table 1). Furthermore, 0.5 mg/kg α-flupenthixol suppressed seeking to a greater extent than that seeking was suppressed during the first reward omission session (Z 19 = −2.670, p < 0.01). Figure 4c shows that the breakpoint did not change during the first reward omission session compared to the last baseline session (Z 9 = −0.931, NS), but in the subsequent reward omission sessions, seeking gradually declined (χ 2(19)  = 116.890, p < 0.001). When sucrose was reintroduced, the breakpoint increased (Z 9 = −2.091, p < 0.05), but only from the second reacquisition session onwards, seeking levels were comparable to pre-reward omission sessions (baseline vs. first reacquisition session, Z 9 = −2.091, p < 0.05; baseline vs. second reacquisition session, Z 9 = −1.244, NS; baseline vs. third reacquisition session, Z 9 = −0.923, NS; baseline vs. fourth reacquisition session, Z 9 = −1.540, NS; baseline vs. fifth reacquisition session, Z 9 = −1.887, NS).

Fig. 4
figure 4

Effect of altering reward size, α-flupenthixol, and reward omission on responding for sucrose under an ST(PR) schedule in rats. a Breakpoints for different amounts of sucrose pellets (n = 10). b Effect of α-flupenthixol on breakpoints for sucrose (n = 10). c Effects of reward omission and reacquisition on breakpoints; baseline response (session 0; closed symbol) followed by 20 reward omission sessions (sessions 1–20; open symbols), after which reacquisition took place (sessions 21–25; closed symbols) (n = 10). Data are presented in mean breakpoints ± SEM. *p < 0.05 for difference between groups (a, b) or sessions (c) (Wilcoxon signed ranks test)

FR-1 schedule with cocaine or sucrose reinforcement

Figure 5a shows that α-flupenthixol altered cocaine self-administration in animals responding under an FR-1 schedule of reinforcement (F 4, 24 = 6.282, p = 0.01). Doses of 0.05 and 0.25 mg/kg α-flupenthixol increased cocaine self-administration compared to vehicle treatment (0 vs. 0.05 mg/kg, t 6 = −3.094, p < 0.05; 0 vs. 0.25 mg/kg, t 6 = −2.647, p < 0.05), while 0.5 mg/kg α-flupenthixol decreased responding for cocaine compared to vehicle treatment (0 vs. 0.5 mg/kg, t 6 = 2.601, p < 0.05). α-Flupenthixol did not affect inactive lever presses (F 4, 24 = 1.541, NS). Figure 5b shows that α-flupenthixol also affected sucrose self-administration in animals responding under an FR-1 schedule of reinforcement (F 4, 28 = 3.463, p < 0.05). The dose of 0.5 mg/kg α-flupenthixol decreased responding for sucrose compared to vehicle treatment (0 vs. 0.5 mg/kg, t 7 = 2.376, p < 0.05). α-Flupenthixol did not affect inactive lever presses (F 4, 28 = 0.605, NS).

Fig. 5
figure 5

Effect of α-flupenthixol on responding for cocaine (a) and sucrose (b) under an FR-1 schedule of reinforcement. Data are presented in mean number of lever presses ± SEM. *p < 0.05 for difference compared to vehicle treatment (paired t test)

Discussion

In the present study, we evaluated two variants of heterogeneous ST chain schedules of cocaine and sucrose self-administration. The main advantage of these schedules is that they explicitly separate the actions of seeking and taking a reward, thereby modeling the fact that different sets of actions are required to procure and consume drugs and natural rewards.

We altered the value of the reinforcer by varying the reward size or by omitting the reward. As expected, cocaine seeking increased with increasing reward size under both schedules. The seeking data obtained under the ST(RI) schedule are consistent with Olmstead et al. (2000), and the seeking data obtained under the ST(PR) are in agreement with cocaine dose–response studies using traditional PR schedules (Hodos 1961; Winger and Woods 1985; French et al. 1995; Ward et al. 1996, 2005; Brebner et al. 2000). Under the ST(PR) schedule, increasing the unit dose of cocaine from 0.125 to 0.25 to 0.5 mg/infusion resulted in an orderly increase in breakpoints. Seeking rates under the ST(RI) schedule did differ between the cocaine unit doses of 0.125 and 0.25 mg/infusion, but there was no difference between 0.125 and 0.5 mg/infusion, or between 0.25 and 0.5 mg/infusion of cocaine. This difference between the schedules is perhaps caused by the fact that reinforcement rate is proportional to seeking rate under the ST(PR) but not under the ST(RI) schedule. Taking responses under the ST(RI) schedule were not affected by varying the reward size of cocaine. This was most likely a consequence of a ceiling effect, since animals in our experiment almost always took the maximum number of rewards under all reward size conditions. Under comparable experimental conditions regarding time-out duration and unit doses of cocaine, Olmstead et al. (2000) did observe a decrease in the number of rewards taken for the lowest unit dose of cocaine used. However, in their experiments, animals could earn more rewards in a session, and the number of rewards earned always exceeded 10 infusions, which was the maximum of infusions that could be earned in our setup.

In the reward omission sessions, the absolute value of the reinforcer was reduced to zero. Omission of the cocaine reward decreased cocaine seeking under both ST schedules already in the first session. This indicates that cocaine seeking was goal-directed (cf. Olmstead et al. 2001) and had not become stimulus-driven (Dickinson 1985). Interestingly, a recent study has shown that after an extent of cocaine self-administration experience comparable to our animals when they underwent omission of the cocaine reward, devaluation of the taking link by omitting the cocaine reward did not affect cocaine seeking during a 5 min extinction session (Zapata et al. 2010). In contrast, in our study, animals did receive presentations of the taking lever after they had fulfilled the response requirement on the seeking lever, providing them with explicit feedback of the consequences of their actions (i.e., pressing the taking lever results in infusion with saline instead of cocaine). This may have precluded the expression of any habitual behavior in our animals, perhaps because distal (seeking) actions gain habitual properties more readily with extended training, or under interval schedules, than actions proximal to reward consumption (Dickinson 1985; Dickinson et al. 2002; Miles et al. 2003). Over sessions, animals adapted their expectancy of the reward since seeking levels stabilized at a low level of responding after a few sessions. Thus, omission of the cocaine reward reduced the motivation to seek the drug. Taking responses under the ST(RI) schedule also declined over sessions, but the effect was more gradual and less pronounced than for seeking responses. There are two explanations for this initial absence of a reduction in taking under the ST(RI) schedule. First, the low effort requirement on the taking link and the limited amount of opportunities to respond on the taking lever in an ST(RI) session do not facilitate fast extinction of responding. Second, in a reinforced ST(RI) session, responding on the taking link resulted in the presentation of the reward and reward-associated stimuli, i.e., the presentation of the cue light and the retraction of the taking lever. Thus, the association between the responding on the taking lever and reward-associated stimuli is much stronger for the taking link than for the seeking link. These reward-associated stimuli that were still presented in the reward omission sessions had likely acquired motivational properties that maintained responding on the taking link (Stewart et al. 1984; Everitt and Robbins 2000). In fact, reward-associated conditioned stimuli can support the acquisition of new instrumental responding that is only reinforced with presentation of these conditioned stimuli (Di Ciano and Everitt 2004; Parkinson et al. 2005).

In our hands, varying the number of sucrose pellets per reward had no effect on seeking or taking under both the ST(PR) and the ST(RI) schedules. This is surprising since it has been shown that both the traditional PR and the ST(RI) schedules are sensitive to changes in the amount of sucrose delivered (Cheeta et al. 1995; Cleary et al. 1996; Olmstead et al. 2000; Sclafani and Ackroff 2003; Rickard et al. 2009). Interestingly, Rickard et al. (2009) showed that the dose–response curve under a PR schedule flattens when sucrose solutions are used that yield more than approximately 20.5 mg sucrose per reward. Since the sucrose pellets we used contained 42.5 mg sucrose per pellet, it may be that the lowest amount of sucrose used in the present study already elicited a maximal response rate. When different amounts of sucrose are offered simultaneously within the same session, such as in a delayed reward task when the delay to the large reward is zero, rats can clearly distinguish between one or four sucrose pellets (e.g., Evenden and Ryan 1996; van Gaalen et al. 2006). Apparently, when the absolute amount of sucrose was varied between sessions, the ST schedules tested in the present study were not sensitive enough to measure the changes in motivation for the sucrose reward. In contrast to varying the reward value from one to four sucrose pellets, the ST schedules did identify a decrease in motivation when the absolute value of the sucrose reinforcer was reduced to zero. Omission of the sucrose reward attenuated sucrose seeking under the ST(RI) schedule already during the first reward omission session, and in consecutive omission sessions, sucrose seeking was suppressed further. Taking responses for sucrose under the ST(RI) schedule showed a similar pattern as when the cocaine reward was omitted. Initially, taking was unaffected but decreased over repeated testing, most likely for the same reasons as described above for the effects of reward omission on cocaine taking. Sucrose seeking under the ST(PR) schedule did not decline until the third reward omission session, indicating a transient resistance to extinction under these circumstances.

We also systematically evaluated the role of dopamine in the motivation for cocaine and sucrose under ST chain schedules. Johnston et al. (2001) have previously tested the effect of pimozide, a dopamine D2 receptor antagonist, in animals that were trained under an ST(RI) schedule with sucrose reinforcement. Treatment with pimozide reduced reinforced sucrose taking over sessions. Pimozide treatment also resulted in a decreased sucrose seeking rate in a later session in which animals got access to the seeking lever only under drug-free conditions. In contrast to the present study, Johnston et al. (2001) tested the effects of pimozide on seeking and taking sucrose in isolation and not during responding under the ST(RI) schedule itself. Our study therefore extends these findings by showing that α-flupenthixol treatment directly decreases cocaine and sucrose seeking under both ST chain schedules as well as cocaine taking under the ST(RI) schedule. This is consistent with previous studies measuring motivation for food rewards (Reilly 1999; Cheeta et al. 1995; Aberman and Salamone 1999; Caul and Brindle 2001; Salamone et al. 2001). As mentioned in the Introduction, it is generally thought that dopamine mediates the willingness to exert effort for food, rather than its positive subjective properties (Baldo and Kelley 2007; Barbano and Cador 2007; Berridge 2007; Salamone et al. 2009), whereas for cocaine it probably mediates both (Koob et al. 1998; Wise 2004; Pierce and Kumaresan 2006). The absence of a compensatory increase in sucrose self-administration after α-flupenthixol treatment under the FR-1 schedule of reinforcement indicates that the dopamine receptor antagonist did not reduce the positive subjective, rewarding properties of sucrose. Therefore, the effects of α-flupenthixol on sucrose seeking under the ST chain schedules were most likely the result of a reduced motivation to work for sucrose. The highest dose of α-flupenthixol tested (0.5 mg/kg, i.p.) attenuated sucrose self-administration under an FR-1 schedule of reinforcement. Since this dose of α-flupenthixol does not affect locomotor activity (Veeneman et al. 2011a), this result suggests that at a high dose, a dopamine receptor antagonist also influences the willingness to respond for sucrose under low ratio schedules.

The decrease in cocaine seeking and taking under the ST chain schedules induced by α-flupenthixol can be explained by a decrease in the positive subjective, rewarding properties of cocaine. Under the FR-1 schedule, α-flupenthixol, in doses up to 0.5 mg/kg i.p., increased cocaine self-administration. This suggests that animals increased their intake to compensate for a lower reward value of cocaine, akin to the increase in responding observed under an FR-1 schedule when the unit dose of cocaine is lowered (Gerber and Wise 1989; Zittel-Lazarini et al. 2007). These findings are in agreement with many previous studies showing that systemic treatment with dopamine receptor antagonists increases responding for cocaine under low ratio schedules and that high doses of dopamine receptor antagonists reduce lever pressing for cocaine, likely reflecting extinction of responding (De Wit and Wise 1977; Ettenberg et al. 1982; Roberts and Vickers 1984; Koob et al. 1987; Bergman et al. 1990; Corrigall and Coen 1991; Hubner and Moreton 1991; Caine and Koob 1994). However, we hypothesized that dopamine mediates both the positive subjective, rewarding properties of cocaine and the willingness to exert effort for rewards, including cocaine. Indeed, we have several reasons to think that α-flupenthixol affected cocaine seeking under the ST schedules by reducing both the rewarding properties of cocaine and the motivation to work for the drug. First, suppressing dopamine neurotransmission affected cocaine seeking and taking to a larger extent than sucrose seeking and taking. Under the ST(RI) schedules, both cocaine seeking and taking were reduced by lower doses of α-flupenthixol, whereas only the highest dose of α-flupenthixol affected sucrose seeking. Sucrose taking was not affected at all by α-flupenthixol, whereas the dopamine receptor antagonist did reduce cocaine taking. In addition, under the ST(PR) schedule, similar doses of α-flupenthixol affected breakpoints for cocaine and sucrose, but the magnitude of the effect was larger for cocaine than for sucrose. Second, comparable to sucrose, the highest dose of α-flupenthixol (0.5 mg/kg, i.p.) also reduced cocaine intake under the FR-1 schedule. Since this unit dose of cocaine (0.25 mg/infusion) is on the descending limb of the dose–response curve (Veeneman et al. 2011b), a reduction in responding under an FR-1 schedule of reinforcement is suggestive of a leftward (indicating an increase in the reward value of cocaine) or a downward shift in the dose–response curve (indicating reduced motivation for cocaine). Clearly, the latter explanation is more plausible. Third, if the role of dopamine was restricted to mediating the rewarding properties of cocaine, then the maximum effect of a dopamine receptor antagonist would be to nullify the reward value of cocaine. Hence, the largest effect of α-flupenthixol should resemble the effect of the first reward omission. However, 0.5-mg/kg α-flupenthixol attenuated cocaine seeking under the ST(RI) and the ST(PR) to a significantly larger extent than during the first reward omission session. Together, these observations support the notion that dopamine mediates cocaine reward as well as the willingness to exert effort for cocaine.

We evaluated two different variants of ST chain schedules of reinforcement with either an RI or a PR requirement on the seeking link. One important difference between these schedules is that under the ST(PR), but not under the ST(RI) schedule, reinforcement rate is proportional to seeking rate. Under the ST(PR) schedule, the number of taking opportunities directly depends on the number of seeking responses performed in a session. In our experiments, once rats got access to the taking link, they always responded on this lever and received the reward. Thus, breakpoints on the seeking lever directly reflected the number of taking responses and the number of rewards obtained. However, unlike the ST(PR) schedule, under the ST(RI) schedule, seeking rates do not automatically reflect taking rates. For example, Olmstead et al. (2000) showed that changes in cocaine reward magnitude can result in opposite effects on seeking and taking rates under an ST(RI) schedule. They showed that in the absence of a time-out period, the cocaine seeking rate increases with reward size but cocaine taking decreases with reward size. Thus, in contrast to the ST(PR) schedule, the ST(RI) schedule can be used to measure alterations in instrumental responding on the seeking and the taking links independently. Consistent with this notion, our present data show that α-flupenthixol reduced sucrose seeking but not taking under the ST(RI) schedule and that the effects of reward omission emerged faster and were of larger magnitude for seeking than for taking, for both sucrose and cocaine.

In summary, we extended the characterization of ST chain schedules of self-administration using either an RI or a PR requirement on the seeking link, and we showed that changes in reward magnitude, reward omission, or α-flupenthixol treatment affected responding, depending on the type of the reinforcer and the schedule used. Therefore, ST chain schedules provide a useful addition to existing self-administration paradigms. The explicit separation of the acts of seeking and taking rewards endows these schedules with naturalistic validity, because in the real world, seeking out rewards and actually consuming them require different sets of action.