Introduction

In their landmark study concerned with animal models of addiction, Ahmed and Koob (1998) found that long access (LgA) cocaine self-administration sessions lasting 6 h produced an escalation in rats’ cocaine intake over sessions, whereas short access (ShA) sessions lasting 1 h did not. Additionally, they found that rats experiencing LgA sessions displayed an upward shift in the cocaine self-administration dose-response curve (i.e., rats self-administered more infusions across the range of self-administered doses), attributed to an increased hedonic set point after LgA cocaine. Since the publication of their 1998 paper, Ahmed and Koob’s LgA versus ShA paradigm has been used by many labs to study the self-administration of various drugs (for review, see Edwards & Koob, 2013). In addition to escalation of intake over sessions and an upward shift of the dose-response curve, LgA drug self-administration produces other addiction-related behaviors, such as heightened motivation for the drug, most commonly measured with progressive ratio schedules where the number of responses required to obtain the drug increases with each infusion (Orio et al., 2009; Paterson & Markou, 2003; Verheij et al., 2016; Wee et al., 2008, 2009; Whitfield et al., 2015).

Recently, intermittent access (IntA) to drug self-administration has been suggested to better model addiction than the LgA procedure (Allain et al., 2015; Allain et al., 2018; Allain & Samaha, 2019; Kawa et al., 2019a, b; Samaha et al., 2021). In a commonly used version of the IntA procedure, 5-min periods of drug availability (signaled by a change in illumination or lever insertion) alternate with 25-min periods of unavailability over the course of a 6-h session (Zimmer et al., 2011, 2012). IntA drug self-administration has been found to produce greater addiction-like behavior than LgA drug self-administration (for reviews, see Allain et al., 2015; Kawa et al., 2019a; Samaha et al., 2021). For example, recent experiments have shown that rats’ motivation for cocaine, as measured on progressive ratio tests, was higher after IntA training than after LgA training (Algallal et al., 2020; Allain et al., 2018; Minogianis & Samaha, 2020). Studies assessing motivation with behavioral economic measures have similarly found greater cocaine motivation after IntA than after LgA training (Kawa et al., 2019b, James et al., 2019; Zimmer et al., 2012).

Increased cocaine motivation observed after IntA training has been hypothesized to be caused by neuroadaptations produced by exposure to intermittently high, “spiking” brain levels of cocaine (Allain et al., 2018; Kawa et al., 2019a, b; Samaha et al., 2021). On the IntA procedure, rats learn to self-administer at a relatively high rate during the 5-min drug availability periods. This results in a rapid rise in blood and brain levels of cocaine, which then fall to near-zero levels over the course of the subsequent 25-min non-availability period (e.g., Algallal et al., 2020; Zimmer et al., 2012). In contrast, during LgA self-administration sessions, rats maintain fairly stable levels of cocaine without the rapid rises and falls (Algallal et al., 2020; Zimmer et al., 2012). The experience of repeated spikes in brain cocaine levels are thought to be responsible for persistent changes in brain function (e.g., sensitization of cocaine’s inhibitory action at the dopamine reuptake transporter) observed after IntA self-administration, which are then thought to cause increased motivation for cocaine (Allain et al., 2021; Calipari et al., 2013, 2015; Minogianis & Samaha, 2020).

Thus far, comparisons of the effects of IntA versus LgA training on reinforcer motivation have only used drug reinforcers, and it has been assumed that spiking brain levels of a drug with direct neuropharmacological activation of reward circuitry is necessary for the behavioral changes produced by the IntA procedure. It is not yet known whether IntA training results in greater motivation for the reinforcer than LgA training when behavior is maintained by a non-drug reinforcer. To this end, the present study compared motivation for saccharin after IntA or LgA training to determine the extent to which high spiking drug levels, and consequent neuroadaptations, are necessary for the effect or whether other aspects of the IntA training procedure may be responsible.

Experiment 1

Experiment 1 employed a between-groups design similar to that used by Algallal et al. (2020), but with saccharin as the reinforcer rather than cocaine. Separate groups of rats lever-pressed for saccharin on either the IntA or LgA procedure before assessing saccharin motivation on progressive ratio tests.

Method

Subjects

Twelve adult female Long-Evans rats weighing 190–220 g upon arrival served as subjects. Rats were individually housed in plastic cages located in an animal colony room with a 12-h light:dark cycle beginning at 08:00 h. Experimental sessions took place during the light phase. Rats had ad libitum access to food and water in their home-cages throughout the experiment. All procedures were approved by American University’s Institutional Animal Care and Use Committee and were conducted in accordance with the Guide for the Care and Use of Laboratory Animals (National Academy of Sciences, 2011).

Apparatus

Training took place in six Med Associates (St. Albans, VT, USA) operant test chambers. Each chamber measured 30.5 × 24 × 29 cm and had aluminum front and rear walls with clear polycarbonate side walls. Three Med Associates retractable levers were located on the front wall of the chamber. Saccharin reinforcers were provided by operation of a Med Associates retractable sipper tube and bottle containing a 0.2% saccharin solution. The aperture through which the sipper tube was inserted was located above the middle lever, which was not used in the present experiments. A 100-mA cue light was located above the left and right levers. A 100-mA house light was located at the rear of the chamber near the ceiling.

Procedure

Acquisition

Rats were first trained on lever-press acquisition procedures. Sessions lasted 2 h and were conducted 5 days a week. Sessions began with illumination of the house light and insertion of the right lever. Initially, rats could press the lever on a fixed-ratio (FR) 1 schedule for a 20-s insertion of the saccharin sipper tube. The cue light above the lever was illuminated during the time that the tube was inserted. Additional lever presses during the tube insertion were recorded but had no consequences and were not included in the data analyses and figures presented below. Rats were trained on this procedure with 20-s saccharin tube insertions for a minimum of two sessions and until they obtained at least 25 saccharin reinforcers in a single session. Then, to promote higher response rates, the duration of the saccharin tube insertion was decreased to 10 s per reinforcer, the duration used for the remainder of the study. Rats received seven to eight acquisition sessions with the 10-s tube insertions before moving on to the next phase of the experiment. (Originally, all rats were meant to receive eight such sessions, but one rat in the IntA group inadvertently only received seven sessions. There were no increasing or decreasing trends over the last four acquisition sessions for this rat, suggesting that acquisition was complete.)

IntA and LgA training

Following acquisition, half of the rats were assigned to the IntA group (n = 6) and the other half to the LgA group (n = 6). Group assignment was made with the goal of matching groups with respect to the mean number of reinforcers obtained over the final three sessions of acquisition. Now, sessions were 6 h long. Because six chambers were available for the experiment and sessions lasted 6 h during this phase, the rats were run in two cohorts that received sessions every other day (following Algallal et al., 2020), Monday through Friday. There were equal numbers of rats from the IntA and LgA groups in each of the cohorts. A particular cohort of rats had three sessions (Monday, Wednesday, Friday) during even-numbered weeks of the experiment and two sessions (Tuesday, Thursday) during odd-numbered weeks, whereas the other cohort had the opposite arrangement. This meant that all rats received two to three sessions per week, with sessions separated by at least 1 day, and more on weekends, where the rats remained in their home cages.

At the beginning of the session, the right lever was inserted into the operant chamber and the house light illuminated. The LgA group had continual access to the lever for the entirety of the 6-h session and could press the lever on an FR-1 schedule to obtain 10-s saccharin sipper tube insertions, plus cue-light illumination, as during acquisition. The IntA procedure was similar to that described by Zimmer et al. (2011). For the IntA group, the session began with illumination of the house light and a 5-min access period where the right lever was inserted into the chamber and rats could press the lever on an FR-1 schedule to obtain saccharin reinforcers. After the 5 min of access to the lever, a 25-min unavailability period occurred where the lever was retracted. The rats were unable to earn additional saccharin reinforcers during this time. The house light remained illuminated during the 25-min unavailability period. The cycle of 5-min lever insertion followed by a 25-min unavailability period repeated 12 times during the 6-h session. For each group, there were ten sessions of either LgA or IntA training during this phase. The house light remained illuminated for the entirety of the 6-h session in both groups.

Progressive ratio test

After completing ten sessions on their respective procedures, the IntA and LgA groups were given a progressive ratio test session to measure motivation for the saccharin reinforcer. At the start of the session, the right lever was inserted and the house light was illuminated. The lever remained inserted, and the house light illuminated, for both groups throughout the duration of the session. The session started on a FR-1 schedule, but the ratio exponentially increased after a reinforcer was earned according to the equation: response ratio = [5e(reinforcer number x 0.2) – 5] (Richardson & Roberts, 1996). The sequence of ratios used was thus 1, 2, 4, 6, 9, 12, 15, 20, etc. The session ended when 1 h elapsed without a reinforcer (Richardson & Roberts, 1996).

Data analysis

For acquisition and LgA or IntA training, the numbers of reinforcers obtained per session were analyzed. Reinforcement rate during the time when the lever was inserted, calculated by dividing reinforcers obtained by the number of minutes that the lever was available during LgA and IntA sessions, was also compared across groups. Breakpoint, or the final ratio completed on the progressive ratio test, was the primary measure of interest. Response rate during the test was also analyzed. Because response rate was expected to vary with the ratio size, response rates were calculated separately for each reinforcer obtained on the test by dividing the ratio requirement by the time required to complete that ratio. For example, the seventh ratio on the test was 15. If a rat took 2 min to complete that ratio, its response rate for the seventh reinforcer was 7.5 responses per min. Because the reinforcers obtained on the test varied by rat, group response rates were compared only for the highest number of reinforcers obtained by all rats.

Non-parametric statistical tests were used due to non-normal distributions, as indicated by significant (p < 0.05) Shapiro-Wilk tests, for key measures. Mann-Whitney U tests were used to evaluate LgA vs. IntA between-group differences. Friedman’s tests and Wilcoxon Signed-Ranks tests were used to evaluate within-subjects effects. The significance level was set to α = 0.05 for all statistical tests. The Benjamini and Hochberg (1995) false discovery rate procedure was used to control α at less than 0.05 for collections of related multiple comparisons.

Results

Acquisition

The mean numbers of acquisition sessions for the LgA and IntA groups were 10.0 and 11.2, respectively. During the final three acquisition sessions, the mean numbers of saccharin reinforcers obtained by the LgA and IntA groups were 107.2 and 96.5, respectively. There were no significant differences between the groups on either of these measures (for sessions, U[6,6] = 24, p > 0.15; for reinforcers, U[6,6] =11, p > 0.3).

IntA versus LgA training

Figure 1a shows for the LgA and IntA groups the mean numbers of saccharin reinforcers obtained on each of the LgA or IntA sessions. The LgA group earned approximately 150–200 reinforcers across the ten LgA sessions. The IntA group took approximately 50 reinforcers on the first IntA session and then increased to about 90–100 reinforcers on the remaining sessions. Collapsed across sessions, the LgA rats earned significantly more reinforcers than IntA rats (U[6,6] = 1, p < 0.005), which is not surprising because LgA rats could lever-press for saccharin continuously for 6 h whereas the IntA rats only had 1 h of total reinforcer availability per session. There was no change in number of reinforcers obtained by the LgA rats across sessions (Friedman’s χ2[9] = 14.1, p > 0.10). For the IntA group, there was a marginally significant trend towards escalation of intake across sessions (χ2[9] = 16.9, p = 0.051). Figure 1b shows the percentage change in reinforcers obtained from session 1 to session 10 for the two groups. The percentage change in the IntA group (+128%) was significantly greater than in the LgA group (-5%; U[6,6] = 3, p < 0.05).

Fig. 1
figure 1

(a) Mean (± SEM) reinforcers earned per session during the ten long access (LgA) or intermittent access (IntA) training sessions of Experiment 1. (b) Mean (± SEM) percentage change in reinforcers obtained from session 1 to session 10 in the LgA and IntA groups. (c) Mean (± SEM) rate of reinforcement during lever insertion for the LgA and IntA groups averaged over the final three LgA or IntA sessions. ** indicates p < 0.01, * indicates p < 0.05

Figure 1c shows, averaged over the last three LgA and IntA sessions, the reinforcement rate experienced by each group during the time that the lever was available. (Because an FR-1 schedule was used, the reinforcement rate was the same as the response rate.) Although the IntA group earned about half as many total saccharin reinforcers per session as the LgA group, they experienced a higher rate of reinforcement in the presence of the lever because the lever was available for only 60 min per session for the IntA group versus 360 min per session for the LgA group. A Mann-Whitney test confirmed that reinforcement rate was significantly higher in the IntA group (U[6,6] = 0, p < 0.005).

Figure 2 shows cumulative records from the final session for two LgA rats (top panels) and two IntA rats (bottom panels). These rats’ records were chosen as they clearly illustrate the difference in within-session patterning of responding observed across IntA and LgA groups. As the figure illustrates, the LgA rats responded at a fairly consistent and high rate during the first 1–2 h of the session. As the session progressed, however, long pauses between reinforcers occurred and rats responded little during the latter parts of the session. The IntA rats generally responded at high rates throughout the 5-min availability periods occurring early in the session, but they often did not respond at all during availability periods occurring later in the session.

Fig. 2
figure 2

Representative cumulative records from the final training session from two rats in the long access (LgA) group (top panels) and two rats in the intermittent access (IntA) group (bottom panels) in Experiment 1. Slashmarks indicate when saccharin reinforcers were earned

Figure 3a shows the number of saccharin reinforcers earned by rats during each hour of the first and last LgA or IntA session. On both of these sessions, the LgA took approximately 90 reinforcers during the first hour and gradually decreased the number of reinforcers taken to about ten per hour towards the end of the session. The IntA group took about 25–35 reinforcers in the first hour and this number decreased to single digits by the end of the session. Mann-Whitney tests comparing the groups at each hour of the final training session indicated that the LgA took significantly more reinforcers during the first hour of the session (U[6,6] = 0, p < 0.005), but the groups did not significantly differ on any of the other 5 h. Friedman’s tests confirmed that saccharin taking significantly decreased over hours for both groups on both the first and last session (all χ2[5]s ≥ 17, all ps < 0.005).

Fig. 3
figure 3

(a) Mean (± SEM) reinforcers earned during each hour on the first and last intermittent access (IntA) or long access (LgA) training session. (b) Mean (± SEM) reinforcers earned per 10-min bin of the first hour of the session (first and last training session) for the LgA group

Figure 3b shows LgA group responding over 10-min bins during the first hour of the first and last training session. A 10-min bin was used because that is how much saccharin access per hour the IntA rats had. That is, in terms of saccharin availability, a 10-min bin in the LgA group is equivalent to an hour of an IntA session because the IntA group had 10 min of saccharin availability per hour (i.e., two 5-min availability periods). When saccharin taking over these successive 10-min periods of saccharin access were compared on the final session (i.e., when LgA last session data in Fig. 3b is compared to IntA last session data in Fig. 3a), the IntA group took significantly more saccharin than the LgA group during the first 10 min of saccharin access (U[6,6] = 1.5, p < 0.005), but significantly fewer reinforcers during hour 5 (U[6,6] = 2, p < 0.01). There were no other significant differences between the groups. The LgA group significantly decreased their saccharin taking over successive 10-min bins of the first hour of both the first and last session (both χ2[5]s ≥ 13.9, both ps ≤ 0.025).

Progressive ratio test

Figure 4a shows the main result of interest for Experiment 1. The mean breakpoint reached by the IntA group – 74.5 – was approximately double that of the LgA group’s 36.8. A Mann-Whitney test confirmed that the IntA group reached significantly higher breakpoints than the LgA group (U[6,6] = 5, p < 0.05). The highest number of reinforcers obtained by all rats was 9. Therefore, the groups were compared on response rates observed for each of the first nine ratios completed on the test. Figure 4b shows that the groups responded at comparable rates as they completed these ratios (no group difference; all U[6,6]s ≥ 11, all ps > 0.3). Note that the response ratio for the first reinforcer was only one response, and many rats responded within the first few seconds of the start of the test session, which led to the high response rate for the first reinforcer. Figures 4c and 4d show the relation between PR breakpoint and responding on the last training session in the LgA and IntA groups, respectively. The correlation coefficients (Spearman rhos) were 0.77 (p = 0.07) and 0.93 (p < 0.001) for the LgA and IntA groups, respectively.

Fig. 4
figure 4

(a) Mean (± SEM) breakpoint on the progressive ratio test for the long access (LgA) and intermittent access (IntA) groups in Experiment 1. (b) Mean (± SEM) response rates observed in the LgA and IntA groups during the ratios corresponding to the first nine reinforcers of the test. The numbers in parentheses on the x-axis indicate the ratio in effect for a particular reinforcer number during the test. (c) and (d) Scatterplots showing relation between last training session responses and breakpoint on the progressive ratio test for the LgA and IntA groups, respectively. Note that the x- and y-axes scales differ over panels c and d. * indicates p < 0.05

Discussion

The main finding of Experiment 1 was that IntA training produced greater motivation for saccharin, as measured on a progressive ratio schedule, than LgA training. This result extends a previously observed effect of IntA versus LgA training with drug reinforcers (Algallal et al., 2020; Allain et al., 2018; Minogianis & Samaha, 2020) to a non-drug reinforcer. That increased motivation after IntA occurs with both drug and non-drug reinforcers may suggest that aspects of the training procedure, rather than rapidly spiking drug levels and consequent neuroadaptations, are responsible for the effect. As described below, Experiment 2 further addresses this point.

No escalation in the numbers of saccharin reinforcers was observed during LgA training and only a marginal escalation effect, due mainly to relatively low intake on session 1, was observed in the IntA group. While escalation of drug taking has been commonly observed after LgA self-administration, there have been exceptions where no escalation was found (e.g., Kippin et al., 2006; Minogianis et al., 2013). Few studies have investigated LgA training with non-drug reinforcers in rats, but one experiment employing a liquid sucrose reinforcer found no escalation across 21 LgA sessions lasting 6 h (Anker et al., 2010) and another study using a saccharin reinforcer actually found a decrease in intake over the course of 14 LgA sessions (Westbrook et al., 2020). IntA training with cocaine as the reinforcer can result in escalation of intake or not, depending on procedural parameters (Allain et al., 2018; Allain & Samaha, 2019; Algallal et al., 2020; Kawa et al., 2019a). Nonetheless, similar effects of IntA training on motivation for cocaine (as assessed by progressive ratio tests) were observed whether or not escalation of cocaine intake occurred (Allain et al., 2018; Allain & Samaha, 2019). The present results indicate that when saccharin is the reinforcer, increased motivation after IntA training similarly does not depend on a robust escalation of intake.

The patterns of responding depicted by the cumulative records presented in Fig. 2 suggest that rats in both groups became sated on saccharin over the course of the 6-h IntA or LgA sessions. Characteristics of the familiar satiation curve (e.g., Owen, 1960; Sidman & Stebbins, 1954) can be seen in the LgA rats, where pauses after reinforcers became more frequent and gradually became longer as the session progressed. Rats in the IntA group, on the other hand, displayed what resembled an all-or-nothing pattern, where they either responded at high rates or, later in the session, not at all during the 5-min availability periods. Interestingly, IntA rats showed these signs of apparent satiation despite earning, on average, half as many reinforcers per session as the LgA group.

Analysis of within-session responding, however, suggests that after accounting for differences in saccharin access, differences in rates of potential satiation across groups appear smaller. As Fig. 3a shows, the LgA rats took approximately 90 saccharin reinforcers in the first hour of sessions. On the last training session, IntA rats took approximately 90 saccharin reinforcers over the whole 6-h session, but this included only an hour of saccharin access. Thus, if only compared over the first hour of saccharin access (which was spread over the 6-h session for IntA rats), the groups took nearly identical numbers of saccharin reinforcers. Whether the IntA group would go on to take an additional approximately 45 reinforcers in a second hour of access, as the LgA rats did, is unknown, and would require a 12-h IntA session. Comparison of groups over successive 10-min bins of saccharin availability (i.e., comparison of LgA group’s data in the right panel of Fig. 3 with the IntA group’s results in the left panel) suggests that the IntA group may still sate somewhat faster than the LgA group. It is unknown why this should be the case.

Prior research indicates that the mechanisms controlling saccharin satiety are different from those regulating ingestion of other substances. Post-ingestive consequences, which inhibit further consumption of caloric solutions and food, do not control the termination of saccharin drinking (Mook et al., 1980, 1981). Instead, oropharyngeal satiety appears to determine when a rat stops drinking saccharin (Mook et al., 1981). It has been suggested that the oropharyngeal receptors act as a kind of “metering device” (Collier & Novell, 1967) and that “the rat passes a fixed amount of saccharin solution through the mouth and over the tongue before stopping” (Mook et al., 1981). Possible remaining differences in rates of satiation across LgA and IntA conditions could be due to the way that this metering mechanism integrates saccharin consumption over different time periods or kinds (continuous vs. intermittent) of saccharin access.

Experiment 2

In Experiment 1, IntA training produced greater motivation for saccharin than LgA training. Spiking drug levels could not have been responsible for this effect here because no drugs were used. It may still be argued, however, that IntA saccharin could have produced some other type of lasting neuroadaptation that increased rats’ motivation for saccharin. To address this, Experiment 2 used a within-subjects design where each rat experienced alternating IntA and LgA sessions with different levers before assessing motivation for saccharin with separate progressive ratio tests on each lever. If IntA training produced lasting neuroadaptations that are the cause of greater motivation for saccharin, then this motivation should similarly increase responding on both the IntA and LgA levers during the progressive ratio tests. Observation of higher breakpoints on the IntA lever would therefore require a different explanation.

Method

Subjects

The rats used in Experiment 1 were also used in Experiment 2. They weighed 260–325 g at the start of the second experiment. Housing and feeding conditions were the same as in Experiment 1.

Apparatus

The same operant chambers used in Experiment 1 were used in Experiment 2.

Procedure

Design overview

A within-subjects design was used in Experiment 2 where all 12 rats from Experiment 1 pressed one lever (right or left, counterbalanced) for saccharin on the IntA procedure and pressed the other lever for saccharin on the LgA procedure. The procedure (IntA or LgA) assigned to the right lever was the one that the rat had prior experience with in Experiment 1, and the new procedure was assigned to the left lever. Hence for rats that were in the IntA group in Exp. 1, the right lever served as the IntA lever and the left lever served as the LgA lever. The opposite was true for rats that were in the LgA group in Experiment 1. This meant that the IntA and LgA procedures were evenly counterbalanced across left and right levers for the 12 subjects in Experiment 2. Further, familiarity with the IntA versus LgA procedures was also counterbalanced given that half of the rats had prior experience with IntA, and the other half had prior experience with LgA. Following training with the different procedures on the two levers, all rats were given one progressive ratio test with the IntA lever and one progressive ratio test with the LgA lever.

IntA and LgA training

Lever press acquisition training was unnecessary because all rats had already learned to lever press in Experiment 1. As in Experiment 1, rats had training sessions every other weekday. One cohort of six rats was trained on one day and the other cohort was trained on the alternate day (the same cohorts as in Exp. 1). The first two sessions were with the right lever and the procedure (either IntA or LgA) that the rats had prior experience with. This was meant to re-establish baseline responding on the familiar procedure before introducing the new procedure. The details of the IntA and LgA procedures were the same as described in Experiment 1. After the first two sessions with the right lever, the rats had eight consecutive sessions on their new procedure with the left lever. Then, the procedure alternated over the final four sessions such that the sequence for half the rats was IntA, LgA, IntA, LgA and for the other half it was LgA, IntA, LgA, IntA.

Progressive ratio tests

After completing ten total sessions on the new procedure, all rats were given a progressive ratio test session with each lever, on separate days, to measure motivation for the saccharin reinforcer. Half of the rats experienced the test with the right lever first and the other half had the test with the left lever first. Within each of these left-right versus right-left order subgroups, half of the rats had experienced LgA on the right lever and IntA on the left lever, and the other half experienced the opposite arrangement. The progressive ratio test procedure was the same as described in Experiment 1.

Results

IntA and LgA training

Rats readily learned to press the left lever on the new procedure. Figure 5a shows the mean number of saccharin reinforcers per session over the final two LgA and IntA sessions, which alternated, for all rats. Rats earned significantly more saccharin reinforcers during the final two LgA sessions than during the final two IntA sessions (T+[12] = 78, p < 0.001). Figure 5b shows that, similar to Experiment 1, rats earned about three times more saccharin reinforcers per minute on the IntA procedure than on the LgA procedure during these sessions (T+[12] = 78, p < 0.001). There were no differences between the rats that were formerly in the LgA group or IntA group of Experiment 1 in terms of total reinforcers per session or reinforcement rate (all U[6,6]s ≥ 12, all ps > 0.3). For the rats that had IntA as the new procedure, there was no significant change in number of reinforcers obtained over the ten IntA sessions (χ2[9] = 14.3, p > 0.1). Similarly, there was no evidence of escalation in the rats that had LgA as the new procedure over the ten LgA sessions (χ2[9] = 10.5, p > 0.3).

Fig. 5
figure 5

(a) Mean (± SEM) reinforcers earned per session during the final two long access (LgA) and intermittent access (IntA) sessions in Experiment 2. (b) Mean (± SEM) rate of reinforcement during lever insertion for the LgA and IntA levers averaged over the final two LgA or IntA sessions. *** indicates p < 0.001

Figure 6 shows cumulative records for two rats’ final sessions on the LgA and IntA procedures. Subject M11 (left panel) was formerly in the IntA group of Experiment 1 and Subject M6 (right panel) was formerly in the LgA group of Experiment 1. These rats were chosen because they illustrate well the difference between procedures and because they are not the same rats whose records were presented in Fig. 2. The patterns of responding on the two procedures in Experiment 2 were similar to those observed in Experiment 1. On the LgA procedure, pauses between reinforcers tended to become gradually longer as the session progressed. On the IntA procedure, rats typically either responded at high rates or, late in the session, not at all during the availability periods.

Fig. 6
figure 6

Representative cumulative records from the final long access (LgA) and intermittent access (IntA) training sessions for two rats in Experiment 2

Progressive ratio tests

Figure 7a shows the mean breakpoints from the progressive ratio test. Rats reached significantly higher breakpoints on the test with the IntA lever than on the test with the LgA lever (T+[10] = 48.5, p < 0.05). The prior history of the rats did not impact test results. For the subgroup of rats that were in the IntA group in Experiment 1, mean breakpoints on the IntA and LgA levers were 52.3 and 43.5, respectively. For the rats that were in the LgA group in Experiment 1, mean breakpoints on the IntA and LgA levers were 52.7 and 42.0, respectively. These subgroups did not differ on IntA or LgA breakpoint in Experiment 2 (both U[6,6]s ≥ 14, both ps > 0.55). Further, the difference between IntA and LgA lever breakpoints in Experiment 2 did not differ across these subgroups (U[6,6] = 18, p > 0.99). Finally, the order of the tests had no impact on test results. There were no differences in IntA breakpoint, LgA breakpoint, or the difference between IntA and LgA breakpoints for the subgroups having the different test orders (all U[6,6]s ≥ 12.5, all ps > 0.35).

Fig. 7
figure 7

Mean (± SEM) breakpoint on the long access (LgA) and intermittent access (IntA) progressive ratio tests in Experiment 2. (b) Mean (± SEM) response rates observed on the LgA and IntA levers during the ratios corresponding to the first seven reinforcers of the test. The numbers in parentheses on the x-axis indicate the ratio in effect for a particular reinforcer number during the test. (c) and (d) Scatterplots showing relation between last training session responses and breakpoint on the progressive ratio test for the LgA and IntA groups, respectively. Note that the x-axes scales differ over panels c and d. * indicates p < 0.05

Figure 7b shows response rates during each ratio of the test for the highest number of reinforcers obtained by all rats – seven in this case – on both tests. A pattern similar to that observed in Experiment 1 was seen here as well, with rats responding at comparable rates during the IntA and LgA tests. Wilcoxon signed-ranks tests performed for each ratio resulted in a p-value less than 0.05 only for the seventh ratio (T+[12] = 67, p = 0.027), which is not significant after adjusting α for multiple comparisons. For all other ratios, p-values were greater than 0.25 (all T+s ≤ 54). Figures 7c and 7d show the relation between PR breakpoint and last session responding for the two groups. The correlation coefficients were 0.63 (p < 0.05) and 0.68 (p < 0.025) for the LgA and IntA groups, respectively, indicating that in both conditions, the higher responders during training tended to have higher breakpoints on the test.

Discussion

Experiment 2 replicated the main results of Experiment 1 in a within-subjects design. Consistent with the primary outcome of Experiment 1, rats reached higher breakpoints on the progressive ratio test with the IntA lever than on the progressive ratio test with the LgA lever. These results were unaffected by procedure history; rats showed higher breakpoints on the IntA lever compared to the LgA lever regardless of which procedure they experienced in Experiment 1. Though significantly elevated breakpoints were observed on the IntA lever, the difference in breakpoints between IntA and LgA levers was somewhat smaller than the difference between IntA and LgA groups in Experiment 1. This may have been due to some generalization of learning across levers.

General discussion

The main findings of this study were that the IntA procedure produced greater progressive ratio breakpoints, thought to reflect motivation, for saccharin reinforcers than the LgA procedure in experiments using both between- and within-subjects designs. These results show the same pattern of increased reinforcer motivation as previous studies that have used cocaine reinforcers (Allain et al., 2018; Algallal et al., 2020). The similarity in increased motivation after IntA training for both drug and non-drug reinforcers suggests that factors other than the direct effects of the drug on the brain contribute to this effect. However, it may be argued that intermittent saccharin access could have produced neuroadaptations, perhaps different from those produced by cocaine, and this could explain the IntA effect on motivation in this study. For example, in humans, saccharin has been shown to partly activate food reward circuitry (Yang, 2010). This activation appears to be due to saccharin’s taste rather than to direct effects on the brain (Haase et al., 2009). Nonetheless, perhaps indirect activation of brain circuity triggered by intermittent exposure to the taste of saccharin could produce lasting neuroadaptations. Research with rats has found that long-term exposure to saccharin in the drinking water can interfere with hippocampal integrity and produce impairments in learning tasks involving the hippocampus (Erbaş et al., 2018), suggesting that saccharin can have direct effects on the brain. Perhaps such effects could have altered the brain in a way leading to greater saccharin motivation in the IntA group than in the LgA group here.

The idea that increased saccharin motivation after IntA training was caused by neuroadaptations produced by saccharin’s direct effects on the brain is more difficult to reconcile with the results of Experiment 2. Rats still reached higher breakpoints on the IntA lever than the LgA lever in the within-subjects design of Experiment 2. If lasting neuroadaptations produced by intermittently high levels of saccharin intake increased motivation for saccharin, this should have been the case on the progressive ratio tests with both the IntA lever and the LgA lever since it was the same rat taking both tests (in counterbalanced order). That breakpoints significantly differed across levers suggests that another factor is responsible for the IntA vs. LgA effect on reinforcer motivation.

One potential explanation is that rats learned to press faster during the IntA procedure due to the time constraints imposed by each 5-min availability period. That is, during IntA training, rats might have learned that they had to press at high rates during periods of lever availability to obtain, or at least approximate, their preferred daily level of saccharin intake. In contrast, during LgA training, rats could have responded at a more leisurely pace and still obtained their preferred amount of saccharin. If, during IntA training, rats learned to press the lever faster than during LgA training, this difference could have carried over to the progressive ratio tests, making it easier for rats to reach higher breakpoints in the IntA group of Experiment 1 or on the IntA lever of Experiment 2. However, as Figs. 4b and 7b illustrate, peak response rates were generally similar across IntA and LgA conditions during the test, suggesting that faster responding is not what was responsible for higher breakpoints after IntA training. Instead, it appears that rats persisted more at the higher ratios after IntA training without necessarily responding faster.

An alternative potential explanation is that features of the IntA training procedure promoted habitual responding, while responding after LgA training remained goal-directed. Habitual responding has been thought to be stimulus-driven and controlled by the stimulus-response (S-R) association, whereas goal-directed responding is controlled by the response-outcome (R-O) association (Adams & Dickinson, 1981; Balleine & O’Doherty, 2010). More recent work indicates that the key to the formation of habits is the predictability of reinforcement signaled by S (Thrailkill et al., 2018, 2021). Manipulations such as reward devaluation through satiation (Vandaele et al., 2017) or pairing with an aversive stimulus (Bouton et al., 2021; Steinfeld & Bouton, 2021), extinction (Zapata et al., 2010), and contingency degradation (Vandaele et al., 2017) are typically used to determine whether responding is habitual or goal-directed. If behavior decreases in response to one of these manipulations, it is goal-directed; if it is insensitive to these manipulations, it is habitual. A progressive-ratio test can be thought of as a gradual approximation to extinction because as the test proceeds, more and more responses go unreinforced and ultimately the subject stops responding. There is evidence that intermittent exposure to lever insertions signaling predictable response-dependent reinforcement, which is similar to the experience of rats on the IntA procedure of the present experiment, is especially likely to make responding habitual (Thrailkill et al., 2021; Vandaele et al., 2017), whereas continuous access to a lever signaling reinforcer availability (as in the LgA condition here) makes responding goal-directed (Vandaele et al., 2017). Higher breakpoints might be expected on the progressive ratio test after IntA training if such training produced habitual responding because rats would be relatively insensitive to the gradual thinning of the reinforcement schedule. However, contrary to this intuition, prior research has shown that habitual responses are actually less resistant to extinction than goal-directed responses (Thrailkill et al., 2018; see Bouton et al., 2020, for more on the impermanence of habits). Thus, it seems unlikely that potential habit learning in the IntA group could explain the present results.

Singer et al. (2018) also provided evidence suggesting habit learning is not responsible for increased motivation observed after IntA training with cocaine as the reinforcer. They trained rats on a modified seeking-taking chain where in the first (seeking) link, subjects had to complete a “puzzle” (e.g., nose poke twice, then turn a wheel four times) to advance to the taking link. In the taking link, a retractable lever was inserted and rats could press it to obtain cocaine infusions. The lever was inserted for a 5-min availability period followed by a 25-min unavailability period, as in the usual IntA procedure. A new seeking-link puzzle was introduced each session, thereby preventing the formation of seeking habits. Nevertheless, rats increased puzzle-solving proficiency and increased the rate of seeking responses over sessions. They also showed evidence of addiction-like behavior (e.g., increased resistance to punishment, increased reinstatement) when tested on the taking lever. The authors noted that while their experiment was not designed to test whether taking responses became habitual, they did find that rats typically took four to five infusions in the first minute of each IntA availability period before stopping, which suggests that taking responses remained outcome-sensitive. This is more evidence that the pattern of responding on IntA schedules likely does not depend on habit learning.

A more promising potential explanation for the increased breakpoints after IntA training is based on differences in the Pavlovian properties conditioned to discriminative cues controlling responding after IntA versus LgA training. In the current study, the levers acted as discriminative stimuli (SDs) that signaled saccharin availability. During the time that the lever was inserted into the chamber, rats in the IntA condition earned three times more reinforcers per minute than rats in the LgA condition (in both the between- and within-subjects designs of Exps. 1 and 2, respectively). This should have resulted in more Pavlovian excitation being conditioned to the lever after IntA training than after LgA training. Research on Pavlovian-to-Instrumental (PIT) transfer have shown that the Pavlovian properties of cues can motivate instrumental responding (Holland, 2004; Rescorla, 1994). In PIT experiments, a separately conditioned Pavlovian cue is typically superimposed on an operant baseline. But Pavlovian conditioning is also embedded within discriminative operant training – an SD signals a stimulus-outcome (S-O) association in addition to signaling a response-outcome (R-O) relation (Weiss, 1978). It is this type of implicit Pavlovian conditioning that is hypothesized to occur during IntA training.

Some IntA studies (e.g., Kawa et al., 2016) have used a constantly available nose-poke port as the manipulandum, rather than retractable levers, with cocaine availability periods signaled by a separate SD such as the house light turning off. (It is worth noting that prior research with non-drug reinforcers has found that light-off, rather than light-on, can serve as an effective SD in discrimination training (Weiss, 1969; Weiss et al., 2009)). Procedures where the operant manipulandum in constantly present are like traditional multiple (mult) schedules (i.e., mult FR-1 extinction in the case of IntA), which have often been used to study the Pavlovian properties conditioned to SDs (Weiss, 2014). Indeed, some learning theories (e.g., scalar expectancy theory; Gibbon & Balsam, 1981) would predict that the relatively long S components (i.e., the 25-min signaled periods when responding was not reinforced) in the IntA procedure should make the SD an especially strong conditioned excitor as compared to more common mult schedules where SD and S are of similar length.

Research related to behavioral momentum theory (Nevin, 2012; Nevin et al., 1990) illustrates well how the Pavlovian conditioning that accrues to discriminative cues can motivate operant responding and make it resistant to disruption (e.g., extinction, satiation, punishment, etc.). For example, pigeons’ and rats’ responding was more resistant to extinction or satiation in the presence of cues that signal high rates of reinforcement than in the presence of cues that signal lower rates of reinforcement (Bai & Podlesnik, 2017; Shull et al., 2002). Furthermore, it was shown that this effect was due to the S-O (Pavlovian) relation rather than the R-O (operant) relation signaled by the discriminative cue (Nevin et al., 1990; Shull et al., 2002). In the present study, the presence of the IntA lever should have made responding more resistant to disruption because this lever was a discriminative cue associated with a higher rate of reinforcement. This could explain the increased persistence of responding on the progressive ratio tests observed after IntA compared to LgA training. Future experiments that manipulate the properties of the cues that are present during testing will be needed to rigorously test the Pavlovian conditioning explanation of the IntA vs. LgA effect. Such an account has the potential to parsimoniously explain the increased motivation observed after IntA training with both drug and non-drug reinforcers.

That learning factors could potentially explain the increased motivation produced by IntA training observed here does not preclude the notion that IntA drug exposure results in neuroadaptations that cause changes in behavior. For example, IntA cocaine self-administration has been shown to produce psychomotor sensitization (Allain et al., 2021; Carr et al., 2020), dopamine sensitization (Kawa et al., 2019b), and increased cocaine seeking in an incubation of craving design (Nicolas et al., 2019). Compared to when food was the reinforcer, IntA training with a cocaine reinforcer resulted in more incubation of craving and more cue-induced reinstatement as well as increased BDNF expression in the ventral tegmental area and prelimbic cortex (Gueye et al., 2019). (It should be noted, though, that cocaine cues can be more effective than food cues in promoting responding after extinction even when IntA training is not used (Ciccocioppo et al., 2004; Tunstall & Kearns, 2016)). It is less easy to see how the learning processes described above with regard to increased reinforcer motivation can account for these various other consequences of IntA drug exposure. It is possible that learning processes and cocaine-related neuroadaptations both contribute to the increased motivation for cocaine seen after IntA training. Future research will be needed to determine how important these different processes are and whether learning factors could help understand other consequences on IntA drug exposure.

In conclusion, recent drug self-administration studies have shown that different drug access conditions influence rats’ motivation for drug reinforcers (Allain et al., 2018; Algallal et al., 2020; Kawa et al., 2019a; Zimmer et al., 2012). Prior to this study, it was unclear if these behavioral changes were specific to drug reinforcers. It had been hypothesized that increased motivation was due to spiking brain cocaine levels and related neuroadaptations (Calipari et al., 2015; Siciliano & Jones, 2017; Allain et al., 2018; Algallal et al., 2020; Kawa et al., 2019a; Zimmer et al., 2012). The present results show that the difference in reinforcer motivation produced by IntA and LgA procedures is not limited to drug reinforcers. Future research is needed to determine how associations among stimuli, responses, and outcomes formed within different self-administration procedures contribute to reinforcer motivation. Testing animal models of addiction with non-drug reinforcers is crucial for understanding the extent to which observed addiction-related behavior is due to drug experiences per se or due to general learning and behavioral processes engaged by the procedures used.