Locus coeruleus-norepinephrine (LC-NE) neurons exhibit two distinct modes of activity: phasic and tonic (Aston-Jones & Bloom, 1981; Aston-Jones & Cohen, 2005). In the phasic mode, LC-NE neurons exhibit moderate baseline activity, with short bursts of higher frequency firing (10–15 Hz) associated with salient stimuli or decisions. These burst responses are thought to optimize performance within a cognitive task, promoting appropriate actions that lead to reward (e.g., lever presses; Bouret & Sara, 2004; Clayton, Rajkowski, Cohen, & Aston-Jones, 2004; Shea-Brown, Gilzenrat, & Cohen, 2008; Usher, Cohen, Servan-Schreiber, Rajkowski, & Aston-Jones, 1999). In contrast, in the tonic mode LC-NE neurons exhibit elevated and more irregular baseline activity (2–6 Hz) with few evoked (burst) responses (Aston-Jones & Cohen, 2005; Aston-Jones, Rajkowski, Kubiak, & Alexinsky, 1994). Periods of high tonic activity are associated with poor performance and increased distractibility within a given task, as animals participate in fewer trials and commit more errors (Aston-Jones et al., 1994).

Although increased tonic activity can be detrimental to performance within a task, adaptive gain theory (AGT) proposes that increased tonic activity may promote behavioral flexibility, especially when utility in a given task is low (Aston-Jones & Cohen, 2005). LC-NE tonic activity increases when reward contingencies change (Aston-Jones, Rajkowski, & Kubiak, 1997), and increasing NE via pharmacological manipulations improves behavioral flexibility in rats (McGaughy, Ross, & Eichenbaum, 2008). Furthermore, baseline pupil diameter, an indirect measure of LC-NE tonic activity (Aston-Jones & Cohen, 2005; Joshi, Li, Kalwani, & Gold, 2016; Varazzani, San-Galli, Gilardeau, & Bouret, 2015), increases as reward rate within a task wanes (Gilzenrat, Nieuwenhuis, Jepma, & Cohen, 2010). However, a causal relationship between LC tonic activity and propensity to change behavior has yet to be established.

To test for a causal relationship, we selectively stimulated LC tonic activity using designer receptors (DREADDs; Alexander et al., 2009; Armbruster, Li, Pausch, Herlitze, & Roth, 2007) as rats performed a patch-foraging task. In this task, rats repeatedly made decisions to exploit reward within a patch that depleted over time, or to leave the patch to travel to a new, full one. The marginal value theorem (MVT) describes the optimal behavior in patch-foraging tasks, which is to leave a patch when the reward rate within the current patch matches or falls below the average reward rate across all patches (Charnov, 1976). MVT makes two specific predictions about behavior in patch-foraging tasks: (i) animals should stay longer in patches that contain more reward, as it will take longer for the reward to deplete to the level that matches the average in the environment; and (ii) when the time to travel to a new patch is longer, animals should stay in all patches longer, as the average reward rate against which the current patch will be compared will be lower. We found that rat behavior qualitatively conformed to these predictions, although rats stayed longer in patches overall than is predicted by MVT, a pattern that has been consistently observed in other animals (including humans and monkeys; Constantino & Daw, 2015; Hayden, Pearson, & Platt, 2011).

In addition to the predictions derived from MVT, we predicted that increasing LC tonic activity would cause disengagement from current behavior (i.e., exploiting a particular patch). Specifically, we predicted that disengagement from exploiting a patch would result in earlier patch leaving, and that rats’ decisions to stay in the patch versus leave the patch would be less correlated with reward rate (i.e., would reflect an increase in decision noise rather than a systematic bias to leave earlier). Our experimental results showed that DREADD stimulation of LC-NE neurons, which increases LC tonic firing (Vazey & Aston-Jones, 2014), caused rats to leave patches earlier and increased decision noise, dissociating rats’ decisions from the reward rate within the patch. Additionally, DREADD stimulation impaired performance in the foraging task, reducing task participation, increasing omission rates, and—after prolonged stimulation—inducing long bouts of immobility that resembled previously reported behavioral arrest (Carter et al., 2010). Aside from bouts of immobility observed following prolonged stimulation of LC-NE neurons, these findings are consistent with the hypothesis, derived from AGT, that increases in LC tonic firing rate favor task disengagement and the pursuit of alternative behaviors by global modulation of gain that increases processing noise.

Materials and method

Animals

Adult Long-Evans rats (Charles River, Kingston, NY; n = 32) were used. Rats were housed on a reverse 12 h/12 h light/dark cycle (lights off at 7 a.m.). All behavioral testing was conducted during the dark period. Throughout behavioral testing, rats were food restricted to maintain a weight of 85% to 90% ad-lib feeding weight, and were given ad-lib access to water. All procedures were approved by the Princeton University Institutional Animal Care and Use Committee.

Operant training

Rats were initially trained to press a lever for a 10% sucrose water reward on an FR1 paradigm, in which each lever press was rewarded with 100 μL of sucrose water. Next, rats were trained to nose poke when reward was unavailable. In this stage, the lever stopped yielding reward on randomly selected intervals between four and 12 lever presses. To regain access to a rewarding lever, rats were required to nose poke at the back of the chamber, which caused the lever to retract and another lever on the opposite side of the chamber to extend, which yielded reward. Next, a delay (first 5 s, then 10 s) that simulated the time to travel between patches was introduced between the nose poke and the opposite lever extending. Finally, rats were trained and tested on the foraging task (described below). To move on to the next stage of training, rats were required to pass a criterion of at least 100 rewarded lever presses in a 1-h session.

Foraging task

This task simulated foraging in a patchy environment, resembling the task used in Hayden et al. (2011). Rewards were organized in discrete “patches,” and reward within a patch depleted as the animal harvested it. Animals had to repeatedly decide whether to stay in the patch to continue harvesting the depleting reward source or leave the patch to travel to a new, full patch, incurring a cost of time and effort to travel to a new patch. The task consisted of a series of trials performed in a standard operant chamber (Med Associates, St. Albans, VT; see Fig. 1a). Each rat decided to harvest from a patch by pressing an activated lever on one side of the front of the chamber, or to travel to a new, full patch by nose poking at the back of the chamber and then returning to a newly activated lever on the other side of the front of the chamber. To cue the beginning of a trial, lights above the activated lever and the nose poke illuminated, indicating that the rat could make a decision to harvest a reward from the activated patch (lever press) or to travel to a new patch (nose poke). The task was self-paced, without any deadline to make a decision. If the rat pressed the lever to harvest from the activated patch, a cue light turned on in the reward magazine next to the lever, and liquid sucrose was delivered when the rat’s head entered the magazine. If the rat did not enter the magazine within 5 s, the magazine light turned off, the rat lost the opportunity to receive a reward on the trial, and the trial was scored as an omission. Once the rat entered the magazine or the 5 s period expired, there was a 7 s intertrial interval to control the rate at which the rat could accumulate reward. For each consecutive harvest trial, the reward volume was reduced by approx. 5–15 μL via an exponential decay function to simulate patch depletion. If the rat nose poked to travel to a new patch, the lever retracted for a delay period, simulating the time to travel to a new patch. After this delay, the opposite lever extended, yielding a replenished reward volume, and the rat could begin to harvest from that patch (see Fig. 1a). As each decision to stay or harvest occurred within a defined trial, we examined the number of trials rats spent in patches as a function of patch type and travel time.

Fig. 1
figure 1

a The foraging task. Rat chooses to lever press to harvest from the patch on trial n, then receives reward followed by an ITI of 7 s. Rat then chooses to harvest from the patch again on trial n + 1, receiving a smaller reward. On trial n + 2, rat decides to nose poke to leave the patch, which causes the lever to retract and initiates a delay, simulating the travel time. After which the rat can continue harvesting from a new, replenished patch on trial n + 3. b Reward volume received as a function of trials spent in a patch and the patch type. c Number of trials rats spent in each patch for each patch type and travel time. Error bars represent ±SEM, n = 8. Rats spent more trials in patches that started with greater rewards and spent more time in all patches with the longer travel time. Rats overharvested compared to optimal behavior predicted by the MVT (shown by green line). (Color figure online)

In an initial behavioral experiment, rats encountered three different patch types within each behavioral session; each patch type differed in the amount of reward it contained, starting with either 90, 120, or 150 μL, and followed the same depletion function (see Fig. 1b). On different behavioral sessions, rats were tested on either a 10 s or 40 s travel time. For each travel time, rats were trained until reaching a criterion of stable behavior, determined by a change of less than two lever presses/patch on average over the course of the session over 3 consecutive days. They were then tested for three sessions on consecutive days. In the DREADD stimulation experiment, rats were tested on a wide range of patch types within each behavioral session, which started with 30–150 μL of reward and depleted at the same rate, using a single travel time: 10 s. In this experiment, rats were trained until reaching a criterion of stable behavior, then tested for 3 to 5 sessions for each drug treatment, with at least 1 day with no drug treatment in between each testing day.

Foraging video analysis

A single behavioral session for each condition (saline, 0.3 and 1 mg/kg CNO) of hM3Dq-expressing animals was recorded and analyzed. The frequency and duration of periods of immobility (no detectable movement for at least 10 s) were scored.

Optimal behavior

According to MVT, it is optimal to stay in the patch when the reward rate within the current patch is greater than the reward rate across patches (i.e., the cumulative reward rate in a given session). In this task, reward rate in the current patch, \( {r}_i \), is given by:

$$ {r}_i=\frac{rewar{d}_i}{tim{e}_i}, $$
(1)

where \( {reward}_i \) is the volume of sucrose water delivered on trial i, and \( {time}_i \) is the time from the start of trial i (cue lights for lever and nose poke turn on) until the start of trial \( i+1 \). The cumulative reward rate, \( {R}_i \), is calculated as the sum of rewards divided by the total time since the start of the session:

$$ {R}_i=\frac{\sum_{t=1}^i rewar{d}_t}{\sum_{t=1}^i tim{e}_t}. $$
(2)

The predicted optimal number of trials to stay in each patch was determined using a simulation, with the decision rule to leave the patch if \( {R}_i\ge {r}_i \), otherwise to stay. Response times (RTs; which impacted reward rate calculations) were sampled from the empirical distribution of rats’ RTs.

Model fitting

To determine how well reward rate predicted decisions to leave patches, we used logistic decision models to fit choices as a function of the difference between the cumulative reward rate across patches and current reward rate within the patch, which we refer to as the value of leaving the patch, \( {V}_{leave} \):

$$ R=\frac{\sum reward}{\sum time}, $$
(4)
$$ {V}_{leave}=R-{r}_i $$
(5)

Reward rate within the patch\( , {r}_i \), was calculated as described above. The cumulative reward rate, \( R \), was measured as the average reward rate across the entire session. As rats were trained for many days and maintained stable behavior, we assume they had an estimate of the average reward rate in the environment from the beginning of the session.

Logistic decision curves were fit as generalized linear mixed-effects models, with a fixed effect of group (mCherry vs. hM3Dq), random effects for drug treatment, \( {V}_{leave} \), and interactions among all variables as fixed and random effects. From this model, we are interested in two parameters: the indifference point and the slope. The indifference point, the point at which rats were equally likely to stay in versus leave the patch, represents the bias toward staying in versus leaving the patch. If the indifference point occurs at a positive value of leaving (i.e., reward rate within the patch is greater than the cumulative reward rate), it indicates overharvesting (i.e., staying too long in patches). The indifference point is measured from the intercept of the model, and change in the intercept due to group and/or drug treatment is given by the fixed effect of group, treatment, and their interaction. The slope provides a measure of how well rats’ decisions correlate with the value of leaving the patch and serves as an index of decision noise; a smaller slope (more shallow curve) indicates that the value of leaving the patch is less predictive of rats’ decisions, and thus presumably more subject to noise. This parameter is given by the effect of \( {V}_{leave} \), and changes in the slope due to group and/or drug treatment are given by the interactions among the \( {V}_{leave} \), group, and drug treatment.

Two potential concerns with this analysis are that (i) parameters may change over the course of the session and (ii) the model assumes all foraging decisions are independent and identically distributed, whereas there may have been dependency on decisions within the same patch. To address these concerns, we fit an additional GLM, adding a fixed and random effect of time into the session and random intercept for each patch visited. The slope and indifference point for each group and drug treatment were estimated as described above.

Virus injection surgery

Anesthesia was induced and maintained using isoflurane (5% and 2%, respectively). Rats were aligned in a stereotaxic frame such that bregma was 2 mm below lambda (~15° head angle). Rats received bilateral injections of an AAV-PRSx8-hM3Dq-HA or AAV-PRSx8-mCherry virus into LC at the following coordinates from lambda: AP: -4.0 mm and ML: ± 1.35 mm from lambda, DV: 6–6.5 from skull surface. Total injection volume was 1.2 μL per hemisphere, at a rate of 100 nL/min (Vazey & Aston-Jones, 2014). After surgery, rats received meloxicam (0.1 mg/kg i.p.). for postoperative analgesia daily for 3 days, and were given 7 days to recover before further testing.

CNO administration

DREADDs were activated by the selective ligand clozapine-N-oxide (CNO; NIMH Chemical Synthesis and Drug Supply Program), which was dissolved in 2% DMSO and diluted in sterile saline to a final injection volume of 1 mL/kg. CNO (0.3 mg/kg or 1 mg/kg, i.p.) or saline was administered immediately prior to the start of behavioral testing, and CNO (1 mg/kg, i.p.) was administered 2 hours before perfusion for Fos immunohistochemistry.

Histology

After a final CNO injection, animals were transcardially perfused with PBS, then 4% paraformaldehyde. Brains were extracted, postfixed overnight in 4% paraformaldehyde, and cryoprotected in PBS with 30% sucrose for 72 hr. Brains were then embedded in OCT compound and frozen using isopentane, then stored at -80° C. Coronal sections (35 μm) throughout LC were cut on a cryostat. To confirm expression of hM3Dq and mCherry, a 1-in-4 series of 35 μm-thick sections was labeled with antibodies against tyrosine hydroxylase (TH), as a marker of LC-NE neurons, and the HA DREADD-tag or mCherry. Sections were first blocked with 5% normal donkey serum (NDS) in PBS with .25% Triton X-100 for 1 hr, then incubated overnight in primary antibodies: mouse anti-TH (1:1000; Immunostar, cat no.: 22941), rabbit anti-HA (1:1000, Cell Signaling Technologies, cat no.:3724S), or rabbit anti-DsRed (1:1000, Clontech: cat no.: 632496). Sections were then washed 3 × 5 min in PBS and incubated for 2 hr in secondary antibodies (conjugated to Alexa Fluor dyes): donkey anti-mouse 488 (1:250, Invitrogen, cat no.: A21202) and donkey anti-rabbit 568 (1:250, Invitrogen, cat no.: A10042). Finally, sections were washed again for 3 × 5 min in PBS.

To examine expression of the immediate early gene product Fos in LC-NE neurons following CNO administration, an additional series of sections was labeled with antibodies against TH and Fos. Sections were first blocked in 10% NDS in PBS with .25% Triton X-100, then incubated overnight with primary antibodies: mouse anti-TH (1:500) and goat anti-Fos (1:100, Santa Cruz, cat no.: SC-52-G). Sections were then washed for 3 × 5 min in PBS and incubated for 2 hr in secondary antibodies: donkey anti-mouse biotin (1:100, JAX, cat no.: 715-065-150) and rabbit anti-goat 488 (1:100, Invitrogen, cat no.: A11078). Sections were washed again for 3 × 5 min in PBS and then incubated for 2 hr in streptavidin 350 (1:100, Invitrogen, cat no.: S11249) and donkey anti-rabbit 488 (1:100, Invitrogen, cat no.: A21206).

Histological analysis

To measure the density of hM3Dq-HA and mCherry staining, two images of TH and hM3Dq-HA or mCherry staining, ~mid-LC along the anterior-posterior axis, were acquired using a Zeiss LSM 700 confocal microscope and analyzed using Fiji (Schindelin et al., 2012). In each image, TH and transgene staining were binarized using the same threshold for each image, and we measured the percentage of pixels positively labeled with transgene within the pixels positively labeled with TH. The density of Fos+ cells in the LC was determined using the optical fractionator method in StereoInvestigator (MBF Bioscience, Williston, VT) with an Olympus BX-60 microscope. In three sections for each animal, a contour was drawn around the boundary of LC in each hemisphere. Fos+ cells were counted using a 40X objective. The size of the grid (70 × 70 μm) and counting frame (110 × 110 μm) were set to count ~40% of the volume of each contour. Due to the small size of the LC, differences in the volume of LC in counted sections had a great influence on the stereological estimate of Fos+ cells. Therefore, we normalized the number of Fos+ cells in counted sections to the volume of counted sections (density = Fos+ count/volume).

Data analysis

All analyses were performed in R (R Core Team, 2015). To examine changes in the number of trials spent in patches, a two-way ANOVA, with patch type and travel time as random effects was used for the initial behavioral experiment. A t test was used to compare rat behavior with optimal behavior. To examine differences in Fos expression due to LC stimulation, a t test on the density of cells expressing Fos between mCherry and hM3Dq animals was used. Additionally, we regressed the density of Fos+ cells by the level of mCherry or hM3Dq expression to examine if more DREADD expression leads to greater activation of LC. For the DREADD stimulation foraging experiment, a mixed-effects model, with fixed effects for animal group (mCherry vs. hM3Dq), patch type, and drug treatment, and random effects for patch type and drug treatment was used to examine the effects of LC stimulation on foraging behavior. To examine changes in logistic decision model parameters, we used two-way ANOVAs, with animal group as a fixed effect and drug treatment as a random effect. Mixed-effects models were conducted using the lme4 package (Bates et al., 2015). To examine simple and interaction effects, we tested contrasts using the phia and lsmeans packages in R, with Bonferroni correction for multiple comparison in all post hoc contrasts (De Rosario-Martinez, 2015).

Results

Rat foraging behavior is qualitatively consistent with the marginal value theorem

In an initial behavioral experiment, rats (n = 8) experienced three different patch types within a single session, starting with either 90, 120, or 150 μL of reward and then depleting at a uniform rate. Across sessions, they were tested with two different travel time delays, 10 s and 40 s. According to MVT, rats should stay longer in patches that start with greater rewards and should stay longer in all patches when the travel time imposed between patches is longer. Behavior qualitatively followed these predictions: Rats stayed for more trials in patches that started with 120 versus 90 μL (p = .001), and with 150 versus 120 μL (p < .001); main effect of patch type on trials in patch: F(2, 14) = 93.10, p < .001. Additionally, rats stayed longer in all patch types in the 40-s versus the 10-s travel time, F(1, 7) = 78.17, p < .001. However, rats overharvested, staying in all patch types for longer than is quantitatively predicted by the MVT (simulation vs. behavior), t(7) = 11.341, p < .001 (see Fig. 1c). This overharvesting is consistent with foraging behavior observed in a variety of other species, including humans and monkeys (Calhoun & Hayden, 2015; Constantino & Daw, 2015; Hayden et al., 2011; Pyke, 1978, 1984; Stephens & Krebs, 1986).

DREADD stimulation of LC-NE neurons increases Fos expression

To stimulate LC tonic firing, we expressed the Gq-coupled DREADD hM3Dq or the fluorescent protein mCherry to control for potential effects of virus expression, selectively in LC-NE neurons. Expression was restricted to LC-NE neurons by including the synthetic dopamine-β-hydroxylase (DBH) promoter, PRSx8 (Hwang et al., 2001, 2005), in an AAV that encoded either hM3Dq (hM3Dq-HA; n = 15) or mCherry (control, n = 9). Prior studies using these vectors found that 97% of cells expressing hM3Dq-HA or mCherry colocalize with TH following injection of these AAVs into LC (Vazey & Aston-Jones, 2014). Figure 2 shows expression of hM3Dq-HA and mCherry in LC. Because hM3Dq-HA is heavily trafficked to the plasma membrane and dendrites (Vazey & Aston-Jones, 2014), counting the number of LC-NE cell bodies expressing hM3Dq-HA would not provide the most accurate measure of expression levels. Instead, we examined hM3Dq and mCherry expression by quantifying the density of hM3Dq-HA and mCherry immunofluorescence (see Figs. 2a–b), and confirmed activation of LC-NE neurons by measuring Fos expression 2 h after CNO injection (1 mg/kg). Results showed that 13/15 hM3Dq-HA rats had bilateral DREADD expression in LC and 2/15 unilateral; 9/9 mCherry rats had bilateral expression. We counted the density of Fos+ cells in LC in a subset of these rats (10 hM3Dq-HA rats—eight bilateral/two unilateral—and six mCherry rats). All 10 hM3Dq-HA expressing animals exhibited greater Fos expression in LC than the six mCherry animals, t(14) = 4.83, p < .001, confirming our ability to stimulate LC-NE neurons. Additionally, Fos expression increased with greater levels of hM3Dq-HA staining, but not with greater levels of mCherry staining (hM3Dq: β = 6734, p = .004; mCherry: β = -16.15, p = .932; see Fig. 2c).

Fig. 2
figure 2

a Images of TH (top), hM3Dq-HA and Fos (middle), and merged images, showing expression of hM3Dq-HA and Fos in LC. b Images showing expression of mCherry and Fos in LC. c The density of Fos+ cells in the LC as a function of the level of expression of hM3Dq-HA or mCherry. Because hM3Dq-HA and mCherry exhibited different expression patterns, the density of expression was normalized (z scored) for comparison. The density of Fos+ cells in LC increased with more expression of hM3Dq-HA, but not with mCherry. (Color figure online)

DREADD stimulation impairs performance on the foraging task

The hM3Dq-HA and mCherry rats were tested in the foraging task, with nine different patch types, which started with a range of 30–150 μL of reward and a 10 s travel time. As the reward rate within the patch typically starts above the rats’ threshold for leaving patches (the cumulative reward rate per MVT) and depletes to this threshold, rats experience few trials in which the reward rate within the patch is below their threshold for leaving patches. By testing rats on a wider variety of patch types, including patches that start with very little reward (30–45 μL), rats encountered some patches that started with reward rate below their typical threshold for leaving patches. In this version of the task, rats stayed significantly longer in patches that started with larger rewards, F(8, 184) = 447.6, p > .001.

Rats were injected with either saline or CNO (0.3 or 1 mg/kg) and tested in the foraging task immediately following the injection. We first assessed general performance, including the number of trials completed per 1-h session, response times, and the rate of omissions (defined as trials in which the lever was pressed to harvest from the patch without then entering the reward magazine to obtain the reward). Behavioral data were binned into 10-min intervals to look at the time course of these behaviors within sessions. Following saline injections, hM3Dq-HA rats participated in an average of 361 ± 31 trials, and mCherry rats in 358 ± 31 trials. The number of completed trials declined in the last 20 min of the session for both LC-hM3Dq-HA and mCherry rats (mean trials in first 10 min = 61.93, SD = 4.32; mean trials in last 10 min = 52.33, SD = 9.78); main effect of time on trials: t(396) = 3.497, p < .001, and RTs slightly increased in the last 20 min, but the main effect of time on RT was not significant (mean RT in first 10 min = 2.06, SD = .70 s; mean RT in last 10 min = 3.34, SD = 1.79 s); main effect of time on RT: t(394) = .386. Omission rates remained constant throughout the session, at an average of 0.1% omissions for both hM3Dq-HA and mCherry rats, main effect of time on omissions: t(396) = .01, p = .992.

Consistent with prior studies in which high tonic LC activity correlated with poor performance (i.e., more errors and slower RTs; Aston-Jones et al., 1994), performance of hM3Dq-HA rats sharply declined within 30 min following CNO injection. The hM3Dq-HA, but not mCherry, rats exhibited a significant reduction in trials (effect of treatment within group: hM3Dq–p < .001; mCherry–p = .821; effect of treatment between groups: p < .001; see Fig. 3a), a significant increase in RT (effect of treatment within group: hM3Dq–p < .001, mCherry–p = 1; effect of treatment between groups: p < .001; see Fig. 3b), and increased omission rates (effect of treatment within group: hM3Dq–p < .001, mCherry–p = 1; effect of treatment between groups: p < .001; see Fig. 3c) after CNO.

Fig. 3
figure 3

CNO administration caused impairment in performance in the foraging task in hM3Dq-HA, but not mCherry animals, consisting of a reduction in the number of trials that rats participated in (a) an increase in response time and (b) increase in the rate of omissions, that is, trials in which rats pressed the lever but did not retrieve reward (c). Testing began immediately following drug treatment, so time along the x-axis represents both time within the session and time following drug administration. Values represented as mean ± SEM; n = 15 hM3Dq, 9 mCherry. (Color figure online)

Additionally, in some animals, prolonged stimulation of LC-NE neurons induced long bouts of complete disengagement and immobility (lasting 10 s to 5 min) that resembled behavioral arrest (Carter et al., 2010). Bouts of immobility were first observed ~20 min following CNO administration and occurred more frequently and for a longer duration as time since CNO administration elapsed. We analyzed videos of hM3Dq-HA rats performing the task following saline, 0.3, and 1 mg/kg CNO injections to examine the times at which rats were immobile. The hM3Dq-HA rats exhibited bouts of immobility on a higher percentage of trials following both 0.3 and 1 mg/kg CNO than following saline injections (saline: M = 0.06%, SD = 0.12% of trials; CNO 0.3 mg/kg: M = 2.53%, SD = 3.11%; CNO 1 mg/kg: M = 3.06%, SD = 3.88%; saline vs. .3 mg/kg: p < .001, saline vs. 1 mg/kg: p = .031, .3 mg/kg vs. 1 mg/kg: p = .679). We next examined the total duration of bouts of immobility in the first 20 min, middle 20 min, and final 20 min of behavioral sessions. Following CNO injections, rats spent more time immobile in the middle and final 20 min of the session than in the first 20 min (effect of time on immobility following saline: p = 1.00;.3 mg/kg CNO: p = .016; 1 mg/kg CNO: p < .001).

DREADD stimulation causes earlier patch leaving

Because task participation waned, and omission rates and bouts of immobility increased over the course of the session, we restricted analyses of rats’ patch-leaving behavior to the first 30 minutes of testing. As rats general behavior changed during this period, we included time into the session as a predicting variable in the mixed-effects model used to examine the number of trials spent in each patch across animal group (mCherry vs. hM3Dq) and drug treatment. Overall, rats spent more trials in patches that contained more reward, main effect of patch type: t(31) = 25.598, p < .001. As predicted by AGT, hM3Dq-HA rats left patches earlier following both .3 mg/kg and 1 mg/kg CNO when compared to saline administration (saline vs. .3 mg/kg: p = .003; saline vs. 1 mg/kg: p < .001, .3 mg/kg vs. 1 mg/kg: p = .460; see Fig. 4a). Neither dose of CNO affected patch leaving behavior in mCherry rats (saline vs. .3 mg/kg: p = .416; saline vs. 1 mg/kg: p = .592; .3 mg/kg vs. 1 mg/kg: p = .977). Additionally, we directly tested the change in behavior due to drug administration between groups, finding that the change in behavior from saline to 1 mg/kg CNO was greater in hM3Dq rats than in mCherry rats (mCherry vs. hM3Dq comparisons: saline vs. 1 mg/kg: p = .045, saline vs. .3 mg/kg: p = .197, .3 mg/kg vs. 1 mg/kg: p = .369).

Fig. 4
figure 4

DREADD stimulation of LC increased patch leaving in the foraging task. a hM3Dq-HA rats stayed in patches for fewer trials following CNO administration than following saline administration. CNO administration had no effect on behavior in mCherry animals. b Likelihood of leaving patches as a function of the value of leaving the patch. Points and error bars are the average likelihood of leaving the patch within each value of leaving bin ± standard error. Lines are the average GLMM predicted probability of leaving the patch for each trial within each value of leaving bin. There is an increased likelihood of leaving the patch due to CNO administration in hM3Dq rats but not mCherry rats. CNO administration increased decision noise, indicated by a shallower slope in hM3Dq rats but not mCherry rats. (Color figure online)

DREADD stimulation increases decision noise

To examine trial-by-trial choices, we computed the psychometric function for decisions to stay versus leave the patch suggested by MVT: that is, the probability of leaving as a function of the difference between the cumulative reward rate across patches and current reward rate within the patch, which we refer to as the value of leaving the patch. Logistic decision curves were fit to the data as generalized linear mixed-effects models (GLMMs), with a fixed effect for group (hM3Dq vs. mCherry) and random effects for drug treatment and the value of leaving the patch. From this model, we assessed (i) whether rats on average leave patches earlier due to drug treatment and (ii) how closely rats’ decisions correlate with the value of leaving patches, which is given by the slope of the curve. A decrease in the slope of the curve (i.e., shallower curve) indicates that rats’ decisions are less closely correlated with the value of leaving the patch, and thus are more noisy. AGT predicts that stimulating LC tonic firing should favor disengagement from the task (and thereby earlier leaving) by increasing decision noise, as indexed by a decrease in slope (see Fig. 4).

To examine whether rats on average left patches earlier, we tested pairwise contrasts for each drug treatment (saline, .3 mg/kg CNO, and 1 mg/kg CNO) within group as well as the change between treatments across groups. The hM3Dq rats were more likely to leave patches on average due to both doses of CNO (saline vs. .3 mg/kg: p < .001; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .046). CNO had no effect on mCherry rats (saline vs. .3 mg/kg: p = .972; saline vs. 1 mg/kg: p = .801; .3 mg/kg vs. 1 mg/kg: p = .891), and the effects of CNO were significant between hM3Dq and mCherry rats (saline vs. .3 mg/kg: p = .002; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .080). To examine changes in slope, we tested the same contrasts as described above, but focusing on their interaction with the value of leaving the patch. Both doses of CNO significantly decreased the slope (i.e., increased decision noise) in hM3Dq rats (saline vs. .3 mg/kg: p = .021; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .327), but not mCherry rats (saline vs. .3 mg/kg: p = .530; saline vs. 1 mg/kg: p = .112; .3 mg/kg vs. 1 mg/kg: p = .611). The interaction between drug treatment and slope of the curve was significant between hM3Dq and mCherry rats (saline vs. .3 mg/kg: p = .013; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .107).

There were a few potential concerns with the simple form of the model described above: (i) The model predicts that the likelihood of leaving the patch should continue to increase with the value of leaving, whereas the likelihood that rats left patches plateaued around 50%. Since rats often left the patch before the reward rate within the patch dropped substantially below the average reward rate, there are far fewer observations in this range. However, rats’ reduced likelihood of leaving relative to model predictions could still skew curve fits. (ii) As rats’ behavior (e.g., number of trials, response times, omission rates) changes over the course of the session, the likelihood of leaving and its correlation with the value of leaving may also change, which is not accommodated by the model. (iii) A related concern is that the logistic model assumes that all foraging decisions are independent and identically distributed, but there may be violations of this assumption, such as variation in the patch-leaving threshold between patches or an increasing bias toward staying in the patch with more trials spent in the patch (i.e., an increasing perseveration bias). To ensure that these concerns were not driving effects reported above, we controlled for them by (i) fitting the same model, but omitting observations beyond the indifference point (GLM2), and (ii & iii) adding a random effect of time and random intercept for each patch, allowing this threshold to vary across patches (GLM3). Our results were robust to controlling for all three concerns. In GLM2, when omitting observations beyond the indifference point, hM3Dq rats were still more likely to leave patches on average and exhibited greater decision noise due to CNO administration, with no changes observed in mCherry rats (effect of treatment between hM3Dq and mCherry—saline vs. .3 mg/kg: p < .001; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .115; treatment × slope between hM3Dq and mCherry—saline vs. .3 mg/kg: p < .001; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .359). The same was true for GLM3, which included all data and all parameters in GLM1 plus a random effect of time and a random intercept for each patch (effect of treatment between hM3Dq and mCherry—saline vs. .3 mg/kg: p < .003; saline vs. 1 mg/kg: p < .001; .3 mg/kg vs. 1 mg/kg: p = .043; treatment × slope between hM3Dq and mCherry—saline vs. .3 mg/kg: p < .032; saline vs. 1 mg/kg: p = .001; .3 mg/kg vs. 1 mg/kg: p = .140). As GLM2 included different data than GLM1, we could not directly compare the likelihood of each model to assess model fit. However, we used AIC and BIC to examine whether the additional parameters in GLM3 improved model fit compared to GLM1. It was unclear whether adding effects of time and a random intercept for each patch (GLM3) improved model fit, as AIC was lower (AICGLM1 = 31004.95, AICGLM3 = 30632.37), but BIC was greater (BICGLM1 = 31291.83, BICGLM3 = 31527.77). At the least, these analysis suggest that the factors listed above do not provide a compelling alternative to the interpretation that CNO increased decision noise.

Discussion

We used a patch-foraging task to examine the influence of elevated LC tonic activity on decisions to exploit a depleting reward source within a patch versus search for a new patch. First, we found that rat foraging behavior qualitatively conforms to the predictions of MVT: rats stayed longer in patches that started with larger rewards, and stayed longer in all patches when the travel time delay between patches was longer. Additionally, we confirmed two predictions of AGT (Aston-Jones & Cohen, 2005): (i) tonic LC stimulation impaired performance within the task, including reduced participation and increased omission rates, and (ii) tonic LC stimulation caused disengagement from patch exploitation, exhibited by earlier patch leaving associated with increased decision noise.

Although rat foraging behavior qualitatively conforms to predictions of MVT, rats stayed in patches longer than is optimal as predicted by MVT, as found for other species (Constantino and Daw, 2015; Hayden et al., 2011; Pyke, 1984; Stephens & Krebs, 1986). The mechanism for overharvesting is unknown, but possible explanations include diminishing marginal utility, such that the rats’ perceived value of reward increased sublinearly with the actual value of the reward (e.g., rats may value 100 μL as less than twice 50 μL); discounting of future rewards obtained after the travel time delay; or uncertainty in the estimation of reward rate, a source of risk that may cause rats to stay in patches longer to ensure that they do not leave significant reward in the patch (Brunner, Kacelnik, & Gibbon, 1992; Constantino & Daw, 2015; Hayden et al., 2011; Kacelnik & Todd, 1992).

Our findings that LC tonic stimulation impaired performance within the task and caused disengagement from patch exploitation are consistent with predictions of AGT (Aston-Jones & Cohen, 2005). To date, evidence in support of AGT has come from physiological recordings from LC neurons in nonhuman species (Aston-Jones et al., 1997; Aston-Jones et al., 1994; Clayton et al., 2004), and from studies in humans using pupillometry as an indirect measure of LC activity (Aston-Jones & Cohen, 2005; Joshi et al., 2016; Varazzani et al., 2015). The latter have tested predictions of AGT in a variety of behavioral tasks designed to probe exploratory behavior and have shown that baseline pupil diameter (a proxy for LC tonic firing) tracks utility in a foraging-like task (Gilzenrat et al., 2010), predicts decisions to explore an unknown reward versus exploit known rewards (Jepma & Nieuwenhuis, 2011), and tracks neural gain (Eldar, Cohen, & Niv, 2013; Warren et al., 2016), the mechanism by which NE release is proposed to impact processing noise. All of these findings have been correlational. Our findings are not only among the first to directly examine the relationship between LC function and decision making but, perhaps most importantly, establish a causal relationship of LC-NE function in such behavior. Our findings indicate that increasing LC tonic firing increases the likelihood of changing behavior, and does so by increasing decision noise as predicted by AGT. These findings are consistent with the time course for effects of DREADD stimulation on LC-NE activity and neural activity in other regions (starting ~10 min following i.p. injection; Smith, Bucci, Luikart, & Mahler, 2016; Vazey & Aston-Jones, 2014). Additionally, our findings corroborate previous work showing that manipulation of LC-NE activity specifically within the ACC regulates behavioral variability (Tervo et al., 2014).

A few notes of caution are warranted in interpreting the effects of chronic DREADD stimulation, as well as the task impairments observed. Although our DREADD manipulation was selective to LC-NE neurons, behavioral and neural effects of chronically stimulating LC tonic activity via DREADDs are not well characterized. In anesthetized animals, DREADD stimulation increases the firing rate of LC-NE neurons within the physiological range, to ~5Hz (Vazey & Aston-Jones, 2014), but its impact on firing rates or NE release in awake behaving animals is unknown. Behavior similar to the immobility we observed has been reported using optogenetic stimulation in mice, for which the intensity of the stimulation is known. Photostimulating LC tonic firing for 5 or more minutes increased measures of anxiety (McCall et al., 2015), and high-frequency photostimulation (5–20 Hz) induced behavioral arrest (Carter et al., 2010), a sustained period of immobility. One possible explanation for the immobility we observed is depletion of NE caused by long-term, high-frequency stimulation. Carter et al. (2010) found that 10 minutes of 10 Hz LC stimulation reduced NE levels in prefrontal cortex, and that treatment with NE reuptake inhibitors increased latency to behavioral arrest and reduced the duration of behavioral arrest due to high-frequency stimulation. Additionally, the LC-NE system has broad efferent projections, and DREADD stimulation affects NE release not only in regions involved in foraging decisions, but throughout the nervous system (Aston-Jones & Cohen, 2005). It is possible that increased NE release into sensorimotor systems caused impairment in task performance and immobility. However, stimulation of LC-NE neurons using the same DREADD approach found no change of locomotor activity in a novel environment (Vazey & Aston-Jones, 2014). To avoid potential confounds, we restricted our analyses to the period prior to the emergence of bouts of inactivity, to focus on task-related consequences of elevated LC activity. The cause of these bouts, their relationship to normal LC function, and any prodromal effects they may have had on the behaviors we observed all remain important questions for future research.

We also addressed several potential concerns that may have influenced the modeling of decisions in the foraging task: (i) that the value of likelihood of leaving the patch plateaued at 50%, despite an increasing value of leaving the patch; (ii) factors influencing the rats’ behavior may have changed over the course of the session; and (iii) violation of the assumption that all decisions are independent and identically distributed. Analyses that controlled for these factors increased confidence in our results. Controlling for these factors did not diminish the significance of the predicted results. However, future studies using either a modified a foraging task or different type of task that permits more balanced sampling around the indifference point will be important to confirm that LC-NE tonic activity regulates decision noise.

Despite these limitations, our findings represent an important step in examining the behavioral function of LC tonic firing, providing causal evidence that increased LC tonic activity causes disengagement from current behavioral goals, and increases the likelihood of leaving a patch in a foraging task. Future studies will be necessary to test additional predictions of AGT, as well as predictions of related theories of LC-NE function (Bouret & Sara, 2005; Sara & Bouret, 2012; Yu & Dayan, 2005), including whether LC tonic activity promotes exploration of options with unknown values. The involvement of the LC-NE system in a variety of fundamental processes—including arousal, attention, memory, and decision execution—and its dysfunction in a variety of psychiatric disorders—including anxiety, depression, and cognitive disorders—indicate the importance of studying the function of the LC-NE system. Our findings suggest an important role for LC in attentional deficits, such as those seen in attention deficit hyperactivity disorder. Future studies will provide an important foundation for characterizing the neural mechanisms underlying many forms of adaptive and maladaptive behavior.