Comparison of the conditioned reinforcing properties of a safety signal and appetitive stimulus: effects of d-amphetamine and anxiolytics

Safety signals providing relief are hypothesised to possess conditioned reinforcing properties, supporting the acquisition of a new response (AnR) as seen with appetitive stimuli. Such responding should also be sensitive to the rate-increasing effects of d-amphetamine and to the anxiolytics 8-OH-DPAT and diazepam. This study tests whether safety signals have conditioned reinforcing properties similar to those of stimuli-predicting reward. Rats received Pavlovian conditioning with either appetitive stimuli (CS+) or safety signals (conditioned inhibitors, CIs) plus truly random control (TRC) stimuli. The appetitive group received a CS + paired with a sucrose pellet and the safety signal group, a stimulus paired with shock omission. Stimuli were tested using an AnR procedure and following systemic d-amphetamine, the 5HT-1A agonist 8-OH-DPAT and the benzodiazepine diazepam in a counterbalanced design. Effective conditioning selectively reduced contextual freezing during CI presentation in the safety signal group and increased food magazine responses (with respect to context and TRC) during CS + presentation in the appetitive group. The appetitive stimulus strongly supported AnR but the safety signal did not. Systemic d-amphetamine significantly potentiated lever pressing in the appetitive group but for the safety signal group, it either reduced it or had no effect, dependent on food deprivation state. 8-OH-DPAT and diazepam had no effect on responding in either group. The safety signal did not support AnR and, therefore, did not exhibit conditioned reinforcing properties. Furthermore, d-amphetamine decreased responding when the safety signal was presented as a consequence, whilst increasing responding with appetitive-conditioned reinforcement. These results are discussed in terms of implications for opponent motivational theory.


Introduction
Safety signals provide "relief" through signalling the absence of an aversive event, denoting places and periods within our environment free from danger. Relief itself is thought to reinforce safety-seeking behaviour, promoting such behaviour in fearful states or environments. Relief may also motivate avoidance behaviour symptomatic of many anxiety disorders. Thus, it has been argued that the pursuit of safety maintains ritualistic behaviour in order to reduce an anxious state as in obsessive-compulsive disorder (Roper et al. 1973). The question remains, however, as to whether the reinforcement provided by a safety signal is comparable to that of a signal-predicting reward.
In order to investigate the properties of safety signals, previous studies have used a procedure known as conditioned inhibition, in which a stimulus explicitly signals the absence of an otherwise expected unconditioned stimulus (US) (Pavlov 1927;Rescorla 1969). Presentation of a stimulus in the absence of an aversive US, such as shock, is thought to confer "relieving" properties to the stimulus as it predicts a period of safety and, in doing so, inhibits conditioned fear (Konorski 1967;Denny 1971;Rogan et al. 2005). Studies in rodents have demonstrated the efficacy of non-contingent presentations of a conditioned inhibitor (CI) in reducing the rate of ongoing avoidance behaviour (Moscovitch and Lolordo 1968;Shearon and Allen 1983;Litner 1969, 1971) that were problematic for early theories of fear in avoidance learning. One such theory, the two-factor theory of avoidance, proposed by Mowrer (1947), focused particularly on the role of a warning stimulus in avoidance, stating that the termination of a conditioned stimulus, paired initially with shock leading to fear (factor 1), provides relief when terminated as a result of an avoidance response (factor 2).
Through further work by Mowrer himself (1956); Dinsmoor (2001); Litner (1969, 1971) and Gray (1971Gray ( , 1987, this two-factor theory of avoidance was advanced in order to include stimuli that were presented with a successful avoidance response in the absence of the US. They proposed that a CI should not only have anxiolytic properties inhibiting anxious behaviour but also, when made contingent upon an instrumental response, should function as a positive conditioned reinforcer for operant behaviour in a similar manner to a stimulus correlated with the presentation of food or water. This hypothesis is consistent with opponent motivational theory that assumes two motivationally, antagonistic systems within the brain, an appetitive and an aversive system (Konorski 1967). A safety signal that inhibits the aversive system, dis-inhibiting the appetitive system, might be perceived as motivationally equivalent to a reward-predicting stimulus that directly activates an appetitive system and, therefore, may also reinforce behaviour (Dickinson and Pearce 1977;Seymour et al. 2005;Leknes et al. 2011).
This study aims to test whether the "relief" provided by a safety signal has equivalent properties to those of an appetitive conditioned reinforcer, e.g., a stimulus predicting food, which would suggest that both act through a common positive reinforcement system, as previously argued for avoidance conditioning (e.g., Gray 1971). We, thus, first established a safety signal as a CI and then tested the potential reinforcing properties of the safety signal using an acquisition of a new response procedure (hereafter, AnR).Conditioned inhibition can result from differing conditioning procedures (for a review, see Lolordo 1969); however, inhibitors rarely elicit active, observable behaviour, as their presentation signals the absence of a reinforcer. Rescorla (1969) proposed a two-test strategy, using a summation test and a retardation test in order to conclude inhibition has been achieved. The use of the two-test strategy is based on two alternative attentional explanations that apply for each test. The rationale is as follows. In a summation test, the putative CI is presented (i.e., tested) together with a conditioned excitor (CE); the combination of the CI and the CE should result in a reduced response compared to presentation of the CE alone (or in the presence of a control stimulus). However, this could be due to too much attention being paid to the inhibitor, diminishing responding to the excitor. In the retardation test, the putative CI is paired directly with the US. A CI requires greater excitatory training to overcome its existing conditioned inhibition than an appropriate control CS, but this could also occur due to reduced attention to the inhibitor. Neither explanation can account for the effect in the alternative test; thus, a stimulus showing the predicted effect in both tests is established as a CI. In this experiment, both of these tests were used in Experiment 1 to ensure that the explicitly unpaired training regime used, led to the stimulus acquiring the "relieving" properties of a CI of fear.
Using the same conditioning procedure to establish a CI as in Experiment 1, the efficacy of the safety signal as a conditioned reinforcer was then tested in Experiment 2A using the stringent AnR test of conditioned reinforcement (Mackintosh 1974;Hyde 1976). A two-lever choice test that determines whether the putative conditioned reinforcer exerts greater effects on responding than a randomly paired stimulus (truly random control, TRC).
Further tests of the appetitive properties of the signal acting as a positive conditioned reinforcer were undertaken using pharmacological challenge with d-amphetamine in Experiment 2B. Previous studies have shown that psychomotor stimulant drugs, critically d-amphetamine, potentiate the effect of appetitive conditioned reinforcers with rate-increasing effects in an AnR procedure (Robbins 1978;Robbins et al. 1983;Sutton and Beninger 1999) via dopaminergic mechanisms (Taylor and Robbins 1986;Cador et al. 1991;Kelley and Delfs 1991). There is also some evidence that d-amphetamine potentiates the effects of negative conditioned reinforcers, as expressed by enhanced behavioural suppression (Killcross et al. 1997). We therefore compared the effects of a range of doses of damphetamine on responding with conditioned reinforcement provided by a safety signal. For comparison, we also tested the effects of the anxiolytic drugs, diazepam and 8-OH-DPAT, in Experiment 2B. It was hypothesised that these drugs may release behaviour suppressed by the aversive context, thus unmasking possible conditioned reinforcing effects of the safety signal.

Subjects
We used experimentally naive, male Lister-hooded rats, weighing 300 g at the start of the experiment, obtained from Charles River, UK. Four rats were housed per cage in a reverse light cycle room (12 h light:12 h dark; lights on at 0700). All experiments complied with the statutory requirements of the UK Animals (Scientific Procedures) Act 1986. Twenty-four rats were used in Experiment 1 maintained on free food. Thirty-two rats were used for both Experiments 2A and 2B; half of the cohort (16 rats) was food deprived to 80 % of their free-feeding weight. Eight rats were used for Experiment 3A + 3B which were all food deprived to 80 % of their freefeeding weight.

Apparatus
Eight operant conditioning chambers (Med Associates) were used for the three experiments, each measuring 29.5 cm×32.5 cm×23.5 cm with a Plexiglas ceiling and front door, and metal panelling on the sides and back of the chamber. The floor of the chamber was lined with absorbent paper and covered with a metal grid. Shockers (Med Associate ENV-224AMWN, 115 VAC, 60 Hz) were connected to the metal grid and used to produce a scrambled 0.5-mA footshock. All testing chambers were placed within sound-and light-attenuating boxes and interfaced to a computer through Whisker control software (Cardinal and Aitken 2010). An auditory stimulus was produced by a Med Associates tone generator (ENV-223AM) attached to one side of the chamber; white noise was produced by a Med Associates white noise generator (ENV-2255M) attached to the same wall of the chamber. Both stimuli were set to 8 db above background level and 2,900 Hz for the tone. The same testing chamber was used for all stages of all experiments, and two retractable levers could be presented (during the AnR procedure) on either side of a recessed food magazine where pellets were delivered in Experiment 2 only in the appetitive group. The food magazine had an infrared beam to detect nose pokes made in the food magazine. The levers were only extended after Pavlovian training during the AnR and drug administration sessions.

Habituation
Rats were first habituated to the testing chamber and auditory stimuli for 2 days prior to any training. On the first day, rats received two presentations of an auditory stimulus (either tone or white noise, counterbalanced) and on the second, rats received two presentations of the alternative auditory stimulus (either white noise or tone, counterbalanced) non-contingently. The CS lasted for 16 s in Experiment 1 and 20 s in Experiments 2 and 3, and the time between presentations was 8 min and each of the two sessions lasted for 25 min.

Experimental procedure
Experiment 1: establishment of conditioned inhibitory properties of a safety signal Explicitly unpaired training involving the presentation of the stimulus in the absence of the US was used to condition inhibitory properties to an auditory stimulus. The last day of training was taken as evidence of a summation test and retardation training as evidence for a retardation test to confirm effective inhibitory conditioning. See Table 1 for experimental design.
Inhibitory training Rats were presented with an auditory stimulus (either a white noise or a tone, counterbalanced) for 16 s and a mild footshock (0.5 mA) as the US for 0.5 s. Inhibitory training consisted of six daily sessions; each day, 10 presentations of the putative inhibitory stimulus (either a white noise or tone) were presented in the absence of 10 presentations of a mild footshock (0.5 mA) in a 25-min session, presented according to a randomly generated schedule. The US was never presented less than 60 s after CS termination to avoid potential forward pairings between the putative inhibitor and US. The effectiveness of this training in establishing the auditory stimulus as an aversive CI was assessed by comparing freezing during the CS with that during the 16 s prior to each CS presentation known as the pre-CS period. If the trained stimulus functions as a CI, the rats should freeze less during the CS than during the pre-CS period. A control stimulus was not used in this experiment to avoid stimulus generalisation during the devised inhibition training protocol.
Retardation training Rats were randomly allocated into two groups, a control group and a retardation group, with 12 rats in each. The retardation group experienced presentations of the putative inhibitor followed by a footshock (US). In the control group, the alternative, untrained but habituated stimulus was presented with the US. If the putative CI had indeed acquired inhibitory properties then the direct pairing Retardation group n=12, control group n=12 X Inhibitory stimulus, auditory tone, or white noise counterbalanced, US 0.5-mA shock lasting 0.5 s, Y control stimulus (alternate auditory stimulus to X), / unpaired,paired of the putative inhibitor with the US should result in the retarded emergence of freezing behaviour compared to the control group through the course of training. The session lasted for 90 min with four pairings of either the inhibitor and footshock or a neutral stimulus and footshock depending on group assignment. Stimuli were presented for 16 s and were immediately followed by the presentation of a footshock (0.5 mA lasting for 0.5 s), and the inter-trial interval was 14 min.

Data analysis
Percentage freezing time (the time spent freezing of the total pre-CS time or total CS time) was used to measure aversive learning. Freezing was defined as the absence of all movement, aside from respiration without regard to posture (Grossen and Kelley 1972;Bolles and Riley 1973;Bolles and Collier 1976;Fanselow and Bolles 1979). Videos of the last five trials of the last day of inhibitory training were analysed by a blind observer, recording whether the rat was moving or freezing at 2-s intervals for 32 s (16 s pre-CS and 16 s CS). An analysis of variance (ANOVA) was conducted of the last day of training with period (pre-CS vs. CS) as a within-subject factor. Videos of the retardation training, recording freezing during the four CS presentations, were also analysed with the observer blind to group. A repeated measures ANOVA of the four CS presentations during retardation training (CS1 vs. CS2 vs. CS3 vs. CS4) compared the percentage freezing time during stimulus presentation with a between-subject factor of group (retardation group vs. control group).

Experiment 2
Experiment 2A sought to establish whether an aversive CI and an appetitive CS (CS+) function equivalently as conditioned reinforcers by comparing the properties of a CI, trained as in Experiment 1, directly with an appetitive CS paired with sucrose pellets. Training for an inhibitor or an appetitive stimulus was conducted in separate groups with comparable training procedures and identical stimulus and US exposure. Both groups had alternate training with a TRC stimulus, randomly presented within the session with no associative relationship to the US. After training, rats were presented with two levers in the conditioning chamber; depressing one lever led to the presentation of either the CI or CS + depending on group, and depressing the other lever led to the presentation of the TRC. See Table 2 for experimental design. Experiment 2B assessed the effects of administration of d-amphetamine on responding for the safety signal during AnR sessions, now termed conditioned reinforcement sessions, as the subjects have had experience with the instrumental response. d-Amphetamine has been previously shown to enhance the reinforcing properties of a conditioned appetitive stimulus, selectively increasing responding to produce the appetitive conditioned reinforcer and may do so for the safety signal. The anxiolytics 8-OH-DPAT and diazepam were also tested to see if these drugs may release responding for the safety signal during conditioned reinforcing sessions through reducing an anxious state mediated by the excitatory context.
Experiment 2A: comparison of the conditioned reinforcing properties of an appetitive CS and a safety signal Appetitive group training Food-deprived rats received 12 days of training, 6 days of training with an appetitiveconditioned stimulus (CS+), and 6 days of training with a TRC stimulus in the same context. Training of either the appetitive CS or TRC was alternated across days. Appetitive conditioning consisted of 10 presentations of an auditory stimulus (a white noise or tone), each paired directly with a 45-mg sucrose pellet (Purina TestDiet® # 58B0 (aka # 5800-B)-AIN-76A Diet (replacement for AIN-76)) in a 25-min session.
The stimulus lasted 20 s with a sucrose pellet delivered 5 s, 10 s, 15 s, or 20 s from the start of the stimulus, randomly chosen at the start of each training day so that attention was maintained throughout stimulus presentation. TRC training consisted of 10 presentations of the alternative auditory stimulus (white noise or tone) for 20 s with a random time between presentations and 10 sucrose pellets randomly presented on a different random-time schedule in the 25-min session. Stimulus presentations were pulsed every 2 s, with a 2-s on phase and a 2-s off phase in order to mimic the 2-s stimulus presentation subsequently used in AnR (see below). Crombag et al. (2008) have shown that short duration stimuli favour AnR. However, during training, a longer stimulus was chosen in order to aid inhibitory conditioning as the inhibitor would signal a long, shock-free period during its presentation.
Safety signal group training Rats received 12 days of training, 6 days of training with a CI and 6 days of training with a TRC stimulus. The type of training session was alternated across days. Inhibitory conditioning consisted of 10 presentations of an auditory stimulus (a white noise or tone) explicitly unpaired with a mild footshock (0.5 mA) in a 25-min session. The stimulus lasted 20 s and was never presented less than 60 s prior to a footshock to prevent forward pairings. TRC training consisted of 10 presentations of the alternative auditory stimulus (white noise or tone; counterbalanced) for 20 s and 10 footshocks randomly presented in the 25-min session in the same context. Stimuli were again pulsed as in the appetitive group.
Acquisition of a new response with an appetitive CS or safety signal Rats were then presented with two levers in the same chamber as used for appetitive or safety signal training. Responding on one lever resulted in the presentation of the CI or CS + depending on group assignment, responding on the other led to presentation of the TRC stimulus. The session lasted for 30 min, and stimuli were presented for 2 s on a variable ratio of one to three responses with similar parameters used in previous studies of appetitive-conditioned reinforcement (Taylor and Robbins 1986;Robbins 1978;Cador et al. 1991). The first response made on either lever was reinforced so that rats experienced the response-stimulus outcome contingencies for both levers from the outset. AnR was conducted for 2 days.

Data analysis
Safety signal training Percentage freezing time was used as a measurement of the conditioned emotional response in the safety signal group as in Experiment 1. A 2×2 ANOVA of the last day of inhibitory training was conducted with period (pre-CS vs. CS) and stimulus (CI vs. TRC) as within-subject factors. The pre-CS, 20 s prior to CS presentation, and the stimulus period, the 20 s of CS presentation, were analysed by recording whether the rat was moving or freezing at 2-s intervals for a total of 40 s. The last five trials of the last day of training were analysed. If stimuli from a previous trial or the US from a previous trial overlapped with the pre-CS period of a subsequent trial then this trial was excluded and the next previous trial was then scored and included in the analysis.
Appetitive training A 2×2 ANOVA of period (pre-CS vs. CS) by stimulus (CS + vs. TRC) was performed on the mean number of food magazine approaches during the last day of inhibitory training. The mean number of nose pokes (responses made where the nose of the rat breaks a photo beam in order to enter the food magazine) during the 5 s prior to CS onset (pre-CS) and the first 5 s (CS) during the stimulus presentation were taken as an index of learning analysed from the last day of training. The first 5 s of the CS were used as they were free from reinforcement, and an equivalent period of time was chosen for the pre-CS period for analysis.
Acquisition of a new response with an appetitive stimulus or safety signal The average number of lever presses across the 2 days of AnR was square-root transformed and taken as a measure of AnR due to variance increasing with mean responding. A mixed ANOVA was conducted with stimulus (CS + or CI vs. TRC) as a within-subject factor and a between-subjects factor of group (safety signal vs. appetitive group).
Experiment 2B: effects of d-amphetamine, 8-OH-DPAT and diazepam on conditioned reinforcement with appetitive CS vs. safety signals See Table 2 for experimental design.
Drug administration All three drugs were counterbalanced for order of administration and order of dose for each drug in both the appetitive and safety signal group. Before a new drug was administered, rats remained in their home cages for 2 days and then received training of the target stimulus (CS + or CI) and control stimulus (TRC) to prevent extinction of the context before the next drug was administered. Intra-peritoneal injections of d-amphetamine and diazepam were administered 15 min and 30 min, respectively, prior to a 30-min AnR session (now termed a conditioned reinforcement session as the rats had experience with the response in the previous experiment) as described in Experiment 2. 8-OH-DPAT was administered 30 min prior to the conditioned reinforcement session subcutaneously.

Data analysis
The square-root-transformed number of lever presses during the conditioned reinforcement session was analysed using dose and stimulus as within-subject factors and safety signal group and appetitive group as betweensubject factors. The same analysis (without stimulus as a within-subjects factor) was conducted for the square root (SQRT) of the mean number of nose pokes within a session to provide a general measure of activity within the box. In addition, random sample scoring of the videos was also conducted as a measure of activity during conditioned reinforcement sessions, where rats' freezing behaviour was scored in 5-min bins for 40 s every 8 s. Four rats were analysed at a time with the observer blind to drug condition. Results are represented as the percentage freezing time of the total time sampled.

Experiment 3
Food deprivation is known to alter activity levels and has also been shown to interact with the effects of systemically administered d-amphetamine on behaviour (Campbell and Fibiger 1971). In order to control for these effects and produce a group comparable in state to the appetitive group of Experiment 2, a safety signal group was trained using the same protocol as in Experiment 2 but were food deprived for the entire experiment. These food-deprived, safety signal group rats were then tested in an AnR test and under damphetamine as performed in Experiment 2.
Experiment 3A: effects of food deprivation on the reinforcing properties of a safety signal Training and AnR protocols were the same as for the safety signal group in Experiment 2.
Data analysis A within-subjects ANOVA was conducted on the square-root transformed average number of lever presses across the 2 days of AnR with stimulus as a within-subjects factor (CI vs. TRC).
Experiment 3B: effects of d-amphetamine on conditioned reinforcement with safety signals in food deprived rats The drug was prepared in the same manner as in Experiment 2 and administered using the same protocol.

Data analysis
The square-root-transformed number of lever presses during the conditioned reinforcement sessions was analysed using dose (veh vs. 0.5 mg/kg vs. 1.5mg/kg) and stimulus (CI vs. TRC) as within-subject factors. The same analysis (without stimulus as a within-subjects factor) was conducted for the SQRT of the mean number of nose pokes within a session to provide a general measure of activity within the box. Behaviour was not observed during the conditioned reinforcement sessions under d-amphetamine and so analysis of freezing behaviour is not included for this experiment.

Summation test: inhibitory training analysis
Training with a CS explicitly unpaired with shock led to the suppression of conditioned freezing during the presentation of the CS in the last day of training (Period F (1, 23) =24.3, p<.001) (Fig. 1a). These results indicate that the stimulus had acquired inhibitory properties as a result of the training protocol.

Retardation training
The retardation group was slower to acquire freezing (assessed as percentage freezing time during the stimulus presentation) when compared with the control group (Fig. 1b), showing that prior inhibitory training retarded the emergence of freezing behaviour when compared to a habituated, neutral stimulus. There was a main effect of trial (F (2.4, 43,3) =7.1, p<.001, Huynh-Feldt adjustment) and a between-subject effect of group (F (1,18) =5.1, p<.05). These group differences appeared through the course of retardation training as no significant differences were observed between the retardation and control groups in freezing prior to the first CS presentation (F<1). Two rats were excluded from the control group due to a failure in testing equipment to deliver the first CS-US pairing and two rats were excluded from the retardation group as outliers with their mean freezing responses being more than two standard deviations away from the mean.
Experiment 2A: comparison of the conditioned reinforcing properties of an appetitive CS and a safety signal

Appetitive group training
The paired stimulus (CS+) induced greater enhancement of nose poking at the food magazine than the TRC with respect to the pre-CS period (Stimulus × Period interaction; F (1,15) =5.7 p<.05), indicating the acquisition of Pavlovian conditioning to the appetitive US (Fig. 2a). There was also a main effect of period (F (1,15) =29.7, p<.01).

Safety signal group training
Rats froze less during the presentation of the inhibitor than during the presentation of the TRC (main effect of stimulus: F (1,15) =7.8, p<.02; main effect of period: F (1,15) =26.6, p<.001; and an interaction of Stimulus × Period: F (1,15) = 7.4, p<.02), demonstrating effective inhibitory conditioning (Fig. 2b). Pairwise comparisons of responding during the CS period revealed a significant difference between freezing during the CI vs. during the TRC (p<.005).
A two-tailed binomial test revealed a significant preference for the active lever in the appetitive group with 13 out of 16 rats preferentially responding on the active lever that produced the appetitive CS. Analysis of the safety signal group demonstrated no difference in responding for the CI over the TRC (F<1, NS).  (Fig. 4a), enhanced responding occurred on both levers with the administration of damphetamine (F (2,30) =7.022, p < .001), with preferential responding for the appetitive CS + (F (1,15) =9.5, p<.02) but no significant interaction (F<1, NS). Planned comparisons revealed a significant difference between the effects of vehicle and 0.5 mg/kg (p < .01), and between the effects of damphetamine on the two levers at every dose level (p<.02).

Analysis of food magazine nose-poking responses
The effects of d-amphetamine on nose poke responding in the food magazine revealed a main effect of dose (F (2,60) =6.3, p<.001) with no interaction (F (2,60) <2.0) and a betweensubject effect of group (F (1,30) =75.4 p<.001) reflecting an

Video analysis of behaviour
Analysis of the videos from the drug sessions revealed a decrease in percentage freezing time in the safety signal group (Fig. 4f) and immobility in the appetitive group ( Fig. 4e) with increasing doses of d-amphetamine (F (2,50) = 3.3, p<.05) across both groups with no interaction. No effects were seen of 8-OH-DPAT or diazepam on freezing in the safety signal group (F (2,30) <2.5 and F<1.0, NS, respectively) nor immobility in the appetitive group (F (2,30) <2.0 and F<1.0, NS, respectively). Experiment 3A: effects of food deprivation on the reinforcing properties of a safety signal Non-preferential responding was observed during the AnR test (Fig. 5a) (Stimulus F<1.0, NS), suggesting food deprivation neither influenced inhibitory conditioning nor aided the transfer of any putative reinforcing properties of the safety signal.
Experiment 3B: effects of d-amphetamine on conditioned reinforcement with safety signals in food-deprived rats d-Amphetamine (0.5-1.5 mg/kg) non-significantly increased lever pressing in the food-deprived safety signal group (Dose F (1.0, 7.2) =3.0, p>.1) (Fig. 5b). Critically, there was no significant preference for the safety signal (Stimulus F (1,7) =1.0, p>.3) nor did d-amphetamine significantly affect  Fig. 3 Mean of the square root of lever presses during the acquisition of a new response test. a Responding on the active lever in the appetitive group led to the appetitive CS (CS+) and responding on the inactive lever led to the truly random control stimulus (TRC) for 2 s. *p<.05 with respect to TRC. b Responding on the active lever in the safety signal group led to the conditioned inhibitor (CI) and responding on the inactive lever led to the truly random control stimulus (TRC) for 2 s. All error bars in figures represent the standard error of the means (SEM) responding for the safety signal (Dose × Stimulus F<1.0, NS). However, d-amphetamine did dose-dependently increase nose-poking in the magazine (Fig. 5c) (F (2,14) =10.9, p<.005).
In an additional statistical analysis, direct comparison of the results obtained with the food-deprived safety signal group vs. the sated safety signal group reported in Experiment 2 under d-amphetamine revealed no effect of within-subject factors of Stimulus and Dose or betweensubject effect of Group but a significant interaction of Dose × Group (F (2,44) =8.0, p<.005) on overall lever pressing during the conditioned reinforcement sessions. For nosepoking in the food magazine, a between-subject effect of Group (F (1,22) =24.3, p<001) and a significant interaction of Dose × Group (F (2,44) =14.9, p<.001) was also seen. Thus, the effects of d-amphetamine on nose-poking as well as lever-pressing were dependent on food deprivation state, but there was no evidence of a differential effect on responding for the safety signal even though the two groups were run separately. Fig. 4 The effects of systemic d-amphetamine on behaviour during a conditioned reinforcement session. a Mean of the square root of the lever presses during a conditioned reinforcement session, where responding on the active lever led to the presentation of the appetitive stimulus (CS+) and responding on the inactive lever led to the presentation of the truly random control stimulus (TRC). *p<.05 with respect to vehicle. b Mean of the square root of the lever presses; responding on the active lever led to the presentation of the inhibitory stimulus (CI) and responding on the inactive lever led to the presentation of the truly random control stimulus (TRC). *p<.05 with respect to vehicle. c The effects of systemic d-amphetamine on the mean square root transformed number of nose pokes in the appetitive group during a conditioned reinforcement session. d The effects of systemic d-amphetamine on the mean square root transformed number of nose pokes in the safety signal group during a conditioned reinforcement session. *p<.05 with respect to vehicle. Random sampling of videos of immobility in the appetitive group (e) and freezing in the safety signal group (f) *p<.05 with respect to vehicle. All error bars in figures represent the standard error of the means (SEM)

Discussion
This study was designed in order to directly compare an appetitive stimulus paired with food to a CI of fear ("safety signal"), testing whether an excitor of one motivational system is functionally equivalent to the inhibitor of an oppositely valenced motivational system. The findings indicate that safety signals did not demonstrate appetitive-conditioned reinforcing properties as measured by the AnR procedure. Furthermore, given that psychomotor stimulant drugs have been shown to potentiate responding with conditioned reinforcement via their central dopaminergic actions (e.g., Sutton and Beninger 1999), we also tested the effects of d-amphetamine as well as certain anxiolytic drugs (diazepam and 8-OHDPAT) on the conditioned reinforcing properties of a safety signal using AnR. Although d-amphetamine non-selectively potentiated appetitive responding maintained by conditioned reinforcers, it actually produced a reduction in responding in the safety signal group or no effect when the safety signal group was fooddeprived. Thus, conditioned fear-relieving properties do not always transfer to exhibit conditioned reinforcing properties as suggested by the theories of Mowrer (1956), Gray (1971) and Dinsmoor (2001). These results will be discussed further in terms of their enhancement of conditioned reinforcement by psychomotor stimulants.
CIs rarely elicit behavioural responses, making it difficult quantitatively to assess the efficacy of conditioning. However, in this study, we demonstrated statistically robust conditioned inhibition that also met the stringent criteria for its demonstration. In the present study, the CS explicitly unpaired with shock (1) acquired the ability to suppress conditioned freezing to the context (summation) (Fig. 1a) and (2) was slower to acquire excitatory conditioning to shock (retardation) (Fig. 1b) in agreement with Rescorla's (1969) criteria for conditioned inhibition. The results of Experiment 1, therefore, confirm that the explicitly unpaired training procedure in this study generated a true CI. The stimulus also acquired the ability to suppress conditioned freezing in comparison to a TRC stimulus during training in Experiment 2, emphasising that conditioned inhibition was acquired as a result of the negative CS-US contingency and not mere exposure. Bearing in mind that once established, the CI does not undergo extinction (Zimmer-Hart and Rescorla 1974), these findings make it most unlikely that subsequent failures to demonstrate the conditioned reinforcing properties of a safety signal resulted from ineffective inhibitory conditioning due to extinction.
The AnR procedure successfully demonstrated conditioned reinforcement with an appetitive CS (Fig. 3). However, under similar circumstances, the safety signal acting as a CI did not function as a conditioned reinforcer. Attempts to assess the reinforcing properties of an inhibitor on instrumental responding have been marked by constraints and failures (for a review, see Beck 1961). One study claiming to have demonstrated the reinforcing properties of a neutral stimulus associated with the termination of shock in AnR was conducted by Kinsman and Bixenstine (1968). This study differed markedly from ours by using an AnR procedure preceded by a conditioning phase with an active shuttle box. However, flaws in the design of this study leave considerable doubt as to, firstly, whether this  stimulus was, in fact, an inhibitor of fear and, secondly, that it possessed positive reinforcing properties. Notably, they failed to use the two-test strategy of Rescorla (1969) necessary for concluding that the target stimulus was indeed an inhibitor of fear. A further, major limitation of the Kinsman and Bixenstine's study was the presentation of the US within the AnR session, as in order to assess the associative properties conditioned to the putative CS, it must be presented in the absence of an explicit US; otherwise, the test procedure becomes equivalent to a learning session. Rather, we tested the effects of the CI in the shock-predicting context in the absence of foot shock. Taking into account all these relevant design considerations, the present study demonstrated that a safety signal trained as a Pavlovian CI did not support the acquisition of a new response, a stringent criterion of positive conditioned reinforcement.
Bi-directional effects of d-amphetamine on signals of reward vs. relief A major finding is that d-amphetamine significantly increased responding in the situation in which appetitiveconditioned reinforcement was available but significantly decreased or had no effect on responding in a similar situation in which the safety signal was contingent on responding. This strongly argues against safety signals having appetitive-conditioned reinforcing effects or at least suggests that any enhancement of the reinforcing properties of a safety signal is unlikely to be dopamine-dependent, as has been shown with appetitive-conditioned reinforcers (Taylor and Robbins 1986;Cador et al. 1991). Our findings are consistent with those of Josselyn et al. (2005) who found that d-amphetamine administered to the intra-core subregion of the nucleus accumbens also failed to alter the effects of CIs, despite similar infusions of d-amphetamine potentiating the effects of appetitive-conditioned reinforcement (e.g., Taylor and Robbins 1984). We tested the effects of d-amphetamine on responding with conditioned reinforcement (directly comparing appetitive-conditioned reinforcers with safety signals) in order to assess possible differences in efficacy of the two classes of stimuli over a range of changes in response rates produced by the drug. Previous studies have shown that weak conditioned reinforcing effects can be magnified by treatment with psychomotor stimulant drugs such as damphetamine (e.g., Robbins et al. 1983). Generally, such potentiation is relatively selective, occurring significantly on only one lever. However, in the present case, although the appetitive-conditioned reinforcer consistently led to more responding than the TRC stimulus, d-amphetamine increased responding on both levers (Fig. 4a).
Previous studies using the AnR procedure have generally found selective increases in responding producing the CS + and no significant increases on the control lever (resulting in significant Drug × Lever interactions) (see reviews by Sutton and Beninger 1999;Robbins et al. 1989). However, in those studies, responding on the control lever has generally had no consequence, whereas in this study, it produced a TRC stimulus, which has a random associative a The mean of the square root-transformed average responses made; responding on the active lever led to the presentation of the inhibitory stimulus (CI) and responding on the inactive lever led to the presentation of the truly random control stimulus (TRC). b The mean of the square root of responses on the active lever or inactive lever during AnR when tested with three doses of d-amphetamine and c mean of the square root-transformed number of nose pokes made during AnR sessions following d-amphetamine relationship with the US. Evidence of the selectivity of the rate-increasing effect of psychomotor stimulants with conditioned reinforcement was previously provided by the use of separate groups responding for a novel stimulus  or for a TRC alone (Robbins 1976;Taylor and Robbins 1984). The present study has shown that although responding is potentiated with an appetitiveconditioned reinforcer, the potentiation can also extend to a TRC that has been paired on a number of occasions by chance with a US, but not in a predictive manner, in the same test context during AnR. This chance pairing may confer weak conditioned reinforcing properties to the TRC, which were evidently potentiated by d-amphetamine in the present study. Overall, the drug appears to act as a "gain amplifier" of responding that has a certain minimum tendency or response strength. This propensity of psychomotor stimulants to increase responding of a certain minimal tendency has also been previously noted in the literature (Clark and Steele 1966;Lyon and Robbins 1975). Some degree of behavioural selectivity of the effect of d-amphetamine can, however, be observed in its concurrent lack of enhancement of approach responses to the food magazine as measured by nose poke responding (Fig. 4c). Such responses are normally controlled by Pavlovian influences and the increase in responding with the appetitive CS can, therefore, be attributed to a dissociation of their conditioned reinforcing and Pavlovian elicitation properties under the drug (see also Robbins 1978). It is important to note that nose-poking was dose-dependently increased by damphetamine in the safety signal, food-deprived group (Fig. 5c).
d-Amphetamine-induced increases in lever press responding only occurred when the food-related conditioned reinforcer contingency was available; when the safety signal was presented contingent on responding, rats failed to exhibit any drug-dependent increase in lever pressing, whether in a comparable state of food deprivation or not (Figs. 4b and 5b). In fact, significant, non-selective reductions in responding were observed (Fig. 4b) (or no net effect at all in safety signal conditioned rats tested under food deprivation, Fig. 5b). The significant dose-dependent reduction in lever pressing was also accompanied by a decrease in nosepoking in the sated safety signal group of Experiment 2 with increasing doses of d-amphetamine (Fig. 4d). The decrease in both lever pressing and nose poking in the safety signal group of Experiment 2 could have arisen for a number of reasons. d-Amphetamine has anxiogenic effects under certain circumstances (Thiébot et al. 1991;Killcross et al. 1997;Foree et al. 1973) and DA neurons have been shown to respond to aversive stimuli (Brischoux et al. 2009;Lammel et al. 2011;Budygin et al. 2011). However, parallel decreases in freezing following d-amphetamine in this safety signal group suggest that the rats were no longer fearful of the context (Fig. 4f). Moreover, when the effects of damphetamine on the properties of the signal acting as a conditioned reinforcer were tested under food deprivation, the rats showed dose-dependent increases in nose-poking behaviour (i.e., visits to the empty food magazine), showing that the drug was exerting a psychomotor stimulant effect in this context; thus, the failure to exhibit rate-increasing effects in lever-pressing for the safety signal is especially significant.
The relative behavioural selectivity of effects of damphetamine is also highlighted by the lack of effect of the anxiolytics diazepam and 8-OH-DPAT on responding for the safety signal. One prediction was that these anxiolytics would increase responding for the safety signal if it acted as an appetitive conditioned reinforcer through reducing the response-suppressant impact of the aversive context. Another possibility is that responding for the safety signal would actually be reduced, if the drugs themselves induced "relief". However, neither diazepam nor 8-OHDPAT had any effects on responding for the safety signal. Higher doses of both drugs could be tested in this paradigm.

Theoretical considerations
The present study failed to demonstrate any conditioned reinforcing effects of a safety signal using a traditional acquisition of new response procedure (Hyde 1976). This failure may be due to inherent difficulties in demonstrating Pavlovian-instrumental transfer when the context for doing so elicits incompatible behaviour (Holmes et al. 2010). A more successful approach may depend on assessing the effects of a safety signal whose presentation is contingent upon instrumental avoidance behaviour (Moscovitch and Lolordo 1968;Shearon and Allen 1983;Weisman and Litner 1971). Common to these studies is the initial training of an instrumental avoidance or escape response that provides the subject with control over its aversive environment. The successful termination of an aversive event by an instrumental response is paired with the presentation of a safety signal. Thus, these dual factors, the termination of an aversive event being controlled by the subject and the presentation of the signal being contingent on this successful termination, may both be necessary for conferring conditioned reinforcing properties on the safety signal. The inability of a safety signal trained as a CI in this study to support AnR might, therefore, be attributed to the lack of control provided to the subject over its aversive environment or a failure of transfer from an inhibitor trained in the absence of instrumental behaviour to a conditioned reinforcer presented contingently upon an instrumental response. Further studies will examine the implications of this hypothesis by investigating drug effects on instrumental avoidance with and without safety signal feedback.

Summary
The putative conditioned reinforcing properties of safety signals were measured in a traditional paradigm for examining such effects, the acquisition of a new response procedure, and were compared with those produced by an appetitive CS+. No conditioned reinforcing effects of the safety signal could be detected, either on baseline or after treatment with the psychomotor stimulant d-amphetamine or the anxiolytics 8-OH-DPAT and diazepam. Only a reduction was observed with d-amphetamine or no effect that contrasted with the potentiation of responding with appetitive conditioned reinforcement observed following the drug. These findings suggest that safety signals do not always demonstrate conditioned reinforcing properties, thus challenging the assumptions of certain theoretical formulations.