Context-induced relapse after extinction versus punishment: similarities and differences

Results from clinical studies suggest that drug relapse and craving are often provoked by exposure to drug-associated contexts. Since 2002, this phenomenon has been modeled in laboratory animals using the ABA renewal model. In the classical version of this model, rats with a history of drug self-administration in one context (A) undergo extinction in a different context (B) and reinstate (or relapse to) drug seeking after exposure to the original drug-associated context (A). In a more recent version of the model introduced in 2013, the experimental conditions in context A are identical to those used in the classical model, but drug-reinforced responding in context B is suppressed by probabilistic punishment. The punishment-based ABA renewal model is proposed to resemble abstinence in humans, which is often initiated by the desire to avoid the negative consequences of drug use. The goal of our review is to discuss similarities and differences in mechanisms that play a role in suppression of drug seeking in context B and context-induced relapse to drug seeking in context A in the two models. We first describe psychological mechanisms that mediate extinction and punishment of drug-reinforced responding in context B. We then summarize recent findings on brain mechanisms of context-induced relapse of drug seeking after extinction, or punishment-imposed abstinence. These findings demonstrate both similarities and differences in brain mechanisms underlying relapse in the two variations of the ABA renewal model. We conclude by briefly discussing clinical implications of the preclinical studies.


Introduction
A major obstacle in the treatment of drug addiction is relapse to drug use after periods of abstinence (Hunt et al. 1971;Sinha 2011). In former drug users, drug craving and relapse during abstinence are often triggered by environments or contexts that were previously associated with drug use (O'Brien et al. 1992;Wikler 1973). This clinical scenario has been modeled in laboratory rats by using a variation of the extinctionreinstatement model de Wit and Stewart 1981;Shaham et al. 2003) that is based on the ABA renewal model (Bouton and Bolles 1979). In the classical ABA renewal model (also termed context-induced reinstatement), rats with a history of drug self-administration in one context (A) undergo operant extinction in a different context (B) and reinstate (or relapse to) drug seeking in context A (Crombag et al. 2008). The operational definition of reinstatement or relapse in the model is significantly higher non-reinforced operant responding in the original drug self-administration training context A than in the extinction context B (Crombag et al. 2008). Since the initial demonstration with speedball (a heroin-cocaine combination) , context-induced reinstatement (or relapse) of drug seeking after extinction-induced abstinence has been observed with heroin (Bossert et al. 2004), cocaine Fuchs et al. 2005), alcohol (Chaudhri et al. 2008;Hamlin et al. 2007), nicotine (Diergaarde et al. 2008), and methamphetamine (Rubio et al. 2015;Widholm et al. 2011).
From a translational perspective, however, one important aspect of drug addiction that extinction does not model is the negative consequences of drug use. Specifically, human abstinence rarely involves lack of drug availability or overt extinction of drug-seeking responses (Epstein et al. 2006;Katz and Higgins 2003;Marlatt 2002). Instead, abstinence is typically initiated while the drug is available because of the desire to avoid the negative consequences associated with excessive drug use (Epstein and Preston 2003). Based on these considerations, we recently developed a context-induced relapse model that incorporates the negative consequences of drug use, in which alcohol self-administration is suppressed by adverse consequences (probabilistic operant punishment) (Marchant et al. 2013a). This model is a modified ABA renewal procedure in which abstinence is achieved in context B, despite alcohol availability, by punishment with response-contingent electric footshock. Using this model, we have demonstrated contextinduced relapse to both alcohol and cocaine seeking when rats were tested in context A after punishment-imposed abstinence in context B (Marchant et al. 2014Pelloux et al. 2018a, b). Our model and findings extend previous research in the addiction field on the use of punishment procedures to model the negative consequences of drug seeking and drug use (Deroche-Gamonet et al. 2004;Marchant et al. 2013b;Panlilio et al. 2003;Pelloux et al. 2007;Vanderschuren et al. 2017;Vanderschuren and Everitt 2004;Wolffgramm and Heyne 1995).
In the extinction-and punishment-based ABA renewal models, the test conditions are identical. Drug seeking in both situations is induced by exposure to drug-associated contexts, and the tests occur under extinction conditions. However, the methods used to impose that abstinence are different. Extinction-imposed suppression of drug seeking occurs in the absence of the drug, while punishment-imposed suppression of drug seeking occurs in the presence of the drug. Thus, the psychological mechanisms that underlie these two processes are different, and because of this, context-induced relapse may also rely on different neuronal mechanisms.
In this review, we first discuss methodological and conceptual issues related to the study of extinction and punishment of operant responding. We then summarize recent findings on brain areas and circuits that play a role in context-induced relapse of drug seeking after extinction-and punishmentimposed abstinence. We conclude by discussing the clinical implications of the similarities and differences in mechanisms underlying relapse, as assessed in these two models.

Extinction and punishment: methodological and conceptual considerations
In operant conditioning, extinction and punishment are examples of retroactive interference learning that decreases the behavioral expression of the original response-outcome (R-O) operant learning (Bouton 1993(Bouton , 2000. One key feature of retroactive interference learning is that the expression of the learned behavior is context-dependent (Bouton 1993(Bouton , 2002. Indeed, renewal of reward seeking is observed in the original training context (A) when extinction or punishment occurs in a different context (B) (Baker et al. 1991;Bouton and Schepers 2015;Marchant et al. 2013a;Nakajima et al. 2000). Thus, an important similarity between extinction and punishment is that they both involve context-dependent learning that modulates the expression of the original operant association.
An important difference between extinction and punishment is that during extinction, the subject must learn a new association in which the operant response now leads to no reward (no outcome). In contrast, during punishment, the subject learns about a new relationship between the original operant response and a second stimulus (footshock), which has opposite motivational valence to the original appetitive stimulus (i.e., food or drug). Therefore, the key differences between extinction and punishment are that punishment involves a continued presence of the appetitive stimulus (reward), and that there is an additional aversive stimulus in punishment (shock). This operational difference causes differences in the underlying psychological mechanisms that are responsible for controlling behavior.
In extinction, new learning occurs regarding the fact that the response no longer leads to the drug reward (the outcome), and drug seeking is reduced. Extensive research on Pavlovian conditioning has led to the proposal that the extinction context functions as an occasion setter (Holland 1992). In this configuration, the context gates the expression of different associations (i.e., either CS-US or CS-NoUS).  proposed that this mechanism may also apply to contextinduced reinstatement of extinguished drug seeking. However, recently Todd, Bouton, and colleagues have demonstrated that in instrumental conditioning, the extinction context forms a direct inhibitory association with the operant response (Bouton and Todd 2014;Todd et al. 2014). To date, comparable studies have not been conducted in rats trained to perform drug-reinforced instrumental responses. However, it is possible that extinction of drug-reinforced responses is also mediated by a direct inhibitory association between the extinction context and the drug-reinforced response.
The learning that occurs during punishment can form both operant and Pavlovian associations (Jean-Richard- Dit-Bressel et al. 2018). Thus, despite the fact that shock delivery is contingent on the operant response (R-O association), the introduction of a new stimulus (footshock) can cause the formation of Pavlovian stimulus-outcome (S-O) associations (e.g., lever shock) (Jean-Richard- Dit-Bressel et al. 2018;Marchant et al. 2017). Because of this inherent confound, it is important to consider whether punishment-imposed abstinence is mediated by operant R-O punishment learning or whether Pavlovian S-O fear conditioning can account for the suppression of the operant response (Estes and Skinner 1941). In this regard, exposure to Pavlovian cues and contexts previously paired with shock can suppress ongoing operant responding (Bouton and Bolles 1979;Estes and Skinner 1941;Pickens et al. 2009). The distinction between R-O operant punishment and S-O fear conditioning is important because different neurobiological mechanisms mediate operant punishment and Pavlovian conditioned suppression (Jean-Richard- Dit-Bressel et al. 2018).
Evidence for operant R-O learning in punishment was demonstrated by Bolles et al. (1980). They showed that S-O (Pavlovian) fear learning occurs during the initial punishment session, while R-O (operant) learning emerges towards the end of the first session and is fully apparent during the next training session. Jean-Richard-Dit-Bressel and McNally (2015) confirmed this finding using a twolever design with retractable levers. They found that punishment of the response on one lever causes moderate levels of freezing (an index of conditioned fear) to both levers during the initial punishment sessions. However, during subsequent sessions, freezing induced by the extension of the levers is reduced, responding on the punished lever is suppressed, and responding on the unpunished lever is increased. Additional evidence that R-O operant punishment mediates suppression of drug seeking in context B in the studies reviewed below is that noncontingent random shock exposure in context B, which is independent of lever pressing, has no effect on alcohol self-administration (Marchant et al. 2013a). Bouton and Schepers (2015) and Pelloux et al. (2018a) expanded on this finding using a yoked-shock design where the number and temporal distribution of shocks between the punished and yoked groups are identical. They found that yoked non-contingent shock exposure has no effect on food or cocaine self-administration in context B.
Together, the studies reviewed above indicate that Pavlovian S-O learning occurs during the initial operant punishment learning but its impact on operant responding is limited to early learning, and is likely to extinguish over time. The recent observations discussed above that contingent but not non-contingent shock selectively suppresses alcohol, food, and cocaine self-administration further indicate that punishment-imposed abstinence is primarily mediated through operant R-O associations in studies using the punishment-based ABA renewal model (Bouton and Schepers 2015;Marchant et al. 2013a;Pelloux et al. 2018a).
In conclusion, while both extinction and punishment lead to context-dependent suppression of drug seeking in context B and renewal of drug seeking in context A, there are important differences in the learning and psychological mechanisms that contribute to the suppression of the operant response in context B. A question for future research is whether there are also differences in the psychological mechanisms of renewal in context A after extinction versus punishment.

Brain mechanisms of context-induced relapse after extinction and punishment
In this section, we review results on the similarities and differences between context-induced relapse after extinction-or punishment-imposed suppression of drug seeking. We focus on studies using rats trained to self-administer alcohol or cocaine, because published studies on mechanisms of contextinduced relapse after punishment with other drugs of abuse do not exist. We do not discuss circuit-related results from other studies using the classical extinction-based ABA renewal model, which we have recently reviewed .
In Table 1, we provide correlational data from several studies in which we and other investigators have used the neuronal activity marker Fos (Cruz et al. 2013;Morgan and Curran 1991) to identify brain regions selectively activated during tests for context-induced relapse after extinction-or punishment-imposed abstinence. This table shows both similarities and differences in the regions activated during the relapse tests in the two models. We do not discuss the data described in Table 1 within the context of brain areas and circuits that play a role in context-induced relapse after extinction or punishment for two reasons. First, it cannot be ruled out that procedural differences related to the training, extinction, relapse test, and Fos assay conditions across the different studies can account for the observed differences in Fos expression (see Pelloux et al. (2018a) for a discussion of this issue). The second reason is that in the absence of follow-up functional causal role manipulations, correlational Fos data should be interpreted with caution. This is because Fos induction in different brain areas can reflect either the cause or the consequence of relapse to drug seeking and does not necessarily imply that a given brain area plays a causal role in relapse (Bossert et al. 2011;Cruz et al. 2013). In Table 2 and Fig. 1, we summarize the results on the effect of causal role neuropharmacological manipulations on context-induced relapse to drug seeking after extinction versus punishment, and discuss these data for each brain area.
Lateral hypothalamus Results from two studies indicate that lateral hypothalamus (LH) activity is critical for contextinduced relapse to alcohol seeking, independent of the method used to achieve abstinence. Reversible inactivation of LH with muscimol+baclofen (GABAa and GABAb agonists) decreases context-induced relapse to alcohol seeking after either extinction or punishment (Marchant et al. 2009(Marchant et al. , 2014. A question for future research is whether this putative general role of LH in context-induced relapse generalizes to other addictive drugs. Nucleus accumbens core Results from several studies indicate a critical role of nucleus accumbens (NAc) core in contextinduced relapse to alcohol seeking after either extinction or punishment. NAc core injections of muscimol+baclofen or the dopamine D1-family receptor antagonist SCH23390 decrease context-induced relapse to alcohol seeking after extinction, and NAc core SCH23390 injections decrease contextinduced relapse after punishment (Chaudhri et al. 2008(Chaudhri et al. , 2010. It is currently unknown whether NAc core plays a similar role in contextinduced relapse to cocaine seeking after punishment or extinction. Injections of muscimol+baclofen or glutamate receptors antagonists into NAc core decrease context-induced relapse to cocaine seeking after extinction (Fuchs et al. 2008;Xie et al. 2012), but see Cruz et al. (2014) for negative results using the  Hamlin et al. (2007Hamlin et al. ( , 2008Hamlin et al. ( , 2009  Daun02 inactivation procedure in Fos-LacZ transgenic rats in which Daun02 injections into discrete brain areas selectively inactivate Fos-expressing neurons activated by exposure to drug-associated cues and contexts (Koya et al. 2009). The functional role of NAc core in context-induced relapse to cocaine seeking after punishment-imposed abstinence has not been investigated.
Nucleus accumbens shell NAc shell also appears to play a critical role in context-induced relapse of alcohol seeking after extinction or punishment. Injections of muscimol+baclofen or SCH23390 into NAc shell decrease context-induced relapse of alcohol seeking after extinction or punishment (Chaudhri et al. 2009;. Context-induced relapse to alcohol seeking after extinction is also decreased by NAc shell injections of a mu-opioid receptor antagonist or the peptide cocaine-amphetamine-regulated transcript (CART) (Millan and McNally 2012;Perry and McNally 2013). Additionally, inhibition of the glutamatergic projection from ventral subiculum (vSub) to NAc shell decreases both context-induced relapse to alcohol seeking after punishment ) and context-induced relapse to heroin seeking after extinction ) using either a dual-virus approach to restrict expression of the inhibitory kappa opioid-receptor-based DREADD in vSub → NAc shell projection neurons or a pharmacological asymmetric disconnection procedure; in this procedure, neuronal activity of a given brain projection is inhibited by either injecting a drug that inhibits neuronal activity or by lesioning the cell body region in one hemisphere and the projection target in the other hemisphere (Gold 1966). Together, these results suggest that NAc shell activity is critical to contextinduced relapse after either extinction or punishment across different drug classes.
However, NAc shell has also been implicated in the inhibition of extinguished alcohol and cocaine seeking Peters et al. 2008). Specifically, while muscimolbaclofen inactivation of NAc shell or local injections of a glutamate AMPA receptor antagonist (NBQX) have no effect on context-induced relapse to alcohol seeking after extinction, these manipulations increase extinguished alcohol seeking in context B (Chaudhri et al. 2008;. Additionally, muscimol-baclofen inactivation of NAc shell induces reinstatement of cocaine or alcohol seeking after extinction in the drug self-administration context (Millan et al. 2010;Peters et al. 2008).
Taken together, the studies reviewed suggest a complicated role of NAc shell in regulating context-induced relapse to cocaine and alcohol seeking after extinction and an unexpected dissociation between dopamine-and glutamatemediated neurotransmission in this form of relapse. This dissociation, however, does not generalize to heroin where inhibition of dopamine and glutamate transmission in NAc shell decreases context-induced relapse after extinction but has no effect on extinction responding in context B (Bossert et al. 2006(Bossert et al. , 2007. A recent study by Piantadosi et al. (2017) used food-trained rats to examine the role of NAc shell in mediating suppression of food seeking by punishment. They used a conflict design, where food reinforcement was first unpunished, and then punished, followed by another unpunished period. They found that NAc shell inactivation decreased food seeking during the unpunished periods, but increased food seeking during punishment. Given these findings, and in addition to the reinstatement-related findings reviewed above, we suspect that NAc shell activity will also play a complicated role in context-induced relapse after punishment that will depend on the neurotransmitter system and the neuropharmacological manipulation.  Hippocampus Given the role of hippocampus in mediating context-dependent functions, it is perhaps unsurprising that there have been several demonstrations that it is critical for context-induced relapse. Muscimol+baclofen inactivation of the dorsal hippocampus (DH) has been shown to decrease context-induced reinstatement of extinguished cocaine seeking (Fuchs et al. 2005). Furthermore, using the pharmacological disconnection procedure, via asymmetrical, unilateral inactivation of DH and BLA, Fuchs et al. (2007) showed that context-induced relapse of extinguished cocaine seeking is dependent on interaction between DH and BLA. In the ventral hippocampus (VH), muscimol+baclofen inactivation also decreases context-induced relapse of extinguished cocaine seeking (Lasseter et al. 2010). This finding is consistent with that of Marchant and Bossert described above because vSub is a primary output region of VH (Groenewegen et al. 1987;Naber and Witter 1998). Muscimol+baclofen inactivation of vSub decreases context-induced relapse to heroin seeking after extinction and context-induced relapse to alcohol seeking after punishment (Bossert and Stern 2014;Marchant et al. 2016). Additionally, as mentioned above, inhibition of the vSub projection to NAc shell by either pharmacological disconnection or by chemogeneticmediated projection-specific inhibition decreases contextinduced relapse to heroin seeking after extinction and context-induced relapse to alcohol seeking after punishment Marchant et al. 2016).
Basolateral and central amygdala In a recent study, within the same experiments and using identical training parameters in context A, we directly tested the effect of muscimol+baclofen inactivation of basolateral and central amygdala (BLA and CeA) on context-induced relapse after either extinction or punishment in context B (Pelloux et al. 2018b). We found that either BLA or CeA inactivation decreases context-induced relapse of cocaine seeking in the classical extinction ABA renewal model. However, in rats that received punishment of cocaine self-administration in context B, BLA inactivation increases context-induced relapse in context A. We found no effect of CeA inactivation on context-induced relapse after punishment. Our manipulations in context B, however, demonstrated significantly contrasting effects between rats trained for extinction versus punishment. We found that either BLA or CeA inactivation provoked relapse in context B after punishment, but not after extinction. The results of this study demonstrate dissociable roles of the two amygdala subregions in context-induced relapse after extinction versus punishment.
The BLA results of our study for context-induced relapse to drug seeking after extinction are consistent with results from previous studies (Chaudhri et al. 2013;Fuchs et al. 2005;Marinelli et al. 2010;Stringfield et al. 2016). Together, these results demonstrate that the amygdala's role in contextinduced relapse critically depends on the method used to achieve abstinence.

Conclusions and clinical implications
We reviewed the results from recent studies on contextinduced relapse to alcohol and cocaine seeking after punishment-imposed abstinence and compared the results to those of previously published studies on context-induced Fig. 1 Effect of site-specific neuropharmacological manipulations on relapse to cocaine or alcohol seeking in context A or B after punishment-or extinction-imposed abstinence. a Manipulations in rats tested for context-induced relapse after punishment. b Manipulations in rats tested for context-induced relapse after extinction. Code: dark gray, decreased drug-seeking in context A; light gray, increased drug seeking in context B; white stars, increased drug-seeking in context A; white, either no effect or not tested. Abbreviations: BLA, basolateral amygdala; CeA, central amygdala; NAc, nucleus accumbens; LH, lateral hypothalamus; vSub, ventral subiculum relapse to drug seeking after extinction-imposed abstinence.
As discussed above and outlined in Table 1 (Fos studies), and Table 2 and Fig. 1 (effect of site-specific neuropharmacological manipulations), there are both similarities and differences in the brain areas activated during the relapse tests after extinction versus punishment, and in the effect of different neuropharmacological manipulations on relapse in the two variations of the ABA renewal model. The LH, NAc core and shell, vSub, and the glutamatergic projection from vSub to NAc shell appear to be critical for context-induced relapse across drug classes, independent of the method used to achieve abstinence in context B. In contrast, the CeA plays a selective role in context-induced relapse to cocaine seeking after extinction but not punishment. Perhaps more significantly, the BLA appears to exert opposing control over cocaine seeking for context-induced relapse after extinction versus after punishment. A question for future research is whether the opposing roles of BLA generalize to other addictive drugs. Previous studies have shown notable differences in brain areas and projections that control context-induced relapse to heroin, cocaine, and alcohol seeking after extinction, as well as the seeking of these drugs in other relapse models (Badiani et al. 2011;Bossert et al. 2013). Below we speculate about some implications to human drug relapse of the rodent studies using the extinction-and punishment-based ABA renewal models. We and others have previously proposed that the negative results from human Bcue exposure^studies may be explained by the very reliable effect of exposure to the drug-associated context on drug seeking (reinstatement), even though the discrete cues are previously extinguished in a different context (Conklin 2006;Crombag et al. 2008). The general finding from these studies is that most human drug users relapse to drug use when they return to their home environment after successful extinction of the physiological and psychological responses to drugassociated discrete cues in the clinic (Carter and Tiffany 1999;Conklin and Tiffany 2002). In a similar manner, the results from our recent studies on context-induced relapse after punishment-imposed abstinence in a non-drug context are likely relevant to the high relapse rates in the home environment after periods of incarceration (Binswanger et al. 2013;Chandler et al. 2009;Dolan et al. 2005) or inpatient treatment (Hunt et al. 1971;Sinha 2011) where continued drug use typically results in adverse consequences like loss of privileges or verbal reprimand.
Finally, the finding that the role of the amygdala in contextinduced relapse depends on the method used to achieve abstinence has implications to both animal models of relapse and human relapse. Regarding animal models, this finding highlights the importance of studying relapse under abstinence conditions that more closely mimic the human condition (Marchant et al. 2013b;Venniro et al. 2016). Regarding clinical implications, many human imaging studies reported that exposure to drug-associated cues activates the amygdala (Grant et al. 1996;Jasinska et al. 2014). The results of our recent study (Pelloux et al. 2018a) imply that the role of human amygdala in drug craving and relapse will be significantly influenced by external environmental conditions, as well as the internal motivational factors that lead to abstinence in individual drug addicts. It would be of interest to determine whether variability in the motivation for abstinence in the clinical population can reliably predict any variability observed in either the neuronal responsiveness to drugassociated cues or the propensity for relapse.