In several instances it has been shown that an extinguished response can recover, indicating that extinction does not completely erase the first-learned information. One of the postextinction phenomena that supports this notion is renewal, the recovery of the extinguished response that occurs when testing takes place outside the extinction context (Bouton & Bolles, 1979; Bouton & King, 1983; Bouton & Ricker, 1994). In a typical renewal experiment, the subjects learn an association between a conditioned stimulus (CS) and an unconditioned stimulus (US) in a context A. In a second phase, which is conducted in Context B, the CS is no longer followed by the US, resulting in a decrease of the conditioned response to the CS (extinction). Finally, if the subjects are then tested again in the acquisition Context A, the originally learned behavior reappears. This procedure is called ABA renewal, with the letters denoting the contexts of acquisition, extinction, and test. Renewal has also been reported when acquisition, extinction, and testing take place in three different contexts (ABC renewal; Bouton & Bolles, 1979), and when acquisition and extinction take place in the same context and testing in a different one (AAB renewal; Bouton & Ricker, 1994).

The renewal effect suggests that extinction performance is more context-specific than initial acquisition. Several accounts have been proposed to explain this context dependency of extinction. For instance, the Rescorla–Wagner model (Rescorla & Wagner, 1972) accounts for renewal by assuming that the context of extinction acquires an inhibitory association with the US due to the nonreinforced presentations of the CS. This contextual inhibition is predicted to “protect” the CS from a complete loss of its excitatory associative strength (protection-from-extinction hypothesis; Lovibond, Davis, & O’Flaherty, 2000; Rescorla, 2003). If the inhibitory contribution of the extinction context is removed by a context change, responding to the CS recovers. There is evidence that under certain conditions an initially neutral context can acquire inhibitory strength during extinction (e.g., Polack, Laborda, & Miller, 2012, 2013); however, renewal has been reported to occur even when direct contextual inhibition was not detected (e.g., Harris, Jones, Bailey, & Westbrook, 2000). Moreover, the Rescorla–Wagner model can be applied to explain ABA and ABC renewal, but it is unable to deal with observations of AAB renewal.

According to Bouton’s retrieval model (e.g., Bouton, 1993, 1994; see also Rosas, Callejas Aguilera, Ramos Álvarez, & Fernández Abad, 2006), contextual stimuli modulate the retrieval of different memories related to the same CS. The model assumes that extinction establishes a second, inhibitory association between the CS and the US that counteracts the previously acquired excitatory connection. Whereas retrieval of the first-learned association proceeds independently of the context, activation of the second-learned association requires the presence of the context of extinction. Bouton’s retrieval model is able to account for ABA, ABC, and AAB renewal, but predicts that all three renewal types should cause the same level of response recovery. According to the evidence, however, AAB renewal typically shows smaller levels of recovery than either ABA or ABC renewal (Thomas, Larsen, & Ayres, 2003), and sometimes it is not observed at all (Üngör & Lachnit, 2008).

Experimental extinction was the basis for the development of exposure therapy (Bouton, 2000; Bouton, Woods, Moody, Sunsay, & García-Gutiérrez, 2006), and the renewal effect provides a model for relapse, which is common in exposure-based treatments (Craske, 1999). In exposure therapy, a patient is confronted with a fear-eliciting stimulus in order to decrease the response to it. The renewal effect indicates that the therapeutic success in overcoming fears will to a certain degree be linked to the therapeutic environment, so that when a patient leaves the treatment context, relapse is likely to occur.

Due to this vulnerability of extinguished behavior to relapse, research has been dedicated to finding treatments able to prevent response recovery (for a review, see Laborda, McConnell, & Miller, 2011). As one possibility, Bouton (1991) suggested conducting extinction in several contexts rather than in a single one, to enhance the generalization of extinction to other contexts. The effectiveness of conducting extinction in multiple contexts has been examined in a variety of preparations and species. It has been shown to attenuate renewal in rats using fear conditioning (Gunther, Denniston, & Miller, 1998; Thomas, Vurbic, & Novak, 2009; Laborda & Miller, 2013) and conditioned taste aversion (Chelonis, Calton, Hart, & Schachtman, 1999). With human subjects, it has been examined in predictive learning (Neumann, 2006; Glautier, Elgueta, & Nelson, 2013), fear of spiders (Vansteenwegen et al., 2007), and fear conditioning (Bandarian Balooch & Neumann, 2011).

The aim of the present study was to examine the mechanisms underlying the effectiveness of extinction in multiple contexts to reduce response recovery. One explanation is offered by the Rescorla–Wagner (1972) model. When extinction is conducted in multiple contexts, each of the contexts acquires less inhibitory strength than the contextual inhibition that develops during extinction in a single context, which leads to a greater loss of excitatory strength of the CS. A related prediction of this explanation is that responding to a CS should decrease more slowly when extinction is conducted in multiple contexts rather than a single one (for empirical support in rats, see, e.g., Thomas et al., 2009).

Thus, the Rescorla–Wagner model predicts a strong relation between the associative properties of contexts and the rate of extinction conducted in these contexts. In the field of human predictive learning, this relationship has been investigated in two experiments by Glautier, Elgueta, and Nelson (2013). In both experiments, the authors observed reduced ABC renewal following extinction in multiple contexts as compared to extinction in a single one. Although in their first experiment they found no evidence for a difference in the rates of extinction conducted in multiple contexts and in a single context, a summation test following the extinction treatment revealed a trend for stronger contextual inhibition in the single-context than in the multiple-contexts condition. However, their second experiment yielded an opposite pattern of results: Although extinction was found to be more rapid when it was conducted in a single rather than in multiple contexts, the second experiment revealed no evidence for a difference in contextual inhibition between the single-context and multiple-contexts conditions. Thus, in contrast to the predictions of the Rescorla–Wagner (1972) model, the results from the two experiments reported by Glautier et al. showed that the associative properties of the contexts and the rate of extinction can be rather unrelated. One aim of our experiments was to further evaluate this prediction of the Rescorla–Wagner model using a different approach. In each of the present experiments we compared the rates of extinction in a single context and in multiple contexts, but in each phase we also included filler cues, with the purpose of manipulating the learning histories of the contexts in such a way that the Rescorla–Wagner model would predict faster rather than slower extinction in multiple contexts (for details, see the Appendix).

In Bouton’s (1993, 1994) retrieval model, contextual control of behavior is not a function of the associative properties of the contexts. When extinction is conducted in multiple contexts, each context switch during extinction might have some potential to cause a return of the conditioned responding, which would lead to a higher level of performance than with extinction in a single context. This effect should occur independently of the associative histories of the contexts.

Within the framework of Bouton’s retrieval model, the effectiveness of extinction in multiple contexts to reduce response recovery can be explained by assuming that the inclusion of more extinction contexts increases the number of contextual features related to extinction. This would, in turn, increase the probability that other contexts could share common features with the extinction contexts, which would facilitate the generalization of extinction across contexts. Because the context of initial learning is not encoded, the model predicts that extinction in multiple contexts should facilitate the generalization of extinction, regardless of whether testing is conducted in the acquisition context or in a novel one. In Experiment 2 we examined this prediction by directly comparing the effects of extinction in multiple contexts on ABA and ABC renewal. To our knowledge, only one study has examined this in human learning within one experiment. Using a human conditioned suppression task, Neumann (2006, Exp. 3) reported that extinction in multiple contexts completely abolished both ABA and ABC renewal. In addition, the author reported evidence for stronger ABA than ABC renewal following extinction in a single context. Although the latter finding is more in accordance with the predictions of the Rescorla–Wagner (1972) model, the study by Neumann was not aimed at assessing possible contributions of a protection-from-extinction mechanism for the observed effectiveness of extinction in multiple contexts.

In both of the present experiments we used a predictive learning scenario that asked participants to imagine being a medical doctor whose patient often suffers from stomach troubles after the consumption of different meals in different restaurants. The task was to predict the occurrence (+) or nonoccurrence (–) of this stomach trouble. On successive trials, different cues (food types) were presented in one of several contexts (restaurants), and participants were asked to predict the patient’s reaction. During the learning phases of each experiment (acquisition and extinction), participants received feedback about the outcome of each trial.

Experiment 1

Table 1 illustrates the design for the two groups of the experiment. During acquisition, all participants received training with a target cue A+ in Context 1. During extinction, half of the participants received extinction training with A– in Context 2 (Group Single), and the other half were presented with A– in Contexts 2, 3, and 4 (Group Multiple). Additionally, the training schedule in each group included filler trials, with F2+ in Context 2 during acquisition and F6+ in Context 2 during extinction.

Table 1 Design of Experiment 1

According to the predictions of the Rescorla–Wagner (1972) model, the training of the excitatory filler cues should prevent Context 2 from acquiring inhibitory strength during the extinction phase. Instead, and as was shown by the simulations conducted using the Associative Learning Theories Simulator (ALTSim; Thorwart, Schultheis, König, & Lachnit, 2009), Context 2 should become excitatory during the acquisition and extinction phases, and no protection from extinction should occur during extinction in a single context. In contrast to the prediction described above, responding to the CS during the extinction phase should decrease more slowly in a single context than in multiple contexts (see the Appendix for details about the simulations).

Method

Participants

The participants were 60 students from the Philipps-Universität Marburg, Germany (41 women and 19 men). Their ages varied between 18 and 33 years, with a median of 22. They either were paid (€1.50 [USD $2]), were rewarded with chocolate for participation, or received course credits. Participants were equally allocated to the different experimental groups as they arrived in the experimental room. They were tested individually and required between 10 and 15 min to complete the experiment. Participants were not authorized to use any additional material or to take notes during the experiment, and they were instructed to deactivate or silence their mobile phones. The data of nine additional participants, three from Group Multiple and six from Group Single, were excluded from the analyses because their predictions were incorrect on more than 30 % of all the trials presented during the last two blocks of acquisition and/or during the last two blocks of extinction.

Apparatus

The instructions and all necessary information were presented on a notebook screen (Lenovo Thinkpad W500, screen size of 15 in. with a resolution of 1440 × 900 pixels). The experiment was programmed using the Microsoft Visual Studio 2010 Visual Basic language. Participants interacted with the computer using only the mouse. The following food types were used as cues: apples, avocados, bananas, strawberries, carrots, oranges, tomatoes, grapes, and lemons. All food types were presented as JPG images with a resolution of 300 × 300 pixels. The names of five fictitious restaurants were used as contexts, labeled (translated from the German) “To the Mug,” “At the Cathedral,” “By the Innkeeper,” “In the Kettle,” and “From the Best,” written in red, blue, yellow, green, and white font, respectively. The assignments of the different food types to Cue A and Filler Cues F1–F8, as well as the assignments of the five restaurant names to the four contexts, were randomized for each participant. The two different outcomes were the occurrence (+) or nonoccurrence (–) of stomach troubles.

Procedure

Each participant was asked initially to read the following instructions (in German) on the screen:

This study is concerned with the question of how people learn about relationships between different events. Imagine that you are a medical doctor and that one of your patients often suffers from stomach troubles after meals. Your task is to discover what causes this stomach troubles your patient is suffering from.

Your patient likes to go out for meals. To the Mug, At the Cathedral, By the Innkeeper, In the Kettle, and From the Best are your patient’s favorite restaurants. You will be told which restaurant your patient visited each day and which food he ate there. Please look carefully at the foods and the respective restaurants. Thereafter you will be asked to predict whether the patient suffers from stomach troubles. For this prediction, please click on the appropriate prediction button. After you have made your prediction, you will be informed whether your patient actually suffered from stomach troubles. Use this feedback to find out what causes the stomach troubles your patient is suffering from. Obviously, at first you will have to guess because you don’t know anything about your patient. But eventually you will learn which causes lead to stomach troubles in this patient and you will be able to make correct predictions.

For all your answers, accuracy instead of speed is essential. Please do not take any notes during the experiment. If you have any more questions, please ask now. If you don’t have any question, please start the experiment by clicking on the “Next” button.

When a participant asked a question, it was answered by the experimenter by rephrasing the appropriate part of the instructions. After the participant clicked on the “Next” button, the learning phases started.

On each learning trial, the name of one of the restaurants appeared on top of the display surrounded by a rectangular frame of the same color as the restaurant’s name. Within the frame, a picture of one food type was shown at the center of the screen. Below that picture the name of the food was written. Participants were told that their patient had eaten the food at the restaurant. They were also instructed to make a prediction whether they expected that their patient would suffer from stomach troubles. Participants made their predictions by clicking on one of two answer buttons, labeled “Yes, I expect stomach trouble” and “No, I do not expect stomach trouble.” Immediately after participants had responded, another window appeared, telling the participants whether their patient had suffered from stomach troubles. Participants had to confirm that they had read the feedback by clicking on an “OK” button. Thereafter, the next trial started.

Acquisition

During the acquisition phase (see Table 1), all participants were given 12 trials each of A+ and F1– in Context 1, and 12 trials each of F2+ and F3– in Context 2.

Extinction

In the extinction phase, half of the participants (Group Multiple) received 12 trials of F6+ and four trials each of F7– and F8– in Context 2, together with 12 trials of A– distributed equally across Contexts 2, 3, and 4—that is, four trials in each context. The other half of the participants (Group Single) were given 12 trials each of A– and F6+ in Context 2, and four trials each of F7– and F8– in Contexts 3 and 4, respectively. Furthermore, all participants were trained during the extinction phase with 12 trials each of F4+ and F5– in Context 1. Extinction followed acquisition without a break (the transition was not signaled to the participants).

For all participants, the acquisition phase was divided into six blocks, whereas the extinction phase was divided into four blocks. Each block in acquisition consisted of two presentations of each cue, and each block in extinction comprised three presentations of each cue, excepting F7– and F8–, which were presented once in each block. Thus, each block in extinction in Group Single comprised three trials with A– in Context 2, whereas each block in Group Multiple comprised one A– trial each in Contexts 2, 3, and 4. The order of presentation of the trials within each block was determined randomly for each block and participant.

Results and discussion

For this and the subsequent experiment, the .05 level of significance was employed for all statistical tests, and the stated probability levels are based on the Greenhouse–Geisser (1959) adjustment of degrees of freedom where appropriate (for the sake of readability, we report uncorrected degrees of freedom).

Acquisition

The left-hand panel of Fig. 1 presents the mean percentages of stomach trouble predictions for A+ in Context 1 across the six blocks of the acquisition phase for each group. White squares represent the data from Group Single, and black squares the data from Group Multiple. As can be seen in the figure, the mean prediction to A+ increased across the blocks, and there were no differences in responding to A+ between groups. This was confirmed by a 6 × 2 (Block [1, 2, 3, 4, 5, 6] × Group [Single, Multiple]) analysis of variance (ANOVA). A main effect of block was found, F(5, 290) = 50.07, p < .001, indicating an increase of stomach trouble predictions to A+ over the course of acquisition training, but neither an effect of group, F < 1, nor a Block × Group interaction, F(5, 290) = 1.74, p = .160, was detected, showing that there was no difference in prediction levels between the groups.

Fig. 1
figure 1

The left-hand panel shows the mean proportions of predictions of stomach troubles in response to A in Context 1 across the six blocks in the acquisition phase of Experiment 1, separately for Group Multiple (black squares) and Group Single (white squares). The right-hand panel shows the mean proportions of predictions of stomach troubles in response to A in Context 2, for Group Single, and in Contexts 2, 3, and 4, for Group Multiple, across the four blocks in the extinction phase of Experiment 1. Error bars denote standard errors of the means

Extinction

The right-hand panel of Fig. 1 presents the mean percentages of stomach trouble predictions for A– in Context 2 in Group Single and for A– in Contexts 2, 3, and 4 in Group Multiple, across the four blocks of the extinction phase. As is shown in the figure, the means of stomach trouble predictions decreased across the blocks, confirming that the response to A was extinguished. The figure also shows a higher level of responding in Group Multiple than in Group Single across the extinction blocks; that is, extinction was slower when conducted in three contexts rather than one context. A 4 × 2 (Block [1, 2, 3, 4] × Group [Single, Multiple]) ANOVA showed a significant main effect of block, F(3, 174) = 50.86, p < .001, as well as an effect of group, F(1, 58) = 5.10, p = .028, indicating that the number of stomach trouble predictions was higher in Group Multiple than in Group Single. No Block × Group interaction was detected, F(3, 174) = 1.07, p = .35.

The results of the present experiment showed that conducting extinction in multiple contexts caused a higher level of responding during extinction than when extinction was in a single context. The Rescorla–Wagner (1972) model is unable to deal with this finding if the specific parameters of the experimental design are taken into account. Due to the training of the filler cues, the model would predict that Context 2 should acquire excitatory strength during the acquisition and extinction phases. In that case, no contextual inhibition would be present to protect the target cue from extinction when this was conducted in a single context, and the rate of extinction in this condition should have been slower than in the condition in which extinction was conducted in multiple contexts.

The results of Experiment 1 are consistent with Bouton’s retrieval theory (Bouton, 1993, 1994). Each time the context switched within the extinction phase, responding to the target cue could recover, which would slow down extinction as compared to extinction in a single context. This prediction remains unaffected by the associative properties of the contexts, since contextual control in the model is achieved by a hierarchical mechanism rather than by the direct associative strengths acquired by the contexts.

According to the retrieval model proposed by Bouton (1993, 1994), extinction in multiple contexts enhances the generalization of extinction learning across contexts by increasing the number of contextual features that are associated with extinction. It follows from this generalization hypothesis that extinction in multiple contexts should decrease ABA and ABC renewal in the same manner, because contextual stimuli are not encoded in the model until the CS becomes ambiguous during extinction. The aim of the following experiment was to test this prediction.

Experiment 2

Table 2 illustrates the design for the four groups of this experiment. The first two phases of Experiment 2 were identical to those from Experiment 1. Thus, following acquisition training with a target cue A in Context 1, half of the participants received extinction of the target cue in Context 2, whereas the other half received extinction in Contexts 2, 3, and 4. During a final test phase, half of the participants who received extinction in a single context and half of the participants who received extinction in multiple contexts were presented with A in Context 1 as well as in Context 2 (Group SingleABA and Group MultipleABA, respectively). The other half of each extinction condition was tested with A in Contexts 5 and 2 (Group SingleABC and Group MultipleABC, respectively).

Table 2 Design of Experiment 2

According to the retrieval model proposed by Bouton (1993, 1994; see also Bouton et al., 2006), response recovery during the test phase should be stronger in the two groups with extinction in a single context than in the two groups with extinction in multiple contexts. Moreover, the reduction in renewal due to extinction in multiple contexts should be the same in the ABA and ABC conditions.

Method

Participants, apparatus, and procedure

The participants were 120 students from the Philipps-Universität Marburg, Germany (79 women and 41 men). Their ages varied between 17 and 30 years, with a median of 22. Participants were equally allocated to the four experimental groups as they arrived in the experimental room. The data of 31 additional participants—nine from Group SingleABA, seven from Group MultipleABA, seven from SingleABC, and eight from MultipleABC—were excluded from the analyses because their predictions were incorrect on more than 30 % of all the trials presented during the last two blocks in the acquisition phase and/or during the last two blocks of the extinction phase.

The instructions, stimuli, and procedure were the same as those used in Experiment 1, unless stated otherwise. For each participant, the five restaurant names “To the Mug,” “At the Cathedral,” “By the Innkeeper,” “In the Kettle,” and “From the Best” were randomly assigned to Contexts 1 to 5.

Test

After participants had completed the extinction phase, they received a test phase that was introduced by the following instructions: “Now the feedback of whether your patient actually suffers from stomach trouble will be omitted. Nevertheless, please try to predict the occurrence or nonoccurrence of stomach trouble as accurately as possible.” The test trials were identical to the learning trials, with the exception that the feedback window was omitted. Half of the participants who received extinction in a single context and the other half who received extinction in multiple contexts were presented with A trials in Contexts 1 and 2 (Group SingleABA and Group MultipleABA, respectively). The other half with extinction in a single context and the other half with extinction in multiple contexts were presented with A trials in Context 2 and in Context 5 (Group SingleABC and Group MultipleABC, respectively). Each trial type was presented on four occasions. This phase was divided into two blocks, and within each block each trial type was presented two times. The order of presentation of the trials within each block was determined randomly.

Results and discussion

Acquisition

The left-hand panel of Fig. 2 presents the mean percentages of stomach trouble predictions for A+ in Context 1 across the six blocks of acquisition for each group. Squares represent the data from groups SingleABA (white) and MultipleABA (black), and triangles the data from groups SingleABC (white) and MultipleABC (black). As can be seen in the figure, the mean predictions to A+ increased across the blocks, and there were no differences in responding to A+ between groups. This was confirmed by a 6 × 4 (Block [1, 2, 3, 4, 5, 6] × Group [SingleABA, MultipleABA, SingleABC, MultipleABC]) ANOVA. A main effect of block was found, F(5, 580) = 71.48, p < .001, indicating an increase of stomach trouble predictions to A+ over the course of acquisition training, but we found neither an effect of group, F < 1, nor a Block × Group interaction, F(15, 580) = 1.31, p = .23, showing no difference in prediction levels between groups.

Fig. 2
figure 2

The left-hand panel shows the mean proportions of predictions of stomach troubles in response to A in Context 1 across the six blocks in the acquisition phase of Experiment 2, separately for Groups MultipleABA (black squares), SingleABA (white squares), MultipleABC (black triangles), and SingleABC (white triangles). The right-hand panel shows the mean proportions of predictions of stomach troubles in response to A in Context 2, for Groups SingleABA and SingleABC, and in Contexts 2, 3, and 4, for Groups MultipleABA and MultipleABC, across the four blocks in the extinction phase of Experiment 2. Error bars denote standard errors of the means

Extinction

The right-hand panel of Fig. 2 presents the mean percentages of stomach trouble predictions for A– in Context 2 for Groups SingleABA and SingleABC, and in Contexts 2, 3, and 4 for Groups MultipleABA and MultipleABC, across the blocks of the extinction phase. As is shown in the figure, the mean of the stomach trouble predictions decreased across the blocks for each of the four groups. The figure also shows that the levels of responding during extinction were higher in groups MultipleABA and MultipleABC than in groups SingleABA and SingleABC. A 2 × 2 × 4 (Renewal Type [ABA, ABC] × Extinction Treatment [Single, Multiple] × Block [1, 2, 3, 4]) ANOVA supported this conclusion. A main effect of block was detected, F(3, 348) = 136.33, p < .001, as well as a main effect of extinction treatment, F(1, 116) = 4.07, p = .046, showing that the number of stomach trouble predictions was higher during extinction in multiple contexts than during extinction in one context. All remaining main effects and interactions were not significant, all Fs < 2.17, all ps > .10.

Test

Figure 3 depicts responding to A during the test phase in terms of the mean proportions of stomach trouble predictions, collapsed across the four test trials presented in each context. The left-hand bars present the predictions for groups MultipleABA and SingleABA in Contexts 1 and 2, and the right-hand bars show the predictions for groups MultipleABC and SingleABC in Contexts 5 and 2.

Fig. 3
figure 3

Mean proportions of predictions of stomach troubles in response to A during the test phase of Experiment 2, collapsed across the four presentations within the same context. The left-hand bars present the predictions for Groups MultipleABA and SingleABA in Contexts 1 and 2, and the right-hand bars show the predictions for Groups MultipleABC and SingleABC in Contexts 5 and 2. Error bars denote standard errors of the means

As the figure demonstrates, the participants in Groups SingleABA and MultipleABA showed a higher level of responding to A in Context 1 than in Context 2 (ABA renewal), whereas the participants in groups MultipleABC and SingleABC differed in their response patterns. The participants in Group SingleABC showed a higher level of responding in Context 5 than in Context 2 (ABC renewal), whereas the participants in Group MultipleABC showed similar levels of performance across the contexts. A 2 × 2 × 2 (Context [test, extinction] × Renewal Type [ABA, ABC] × Extinction Treatment [single, multiple]) ANOVA showed a main effect of context, F(1, 116) = 24.44, p < .001; a main effect of renewal type, F(1, 116) = 8.99, p = .003; and a Context × Extinction Treatment interaction, F(1, 116) = 4.73, p = .032. Most importantly, the ANOVA also revealed a Context × Renewal Type × Extinction Treatment interaction, F(1, 116) = 7.45, p = .007, indicating that the effectiveness of extinction in multiple contexts on context dependency was modulated by the type of renewal. The main effect of extinction treatment and the remaining interactions failed to reach significance, all Fs < 3.76, all ps > .06.

To decompose the Context × Renewal Type × Extinction Treatment interaction, we conducted a 2 × 2 (Context [test, extinction] × Group [Single, Multiple]) ANOVA for each renewal condition. For groups MultipleABA and SingleABA, the analysis revealed a main effect of context, F(1, 58) = 20.1, p < .001, indicating that responding to A was stronger in Context 1 than in Context 2. We found no main effect of group, F(1, 58) = 1.73, p = .19, and no Context × Group interaction, F < 1, showing that the strengths of renewal were the same in both groups.

For groups MultipleABC and SingleABC, the analysis yielded a main effect of context, F(1, 58) = 5.50, p = .02, and a Context × Group interaction, F(1, 58) = 14.64, p < .001, showing that the context dependency of responding was stronger in Group SingleABC than in Group MultipleABC. No main effect of group was detected, F < 1. Two paired-samples t tests were conducted to explore the Context × Group interaction. Whereas the participants in Group SingleABC responded more strongly in Context 5 than in Context 2, t(29) = 4.27, p < .001, the participants in Group MultipleABC showed the same levels of responding across the contexts, t(29) = 1.07, p = .29.

As in Experiment 1, we observed that extinction in three contexts resulted in a higher response level during extinction than during extinction in a single context. Furthermore, we observed that extinction in multiple contexts reduced response recovery when the test was conducted in a new, neutral context (ABC renewal), but had no detectable impact when the test took place in the acquisition context (ABA renewal). This dissociation between ABA and ABC renewal is inconsistent with the predictions from Bouton’s (1993, 1994) retrieval model. That theory assumes that ABA and ABC renewal are caused by the same mechanism, and therefore that extinction in multiple contexts should affect both renewal types equally, which was not the case.

General discussion

In two human predictive-learning experiments, we investigated the effects of conducting extinction in multiple contexts on both extinction rate and renewal. In each experiment, we found that extinction proceeded slower when it was conducted in three contexts than when it was conducted in one context. Moreover, Experiment 2 showed that extinction in multiple contexts prevented response recovery in a new, neutral context (ABC renewal) but had no detectable impact on recovery in the original acquisition context (ABA renewal).

For each of the present experiments, the Rescorla–Wagner (1972) model predicted that extinction in multiple contexts should not have been slower than extinction in one context. According to the simulations, Context 2 (extinction context), in which excitatory filler cues were presented, should have become excitatory during the acquisition and extinction phases. Thus, no protection from extinction should have occurred in the condition with a single extinction context, and the rate of extinction in this condition should have been slower than in the condition in which extinction was conducted in multiple contexts.

We confirmed this prediction in simulations of the Rescorla–Wagner model with several parameter variations. These simulations showed no difference between the different parameters sets, establishing that the predictions provided by the Rescorla–Wagner model were reliable and do not depend on the assigned parameters (for details, see the Appendix).

Note that the reported results of the groups with extinction in multiple contexts, and likewise their simulations, represent the average responding to A in Contexts 2, 3, and 4, whereas the results of the groups with extinction in a single context consist of the average response to A only in Context 2. For the present experimental design, the Rescorla–Wagner model predicts different associative strengths for the contexts due to the different treatments they received. In particular, Context 2 should become more excitatory than Contexts 3 and 4. However, our design did not allow for testing these different associative strengths, and further experimentation will be necessary to examine whether the learning history of the contexts affects their associative strengths, as is predicted by the Rescorla–Wagner model.

Our finding that extinction proceeds faster when it conducted in a single context rather than in multiple contexts is consistent with Bouton’s (1993) retrieval account. According to the theory, each context switch during extinction might have some potential to cause a return of responding, which would lead to a higher level of performance than would extinction in a single context. However, the retrieval model is unable to deal with our findings from Experiment 2, that extinction in multiple contexts prevented ABC renewal but did not affect the recovery levels in ABA renewal. The model assumes that contextual stimuli are not encoded until a CS undergoes extinction. For this reason, the theory is unable to anticipate dissociations between the different types of renewal.

Our findings regarding the extinction rate are consistent with previous studies (e.g., Bouton et al., 2006a, b; Glautier et al., 2013; Thomas et al., 2009). The present experiments extend these studies by demonstrating that extinction proceeds slower in multiple contexts, even if there is no basis for contextual inhibition. However, we also found evidence that extinction in multiple contexts does not necessarily have an effect on the extinction rate (e.g., Glautier et al., 2013; Neumann, 2006). The reasons for this difference are not clear. Potentially, the number of contexts and the number of context changes might be crucial factors.

Our finding that extinction in multiple contexts reduced ABC renewal is consistent with previous evidence (e.g., Bouton et al., 2006a, b; Glautier et al., 2013; Gunther et al., 1998; Neumann, 2006; Vansteenwegen et al., 2007; but see Bouton et al., 2006a), and the present study demonstrates the generality of the previous work. In the case of ABA renewal, however, the evidence is less clear. In accordance with the present study, other researchers have also reported no attenuation of ABA renewal due to extinction in multiple contexts (e.g., Betancourt et al., 2008; Bouton et al., 2006a; Neumann et al., 2007). However, there have also been a number of demonstrations of the effectiveness of extinction in multiple contexts in ABA renewal (e.g., Chelonis et al., 1999; Neumann, 2006). Some factors might explain these differences. For example, Thomas et al. (2009) reported that massive extinction in multiple contexts attenuated ABA renewal, but moderate extinction in multiple contexts was not effective. Furthermore, Bandarian Balooch and Neumann (2011) reported the prevention of ABA renewal due to extinction in multiple contexts only when the extinction contexts were perceptually similar to the acquisition context. Thus, it is possible that we might have found attenuation of ABA renewal either with a longer extinction phase or with more similar contexts.

To assess ABC renewal in Experiment 2, we introduced a novel context during the test phase that had not been trained or introduced in any way to our participants during the previous stages of the experiment. Thus, this test context differed from the acquisition context not only in terms of its associative learning history, but also in terms of familiarity. Both of these factors might have contributed to the difference in the effectiveness of extinction in multiple contexts on ABA and ABC renewal. Further experimental work will be required to uncover the importances of contextual learning history and contextual familiarity for the effectiveness of extinction in multiple contexts on the strength of response recovery.

The results of Experiment 2 suggest that ABC renewal is easier to prevent by extinction in multiple contexts than is ABA renewal. This observation extends the scope of the documented differences between ABA and ABC renewal. Harris et al. (2000), for instance, reported stronger ABA renewal than ABC renewal after extinction using an aversive-conditioning preparation with rats. Similar findings were reported with human subjects by Havermans et al. (2005) and by Neumann (2006) in a conditioned suppression task, and in human predictive learning (Üngör & Lachnit, 2006). In order to explain these findings, some authors (e.g., Delamater, 2004; Harris et al., 2000; Havermans et al., 2005) have suggested that during extinction, the context of initial learning retrospectively acquires the ability to modulate acquisition performance.

A similar assumption might be used as a post-hoc explanation of the results of Experiment 2. When extinction is conducted in multiple contexts, the CS is followed by the outcome only in the acquisition context, whereas the outcome is absent in the remaining contexts. This could lead the participants to treat their experience in the acquisition context as an exception to the rule and to treat extinction as the general case, which would prompt generalization of extinction to novel contexts. This suggestion is also consistent with results reported by Gunther, Denniston, and Miller (1998, Exp. 2). They examined whether extinction in multiple contexts attenuates ABC renewal following acquisition in multiple contexts. The results showed that when acquisition was conducted in multiple contexts, extinction in multiple contexts failed to reduce renewal. Further experiments should examine whether the context specificity of acquisition plays a role for the effectiveness of extinction in multiple contexts—for instance, by extending the results reported by Gunther et al. to ABA renewal.