For several decades, auditory sensory memory was generally conceived as being composed of two types of memory: (1) a short type that lasts for much less than a second and is thought to be the basis of such well-known phenomena as loudness summation and could, in principle, also be used for discrimination between sounds that are separated by very short silent intervals and (2) a long type that lasts for several seconds and is the basis for discrimination between sequentially presented sounds, especially when separated by long intervals (for theoretical discussions, see Cowan, 1984; Demany & Semal, 2008; Massaro, 1972; Näätänen & Winkler, 1999). However, the notion of passive loss of memory in such explicit memory-based judgments has been challenged (Cowan, 2008).Footnote 1 For example, Cowan and colleagues showed, in a two-alternative forced choice frequency discrimination task, that discrimination performance declined not only with a larger delay between two tones, but also with a shorter delay between trials; the results strongly suggested the presence of true passive loss, in addition to an interfering effect of the previous trial (Cowan, Saults, & Nugent, 1997; see also Cowan, Saults, & Nugent, 2001; Deutsch, 1970; Mathias, Micheyl, & Bailey, 2010; McKeown & Mercer, 2012; Mercer & McKeown, 2010b; Ruusuvirta, Wikgren, & Astikainen, 2008). A similar debate exists in the literature about whether verbal short-term memory and episodic memory decline or not (e.g., Altmann, 2009; Lewandowsky, Oberauer, & Brown, 2009). As far as we know, previous studies addressing this issue utilized explicit prospective memory paradigms, meaning that listeners knew prior to each trial that they would have to make memory judgments on stimuli that were presented. Thus, in order to test the generality of these findings, it is important to determine whether implicit auditory memories are lost, interfere with each other, or persist despite the passage of time and the presentation of other, subsequent patterns.

Temporal context effects, which refer to the effects of prior stimuli or prior percepts on perception, provide a useful tool for investigating implicit auditory memory. Context effects have been observed for the perception of basic auditory attributes, such as loudness (e.g., Gordon & Schneider, 2007; Marks, 1993; Oberfeld, 2007; Plack & Viemeister, 1992; Viemeister & Bacon, 1982; Zeng, Turner, & Relkin, 1991), pitch (e.g., Dawe, Platt, & Welsh, 1998; Okada & Kashino, 2003; Repp, 1997; Serman, Semal, & Demany, 2008; Shu, Swindale, & Cynader, 1993), timbre (e.g., Summerfield, Haggard, Foster, & Gray, 1984), and sound location (e.g., Kashino & Nishida, 1998; Kopco, Best, & Shinn-Cunningham, 2007), as well as for the perception of speech—specifically, phoneme boundaries (e.g., Holt, 2005; Samuel, 1986), vocal affect (e.g., Bestelmeyer, Rouger, DeBruine, & Belin, 2010), and voice gender (e.g., Zaske, Schweinberger, Kaufmann, & Kawahara, 2009). However, very little is known about the time course of these implicit influences of prior stimuli or prior percepts and whether they undergo interference or show persistent effects.

In the present study, we utilized recently discovered context effects in auditory scene analysis tasks (Snyder, Carter, Hannon, & Alain, 2009; Snyder, Carter, Lee, Hannon, & Alain, 2008; Snyder, Holder, Weintraub, Carter, & Alain, 2009; Snyder & Weintraub, 2011; see also Riecke, Mendelsohn, Schreiner, & Formisano, 2009; Riecke et al., 2011) to determine the extent to which implicit memories consist of memories that are lost versus persistent. Context effects in stream segregation have been demonstrated using sequences of tones organized into a repeating triplet pattern, ABA-ABA-. . . , where A and B represent tones of different frequencies and the dash represents a silent gap (for reviews of research on stream segregation relevant to this study, see Snyder & Alain, 2007; Snyder, Gregg, Weintraub, & Alain, 2012). It has been known for many years that such sequences can give rise to two different percepts, depending on the frequency separation (Δƒ) between the A and B tones: For small Δƒs (e.g., 1 semitone), the sequence is almost invariably heard as a single “stream” of sounds with a galloping rhythm; for large Δƒs (e.g., 18 semitones), the galloping rhythm is lost, and what the listener hears instead are two parallel and separate sound streams, one (A-A-. . .) having a faster tempo than the other (B---B---. . .) (Bregman & Campbell, 1971; Miller & Heise, 1950; Van Noorden, 1975). For intermediate Δƒs (e.g., 6 semitones), the sequence can be heard either as one galloping stream or as two separate isochronous streams, depending on factors such as the time elapsed since the onset of the sequence (Anstis & Saida, 1985; Bregman, 1978) and the listener’s perceptual goals (Van Noorden, 1975) or attentional state (Carlyon, Cusack, Foxton, & Robertson, 2001). The process of perception changing from one stream to two streams with longer sequences is called buildup and has been attributed to adaptation of frequency-tuned neurons in the brainstem and the auditory cortex (Micheyl, Tian, Carlyon, & Rauschecker, 2005; Pressnitzer, Sayles, Micheyl, & Winter, 2008). Consistent with the idea of adaptation of frequency-tuned neurons causing buildup is the finding that a preceding sequence of single-frequency tones that match the frequency of A or B tones can facilitate streaming (Beauvois & Meddis, 1997; Haywood & Roberts, 2010; Rogers & Bregman, 1993).

It was also recently shown that the Δƒ of the previously presented sequence(s) and whether the listener perceived the previous sequence(s) as one stream or two have a major impact on subsequent perceptual judgments (Snyder, Carter, et al., 2009; Snyder et al., 2008; Snyder, Holder, et al., 2009; Snyder & Weintraub, 2011). Specifically, these studies have consistently found that listeners are more likely to report perceiving a test sequence as two separate streams when the preceding (or context) sequence contained a small Δƒ than when the context sequence contained a large Δƒ (Snyder, Carter, et al., 2009; Snyder et al., 2008; Snyder, Holder, et al., 2009; Snyder & Weintraub, 2011). In addition to this contrastive effect of prior stimulation, a facilitative effect of prior judgment was also found, whereby listeners were more likely to report perceiving the test sequence as two separate streams if they had also reported perceiving the context sequence as two streams than if they had reported perceiving the context sequence as a single stream (Snyder, Carter et al., 2009; Snyder et al., 2008; Snyder, Holder et al., 2009). The facilitative effect of prior perception is dissociated from the prior Δƒ effect by measuring the prior perception effect when the Δƒ of the context and test are held constant, usually at an intermediate Δƒ level (e.g., five or six semitones).

Similar to buildup, it is possible to explain the effect of prior Δƒ as resulting from adaptation of neurons tuned to large (or small) Δƒs, under the assumption that reduced responsiveness of neurons would cause greater relative responses in neurons tuned to Δƒ of the opposite size—that is, small (or large)—for which we have found some evidence using event-related potentials (Snyder, Holder, et al., 2009). However, an adaptation explanation does not account for the facilitative nature of the prior perception context effect. Instead, this may result from a type of perceptual priming found in visual studies (for a review, see Pearson & Brascamp, 2008; see the General Discussion section for more details). Alternatively, a framework based on adjustments to perceptual judgment criteria might be able to account for both the contrastive nature of the prior Δƒ effect and the facilitative nature of the prior perception effect (Treisman & Williams, 1984; for an in-depth discussion, see Snyder & Weintraub, 2011).

In addition to furthering our understanding of sound segregation mechanisms (e.g., Snyder, Carter et al., 2009), these context effects provide a tool for studying the implicit effects of memory and how they influence the perceptual organization of sound. If the traditional notion of a single long auditory memory store containing information that is lost over time applies to implicit auditory memory of sound patterns, the streaming context effects should simply get smaller as the time between the context and test patterns is increased. In a previous study, Snyder and colleagues examined the time course of the prior Δƒ effect in some detail and showed that it does become smaller over time (Snyder et al., 2008). In particular, when there was a longer interstimulus interval (ISI; e.g., 5.76 vs. 1.44 s) from the end of one ABA- sequence to the beginning of the next, the size of the prior Δƒ effect was diminished. Additionally, the effect of the previous trial was larger than the effect of the trial before the previous one, which was, in turn, larger than the one before that. Together, these results were consistent with a memory that is lost over tens of seconds, until the memory is no longer present. However, an alternative explanation is that the presence of intervening stimuli actively interfered with the context effect from sequences other than the immediately prior one, as opposed to a more passive process of loss. Such a finding would be consistent with the proposals described above—namely, that interference, rather than or in addition to loss, is responsible for the effects of decreased memory performance (Cowan, 2008; Mercer & McKeown, 2010b)—and would extend these findings to implicit influences of auditory sensory memory. This study will also allow us to examine whether these two context effects result from similar underlying memories by directly comparing the effects of ISI on the prior Δƒ and prior perception effects in different participants.

To accomplish this, we utilized trials containing two successive ABA- context periods, followed by a test ABA- sequence. We examined the effect of prior Δƒ and prior perceptual interpretation of ABA- sequences during the first context period (context1) and the second context period (context2) on perception during the test. In these experiments, to identify temporal loss of the effects of the two contexts, we manipulated the ISI between the second context and the test, and we also compared the size of the context effects for the first and second contexts, as in a previous study (Snyder et al., 2008). However, unlike in the previous study, we also performed experiments in which the second context period consisted of silence, which allowed us to determine the extent to which the effect of the first context sequence was lost over time or was interfered with by the second context sequence when both context sequences were presented. This relies on the assumption that we can compare the effect of context1 when context2 is present or absent by estimating the effect of context1 in the presence of context2 by averaging across conditions in which context2 Δƒ is small or large or in which perception is “one stream” or “two streams”; we therefore assume that we can estimate the effect of context2 presence without the confounding influence of the context2 Δƒ or percept. Additionally, we performed experiments in which the first context period consisted of silence, to verify whether the loss of the context2 effect was a true loss effect and not the result of interference from the preceding context1. Finally, we performed an additional experiment that looked for loss and interference using a method more directly inspired by one of the previous auditory memory studies we described above (Cowan et al., 1997).

Experiments 1A–1C

We now examine the effect of the prior frequency separations on the perception of streaming during a test period.

Method

Participants

Forty-seven self-reported normal-hearing adults (22 men and 25 women; age range = 18–49 years, mean age = 20.9 years) from the University of Nevada, Las Vegas psychology participant pool participated after giving written informed consent according to the guidelines of the University’s Office for the Protection of Research Subjects. Thirteen participants took part in Experiment 1A, 16 participants took part in Experiment 1B, and 18 participants took part in Experiment 1C.

Stimuli

The stimuli were pure tones (80 ms in duration, including 10-ms rise/fall times with linear ramps) organized into a repeating temporal pattern, ABA-, where A and B represent tones of different frequencies and the dash stands for a silent gap. The onset asynchrony between adjacent tones within each ABA- triplet and the duration of the silent gap were 120 ms each. The stimuli were synthesized off-line using MATLAB (The MathWorks Inc., Natick, MA) and presented using a custom program written in the Presentation language (Neurobehavioral Systems, Inc., Albany, CA). Sounds were generated using an SB X-Fi sound card (Creative Technology, Ltd.) and were delivered binaurally via Sennheiser HD 280 headphones (Sennheiser Electronic Corporation, Old Lyme, CT) at approximately 70 dB SPL.

Procedure

Each trial in Experiment 1A consisted of three successive sequences: two context sequences (context1 followed by context2), and a test sequence (test). Each sequence consisted of 14 ABA- triplets, for a total duration of 6.72 s. The two context sequences were separated by a 1.44-s silence. The duration of the silent gap between the second context and test sequence was variable, being equal to either 1.44 or 8.64 s. Consecutive trials were separated by a silent interval of 5 s. The frequency of the A tones was fixed at 300 Hz. In the test sequence, the frequency of the B tones was equal to 424 Hz. This corresponds to an A–B frequency separation (Δƒ) of 6 semitones. This Δƒ was chosen because it usually leads to an ambiguous, or bistable, percept: The sequence can be perceived as a single stream or as two separate streams. For the context sequences, the frequency of the B tones was either the same for both contexts or different between context1 and context2, depending on the trial. It was constant within a given context, being equal to 357 Hz, which corresponded to a Δƒ of 3 semitones between the A and B tones, or to 600 Hz, which corresponded to a Δƒ of 12 semitones between the A and B tones. This resulted in eight different types of trials (2 context1 Δƒs × 2 context2 Δƒs × 2 silent gaps). The stimuli in Experiment 1B were identical to those in Experiment 1A, except that context2 was replaced by a silent interval of the same duration. Therefore, in this experiment, there were only four types of trials (2 context1 Δƒs × 2 silent gaps). Similarly, the stimuli in Experiment 1C were identical to those in Experiment 1A, except that context1 was replaced by a silent interval of the same duration. Therefore, in this experiment, there were only four types of trials (2 context2 Δƒs × 2 silent gaps).

In Experiment 1A, five blocks of 24 trials were presented (3 of each of the eight trial types). In Experiment 1B, four blocks of 24 trials were presented (6 of each of the four trial types). In Experiment 1C, four blocks of 24 trials were presented (6 of each of the four trial types). The different types of trials were randomly intermingled within a block. Blocks were tested in counterbalanced orders using a Latin-square design. Prior to Experiment 1A, we presented each of the eight trial types once as practice in random order; for Experiments 1B and 1C, we presented each of the four trial types twice as practice in random order.

Participants were instructed to press and hold down the key labeled “1” on the computer keyboard (number pad) whenever they perceived the sound sequence as a single stream and to press and hold down the key labeled “2” whenever they perceived the sound sequence as two streams. They were asked to release all keys during silent intervals. In addition, the participants were encouraged not to actively try to hear the sequence one way or the other but, rather, to listen “neutrally” and attentively to the sequences. They were not informed that the trials were structured into context and test periods; they were simply told to respond whenever sounds were played. Button presses and releases were recorded by the Presentation software and stored for off-line analysis. During the tests, participants were seated in a quiet room and were asked to maintain fixation on a white cross on a black background in the center of a computer screen throughout the experiment, in order to minimize potential visual influences on the auditory percepts.

Data analysis

The timings of the keypresses were used to construct the time courses of perceiving “two streams,” separately, for each experiment, trial type, and participant. The time series for each trial represented a total duration of 22.56 or 29.76 s and consisted of 48 or 63 time points, depending on the duration of the variable silence between context2 and the test period. Each time point represented the instantaneous reported perception at the start of an ABA- cycle (i.e., every 480 ms). Each of the time points was coded as “1 stream” if no buttonpress had been pressed previously during the trial, if the “1-stream” button had been pressed most recently during the trial, or if the “1-stream” button was pressed immediately after the current time point and closer to the current time point than to the next time point. A data point was coded as “2 streams” only if the last previous button pressed during the trial was the “2-streams” button or if the “2-streams” button was pressed immediately after the current time point and closer to the current time point than to the next time point. For each participant, we calculated the proportion of trials on which participants reported perceiving two streams for each time point within a condition, by averaging the time series across all the trials within the same condition; an example of these average time courses is plotted in Fig. 1. In order to quantify streaming for statistical analysis, we calculated the proportion of total time that each participant reported two streams by averaging all the time points together from the test period, thus eliminating information about the time course of streaming. Finally, we used these proportions to calculate the following difference score to quantify the effect of prior Δƒ during the test: The mean proportion of hearing two streams during the test when the prior Δƒ was 12 semitones was subtracted from the proportion of hearing two streams during the test when the prior Δƒ was 3 semitones.

Fig. 1
figure 1

Time course data from Experiment 1A: Proportion of time the tone sequences were heard as two streams (streaming) for the context1 period (left portion of panels), the context2 period (middle portion of panels), and the test period (right portion of panels). In Experiment 1A, the effect of Δƒ during context2 (compare thick vs. thin lines) had a larger effect on perception during the test than did the context1 Δƒ (compare solid vs. dotted lines). The effect of context2 declined with a longer silent duration preceding the test than the context1 effect (compare upper and lower panels). Schematic representations of the ABA- patterns are shown above each data panel

For brevity, statistical analyses will be reported only for the data from test periods of trials. For Experiment 1A, the average proportions of hearing two streams during the test period for different trial types were entered into a repeated measures analysis of variance (ANOVA) to test for differences in the effect of prior Δƒ on streaming depending on the context being evaluated (context1 vs. context2) and the duration of the variable silent period (1.44 or 8.64 s) as within-subjects factors. For Experiments 1B and 1C, separate ANOVAs were performed, using the duration of the variable silent period as the within-subjects factor. For Experiments 1A and 1B, to assess the persistence of the effect of prior Δƒ from context1, prior Δƒ difference scores were entered into a one-sample t-test to test whether the effect was significantly larger than 0. To directly compare Experiments 1A and 1B and, thus, the effect of the presence or absence of context2 on the effect of the context1 Δƒ, we averaged the Experiment 1A data across the different context2 Δƒ levels and entered the data into a mixed-measures ANOVA with variable silent duration as the within-subjects factor and context2 presence/absence as the between-subjects factor.

Results and discussion

In general, participants followed instructions by continuously indicating whether they were hearing one or two streams during the context and test periods and ceasing to press buttons during the silent period between the context and test. Because the results of most interest are based on data collected during the test periods, we present detailed results only from the test period. However, as is shown in Fig. 1, expected effects were observed during the context (cf. Anstis & Saida, 1985; Bregman, 1978). In particular, when the current Δƒ was large, participants tended to report hearing two streams more often than when the Δƒ was small; and only after multiple repetitions of the ABA- pattern did participants tend to report two streams.

As is shown in Figs. 1 and 2, perceptual judgments during the test in Experiment 1 were influenced by Δƒ during the context, with larger prior Δƒ leading to less streaming, as compared with when a smaller Δƒ was presented, replicating previous findings (Snyder, Carter et al., 2009; Snyder et al., 2008; Snyder, Holder, et al., 2009; Snyder & Weintraub, 2011). The effect of context2 was larger than the effect of context1 (see Fig. 2, top left panel), F(1, 12) = 9.54, p < .01, η p 2 = .443, consistent with the fact that context2 was closer in time to the test than was context1. Nonetheless, the size of the effect of prior Δƒ in context1 was significantly larger than 0 [context1–short, t(12) = 3.16, p < .01; context1–long, t(12) = 3.93, p < .005]. However, there was no main effect of the silent duration, F(1, 12) = 2.29, p = .15, η p 2 = .160; instead, there was an interaction between context and silent duration, F(1, 12) = 8.79, p < .025, η p 2 = .423, because the effect of context2 (effect for short silence = 0.25, effect for long silence = 0.15), F(1, 12) = 10.704, p < .01, η p 2 = .471, declined more than the effect of context1 (effect for short silence = 0.06, effect for long silence = 0.09), F(1, 12) = 0.954, p = .35, η p 2 = .074, which did not show loss at all. Consistent with this finding, Fig. 2 (middle left panel) shows that in Experiment 1B, there was little if any loss of the context1 effect (effect for short silence = 0.12, effect for long silence = 0.15), since we observed no significant main effect of silent duration, F(1, 15) = 1.74, p = .21, η p 2 = .104. Again, the size of the effect of prior Δƒ was significantly larger than 0 [context1–short, t(15) = 3.80, p < .005; context1–long, t(15) = 4.13, p < .001]. Finally, Fig. 2 (bottom left panel) shows that in Experiment 1C, there was substantial loss for the effect of context2 in the absence of a preceding context1 (effect for short silence = 0.30, effect for long silence = 0.19), confirming that loss does occur over a short time period, F(1, 17) = 15.18, p < .005, η p 2 = .472.

Fig. 2
figure 2

Context1 and context2 effect sizes for Experiments 1 and 2. Experiments 1A–1C show effects of prior Δƒ for short and long delays after context1 or context2. Experiments 2A–2C show effects of prior perception. Within-subjects error based on the pooled and scaled SEMpairedDiff for all trial type pairs such that \( \mathrm{SEM}=\sqrt{{\overline{{{{{\left( {\frac{1}{{\sqrt{2}}}\ \mathrm{SEM}pairedDiff..} \right)}}^2}}}}} \) (Franz & Loftus, 2012). The horizontal line and the two dots indicate that all corresponding SEMpairedDiff are pooled. This method results in one within-subjects error equivalent to that calculated using the method suggested in Loftus and Masson (1994)

Fig. 3
figure 3

Context effect sizes for Experiment 3 as a function of the length of the context2–test delay. Ratios between the context1–context2 delay and context2–test delay are indicated by black (1:1), dark gray (1:2), and light gray (1:6) bars. Within-subjects error is based on the pooled and scaled SEMpairedDiff for all trial type pairs (Franz & Loftus, 2012)

These results are consistent with our previous findings that the effect of prior Δƒ lessens with larger silent intervals and that more distant contexts have less of an effect than more recent ones (Snyder et al., 2008). However, the results are not completely consistent with our previous interpretation that these effects simply reflect passive loss of the effect of prior Δƒ. Rather, the persistent effect of context1 that does not show any further loss as a result of increasing the silent duration suggests a sustained memory for context1. Furthermore, the results suggest that in addition to or instead of passive loss causing the relatively small effect of context1, it is possible that this is, instead, the result of interference from the presence of context2, which is consistent with the visibly larger effect of context1 in Experiment 1B, as compared with Experiment 1A. The ANOVA comparing these two experiments directly revealed no effect of context2 presence, F(1, 27) = 2.66, p = .11, η p 2 = .090. Thus, no strong conclusion can be made about the interfering effect of context2, although a much larger sample size (76 participants, as determined using G*Power software) might yield a significant effect.

Experiments 2A–2C

We now examine effects of prior perception during the two previously presented contexts on perceptual judgments of streaming during a test period. We also directly compare the effects of prior Δƒ and prior perception to determine whether similar memory processes underlie these two separate context effects.

Method

Participants

Sixty-eight self-reported normal-hearing adults (36 men and 32 women; age range = 18–47 years, mean age = 20.9 years) from the University of Nevada, Las Vegas psychology participant pool participated after giving written informed consent according to the guidelines of the University’s Office for the Protection of Research Subjects. Twenty participants took part in Experiment 2A, 20 participants took part in Experiment 2B, and 28 participants took part in Experiment 2C.

Stimuli

The stimuli were similar to those in the previous experiment.

Materials and procedure

The materials and procedure were the same as those in Experiment 1, except as follows. Within each block of trials in Experiment 2A, the Δƒs of context1 and context2 were constant. In block 1, the Δƒ was six semitones, and after each block, the Δƒs of both context1 and context2 were adjusted up or down by one semitone if the participant reported hearing one or two streams, respectively, for more than 60 % of their total responses during the previous block. This was done to generate similar numbers of 1- and 2-stream responses at the end of each context. Experiment 2B was conducted in the same fashion, except that context2 was always silent, as in Experiment 1B. Experiment 2C was conducted in the same fashion, except that context1 was always silent, as in Experiment 1C. In Experiments 2A, 2B, and 2C, five blocks of 12 trials (6 of each of the two trial types) were presented. Three practice trials were presented prior to beginning each experiment, 2 with a short silent interval before the test and 1 with a long interval before the test.

Data analysis

The data were processed and analyzed as in Experiment 1, except as follows. For Experiment 2A, trials were sorted and averaged together within each silent interval condition and according to which perception was reported at the end of context1 and context2, resulting in eight conditions. Trials from Experiments 2B and 2C were sorted in a similar fashion, according to perceptual judgments at the end of context1 or context2 and the silent duration, resulting in four conditions as in Experiments 1B and 1C, respectively. We first determined the average Δƒ for each condition for Experiments 2A, 2B, and 2C to make sure that perception at the end of the contexts, rather than Δƒ, was modulating perception during the test. Next, we used the proportions of streaming during the test to calculate the following difference score to quantify the effect of prior perception: The mean proportion of hearing two streams during the test when the prior percept was 1 stream was subtracted from the proportion of hearing two streams during the test when the prior percept was 2 streams.

For Experiment 2A, the average proportions of hearing two streams during the test period for different trial types were entered into a repeated measures ANOVA to test for differences in the effect of prior perception on streaming depending on the context being evaluated (context1 vs. context2) and the duration of the variable silent period (1.44 or 8.64 s) as within-subjects factors. In performing this analysis, we recognize that the perception during the two contexts are not independent from each other and that it is, therefore, quasi-experimental, but in order to be able to better compare results across experiments, we chose to perform this analysis nevertheless. For Experiments 2B and 2C, separate ANOVAs were performed, using the duration of the variable silent period as the within-subjects factor. For Experiments 2A and 2B, to assess the persistence of the effect of prior perception from context1, prior perception difference scores were entered into a one-sample t-test in order to test whether the effect was significantly larger than 0. To directly compare Experiments 2A and 2B and, thus, the moderating effect of the presence or absence of context2 on the effect of the context1 percept, we averaged the Experiment 2A data across the different context2 percepts and entered the data into a mixed-measures ANOVA with variable silent duration as the within-subjects factor and context2 presence/absence as the between-subjects factor. To make sure that the effects of our main manipulations of prior perception and silent period were having effects above and beyond the possibly confounding effect of Δƒ, which was allowed to vary individually for each participant from block to block, we performed linear mixed-model analyses on the prior perception effect sizes with condition (Experiment 2A: context1/short delay, context1/long delay, context2/short delay, context2/long delay; Experiment 2B: context1/short delay, context1/long delay; Experiment 2C: context2/short delay, context2/long delay) and average Δƒ for each condition as fixed factors and participant as the random factor.

Results and discussion

Table 1 shows the average Δƒ values for each of the conditions in Experiments 2A, 2B, and 2C, which were all around the starting value of six semitones. To determine whether there were significant differences in Δƒ for the different conditions, we performed ANOVAs on the Δƒ values, with perception and delay size as within-subjects factors. For Experiment 2A, there were no main effects of percept or delay, but there was a significant interaction between these factors, F(3, 57) = 3.14, p < .05, η p 2 = .142. This was primarily due to a relatively low value for the condition in which two streams were perceived at the end of context1, followed by one stream at the end of context2, with a delay of 1.44 s. For Experiments 2B and 2C, neither the main effects nor the interaction was significant, although in both experiments there was a marginal effect of percept [Experiment 2B, F(1, 19) = 3.77, p = .07, η p 2 = .166; Experiment 2C, F(1, 27) = 4.15, p = .052, η p 2 = .133]. In both experiments, this was due to slightly higher values for the two-streams conditions. However, given the very small differences in Δƒ between conditions (<1 semitone), it is highly unlikely this would have much effect on perception of streaming, on top of the differences due to the prior perception effects. To verify this, the linear mixed-model analysis on the prior perception effect sizes showed a significant effect of condition for Experiment 2A, F(3, 75) = 2.91, p < .05, and a marginal effect of Δƒ, F(1, 75) = 2.79, p = .10. For Experiment 2B, neither effect was significant [condition, F(1, 37) = 0.07, p = .787; Δƒ, F(1, 75) = 0.36, p = .55]. For Experiment 2C, also, neither effect was significant [condition, F(1, 53) = 1.41, p = .24; Δƒ, F(1, 53) = 1.63, p = .21].

Table 1 Mean Δƒ values for conditions in Experiments 2A, 2B, and 2C

As is shown in Fig. 2 (top right panel), unlike in Experiment 1, the effect of context2 was not significantly larger than the effect of context1, F(1, 19) = 1.60, p = .22, η p 2 = .078, although there was a trend in this same direction. The effect of prior perception in context1 was significantly larger than 0 [context1–short, t(19) = 3.66, p < .005; context1–long, t(19) = 3.27, p < .005]. There was a main effect of silent duration, F(1, 19) = 7.81, p < .025, η p 2 = .291, but there was only a trend toward an interaction between context and silent duration, F(1, 19) = 2.97, p = .10, η p 2 = .135. As in Experiment 1, this was due to the fact that, overall, the effect of the contexts was smaller with larger silent intervals and this effect was most dramatic for context2 (effect for short silence = 0.21, effect for long silence = 0.07), F(1, 19) = 8.953, p < .01, η p 2 = .320, and less so for context1 (effect for short silence = 0.11, effect for long silence = 0.08), F(1, 19) = 0.318, p = .58, η p 2 = .016. Consistent with these findings, Fig. 2 (middle right panel) shows that in Experiment 2B, there was little if any loss of the context1 effect (effect for short silence = 0.14, effect for long silence = 0.12), and we observed no significant main effect of silent duration, F(1, 19) = 0.65, p = .43, η p 2 = .033. Again, the effect of prior perception in context1 was significantly larger than 0 [context1–short, t(19) = 2.85, p < .025; context1–long, t(19) = 2.59, p < .025]. Figure 2 (bottom right panel) shows that in Experiment 2C, there was weak evidence for loss for the effect of context2 in the absence of a preceding context1 (effect for short silence = 0.16, effect for long silence = 0.09), although the effect of silent duration was only marginally significant and not as large as in Experiment 1C, F(1, 27) = 3.23, p = .08, η p 2 = .107. The ANOVA comparing Experiments 2A and 2B directly showed no main effect of context2 presence on the effect of context1, F(1, 38) = 0.54, p = .46, η p 2 = .014, suggesting a lack of interference of context2 on the memory for context1.

In order to directly compare the results between Experiments 1 and 2, three separate mixed-measures ANOVAs were carried out. The first ANOVA combined the data from Experiments 1A and 2A, with context1, context2, and silent duration as within-subjects factors, and context effect type (prior Δƒ vs. prior perception) as a between-subjects factor. As when examining the two experiments separately, this revealed a main effect of context across the two experiments, F(1, 31) = 9.88, p < .005, η p 2 = .242, due to the effect of context2 being stronger than the effect of context1. We also observed a main effect of silent duration, F(1, 31) = 8.08, p < .01, η p 2 = .207, due to the effect of contexts getting smaller with larger silent durations. We also observed an interaction between context and silent duration, F(1, 31) = 7.33, p < .025, η p 2 = .191, such that the effect of silent duration was a larger and significant effect for context2, F(1, 31) = 14.308, p < .001, η p 2 = .316, but a smaller and nonsignificant effect for context1, F(1, 31) = 0.017, p = .90, η p 2 = .001. There was no main effect or interaction involving context effect type, consistent with similar memory processes operating regardless of whether examining the effect of prior Δƒ or prior perception.

The second ANOVA combined data from Experiments 1B and 2B, with context1 and silent duration as within-subjects factors and context effect type as a between-subjects factor. As when examining the two experiments separately, there was no effect of silent duration, F(1, 34) = 0.09, p = .766, η p 2 = .003, again showing that context1 does not decline further with the larger silent duration. There was also no main effect or interaction involving context effect type, again suggesting similar memory processes for the effects of prior Δƒ and prior perception.

The third ANOVA combined data from Experiments 1C and 2C, with context2 and silent duration as within-subjects factors and context effect type as a between-subjects factor. As when the two experiments were examined separately, there was a main effect of silent duration, F(1, 44) = 11.42, p < .01, η p 2 = .206, again showing that context2 loss does occur over a short time period. There was a marginal trend for context type, F(1, 44) = 4.32, p < .05, η p 2 = .089, because the effect of prior Δƒ was larger than the effect of prior perception. Most important, there was no interaction between context effect type and silent duration, F(1, 44) = 0.61, p = .44, η p 2 = .014, again suggesting similar memory processes for the effects of prior Δƒ and prior perception.

Experiment 3

The purpose of Experiment 3 is to search for interference effects in addition to passive loss of information using a procedure inspired by a previous study on auditory discrimination judgments (Cowan et al., 1997; see also Winkler, Schröger, & Cowan, 2001). The Cowan et al. (1997) study showed that frequency discrimination performance decreased as a result of the passage of time between two to-be-discriminated tones in addition to how temporally proximate the last tone of the previous trial was to the first tone of the current trial, relative to the proximity between the two tones within the current trial. In other words, the ratio of the two time intervals in question can affect memory for the first tone of the trial, such that closer relative proximity of the to-be-discriminated tones leads to better discrimination performance. This can be considered a type of interference effect because the greater temporal proximity of events across trials reduces the ability to compare events within a trial. Therefore, in the present experiment, we varied the amount of time between context1 and context2, as well as the amount of time between context2 and the test in such a way that we could test whether increasing the pretest delay resulted in a smaller context2 effect as a result of actual loss of memory for the context2 Δƒ (i.e., the passage time) and/or due to the greater relative temporal proximity of context2 to context1. In so doing, we assumed that the greater the temporal proximity of context2 to context1, the larger the interference that context1 might have on the context2 Δƒ effect.

Method

Participants

Thirty-four self-reported normal-hearing adults (14 men and 20 women; age range = 18–48 years, mean age = 21.0 years) from the University of Nevada, Las Vegas psychology participant pool participated after giving written informed consent according to the guidelines of the University’s Office for the Protection of Research Subjects. One participant was later excluded for reporting one stream for the majority of trials across all conditions.

Stimuli

The stimuli were the same as those in Experiment 1.

Materials and procedure

The materials and procedure were the same as those in Experiment 1, except as follows. The frequency of the A tones was fixed at 300 Hz. The context1 and test Δƒs were fixed at 6 semitones, whereas the context2 Δƒ was 3 or 12 semitones. We chose to keep the context1 Δƒ constant so as to make the number of trial types (see below) and study duration more practical. The context2 Δƒ was crossed with five different possible combinations of context1–context2 silent intervals and context2–test silent intervals: (1) 2.88 and 2.88 s, (2) 8.64 and 8.64 s, (3) 1.44 and 2.88 s, (4) 4.32 and 8.46 s, and (5) 1.44 and 8.64 s, yielding ten different trial types. Therefore, the silent-interval ratio between context1–context2 and context2–test was 1:1, 1:1, 1:2, 1:2, or 1:6, respectively. Consecutive trials were separated by a silent interval of 5 s.

Five blocks of 20 trials were presented (2 of each of the ten trial types). The different types of trials were randomly intermingled within a block. Prior to the experiment, we presented 8 trials chosen randomly from all ten trial types as practice in random order.

Data analysis

The data were processed and analyzed as in Experiment 1, except as follows. The time series for each trial represented a total duration of 24.00, 25.44, 29.76, 32.64, or 36.96 s and consisted of 50, 53, 62, 68, or 77 time points, respectively, depending on the duration of the variable context1–context2 silence and the context2–test silence. The average proportions of hearing two streams during the test period for different trial types were entered into a repeated measures ANOVA to test for differences in the effect of prior Δƒ on streaming, depending on the duration of the context2–test silent interval (2.88 or 8.64 s) and the ratio between the context1–context2 and context2–test silent intervals (1:1 or 1:2) as within-subjects factors. Note that this ANOVA did not include the condition with intervals of 1.44 and 8.64 s. A second one-way repeated measures ANOVA was run to test for differences between the three conditions that each had a context2–test silent interval of 8.64 s as a function of the ratio between the context1–context2 and context2–test silent intervals (1:1, 1:2, or 1:6). This second ANOVA allowed us to determine whether the relative temporal proximity of context2 to context1 and the test, as opposed to the absolute time between context2 and the test, affected the size of the context2 Δƒ effect on streaming during the test.

Results and discussion

Consistent with the findings of Cowan et al. (1997), we found that the passage of time per se resulted in a loss of the context2 effect size (Fig. 3), even when controlling for the ratio between the context1–context2 and context2–test silent intervals, F(1, 32) = 19.87, p < .001, η p 2 = .383. However, unlike with the study by Cowan et al. (1997), we found that when controlling for the context2–test silent interval, there was no effect of the ratio between the two time intervals, F(2, 64) = 2.32, p = .11, η p 2 = .068, such that the effect of context2 was not different when the silent-interval ratio deviated from a 1:1 ratio (i.e., 1:2 or 1:6), although there was a trend in the expected direction. Thus, like the earlier experiments, this experiment provides only weak evidence in favor of interference leading to loss of implicit auditory memory in Experiments 1 and 2; in addition, we also independently confirmed the loss of information due to the simple passage of several seconds of time. This experiment only tested for interference and loss for the context2 Δƒ, so future experiments are needed to examine whether similar principles govern auditory memory for prior perception and for context1 Δƒ.

Experiment 4

The main purpose of Experiment 4 was to verify our assumption that the effects of prior Δƒ and prior perception are indeed separate effects. In particular, it was recently suggested to us that while the effect of prior perception seemed likely to be a true phenomenon, the effect of prior Δƒ might also be attributable to the prior percept (L. Demany, personal communication). The reasoning behind this is that when there is a large Δƒ during the context, this usually results in perception of two streams during the context, but perception might tend to switch to one stream during the test because the context and test stimuli do not match each other. In this case, the effect of prior Δƒ could be considered an effect of prior perception that has a contrastive, rather than facilitative, effect due to the change in Δƒ between the context and test. To test this theory, we reasoned that if the theory were correct, the effect of prior perception would be facilitative only when the context and test have the same Δƒ. Alternatively, if the facilitative nature of the prior perception effect were more robust, it would not become contrastive even when the context and test have a different Δƒ.

Method

Participants

Twenty-one self-reported normal-hearing adults (7 men and 14 women; age range = 18–35 years, mean age = 22.9 years) from the University of Nevada, Las Vegas psychology participant pool participated after giving written informed consent according to the guidelines of the University’s Office for the Protection of Research Subjects.

Stimuli

The stimuli were similar to those in the previous experiments.

Materials and procedure

The materials and procedure was the same as those in Experiments 1 and 2, except as follows. Each trial consisted of only two successive sequences: a context sequence, and a test sequence. On each trial, the frequency of the A and B tones of the test were fixed at 300 and 400 Hz or 500 and 668 Hz, respectively. Two different frequency ranges were tested in this experiment to reduce the likelihood that participants would assign one perceptual report to one particular stimulus combination. This corresponds to an A–B frequency separation (Δƒ) of five semitones. As with a Δƒ of six semitones as used in Experiments 1 and 2, this Δƒ usually leads to an ambiguous, or bistable, percept. For the context sequences, the frequency of the A tones was the same as in the test, and the B tones were constant within a given context, being equal to 357, 400, or 450 Hz when the A tone was 300 Hz, which corresponded to a Δƒ of three, five, or seven semitones between the A and B tones. On trials with an A tone frequency of 500 Hz, the B tones were 595, 668, or 750 Hz, again corresponding to Δƒs of three, five, or seven semitones. This resulted in six different types of trials (3 context Δƒs × 2 A tone frequencies).

Five blocks of 30 trials were presented (5 of each of the six trial types). The different types of trials were randomly intermingled within a block. Trial order was randomized every block and for every participant separately. Prior to the experiment, participants completed a practice of 10 trials selected randomly from an array of 12 trials (2 of each of the six trial types).

Data analysis

The data were processed and analyzed as in Experiments 1 and 2, except as follows. The time series for each trial represented a total duration of 14.88 s and consisted of 31 time points. For each participant, we examined the effect of prior Δƒ on perception during the test. Trials were therefore sorted and averaged together within each Δƒ/A-tone-frequency condition. The average proportions of hearing two streams during the test period for different trial types were entered into a repeated measures ANOVA), with prior Δƒ (three, five, or seven semitones) and the A tone frequency (300 or 500 Hz) as within-subjects factors.

Within the same data, we also examined the effect of prior perception on perception during the test. Trials were therefore sorted and averaged together within each Δƒ/A-tone-frequency condition and according to which perception was reported at the end of the context, resulting in 12 conditions (i.e., 2 A-tone frequencies × 3 prior Δƒs × 2 prior percepts). The average proportions of hearing two streams during the test period for different trial types were entered into a repeated measures ANOVA, with the prior Δƒ, the A-tone frequency, and the prior percept (one or two streams) as within-subjects factors. Three participants were excluded from this analysis because they never perceived a particular context type as two streams and, therefore, could not be included in a repeated measures analysis (A tone = 300 Hz, three-semitone prior Δƒ: 1 participant; A tone = 500 Hz, three-semitone prior Δƒ: 2 participants).

Results and discussion

Figure 4 shows the average proportion of time the participants reported the “two streams” percept, averaged across all participants during Experiment 4. Perceptual judgments during the test were influenced by Δƒ during the context, with larger prior Δƒ leading to less streaming, as compared with when a smaller Δƒ was presented, F(2, 40) = 9.71, p < .025, η p 2 = .327. This occurred even though the difference in Δƒ between the context and test was smaller (i.e., only two semitones) than in the previous studies, in which differences in Δƒ between the context and test were at least three semitones. There was also a significant effect of A-tone frequency, with more perception of two streams when the A tone was 500 Hz, F(1, 20) = 13.51, p < .025, η p 2 = .403, consistent with the previous finding that the Δƒ below which it is not possible to hear two streams (the so-called fission boundary; Van Noorden, 1975) is smaller in higher frequency ranges (Rose & Moore, 2000). However, there was no interaction between these two factors, F(2, 40) = 2.16, p = .141, η p 2 = .097.

Fig. 4
figure 4

Proportion of time the test sequences were heard as two streams (streaming) for Experiment 4. The contrastive effect of context Δƒ (three, five, or seven semitones) can be seen during the test period when the A tone was 300 Hz (top panel) or 500 Hz (bottom panel). The facilitative effect of context perception (one stream or two streams) can be seen by comparing the light and dark bars. The general pattern occurred regardless of the context Δƒ, although it was largest when the context and test both had context Δƒ of five semitones, (middle set of bars). Within-subjects error is based on the pooled and scaled SEMpairedDiff for all trial type pairs (Franz & Loftus, 2012)

As is shown in Fig. 4, perceptual judgments during the test were highly influenced by perception during the context, with perceiving two streams during the context leading to more streaming during the test, as compared with when one stream was perceived during the context, F(1, 17) = 50.95, p < .001, η p 2 = .750, consistent with previous studies (Snyder, Carter et al., 2009; Snyder et al., 2008; Snyder, Holder, et al., 2009). As in the previous ANOVA, there was a significant effect of prior Δƒ, F(2, 34) = 14.92, p < .001, η p 2 = .467, and a marginal effect of A-tone frequency, F(1, 17) = 3.83, p = .067, η p 2 = .184. Additionally, there was an interaction between prior perception and prior Δƒ, F(2, 34) = 29.96, p < .001, η p 2 = .638, due to a larger effect of prior perception when the prior Δƒ was five or seven semitones, as compared with when the prior Δƒ was three semitones. Importantly, however, there was no indication that the effect of prior perception became contrastive when the Δƒ of the context and test were not both five semitones. Thus, it appears that the facilitative nature of the prior-perception effect is a robust feature of the phenomenon and can be observed in the same data set as the contrastive effect of prior Δƒ, strongly supporting the assumption that there are, indeed, two separate context effects.

General discussion

The experiments described above clearly replicate previous findings that a smaller prior Δƒ and prior perception of two streams make perceptual judgments of two streams more likely during the test pattern (Snyder, Carter et al., 2009; Snyder et al., 2008; Snyder, Holder, et al., 2009; Snyder & Weintraub, 2011). The results also replicated the finding that the effect of prior Δƒ declines over time (Snyder et al., 2008), as shown by a main effect of the silent duration between the context periods and the test. One novel result from Experiment 2 was that the effect of prior perception also becomes smaller with a longer silent duration. The most important findings, however, were that in addition to evidence for passive loss of the memory underlying these context effects, we also showed evidence for a long-lasting memory for both context effects (Experiments 1 and 2). We now discuss the implications of these findings in greater detail—specifically, how they update our understanding of auditory memory for sound patterns generally and for perceptual context effects in particular.

Relation to general models of auditory memory

As was described earlier, studies have questioned whether delay-related auditory short-term memory declines can be attributed solely to passive loss over time by providing evidence that interference due to stimuli from previous trials can weaken otherwise persistent memory (Cowan et al., 1997; Ruusuvirta et al., 2008). More recently, McKeown and colleagues have also demonstrated that when a pair of sequentially presented stimuli are discriminated, the pattern of interference due to previous and intervening stimuli is most consistent with a model that includes feature-updating mechanisms of persistent memories (McKeown & Wellsted, 2009; Mercer & McKeown, 2010a, 2010b; see also Näätänen & Winkler, 1999). Importantly, the present data on streaming context effects support the notion that auditory memory for sound patterns consists of information that is persistent over time (Experiments 1 and 2) and extend this notion to implicit memory effects. However, the data of Experiments 13 showed only minimal evidence for interfering effects of intervening stimuli (e.g., the moderating effect of context2 on the context1 effects). It is possible that the observed influence of context1 on perception during the test was lessened because context2 already biased perception of the test toward the extremes (i.e., proportions of “two streams” responses close to 0 or 1), where results are likely to be less influenced by prior stimuli and prior percepts because of floor and ceiling effects, which is consistent with our findings for the prior Δƒ effect (Snyder et al., 2008). Thus, interfering stimuli that do not have similar effects on perception as the context stimulus of interest may be more appropriate for examining interference of streaming context effects, as compared with the biased ABA- sequences we used here.

Researchers often assume that when loss of information over time is observed, this is due to a particular type of passive loss called decay, which is characterized by a gradual decline in the memory over time (Cowan, 2008). However, another possibility is that loss of information can occur more abruptly, called sudden death, a phenomenon that has been found to occur in visual memory for color and shape (Zhang & Luck, 2009). To our knowledge, this possibility has not been explored in the auditory domain, although it could explain information loss observed in the literature, including the present findings. Future studies should therefore design auditory memory experiments in such a manner that gradual and sudden information loss can be independently estimated, which requires participants to have the opportunity to match remembered stimuli using a continuous range of responses (cf. Zhang & Luck, 2009), as opposed to more commonly used forced choice responses. Auditory qualities such as pitch, loudness, duration, and timbre would potentially work in such experiments to determine whether auditory memory is lost gradually or suddenly.

The present data are consistent with a wide range of studies demonstrating robust implicit memory effects. For example, research on statistical learning has shown that passive exposure to sequences of speech and tone stimuli in which some sounds are more likely to be followed by other particular sounds can result in listeners being able to later recognize the more probable sound sequences (e.g., Conway & Christiansen, 2005; Saffran, Aslin, & Newport, 1996; Saffran, Johnson, Aslin, & Newport, 1999). Similarly, complex sounds that were repeatedly presented on multiple trials resulted in them being much easier to discriminate, as compared with similar sounds that were not repeated across trials (Agus, Thorpe, & Pressnitzer, 2010). There is also evidence for robust implicit-memory effects in word priming that showed a lack of decline in memory strength over time as long as the priming stimulus was made supraliminal; when the priming stimulus was subliminally presented, priming still occurred but was weaker and declined further in strength over the course of a second (Dupoux, de Gardelle, & Kouider, 2008). There is also reason to believe that similar stores are used for short- and long-term memory (Cowan, 2008; Ranganath & Blumenfeld, 2005).

Finally, there is recent evidence for robust short-term auditory implicit memory. Specifically, when a “target” pure tone is presented simultaneously with masking pure tones, followed by a pure tone of a slightly different frequency as compared with the critical tone, listeners are able to judge the direction of frequency change, despite the fact that the “target” tone is not detected as a separate entity (Demany, Pressnitzer, & Semal, 2009; Demany & Ramos, 2005; Demany, Semal, & Pressnitzer, 2010b). Although performance on this task declines when there is a larger time interval between the to-be-discriminated tones and when there is a larger number of masking tones, these two factors do not interact in such a way that memory for the first tone declines precipitously for more complex sounds (Demany Semal, Cazalets, & Pressnitzer, 2010a; Demany, Trost, Serman, & Semal, 2008), as is the case with vision (Phillips, 1974). An important limitation to these studies, however, is that the memory system recruited for detection of frequency shifts might be different from the memory system(s) involved in implicit or explicit comparisons between complex auditory scenes (including ABA- patterns) (Demany, Semal et al., 2010b; for further discussion, see Snyder & Gregg, 2011).

Relation to context effects in vision

The study of implicit context effects, such as the ones examined here, has a long tradition in experimental psychology, especially in visual psychophysics research. For example, contrastive aftereffects from staring at moving or tilted patterns are fairly well accepted to result from adaptation in cortical sensory areas that have neurons that are tuned to motion and orientation, respectively (e.g., Addams, 1834; Gibson & Radner, 1937; for reviews, see Clifford, 2002; Kohn, 2007). The effects of prior perceptual interpretations have also been studied recently by vision scientists (e.g., Leopold, Wilke, Maier, & Logothetis, 2002; for a review, see Pearson & Brascamp, 2008), with studies showing that the underlying memory encodes stimulus details associated with the percept (e.g., Chen & He, 2004). As with the streaming context effects studied here, the strength of visual context effects also become smaller as the interval between context and test patterns is increased. For example, in one study, a context pattern consisting of a square-wave grating that was moving unambiguously either to the left or to the right was shown for a brief duration, followed by a variable duration blank screen and, finally, a test grating that was ambiguously moving to the left or right (Kanai & Verstraten, 2005). The largest contrastive effect of the context pattern (i.e., perceiving the test direction as opposite to the context direction) occurred when the blank screen duration was 120 ms and became smaller with longer blank screens, with the effect declining completely after 2,000 ms. When the context and test pattern were both ambiguous, the observed context effect was facilitative (i.e., the perceived context direction tended to match the perceived test direction, even though both were ambiguous); however, this context effect became larger as the blank duration increased, suggesting an increasing (as opposed to declining or static) strength of the perceptual memory. Notably, this pattern of results in visual motion perception differs from the streaming effects observed here because (1) we found that the effects of both prior stimuli and prior percepts underwent loss, with no evidence of increasing effects of prior perception over time, and (2) we found evidence for a second persistent effect of both prior stimuli and prior perception. Finally, our Experiment 4 results, which suggest that the effects of prior Δƒ and prior perception reflect distinct effects, are consistent with recent fMRI evidence suggesting that contrastive effects of prior stimulus and facilitative effects of prior perception on perception of ambiguous visual stimuli map onto anatomically distinct cortical networks (Schwiedrzik et al., in press).

Relation to context effects in hearing

It is unclear whether the time course of streaming context effects represents a general trend across different auditory context effects or whether different patterns of behavior would be observed for different stimuli and tasks, such as those mentioned in the introduction. However, the time course of streaming buildup provides an interesting comparison with the time course of the streaming context effects assessed in the present study. As with the context effects, streaming buildup also shows decreasing strength with longer temporal delays between ABA- patterns (Beauvois & Meddis, 1997; Bregman, 1978). However, these previous studies did not determine whether buildup had declined down to a baseline level, because they did not include a control condition with a silent context. For example, in one of these studies, a repeating A-A-A-. . . pattern was followed with a variable delay by a repeating ABAB. . . test pattern (Beauvois & Meddis, 1997); but because no silent context pattern was included, it was not possible to determine whether the observed decline of buildup was complete. Thus, it would be important for future studies to include such a silent-context control to determine whether streaming buildup also has a persistent component, as we believe is characteristic of the streaming context effects studied here.

Similar memory underlying streaming context effects?

The effects of prior stimuli and prior perception showed similar patterns of change with increased delay between context2 and the test period and between context1 and the test period. Future studies should evaluate these patterns with a finer sampling of this range of delays (up to about 17 s) to determine whether the effects of prior stimuli and prior perception truly have the same pattern of memory loss, which might result from being different features of a common memory or set of memories. In such a scenario, the memories for previous auditory patterns might be multidimensional, containing information not only about the stimulus characteristics of prior auditory patterns, but also about the perceptual interpretation that resulted from them. Such a conclusion would be consistent with evidence in the visual domain. In particular, several studies have shown that when an otherwise continuously presented bistable visual pattern is occasionally interrupted, the extent to which the same perceptual interpretation is retained across interruptions is greater when certain stimulus features are also more similar across interruptions (Chen & He, 2004; Maier, Wilke, Logothetis, & Leopold, 2003; Pearson & Clifford, 2004).

If the same memories do indeed contain information about the prior stimulus and the prior percept, manipulating the physical similarity between the context and test patterns should equally disrupt the prior stimulus and prior perception effects. Previously, we found only minimal disruption of the prior Δƒ effect when the context and test patterns were played in different frequency ranges (Snyder, Carter et al., 2009). However, when the context and test patterns had different rhythms (e.g., ABA- vs. AB--), the effect was disrupted to a greater extent (Snyder & Weintraub, 2011). Thus, similar effects of changing the frequency range and the rhythm from context to test should be observed when examining the effect of prior perception.

Summary and conclusions

The present study used streaming context effects to shed light on auditory implicit memory for sound patterns. The evidence clearly demonstrated that implicit memories for prior ABA- patterns (or at least the extent to which they influence later judgments) decline over the first few seconds but then show persistence for at least several seconds. The data could therefore be explained by the existence of multiple memories for prior ABA- patterns (some of which decline and some of which do not) or a single memory that initially declines but then stabilizes at a level that is different from baseline. Furthermore, evidence was provided that memory for repeated ABA- patterns contains information about both stimulus characteristics (Δƒ) and perceptual interpretations (one vs. two streams) and is able to influence subsequent perceptual reports of ABA- patterns. Given the fact that the effects of prior Δƒ and prior perception act in opposite directions, one possibility is that a common memory for prior ABA- patterns is used by two distinct mechanisms, which act in contrastive and facilitative manners, respectively (e.g., Kinchla & Smyzer, 1967). However, it is still possible that different memories underlie the two context effects.