Distraction by unexpected sounds

Stimulus predictability is a powerful factor capable of shaping behavioral performance. Predictable target stimuli help us act faster and efficiently, while unpredictable distractors tend to be especially prone to disrupt ongoing performance. Predictability of target and distractor stimuli have typically been the object of separate and independent lines of research, however. Here, we seek to answer the following question: Do unexpected distractors obligatorily affect ongoing task performance independently of the predictability of the target stimuli, or is distraction by unexpected stimuli reduced when upcoming stimuli and actions are predictable?

Task-irrelevant sounds unexpectedly deviating from an otherwise repeated sequence induce an orienting response marked by specific electrophysiological responses (Berti & Schröger, 2001; Escera, Alho, Winkler, & Näätänen, 1998; Schröger, 1996, 2005) and behavioral distraction (e.g., Pacheco-Unguetti & Parmentier, 2014; Parmentier, Elford, Escera, Andrés, & Miguel, 2008). This is typically studied in simple categorization tasks (e.g., judging the parity of visual digits) involving task-irrelevant sounds (e.g., Parmentier, Turner, & Perez, 2014; Parmentier, Vasilev, & Andrés, 2018).

Behaviorally, unexpected sounds lengthen response times to target stimuli in ongoing tasks due to the involuntary shift of attention to and from the unexpected sound (Parmentier et al., 2008; Schröger, 1996) and the transient inhibition of motor actions (Parmentier, 2016; Vasilev, Parmentier, Angele, & Kirkby, 2019; Wessel, 2017; Wessel & Aron, 2013; Wessel & Huber, 2019). Unexpected sounds distract because they violate the cognitive system’s predictions (Parmentier, Elsley, Andrés, & Barceló, 2011; Schröger, Bendixen, Trujillo-Barreto, & Roeber, 2007). Consequently, distraction lessens or disappears when unexpected sounds are made predictable (e.g., Horváth & Bendixen, 2012; Parmentier & Hebrero, 2013; Sussman, Winkler, & Schröger, 2003).

Sequence learning

Participants responding to repeated sequences of stimuli do so increasingly fast, indicative of sequence learning, even though participants are typically unaware of it. This can be observed in the serial reaction time (SRT) task (e.g., Nissen & Bullemer, 1987; Robertson, 2007). Typically, this task requires participants to respond to a sequence of visual locations by pressing spatially corresponding keys. Unbeknownst to participants, the sequence follows a repeating pattern. Learning is measured by the progressive shortening of response times (RTs) as the sequence is repeated, and by their sharp increase upon the introduction of a modified or new sequence (Cohen, Ivry, & Keele, 1990; Reed & Johnson, 1994).

Implicit sequence learning is reduced when participants perform a concurrent auditory secondary task (Shanks & Channon, 2002; Wierzchon, Gaillard, Asanowicz, & Cleeremans, 2012). This may due to the competition for attention (Nissen & Bullemer, 1987), the integration of the visual and auditory stimuli into an unstructured sequence of events (Heuer & Schmidtke, 1996), or interference with the expression of learning (Frensch, Wenke, & Rünger, 1999). Importantly, in contrast to the oddball paradigm, past SRT studies using auditory distractors required participants to attend and respond to the auditory stimuli, and did not use sounds capturing attention by deviating from a structured sequence. Our study addresses this issue.

The present study

While evidence indicates that distraction by unexpected sounds is reduced or eliminated when these are made predictable, the role of stimulus–response predictability on novelty distraction remains unknown. On the one hand, the orienting response to unexpected stimuli is often considered to be an obligatory sensory phenomenon (Escera et al., 1998; Schröger, 1996; Sokolov, 1963), and its behavioral consequences have been shown to occur in a range of tasks (see Parmentier, 2014, for a review). On the other hand, some evidence suggests that novelty distraction might be subject to top-down control, as suggested by the reduction of distraction when unexpected sounds are announced by explicit visual cues (Horváth, Roeber, Bendixen, & Schröger, 2008; Sussman et al., 2003). Importantly, in past cross-modal oddball studies, auditory distractors were presented ahead of target processing and response preparation. Hence, it remains unknown whether novel sounds yield distraction when stimuli and responses are predictable ahead of the distractors’ presentation. To examine this issue, we used an SRT task to install learning of a sequence of visual stimuli and then measured the effect of novel sounds on response times for the learned versus a new sequence. If novelty distraction is not contingent upon target and response uncertainty, or if the cognitive system prioritizes response over the orienting to novel sounds, then novelty distraction should be reduced for the learned relative to the new sequence. In contrast, if novel sounds capture attention in an obligatory fashion, then equal levels of distraction should be observed for learned and new sequences.

Method

Participants

Fifty-two undergraduate students (12 males, 40 females), between the ages of 18 and 30 years (M = 21.15 years, SD = 2.62 years) took part in this experiment in exchange for a small honorarium. All participants reported normal hearing and normal or corrected-to-normal vision. Four participants were left-handed. All were undergraduate students from the University of the Balearic Islands and gave their written consent to take part in the study, which was approved by the Bioethics Committee of the University of the Balearic Islands. Since there is no prior work examining the effect of target predictability on novelty distraction, we based our sample size calculation on the assumption that if novelty distraction is contingent upon target stimuli uncertainty, making these stimuli predictable through sequence learning should have a medium to large impact on novelty distraction. Under such premise, we hypothesized an effect size of dz = 0.5 for the most relevant test in our study (the comparison of novelty distraction for a learned and a new sequence). For this effect size, a Type I error of .05 and a power of .95, the minimum sample size is 34.

Materials and procedure

We present a schematic illustration of the participants’ task and experimental design in Fig. 1. The SRT task consisted of 15 blocks of 96 trials each. In each trial, one of four locations (marked by black boxes framed in a white border and arranged horizontally in the middle of the screen) turned blue. The participant’s task was to press one of four response keys (C, V, B, N, using index and middle fingers of both hands) in accordance with the highlighted location. As soon as the participant pressed the correct key, this location turned black again, and 300 ms passed before the next trial began. On wrong responses, the location’s frame turned red, and the participant was required to respond again until pressing the correct key. At the end of each block, the mean response time and percentage of correct responses were displayed on the screen, together with a comparison with performance on the previous block.

Fig. 1
figure 1

Schematic of serial reaction time task and experimental design. In each trial of the serial reaction time task, one of four visual boxes was highlighted, and participants pressed the corresponding key. Blocks 1–11 were presented in the absence of auditory stimuli. In Block 1 (warm-up), the sequence of locations was quasirandom. In Blocks 2–8, 11–12, a second-order condition sequence (SOC1) was looped 12 times (starting from a different random point in every block). In Block 9, a new second-order condition sequence was introduced (SOC2). In Blocks 12–15, auditory distractors were introduced. In each of these blocks, 80/96 (83.3%) trials involved the presentation of the standard sound (STD). In the remaining 16/96 (16.7%) trials, novel sounds were presented (NOV). For half the participants (Group 1), SOC1 was presented in Blocks 12 & 13, whereas SOC2 was presented in Blocks 14 & 15 (this ordered was reversed for Group 2). In Block 16, all participants performed a production task (in the absence of auditory distractors), in which they were instructed to explicitly predict the next visual location (PROD)

The crucial manipulation related to the sequencing of the location and the presence or absence of auditory distractors. Block 1 fulfilled the purpose of allowing participants to warm up (and was therefore not included in the analysis) using a quasirandom sequence of locations: 96 trials were presented, with each location (1–4) presented equally often, no immediate repetitions, and no trills (e.g., 121). Blocks 2 to 15 used two second-order conditional (SOC) sequences of 12 locations looped eight times. In second-order conditional sequences, each location is followed equally often by each of the remaining locations, and locations are dependent on the target location of the previous two trials (e.g., 213243142341). For each participant, two second-order conditional sequences (hereafter referred to as SOC1 and SOC2) were picked among a set of possible sequences, such that these two sequences shared no triplet (e.g., 213243142341, 231241342143), and did not contain trills (e.g., 232) or runs (e.g., 123 or 321). SOC1, the sequence to be learned, was presented in Blocks 2 to 8, and 10 to 11, started at a randomly picked position in each block. In Block 9, participants were presented with SOC2 instead of SOC1.

In Blocks 12-15, short auditory stimuli were introduced immediately before each visual location. In each block, a 650 Hz sinewave tone (with 10-ms fade-in and fade-out) was used in 80 of the 96 trials (this sound is hereafter referred to as the standard sound). In the remaining trials, the novel auditory stimuli were short environmental sounds (e.g., hammer, telephone ring, drill, rain) randomly selected without replacement from a set of 100 sounds adapted from Andrés, Parmentier and Escera (2006). The distribution of the novel trials across the block was quasirandom, with the following constraints: Each block started with at least five standard trials, novel trials were separated by two or more standard trials, and each subsequent group of 24 trials contained four novel trials. All auditory stimuli lasted 150 ms, were normalized, and presented binaurally through headphones (approx. intensity of 70 dB SPL). The sounds’ offset coincided with the target location’s onset. For half the participants, SOC1 was used in Blocks 12–13 and SOC2 in Blocks 14–15 (and vice versa for the remaining participants).

In Block 16, participants completed a production task using the same stimuli and keys as in the SRT task, but under new instructions: Participants were asked to explicitly predict the next location. The sequence used was SOC1, starting at a random position and looped 12 times.

Data analysis

Together with frequentist statistical tests, we report the Bayes factor (BF10) to assess the credibility of the experimental hypothesis relative to that of the null hypothesis given the data. Values below 1/3 constitute strong support for the null effect, and values above 3 provide strong support for the experimental hypothesis (e.g., Jeffreys, 1961). Effect sizes are reported as \( {\eta}_p^2 \) for F tests, and as Cohen’s dz for dependent sample t tests (Lakens, 2013). The 95% confidence intervals displayed on figures were calculated following Jarmasz and Holland (2009).

The analysis consisted of three steps. First, we carried out analyses to demonstrate the learning of SOC1 in the absence of auditory distractors. This was done (1) by measuring the linear regression slope of RTs across Blocks 2–8 for each individual participant and comparing them to zero using a one-tailed one-sample t test under the hypothesis of a negative slope; and (2) by comparing RTs for SOC1 (Blocks 8 & 10) relative to RTs for SOC2 (Block 9) under the hypothesis of an increase of RTs in Block 9 (t test for dependent samples). To evaluate learning beyond Block 11, we compared RTs in Block 11 and RTs in the standard trials of Blocks 12–15 using a t test for dependent samples. Second, we examined the impact of novel sounds for the learned (SOC1) and the new (SOC2) sequences (Blocks 12–15). We did this in two ways. We analyzed RTs as a function of the sequence (SOC1 vs. SOC2) and the type of sound (standard, novel) using an ANOVA for repeated measures. To compare further the degree of deviance distraction between SOC1 and SOC2, we also conducted an equivalence test (Lakens, Scheel, & Isager, 2018) and Bayesian estimation (Kruschke, 2013, 2015; Kruschke & Liddell, 2018). As a further test of the relationship between SOC1 learning and the susceptibility of SOC1 to distraction by novel sounds, we computed two correlation coefficients: one between the learning slope (Blocks 2–8) and distraction for SOC1 (RTnovel − RTstandard), and another comparing the RT difference between Block 9 (SOC2) and Blocks 8 and 10 (SOC1) to distraction for novel sounds for SOC1. Third, to assess explicit knowledge of SOC1, we conducted a one-sample t test to compare the participants’ mean proportion of correct predictions (Block 16) to chance (1/3, since a location was never repeated in immediate succession across the experiment). The analysis was conducted using JASP (JASP Team, 2019) and R (R Development Core Team, 2019).

Results

Learning of SOC1

Linear regression slopes were calculated across Blocks 2–8 for all participants. The mean slope was negative (M = −3.410, SD = 8.092) and significantly different from zero; t(51) = 3.039, p = .002, dz = 0.421 (95% CI of dz: 0.181 to infinity), BF10 = 17.386. RTs were significantly longer for SOC2 (Block 9) than for SOC 1 in Blocks 8 and 10 combined: t(51) = 8.128, p < .001, dz = 1.127 (95% CI of dz: 0.830 to infinity), BF10 = 2.186 × 108. In sum, RTs for SOC1 decreased significantly across Blocks 2–8 and increased sharply for SOC2 (see Fig. 2), providing solid evidence of sequence learning for SOC1.

Fig. 2
figure 2

Response times (RTs) for correct responses across Blocks 1 to 11. Block 1 consisted of a warm-up random sequence of locations (RND). In Blocks 2–8 and 10–11, participants were presented with the looped presentation of SOC1. In Block 9, a new sequence (SOC2) was looped. Error bars represent the 95% confidence interval based on the main effect of block (Jarmasz & Hollands, 2009)

The comparison of RTs in Block 11 and in Blocks 12–15 (standard trials) revealed a significant shorting of RTs in the latter, thereby indicating that learning of SOC1 continued after the introduction of the auditory distractors: t(51) = 2.823, p = .007, dz = 0.392 (95% CI of dz: 0.108 to 0.672), BF10 = 5.192.

Effect of novel versus standard sounds on learned (SOC1) versus new (SOC2) sequences

The 2 (SOC1 vs. SOC2) × 2 (standard vs. novel) ANOVA conducted on RTs in Blocks 12–15 revealed a main effect of sequence (faster responses for SOC1): F(1, 51) = 48.221, MSE = 1278.394, p < .001, \( {\eta}_p^2 \) = .491, BF10 = 3.659 × 1015. Novel sounds yielding longer RTs than the standard sound: F(1, 51) = 23.356, MSE = 258.972, p < .001, \( {\eta}_p^2 \) = .314, BF10 = 2.545. Importantly, novelty distraction was equivalent for SOC1 and SOC2 (see Fig. 3): F(1, 51) = 1.031, MSE = 211.055, p = .315, \( {\eta}_p^2 \) = .020, BF10 = 0.233. Hence, there was no evidence of a difference in deviance distraction between SOC1 and SOC2, and the null effect was supported by the Bayes factor. Nevertheless, because this null effect is of key importance for our conclusions, we sought to test it further. To do so, we compared distraction (RTdeviant − RTstandard) for SOC1 and SOC2 using two additional techniques: equivalence testing (Lakens et al., 2018) and Bayesian estimation (Kruschke, 2013, 2015). Both were carried out in R (R Development Core Team, 2019) using the TOST (Lakens, 2018) and BEST (Kruschke & Meredith, 2018) packages, respectively.

Fig. 3
figure 3

Response times (RTs) for the learned (SOC1) and new (SOC2) sequences as a function of the type of auditory distractor (novel versus standard). Error bars represent the 95% confidence interval based on the interaction term of the 2 (SOC1 vs. SOC2) × 2 (novel vs. standard) ANOVA for repeated measures (Jarmasz & Hollands, 2009)

Equivalence testing consists in using two one-sided tests to determine whether an effect is equivalent to the null effect and reject the presence of a smallest effect size of interest. This technique requires researchers to specify boundary effect size values against which to test their data. In the absence of any existing study comparable to ours on which to base this effect size, we opted for the second recommended approach, which consist in calculating the smallest effect size detectable with a power of .95 given our samples size (Lakens et al., 2018). Given our sample size, this effect size parameter (dz) is 0.426. The equivalence test was significant, t(51) = 2.319, p = .122, given equivalence bounds (on a raw scale) of −13.435 and 13.435 ms. In line with our previous analysis, the null hypothesis test result was not significant, t(51) = −1.015, p = .315. In other words, the difference in deviance distraction between SOC1 and SOC2 was not statistically different from zero and statistically equivalent to zero, supporting the null effect.

Bayesian estimation takes a different approach to null hypothesis significance testing by calculating credible ranges of values for model parameters in the light of the empirical data (Kruschke, 2013; Kruschke & Liddell, 2018). The two key model parameters of interest here are the mean difference in distraction between SOC1 and SOC2 and the effect size. Using BEST’s minimally informative default priors (“so that the prior has minimal influence on the estimation, and the data dominate the Bayesian inference,” Kruschke, 2013, p. 576), Bayesian estimation reallocates credibility to the model’s parameter in a way that best accommodates the empirical data. The posterior distribution for the mean difference and effect size were approximated using the Markov chain Monte Carlo (MCMC) method, which generates a large sample of credible parameter values from the posterior distribution (using BEST’s default size for this sample, or chain length, of 100,000). Credible intervals were calculated in the form of 95% high-density intervals (95% HDI). Using this method, the point estimate for the mean difference in distraction between SOC1 and SOC2 (distractionSOC2 − distractionSOC1) was 3,510 ms (95% HDI: −4719 to 11.565), and the point estimate of the effect size was 0.125 (95% HDI: −0.164 to 0.414). Of key importance, both 95% HDIs included the zero value, thereby provided strong support for the null effect. The Gelman–Rubin diagnostic value and effective sample size (ESS) for the mean difference estimate were 1 (confirming that convergence was reached) and 58,344 (i.e., superior to the recommended 10,000; Kruschke, 2015), respectively.

Altogether, the equivalence levels of deviance distraction for SOC1 and SOC2 received strong support from all three techniques we used (Bayes factor, equivalence testing, and Bayesian estimation).

Finally, as one final way to assess the relationship between target predictability and deviance distraction, we examined whether distraction varied as a function of the degree of prior knowledge of the target stimuli. The correlation between the learning slope for SOC1 across Blocks 2 to 8 and novelty distraction (RTnovel − RTstandard) for SOC1 was not significant: r = −.214, p = .936, BF10 = 0.072. The correlation between the cost of introducing SOC2 in Block 9 (relative to SOC1 in Blocks 8 and 10) and novelty distraction for SOC1 was not significant either: r = .127, p = .815, BF10 = 0.097. Hence, the evidence strongly indicates no relationship between SOC1 learning and its subsequent susceptibility to distraction by novel sounds.

Production task

The proportion of correct predictions in Block 16 (M = .357, SD = .164) was not significantly different from chance: t(51) = 0.993, p = .326, dz = 0.138 (95% CI of dz: −0.136 to 0.410), BF10 = 0.241.

Discussion

We examined whether the negative impact of novel sounds on an ongoing visual task was mediated by the predictability of the stimuli and responses in that task. Using an SRT task, we observed solid evidence of sequence learning and distraction by novel sounds. Importantly, however, novelty distraction was equivalent for learned and new sequences.

Our data suggest that novelty distraction is not mediated by the predictability (induced through implicit learning) of the sequence of target stimuli and responses. Instead, it appears to occur in an obligatory fashion (e.g., Schröger, 1996; Sokolov, 1963). Of course, the predictability of the stimuli does not dispense from their processing altogether (target stimuli must, at a minimum, be registered and compared with predictions). Nevertheless, target processing and response production are substantially facilitated when the stimuli are made predictable through implicit learning. In fact, the reduction of RTs for the repeated sequence across Blocks 2 to 11 was substantial. If we assume that the incompressible lower limit of RTs (accounting for physical limitations of the perceptual and efferent systems) is about 200 ms (Jain, Bansal, Kumar, & Singh, 2015; Jensen & Munro, 1979), the reduction of RTs at Block 11 represents around 21% of the variable component of RTs at Block 2 (corresponding to a large effect; dz = .830, 95% CI: 0.511 to 1.142). Combined with the strong evidence of sequence learning across Blocks 2 to 11, this observation strongly suggests that the absence of an interaction between sound type (novel vs. standard) and sequence (learned vs. new) in Blocks 12–15 cannot be attributed to a weak learning effect or to equivalent demands on stimulus processing for learned and new sequences. Instead, our data suggest that novel sounds disrupt ongoing cognitive processes, including the programming of a response or the maintenance of a programmed response plan. This may reflect the shift of attention to and away from the novel sounds (Parmentier et al., 2008) and/or a transient inhibition of motor plans (e.g., Wessel, 2017).

Some have suggested that randomly ordered auditory stimuli interspersed with the repetition of a visual sequence in the SRT task can disrupt sequence learning through the integration of both types of stimuli into a nonpredictable bimodal sequence (Schmidtke & Heuer, 1997), or that tones affect the expression of learning (Frensch et al., 1999). We think this is unlikely to account for our results, however. First, these propositions derive from studies in which, contrary to ours, participants were instructed to attend and respond to the auditory stimuli. When infrequent to-be-ignored tones are presented in the SRT, learning does occur (Jiménez & Vázquez, 2005). In any case, we provided clear evidence of sequence learning prior and posterior to the introduction of the auditory distractors. Hence, we can safely rule out the possibility that the introduction of the auditory stimuli made our stimuli sequence equivalent to a new, unlearned, sequence.

Whether learning is purely implicit or involves some explicit knowledge has generated much debate among implicit memory researchers (e.g., Destrebecqz & Cleeremans, 2001; Jiménez, Vaquero, & Lupiáñez, 2006; Rowland & Shanks, 2006; Shanks & Johnstone, 1999; Shanks, Rowland, & Ranger, 2005). The measurement of explicit sequence knowledge is notoriously difficult to evaluate. However, the results of our production task did not reveal any evidence of such knowledge. While we would not wish to make general claims, the data suggest that SOC1 was learned implicitly in our study. Importantly, however, this issue of inconsequential for our purpose: What mattered to us was that SOC1 be learned, irrespective of implicit/explicit underlying mechanisms.

In conclusion, our data suggest that, at least as assessed by means of the SRT task and induced implicitly, the predictability of target stimuli and responses does not shield the cognitive system from the detrimental effect of novel sounds. This contrasts with the finding that the predictability of the distractors does so (Parmentier et al., 2011; Schröger et al., 2007; Sussman et al., 2003). We conclude that the anticipation of target stimuli and responses does not shield participants from novelty distraction and that the latter is an obligatory attentional effect.