Introduction

Humans respond faster to visual stimuli when auditory stimuli are presented in close temporal proximity (e.g., Bernstein, 1970; Bernstein, Clark, & Edelstein, 1969; L. K. Morrell, 1967; Nickerson, 1973; Stoffels, Van Der Molen, & Keuss, 1985). This accessory stimulus effect occurs even when auditory stimuli are completely irrelevant to the task and provide no information about the response to be made (Bernstein, Chu, Briggs, & Schurman, 1973; Keuss, van der Zee, & van den Bree, 1990; Morrell, 1968; Stahl & Rammsayer, 2005). Despite a research tradition of several decades, the processes underlying the accessory stimulus effect are still hotly debated (e.g., Jepma, Wagenmakers, Band, & Nieuwenhuis, 2009; Tona, Murphy, Brown, & Nieuwenhuis, 2016). Two prominent accounts hold that accessory stimuli speed up the selection of responses (Hackley et al., 2009; cf. Hackley & Valle-Inclán, 1999) or accelerate sensory processing before decision making (Jepma et al., 2009; thereby impacting on subsequent processes of cognitive control, Nieuwenhuis & de Kleijn, 2013; Schneider, 2018; Weinbach & Henik, 2012). Faster response selection could stem from an increase of cortical arousal provoked by the accessory stimuli (e.g., Hackley et al., 2009). Faster sensory processing could likewise stem from an increase in arousal (as discussed by Ásgeirsson & Nieuwenhuis, 2017; Bundesen, Vangkilde, & Habekost, 2015; Petersen, Petersen, Bundesen, Vangkilde, & Habekost, 2017) or could be due to the more specific multisensory integration of the stimulus energies (Jepma et al., 2009).

For visual tasks, auditory accessory stimuli seem to be especially suited to elicit the described processing benefits. Auditory stimuli are generally processed faster than visual ones (Woodworth & Schlosberg, 1954). Thus, albeit their temporal proximity, auditory accessory stimuli are processed before visual target stimuli and could prepare the cognitive system for the visual task in a stimulus-unspecific manner (Los & Van der Burg, 2013).

Stimulus-unspecific preparation processes have been extensively studied within the literature on phasic alertness (e.g., Fan, Mccandliss, Fossella, Flombaum, & Posner, 2005; Fan, McCandliss, Sommer, Raz, & Posner, 2002; Petersen et al., 2017; Posner, 1978, 2008; Posner & Petersen, 1990). Phasic alertness refers to short-term increases of the brain’s general readiness for responding to external information (Posner, 1978; Posner & Petersen, 1990). Similar to accessory stimuli, alerting cues (e.g., brief tones) have been found to lower reaction times (RTs) to visual target stimuli (Callejas, Lupiáñez, & Tudela, 2004; Callejas, Lupiàñez, Funes, & Tudela, 2005; Hackley & Valle-Inclán, 1998; for a review, see Hackley, 2009).

In contrast to accessory stimuli, however, alerting cues often precede targets by several hundred milliseconds (Callejas et al., 2005, Callejas et al., 2004; although it should be noted that this distinction is not always made, e.g., Bernstein, 1970; Hackley & Valle-Inclán, 1999). Due to the apparent similarity of accessory stimulus and alerting effects, one might suppose that they reflect the same underlying processes. In this view, the two effects should, if at all, differ only quantitatively as a function of their differing temporal distance from the target stimuli. This view is compatible with the idea that alerting and accessory stimulation both increased the brain’s readiness for responding to external information (cf. Posner, 1978; Posner & Petersen, 1990; Tona et al., 2016). However, inconsistent with this view is a qualitative difference between alerting and accessory stimulation: Accessory stimuli are temporally close enough to targets for their multisensory integration into a unified percept, whereas this is not the case for alerting cues (see, Diederich & Colonius, 2008; Slutsky & Recanzone, 2001). Indeed, accessory stimulus effects have prominently been explained as a result of the multisensory integration of stimulus energies (e.g., Bernstein, Rose, & Ashe, 1970; Jepma et al., 2009), but the different timing precludes this explanation for alerting effects. Therefore, it is still an open question whether or not alerting and accessory stimulation stem from the same underlying processes.

In classic studies (Callejas et al., 2005; Callejas et al., 2004; Fan et al., 2005; Fan et al., 2002; Hackley, 2009; Posner, 1978), alerting cues signaled the imminent appearance of target stimuli. Thus, alerting effects could have been driven by temporal expectations of target stimuli raised by the alerting cues (Los, Kruijne, & Meeter, 2014; Nobre, Correa, & Coull, 2007; Nobre & van Ede, 2017). However, this hypothesis was falsified by recent studies finding alerting effects even when the temporal expectation for targets after alerting cues was kept constant (Petersen et al., 2017; Weinbach & Henik, 2013; Weinbach & Henik, 2012). Specifically, this was achieved by drawing waiting times for targets after alerting cues (across trials) from non-aging probability distributions, so that the probability that a target appeared given it had not yet appeared was constant over time (Näätänen, 1971). Alerting effects might still reflect a form of preparation, but these findings argue that this preparation is not based on temporal expectation. Similar to some explanations of the accessory stimulus effect (Tona et al., 2016), it has been proposed that alerting effects (effects of warning signals) can result from brief surges of arousal that are triggered automatically (e.g., Hackley, 2009; Hackley et al., 2009).

Importantly, despite the partial overlap of explanations, it is still unclear whether and how phasic alertness and accessory stimulation work in concert to shape choice reaction performance. To address this question, the present study investigated how auditory alerting modulates the effects of subsequent accessory stimuli that accompany the targets of a visual choice reaction task. Results showed that accessory stimuli helped performance in the absence of alerting cues but impaired performance when alerting cues preceded the accessory and target stimulus (Experiment 1). This reversed accessory stimulus effect did not seem due to stimulus expectations regarding the combination of accessory stimuli and alerting cues (Experiment 2).

Methods

Participants

Nineteen paid participants performed Experiment 1. They were between 19 and 35 years old (median = 23 years), seven were male, 12 female, 17 were right-handed, and two were left-handed. One additional participant had to be excluded from analysis because of an experimentation error. Twenty-six new paid participants performed Experiment 2. They were between 18 and 32 years old (median = 23.5 years), 12 were male, 14 were female, 21 were right-handed, four were left-handed, and one was an ambidexter.

All participants reported normal or corrected-to-normal vision and normal hearing, and gave written informed consent before participation. The experiment conformed to the ethical guidelines of the German Psychological Association (DGPs) and was approved by Bielefeld University’s ethics committee (2019-015).

Apparatus and stimuli

The experiments took place in a dark room. Visual stimuli were projected onto a screen (physical dimensions: 208.5 × 117 cm, center 172 cm above ground) using a PROPixx projector (Vpixx Technologies, Saint-Bruno, QC, Canada; front projection, 382 cm from the screen and 248 cm above ground), 180 cm from the participants, running at 120 Hz with a resolution of 1,920 × 1,080 px. Before experimentation, the projector was warmed up for at least 5 min (cf. Poth & Horstmann, 2017). Auditory stimuli were presented using loudspeakers (Philips Multimedia Speaker System A 1.2 Fun Power/MMS 101, Philips, Amsterdam, The Netherlands), placed 4 cm below and 45 cm to the left and right of screen center. Responses were collected using a button box whose two employed buttons were constantly illuminated (ResponsePixx, controlled by a PROPixx, Vpixx Technologies, Saint-Bruno, QC, Canada). The experiment was controlled using the Psychtoolbox3 extension (Kleiner, Brainard, & Pelli, 2007) for Matlab (The Mathworks, Natick, MA, USA), running on Ubuntu 14.04.5.

Luminance of visual stimuli was measured using an LS-110 luminance meter (Konica Minolta, Osaka, Japan). Visual stimuli were presented against a black background (0.03 cd/m2) and 7° (of visual angle) below the center of the elevated screen, so that they were about level with participants’ heads. A small gray square (0.2 × 0.2°, 10 cd/m2) was used as central fixation stimulus. The target stimulus was a larger white square (1 × 1 °, 51 cd/m2), shown 5° to the left or right of screen center. Sound level of auditory stimuli was measured using a SLM01 sound level meter (Tacklife, Shenzhen Temie Technology, Shenzhen, China). In Experiments 1 and 2, the auditory alerting cues and accessory stimuli were identical sine tones with a frequency of 900 Hz, a sound level of 74 dB(A) SPL (against the 47 dB(A) SPL background noise of the projector), and a duration of 50 ms.

The timing of visual and auditory stimuli was externally measured (cf. Poth et al., 2018) using a microphone capsule and a BPW-34 photodiode (Vishay Semiconductors, Malvern, PA, USA), sampled at 2.5 kHz using a TDS 2022B oscilloscope (Tektronix, Beaverton, OR, USA). Twelve runs of an identical trial were measured, in which the auditory accessory stimulus and the visual target stimulus were programmed to follow the auditory alerting cue with a stimulus-onset asynchrony (SOA) of 250 ms. For the present experimental setup and software, the measurements demonstrated that the SOA between the alerting and the accessory stimulus was on average 8 ms shorter than programmed (SD = 1 ms). Likewise, the accessory stimulus appeared on average 5 ms (SD = 1 ms) after the visual target. Except when stated otherwise, the stimulus timing described below is reported as programmed.

Design and procedure

Experiment 1

Figure 1a illustrates the general experimental paradigm. Trials started with the presentation of the fixation stimulus. After a waiting time that varied between 1,400 and 5,800 ms, either the alerting cue (alert condition) or a silent sound object of identical duration (no alert condition) was played. To keep the expectation of the alerting cue constant over time (cf. Petersen et al., 2017), the waiting time for the alerting cue was drawn from the non-aging geometric distribution (in steps of 200 ms and with a probability of 1/3, see Fig. 1b, for the present data on average 1,801 ms). After the offset of the alerting cue or the respective time in the no-alerting condition, the waiting time for the visual target began. This waiting time ranged from 200 to 733 ms and was again drawn from a geometric distribution (in steps of 33 ms and with a probability of 2/5, for the present data on average 250 ms) to control temporal expectations (Fig. 1b). In the accessory condition, target onset was accompanied by the accessory stimulus (i.e., the accessory stimulus followed the target after 5 ms on average; see the external measurements reported above). In the no accessory condition, a silent sound object whose duration was identical to the accessory stimulus was played. The fixation stimulus was extinguished with the onset of the visual target. The target appeared to the left or right of screen center (equally often across trials). Participants’ task was to indicate target location as fast and as accurately as possible by pressing the corresponding response button (the left button for targets on the left, the right for targets on the right). The target stayed on-screen until participants had responded, after which the next trial started automatically.

Fig. 1
figure 1

Experimental paradigm. a Single trial. Participants fixated a fixation stimulus, either an auditory alerting cue or no alerting cue was played, after which the target stimulus (a white square) appeared to the left or right of screen center. The target was either accompanied by an auditory accessory stimulus or presented alone. Participants responded by pressing the button corresponding to the position of the target as fast and as accurately as possible. b Waiting times for alerting cues and waiting times for targets and accessory stimuli followed the non-aging geometric distribution to keep temporal expectations constant over time

Participants performed 400 trials (2 [alert vs. no alert condition] × 2 [accessory vs. no accessory condition] × 100 [trials per cell of the design]) in random order. Before the experiment, they performed eight training trials (two per cell of the design) in random order.

Experiment 2

The design and procedure were identical to the ones of Experiment 1 with the following exceptions. Participants performed 800 trials in total, 320 trials of the no alert – no accessory condition, 120 trials of the alert – no accessory condition, 120 trials of the no alert – accessory condition, and 240 trials of the alert – accessory condition. Thus, across trials, the probability that no tone was played was .4, the probability of one tone and the probability of two tones were both .3. These probabilities were chosen to equate the expectation for a single and for two tones.

Results and discussion

Statistical analyses were performed in R (3.4.4., R Core Team, 2018, including the packages dplyr, 0.7.5., Wickham, François, Henry, & Müller, 2018; ggplot2, Wickham, 2016, ez, 4.4-0, Lawrence, 2016). Experimental conditions were compared using repeated-measures analyses of variance (rmANOVAs) with type-III sums of squares and \( {\eta}_G^2 \) as effect size (Bakeman, 2005), followed up by pairwise paired t-tests with Cohen’s dz effect size (Cohen, 1988). Pairwise comparisons were complemented by Bayesian t-tests (Rouder, Speckman, Sun, Morey, & Iverson, 2009), whose Bayes factor (BF10) quantifies the evidence in favor of the alternative hypothesis (i.e., the likelihood of the data given the alternative hypothesis) relative to the evidence in favor of the null hypothesis (i.e., the likelihood of the data given the null hypothesis). Bayes factors were computed using the R-package BayesFactor (0.9.12-4.2; Morey & Rouder, 2018, using current standard settings).

Experiment 1

Participants’ RT in the experimental conditions was assessed as mean RT across trials. Trials with erroneous responses (2.00%), anticipatory responses (RTs < 100 ms, 0.58 %), or with extremely long RTs (RTs > 2 SDs of the respective participant, on average between 3.89% and 4.63% in the conditions) were excluded. Figure 2 visualizes the results of Experiment 1. A significant main effect showed that alerting (M = 273 ms, SD = 40 ms) lowered RTs relative to the no alert condition (M = 320 ms, SD = 47 ms), F(1, 18) = 112.959, p < .001, \( {\eta}_G^2 \) = .243.

Fig. 2
figure 2

Results of Experiment 1. Means of participants’ mean reaction times in the four experimental conditions. Error bars denote 95% confidence intervals (for within-designs, Morey, 2008, http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/)

The main effect of accessory stimulus was significant as well, F(1, 18) = 9.633, p = .006, \( {\eta}_G^2 \) = .006, but its interpretation was precluded by a disordinal interaction of alerting and accessory stimulus (Fig. 2), F(1, 18) = 43.619, p < .001, \( {\eta}_G^2 \) = .050. Within the no-alert condition, the classic accessory stimulus effect became evident as shorter RTs in the accessory (M = 307 ms, SD = 43 ms) compared with the no-accessory condition (M = 333 ms, SD = 48 ms), t(18) = -7.203, p < .001, dz = -1.65, BF10 = 17024.22 (Fig. 2). Within the alert condition, however, the accessory stimulus effect was reversed: RTs were longer when the accessory was presented (M = 279 ms, SD = 43 ms) compared with when there was no accessory stimulus in addition to the alerting cue (M = 266 ms, SD = 36 ms), t(18) = 3.741, p = .001, dz = 0.858, BF10 = 26.09 (Fig. 2). Thus, the alerting cue not only extinguished the supporting effect of the subsequent accessory stimulus, but made it hamper participants’ choice reaction.

Alerting effects and accessory stimulus effects have been hypothesized to rely on the same underlying mechanisms (Los & Van der Burg, 2013; Posner, 1978). In this case, alerting cues could trigger phasic alertness to maximum, so that subsequent accessory stimuli cannot improve performance further. This would predict that alerting extinguished the accessory stimulus effect, but not that alerting reversed the effect. The reversed accessory stimulus effect might seem to imply that the alerting cue and the identical accessory stimulus together cause an inhibition of choice reaction. Importantly, however, the reversal of the effect might also stem from violations of participants’ expectation. In Experiment 1, all experimental conditions were performed equally often. Thus, on 25% of the trials, there was no tone (no alert – no accessory condition); on 50% of the trials, there was one tone (alert – no accessory, as well as no alert – accessory condition); and on 25% of the trials there were two tones (alert – accessory condition). Thus, participants should have had the overall expectation (Näätänen, 1990; Schröger, 1997) that a single tone would be played. As a result, the presentation of two tones could have violated participants’ expectations, which in turn could have slowed down their responses by drawing attention from the visual task to the auditory modality (Parmentier, Elsley, Andrés, & Barceló, 2011).

The results of Experiment 1 suggest that auditory alerting reverses the beneficial effects of accessory stimuli on visual choice reaction, when the combination of alerting cues and accessory stimuli violates participants’ expectations. Therefore, Experiment 2 tested how alerting and accessory stimulation interact when participants’ expectation that a single tone (alerting cue or accessory stimulus) would be presented matched their expectation for two tones (alerting cue and accessory stimulus).

Experiment 2

Trials with erroneous responses (1.67%), anticipatory responses (0.52%), or extremely long RTs (on average between 3.58% and 4.17% in the conditions) were excluded from analysis.

The results of Experiment 2 are visualized in Fig. 3. Alerting (M = 295 ms, SD = 73 ms) significantly lowered RT compared with the no-alert condition (M = 343 ms, SD = 78 ms), F(1, 25) = 162.80, p < .001, \( {\eta}_G^2 \) = .093. Likewise, accessory stimuli (M = 316, SD = 78 ms) lowered RT relative to the no-accessory condition (M = 322 ms, SD = 80 ms), F(1, 25) = 7.611 p = .011, \( {\eta}_G^2 \) = .002. As in Experiment 1, this accessory stimulus effect was qualified by a disordinal interaction with alerting, F(1, 25) = 83.052, p < .001, \( {\eta}_G^2 \) = .014. Replicating Experiment 1, the classic accessory stimulus effect appeared as shorter RTs in the accessory (M = 331 ms, SD = 79 ms) than in the no-accessory condition (M = 355 ms, SD = 78 ms) within the no-alert condition, t(25) = -11.638, p < .001, dz = -2.28, BF10 > 6.377 × 108. Again replicating Experiment 1, within the alert condition, the accessory stimulus effect was reversed: RTs were longer in the accessory (M = 301 ms, SD = 77 ms) compared with the no-accessory (M = 290 ms, SD = 69 ms) condition, t(25) = 3.041, p = .005, dz = 0.596, BF10 = 7.880. Although the reversed effect seemed somewhat smaller in Experiment 2 than in Experiment 1, a cross-experiment ANOVA did not detect a significant difference in the accessory stimulus effect of the alert conditions between Experiment 2 and Experiment 1, F(1, 43) = 0.042, p = .838, \( {\eta}_G^2 \) < .001.

Fig. 3
figure 3

Results of Experiment 2. Means of participants’ mean reaction times in the four experimental conditions. Error bars denote 95% confidence intervals (for within-designs, Morey, 2008, see above)

The findings of Experiment 2 corroborate the ones of Experiment 1:

As in Experiment 1, phasic alerting reversed the otherwise beneficial effect of accessory stimuli on choice reaction. This reversed accessory stimulus effect does not seem to have arisen from violated stimulus expectations, because these have been controlled for in Experiment 2.

General discussion

This study showed that auditory phasic alertness and auditory accessory stimulation interactively determine visual choice reaction performance. Replicating classic findings (Bernstein et al., 1973; Keuss et al., 1990; Morrell, 1968; Stahl & Rammsayer, 2005), accessory stimuli sped up responding in the absence of alerting cues. Likewise, alerting cues facilitated responding in the absence of accessory stimuli (Fan et al., 2005; Fan et al., 2002; Hackley, 2009; Posner, 1978). In contrast to the classic findings, the present study reveals that combining accessory stimulation with alerting changes the effects on performance in a qualitative fashion. Experiment 1 showed that accessory stimuli impair performance when an alerting cue precedes them on the same trial. Experiment 2 showed that this reversed accessory stimulus effect is not due to violated stimulus expectations. That is, the reversed effect was also found when alerting cues and accessory stimuli were equally expected to occur together or alone. Taken together, the present findings offer the new view that accessory stimulation is not beneficial for performance per se, but that this depends on one’s current state of phasic alertness.

Alerting reverses the beneficial accessory stimulus effect: The role of stimulus expectations

In Experiment 1, alerting cues and accessory stimuli occurred together on a trial with much lower probability than when either one of the stimuli occurred on its own. Based on these probabilities (e.g., Näätänen, 1990; Schröger, 1997), participants could have developed the overall expectation that a single tone rather than that two tones would be presented. Thus, the rare combination of alerting cues and accessory stimuli within the context of the task would have violated participants’ expectations (e.g., Parmentier et al., 2011). This could have happened even though all auditory stimuli were completely irrelevant to the participants’ visual task, because expectations regarding auditory stimuli are assumed to be created and matched against new stimuli automatically and involuntarily (Schröger, 1997). Generally, expectation violations have been found to slow down responding in a large variety of tasks and settings (as reviewed by Horstmann, 2015). More specifically, violated auditory expectations have been found to impair performance in concurrent visual tasks (Parmentier et al., 2011). This has been interpreted as an involuntary capture of attention to the auditory modality, impairing performance in the visual task by cutting necessary processing resources (Parmentier et al., 2011). The results of Experiment 1 might have arisen because these detrimental consequences of expectation violations overpowered the beneficial effects of accessory stimulation. However, the results were replicated in Experiment 2, in which no expectation violation should have taken place, because stimulus expectations had been controlled for. Therefore, it seems we can rule out that expectation-related factors underlie the reversed accessory stimulus effect.

Phasic alertness and accessory stimulation: Underlying processes

One might assume that alerting and the accessory stimulus effect share their underlying mechanisms. Auditory accessory stimuli are processed faster than visual targets (Woodworth & Schlosberg, 1954), so that they could affect processing for the visual task by increasing phasic alertness (cf. Los & Van der Burg, 2013). The accessory stimulus effect would then be a special case of the alerting effect, in line with ideas that both alerting (Petersen et al., 2017; Sturm & Willmes, 2001) and accessory stimulation (Tona et al., 2016), exert their effects by up-regulating neuronal arousal (cf. Aston-Jones & Cohen, 2005; Mather, Clewett, Sakaki, & Harley, 2016). Across a wide range of tasks, the level of arousal and the quality of performance seem to follow an inverted U relationship, so that performance is best at intermediate levels of arousal and declines both with lower and with higher arousal (Yerkes & Dodson, 1908; see also, Aston-Jones & Cohen, 2005; Bundesen et al., 2015). With this in mind, one could speculate about the present findings as follows. Performance was best when alerting cues were presented on their own, indicating that the level of arousal was optimal in this condition. Performance for the accessory stimulus alone was weaker, maybe because only a sub-optimal level of arousal was reached as the (close to) simultaneous presentation of accessory and target constrained the time for arousal to develop. Now, performance might have been impaired when the accessory stimulus was added on top of the alerting cue, because their combination imposed a state of “overarousal” on the participant. Such a state would be detrimental for performance, for example because it hampers engagement in and focus on the current task (e.g., Aston-Jones & Cohen, 2005).

In the present experiments, alerting cues and accessory stimuli consisted of identical tones, so that one might assume that processing the alerting cue inhibited processing of the subsequent accessory stimulus. This would resemble mechanisms proposed to explain the weakening of startle responses to intense stimuli by shortly preceding stimuli (Swerdlow, Blumenthal, Sutherland, Weber, & Talledo, 2007). However, this would predict that the effects of accessory stimulation would be reduced by preceding alerting, but it is incompatible with the observed qualitative changes of effects. Thus, although such stimulus-specific effects cannot explain the present findings, it seems to be an interesting goal for future studies to find out how the reversed accessory stimulus effect is driven by stimulus-specific processes.

Conclusion

In sum, the present study establishes a new link between auditory phasic alertness and the effects of auditory accessory stimuli in visual choice reaction tasks. Replicating classic effects, phasic alerting and accessory stimulation each on their own support choice reaction performance. Importantly, however, combining the two yields a qualitatively different pattern: Adding accessory stimulation on top of a previous alert impairs rather than improves performance. In this way, the findings show that accessory stimulation is not always beneficial for performance, but that this depends on the current situation with its other stimuli and associated levels of phasic alertness.