Attentional modulations of audiovisual interactions in apparent motion: Temporal ventriloquism effects on perceived visual speed

Duyar, Aysun; Pavan, Andrea; Kafaligonul, Hulusi

doi:10.3758/s13414-022-02555-7

Attentional modulations of audiovisual interactions in apparent motion: Temporal ventriloquism effects on perceived visual speed

Published: 22 August 2022

Volume 84, pages 2167–2185, (2022)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Attentional modulations of audiovisual interactions in apparent motion: Temporal ventriloquism effects on perceived visual speed

Download PDF

Aysun Duyar^1,2,3,
Andrea Pavan⁴ &
Hulusi Kafaligonul^1,5

771 Accesses
Explore all metrics

Abstract

The timing of brief stationary sounds has been shown to alter different aspects of visual motion, such as speed estimation. These effects of auditory timing have been explained by temporal ventriloquism and auditory dominance over visual information in the temporal domain. Although previous studies provide unprecedented evidence for the multisensory nature of speed estimation, how attention is involved in these audiovisual interactions remains unclear. Here, we aimed to understand the effects of spatial attention on these audiovisual interactions in time. We utilized a set of audiovisual stimuli that elicit temporal ventriloquism in visual apparent motion and asked participants to perform a speed comparison task. We manipulated attention either in the visual or auditory domain and systematically changed the number of moving objects in the visual field. When attention was diverted to a stationary object in the visual field via a secondary task, the temporal ventriloquism effects on perceived speed decreased. On the other hand, focusing attention on the auditory stimuli facilitated these effects consistently across different difficulty levels of secondary auditory task. Moreover, the effects of auditory timing on perceived speed did not change with the number of moving objects and existed in all the experimental conditions. Taken together, our findings revealed differential effects of allocating attentional resources in the visual and auditory domains. These behavioral results also demonstrate that reliable temporal ventriloquism effects on visual motion can be induced even in the presence of multiple moving objects in the visual field and under different perceptual load conditions.

Temporal ventriloquism along the path of apparent motion: speed perception under different spatial grouping principles

Article 28 December 2017

Congruent audio-visual stimulation during adaptation modulates the subsequently experienced visual motion aftereffect

Article Open access 18 December 2019

Spatial and temporal (non)binding of audiovisual rhythms in sensorimotor synchronisation

Article Open access 14 February 2023

Introduction

Motion perception is an important aspect of our daily experience. To perform proper actions and interact with a dynamic environment, humans (and many other species) precisely estimate the direction and speed of moving objects. Accordingly, motion processing has become an extensively investigated visual feature (Burr & Thompson, 2011; Kolers, 1972; Nakayama, 1985; Nishida, 2011). In studies investigating motion perception, the manipulations have been mainly based on visual stimulation and hence restricted to the visual modality. On the other hand, multisensory research ushered a new perspective of motion perception, wherein the information provided by other modalities (e.g., audition) is also involved in computations underlying motion perception (Soto-Faraco et al., 2003; Soto-Faraco & Väljamäe, 2012). To date, various audiovisual paradigms have been developed to demonstrate the multisensory nature of motion processing. Of particular relevance to the current study, the timing of brief static sounds (e.g., clicks) can alter apparent motion perception (Getzmann, 2007; Shi et al., 2010). Specifically, the time interval between static clicks has been found to modulate perceived direction, speed, and sensitivity to visual apparent motion (Freeman & Driver, 2008; Kafaligonul & Stoner, 2010, 2012; Ogulmus et al., 2018).

In these studies, the experimental design is typically based on two-frame apparent motion. Two concurrent brief sounds (e.g., clicks) have been used for auditory stimulation, and the time interval between them is systematically changed. The auditory time interval of these static sounds has been shown to modulate different aspects of motion perception. For example, previous research indicated that auditory time intervals can alter the perceived speed of two-frame apparent motion (Kafaligonul & Stoner, 2010; Ogulmus et al., 2018). The apparent motion with a short auditory time interval is perceived to move faster than the one with a long time interval, although apparent motions are the same in terms of visual stimulation. These effects of auditory timing on apparent motion percept have been interpreted as a consequence of a well-known phenomenon called “temporal ventriloquism.” In general, temporal ventriloquism refers to the ability of brief sounds to drive the perceived timing of brief visual events when these stimuli are presented at different times (Fendrich & Corballis, 2001; Morein-Zamir et al., 2003; Recanzone, 2003). This illusion makes adaptive sense given the auditory system’s superior temporal resolution, and such dominance has been mostly described as brief sounds affecting (e.g., capturing) visual events in time (Burr et al., 2009; Vroomen & Keetels, 2010; Welch & Warren, 1980). In the case of two-frame apparent motion paradigms, the static clicks may similarly drive the timing of visual motion frames (or the time interval between them). Hence, a decrease or an increase in the perceived time interval between the two motion frames may lead to faster and slower motion percepts, respectively.

The effects of auditory time interval on apparent motion provide important evidence that audiovisual interactions in the temporal domain play a critical role in motion perception. There is also neurophysiological evidence that auditory timing can affect the amplitude of evoked activities at both early and later stages of motion processing (Kaya et al., 2017; Kaya & Kafaligonul, 2019). These findings suggested that the effects of auditory time intervals on motion perception may be the outcome of a dynamic interplay between different cortical regions. An important question to address is how attention is involved in these interactions at different stages of cortical processing. Attention allows prioritization of relevant information for further processing according to context and task demands. The role of attention is complicated and context-dependent in crossmodal interactions. An emerging notion suggests that multisensory processing and attention interact in a complex, multifaceted manner. In agreement with this perspective, mounting evidence suggests that attention can take place at different levels of multisensory processing (Teder-Sälejärvi et al., 1999). Furthermore, the bottom-up (stimulus-driven) and top-down (goal-driven) attention may have differential effects at distinct stages of processing (Koelewijn et al., 2010; Macaluso et al., 2016; Talsma et al., 2010). Spatial attention can affect processing across sensory modalities, such that the processing of irrelevant visual information is enhanced in the attended (auditory) location and vice versa (Spence & Driver, 1996). In particular, attentional allocation enhances perception across sensory modalities in motion perception (e.g., Beer & Röder, 2004a, 2004b). Attentional demands increase with additional tasks and/or with the task difficulty, which results in increased perceptual load. Perceptual load can influence audiovisual interactions in space, as well as the speed of audiovisual feature binding (e.g., Alsius et al., 2005; Eramudugolla et al., 2011; Evans, 2020).

Freeman and Driver (2008) investigated whether this form of audiovisual motion illusion (i.e., temporal ventriloquism effects on apparent motion) may be achieved simply by focusing attention on specific visual intervals. The auditory clicks may conceivably capture attention, potentially making some intervals between apparent motion frames more salient than others and affecting motion perception without changing the perceived visual timing. Their behavioral findings rejected this hypothesis based on the attention-capture account. Moreover, Kafaligonul and Stoner (2012) aimed to understand the involvement of attention-based motion system. They found that click timing can affect visual motion processing even when attentional tracking is ruled out (i.e., without the involvement of higher-order attentional and/or position tracking mechanisms). Therefore, these previous studies suggest that attention may not be required for this audiovisual temporal illusion to occur, highlighting the automatic nature of audiovisual interactions. Nevertheless, attention can have a modulatory influence on these audiovisual interactions in time and little is known about such modulatory role. This is mainly because visual apparent motion and auditory clicks were primary and secondary task-irrelevant stimuli in previous research, respectively. In other words, observers performed a perceptual task on visual motion while passively listening to the static clicks. Accordingly, the observers focused their attention on visual motion, and there was no systematic manipulation of attention either in the visual field or across modalities. On the other hand, such manipulations of attention have important implications for daily life situations.

In everyday life, the stimulation of the external environment is complex, and we are frequently exposed to more than one moving object in the visual field. Furthermore, the sensory relevance and attentional demands constantly change. Using complex stimulus configurations, previous research investigated the roles of feature similarity and crossmodal correspondence in temporal ventriloquism (Boyce, Lindsay, et al., 2020a; see also Chen et al., 2018). Although previous findings revealed significant effects of similarity, they also indicated that the featural differences did not abolish temporal ventriloquism (Boyce, Whiteford, et al., 2020b; Klimova et al., 2017). This applies to the number of stimuli in the visual and auditory domains. Against the original descriptions (Morein-Zamir et al., 2003), an equal number of auditory and visual stimuli (e.g., the number of visual objects and clicks) may not be necessary to elicit temporal ventriloquism effects on the perception of apparent motion (Getzmann, 2007; Ogulmus et al., 2018). Besides having important implications for audiovisual binding in the temporal domain (see Experiment 1), these results pave the way to investigate the role of spatial attention and to manipulate sensory relevance and attentional demands. Within the context of temporal ventriloquism effects on perceived speed, there is still no systematic research on the number of visual stimuli and the role of spatial attention in these audiovisual interactions. An important question is whether the auditory time interval can alter the perception of more than one moving object and when the attention is distributed within the visual field. In the present study, we first aimed to address this question by investigating the effects of auditory time interval on speed perception. We systematically manipulated the number of concurrent moving objects in the visual field under different attention conditions. Additionally, we included a secondary perceptual task on the visual events (i.e., a dual-task paradigm) to assess the allocation of attentional resources. We next asked whether focusing attentional resources on the auditory click would modulate these audiovisual interactions in time. In this part of the study, we introduced a secondary task on the location of static clicks and systematically manipulated the secondary task difficulty by shifting the position of the sound source, which also allowed us to examine whether the possible modulations due to perceptual load on the auditory stimulation depend on task difficulty.

Experiment 1

Using a visual search (i.e., pip and pop) paradigm, previous research revealed that audiovisual integration decreases drastically with more than one static object in the visual field (Olivers et al., 2016; Van der Burg et al., 2013). According to these findings, the number of visual events that may be linked to a single auditory event is limited. On the other hand, behavioral studies combining temporal ventriloquism and apparent motion indicated that auditory time intervals can affect more than one moving object (e.g., Ogulmus et al., 2018). These findings suggest that the timing of a single auditory click may drive the timing of more than one object presented in each motion frame, because two-frame apparent motion and two concurrent clicks were typically used in previous research, and the effects of temporal ventriloquism have been mostly described as each click affecting the perceived timing of each apparent motion frame (or the time interval demarcated by these frames; Chen & Vroomen, 2013). However, there is still no systematic investigation on testing the limits of these audiovisual interactions in terms of the number of moving objects in the visual field. Therefore, in the first experiment, we examined auditory time interval effects on perceived speed by systematically manipulating the number of moving objects and spatial attention in the visual field. Based on the hypothesis that there is a limited capacity of binding auditory and visual events, we expected to have an increase in the amount of temporal ventriloquism effects on perceived visual speed when observers attended to a single moving object in the visual field.

Moreover, dual-task paradigms (i.e., having a secondary task) have been used to manipulate attentional resources in multisensory paradigms. Previous work showed that attentional demands modulate audiovisual processing and binding (e.g., Alsius et al., 2005; Mozolic et al., 2008; Ren et al., 2020; Ren et al., 2021). Using a secondary task in the visual domain, these studies indicated that audiovisual interactions were greatly reduced when participants concurrently performed an unrelated visual task. Accordingly, we also assessed whether the allocation of attentional resources in the visual domain alters the amount of auditory time interval effects on perceived speed by introducing a secondary task on the fixation target. Based on the previous research on different audiovisual paradigms, we hypothesized that diverting attention away from moving stimuli would decrease the binding and hence audiovisual interactions in time.

Methods

Participants

Twelve participants (age range: 21–29 years) completed all the training and main experimental sessions. All participants had normal or corrected-to-normal vision and normal hearing. None had a history of neurological disorders by self-report. Before their participation, they were informed about experimental procedures and signed a consent form. The sample size was determined based on our previous behavioral studies examining the effects of auditory time interval on perceived visual speed (Kafaligonul & Stoner, 2010; Ogulmus et al., 2018; see also the behavioral study reported in Kaya et al., 2017). In particular, Ogulmus et al. (2018) used a design based on comparing two consecutive apparent motions with different auditory time intervals. All the sample sizes reported in the present study were also commensurate with the original research by Van der Burg et al. (2013) investigating the capacity of audiovisual binding. All procedures were in accordance with the Declaration of Helsinki (World Medical Association, 2013) and approved by the local Ethics Committee of Bilkent University.

Apparatus

We used MATLAB (The MathWorks, Natick, MA, USA) with the Psychtoolbox 3.0 extension (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997) to control stimulation, experimental design, and data acquisition. The visual stimuli were displayed on a 20-inch CRT screen (1,280 × 1,024–pixel resolution, 100-Hz refresh rate) at a viewing distance of 57 cm. The display was gamma-corrected using a SpectroCAL (Cambridge Research Systems, Rochester, Kent, UK) photometer. The auditory stimuli were emitted by two-channel speakers positioned next to the display on each side. The center of speakers (i.e., the horizontal midpoint between the two speakers) was vertically aligned with the display and 57 cm away from the participants. The sound pressure level (SPL) was regularly measured with a sound-level meter (SL-4010 Lutron, Lutron Electronics, Taipei, TW). A chin rest was used to stabilize the head position and constrain movements. The experiments were performed in a dimly lit and sound-attenuated testing room. Except for a speaker change in Experiment 4, the same apparatus and testing room were used in all the experiments.

Stimuli and procedure

The design was based on comparing the speed of two consecutive apparent motions moving in the same direction (Kafaligonul & Stoner, 2010; Ogulmus et al., 2018). A small square (0.5° length, 108 cd/m²) at the center of the display (0.56 cd/m² background luminance) served as a fixation marker. Each apparent motion consisted of two motion frames (Fig. 1a). In each motion frame, an equal number of objects (2, 4, or 8 objects) were presented on an imaginary circle (inner circle radius: 2.15°, outer circle radius: 3.85°) around the fixation. The shape of each object was pseudorandomly assigned to a circle (0.6° diameter, 54.5 cd/m²) or a square (0.6° length, 54.5 cd/m²). When there were two objects, the stimuli were positioned on the left and right side of the fixation. Therefore, the resulting movement was horizontal, and there was a 180° angle between the motion directions. For the 4 and 8 object presentations, the positions of objects were equally spaced in each frame to have 90° and 45° angles between neighboring motion directions, respectively (Fig. 1b). Apparent motions were generated by presenting each frame for 50 ms and having a 100-ms blank interval between them (ISI_v, interstimulus interval). During the blank interval, there was only the fixation at the center of the display (Fig. 1a). Based on the overall motion direction (outwards or inwards) during a trial, the motion frame in which objects were positioned either on the inner circle or the outer circle was presented first. A pair of static clicks was also introduced during the presentation of each two-frame apparent motion. Each click had a duration of 20 ms (rectangular windowed 480-Hz sine-wave carrier, 44.1-kHz sampling rate), and the SPL was 78 dB. The pair of clicks was introduced with a time interval (ISI_a) and temporally centered with respect to the pair of motion frames.

For each trial, the number of objects in an apparent motion frame was pseudorandomly selected from the three conditions (2, 4, or 8 objects). The two-frame apparent motion stimuli were shown twice. The interval between each consecutive presentation was 700 ms (i.e., the ISI between the first and second apparent motion presentation, see Fig. 1a for the timeline). Each apparent motion was the same, but the auditory time interval between the concurrent sounds was different. For one of the apparent motion presentations, the time interval between static clicks was shorter than the visual time interval between the two motion frames (short ISI_a = 20 ms). For the other one, the auditory time interval was longer than the visual time interval (long ISI_a = 240 ms). The order of auditory time intervals (short vs. long) was randomized across trials. The timeline of events, including auditory time intervals were based on previous studies (Kafaligonul & Stoner, 2010; Kaya & Kafaligonul, 2019; Ogulmus et al., 2018). Observers were instructed to fixate during a trial and to indicate, by pressing one of two keys on a standard keyboard, which of the consecutive apparent motions appeared to move faster (i.e., two-interval forced-choice paradigm). Participants were allowed to respond at the end of each trial with no time pressure.

As in previous research (Kaya & Kafaligonul, 2019; Ogulmus et al., 2018), there was no additional task in the neutral (baseline) condition. The observers were asked to distribute their attention to all moving objects in the visual field and to make a comparison based on the overall speed (see also Table 1 for a comparison of attention conditions). The participants were informed that clicks would accompany the moving objects but to base their responses solely on the visual stimulation. In the cued condition, a brief (70 ms) square cue (0.5° length, blue: 20.4 cd/m² or red: 35 cd/m²) was presented before the first apparent motion presentation (Fig. 1a). The cue location was at the center of one of the upcoming moving object’s trajectory. The onset timing (i.e., onset asynchrony) between the cue and the first apparent motion was varied between 270 and 300 ms. The range of cue timing was selected to have sustained attention along the path of one of the moving objects (Nakayama & Mackeben, 1989; Ward, 2008). The observers were instructed to attend only to the moving object that would appear at the cue location and to compare the speed of that particular object. They also performed a secondary task by reporting the cue color. Since the cue was presented even before the first apparent motion, this secondary task was included in the design just to make sure that observers did not ignore the cue and they oriented attention at a specific location. In the fixation (color) condition, the observers were instructed to distribute their attention in the visual field and judge the overall speed as in the neutral condition. However, during the presentation of each apparent motion, the fixation color was turned to either red or green for 70 ms (see also Fig. 1a), and the onset of color change was varied within the visual time interval (ISI_v = 100 ms). As a secondary task, the participants were also asked to report whether the fixation color change was the same or not. Since the fixation color change occurred during the presentation of each apparent motion, the secondary task in this condition was included in the design to specifically manipulate attentional resources in the visual field and divert attention away from the moving objects. These three attention conditions (neutral, cued, and fixation) were run in separate blocks. The order of these blocks was randomized across participants. Each block consisted of 384 trials (3 different number of moving objects x 128 trials per condition).

Table 1 List and comparison of all attention conditions used in the study (the conditions of each experiment are grouped in separate rows)

Full size table

Training and performance testing

Before the main behavioral experiment, each participant first engaged in practice/training blocks. These blocks allowed us to evaluate whether a participant can reliably compare the speed of two successive apparent motions in our experimental design and settings. There were no auditory clicks in the practice blocks, and the number of objects in each apparent motion frame was fixed to four (i.e., 4 moving object condition of the main experiment; Fig. 1). As in previous research (Kafaligonul & Stoner, 2010; Ogulmus et al., 2018), one of the two successive apparent motions was used as a “reference” stimulus. The reference had a 100 ms time interval between apparent motion frames (ISI_ref = 100 ms). The other “test” apparent motion had a time interval (ISI_test) that varied pseudorandomly from trial to trial: 20, 40, 60, 80, 100, 120, 140, 160, 180, and 200 ms. As in the main experiment (Fig. 1), the reference and test stimuli were separated by a delay of 700 ms, and their order was randomized from trial to trial. The reference and test apparent motions were not distinguished in the instructions to the participants. At the end of each trial, participants performed a speed comparison by indicating which apparent motion (i.e., first or second motion) appeared to move faster.

A practice block included a total of 120 trials (10 ISI_test × 12 trials per condition). After each practice block, the percentage of trials in which the test apparent motion reported as faster was computed for each ISI_test condition. The percentage of trials was expected to be high and above 75% for short ISIs (i.e., ISI_test << ISI_ref). The percentage values should have decreased as the ISI_test got longer and was expected to be below 25% for the long ISIs (i.e., ISI_test >> ISI_ref). These percentage values were plotted as a function of ISI_test and a complementary error function (\( 1-\frac{2}{\sqrt{\pi }}{\int}_0^x{e}^{-{t}^2} dt \)) was fitted to these values using psignifit (Version 2.5.6). The software package implements the maximum likelihood method described by Wichmann and Hill (2001a, 2001b). The 50% point on the resultant curve yields the point of subjective equality (PSE). The PSE is the ISI_test for which the test apparent motion was reported as faster than the reference on 50% of the trials (see also Fig. S1 for sample data). To be eligible to continue with the main experimental session, we required that the PSE point was reliably estimated based on the data for the whole ISI_test range (20–200 ms). We expected the percentage values of two short ISI_test conditions (slower test: 20, 40 ms) to be above or equal to 75% and two long ISI_test conditions (faster test: 180, 200 ms) to be below or equal to 25% level. We also required the values in three of these four extreme ISI_test conditions to be in the expected range. Participants were trained by repeating the practice block until they reached these criteria.

Results

The results of Experiment 1 are shown in Fig. 2. To quantify auditory time interval effects on perceived speed, we computed the percentage of trials in which the apparent motion with a short auditory time interval was perceived to move faster than the one with a long auditory interval. In all the experimental conditions, the mean percentage values were above the 50% chance level (Fig. 2a). A series of one-sided one-sample permutation tests (sampling permutation distribution 5k) were performed on the percentage value of each condition to assess whether these values were greater than the chance level. The resultant p values were corrected with the Holm method for nine comparisons (i.e., 3 attention conditions × 3 number of objects). All the data analyses were performed in R (Version 4.1.2; R Core Team, 2021). The results showed that for all the conditions the percentage values were significantly higher than 50% (neutral: p_adj < .001, p_adj = .0024, p_adj = .0016; cue color: p_adj = .0016, p_adj = .0032, p_adj = .0054; fixation color: p_adj = .003, p_adj = .0032, p_adj = .0032 for 2, 4, and 8 objects, respectively). These results indicate reliable temporal ventriloquism (i.e., auditory time interval) effects on perceived visual speed in all the conditions tested.

According to a Shapiro–Wilk test, residuals of percentage values of apparent motion perceived as faster were not normally distributed (W = 0.95, p < .001). Additionally, for Experiment 1, data are likely to follow a uniform distribution (data distribution was assessed using the R function descdist with 1 k bootstrapped values). Therefore, we used the aligned rank transform (ART), a procedure for the nonparametric analysis of variance in multifactor designs (Higgins et al., 1990; Higgins & Tashtoush, 1994; Salter & Fawcett, 1993; Wobbrock et al., 2011). With this technique, a linear mixed model can be implemented once the data is aligned and ranked for each main and interaction effect. Pairwise comparisons were conducted using the ART-C procedure (Elkin et al., 2021). A linear mixed model with random intercept across participants and including the attention conditions (neutral, cue color, and fixation color) and the number of objects (2, 4, and 8) as within-subjects factors, revealed only a significant effect of attention conditions, F(2, 88) = 4.55, p = .013, number of objects: F(2, 88) = 0.27, p = .77; interaction between attention and number of objects: F(2, 88) = 0.089, p = .98. For the main effect of the attention, Holm-corrected post hoc comparisons reported a significant difference between the neutral and the cue color condition (p_adj = .028), between the neutral and the fixation color condition (p_adj = .024), but not between cue color and fixation color condition (p_adj = .84).

Figure 2b shows the averaged performance values for the secondary task. Participants reported either the cue color or had to discriminate the color change of the fixation square. A series of one-sided one-sample permutation tests (sampling permutation distribution 5k) were performed for each condition on the accuracy values to assess whether accuracies across conditions and number of objects were greater than 75%. The results showed that for all the conditions the percentage values were significantly higher than 75% (Holm-corrected comparisons; cue color: p_adj = .0012, p_adj = .002, p_adj = .002; fixation color: p_adj = .0116, p_adj = .0036, p_adj = .021, for 2, 4, and 8 objects, respectively). For accuracy values, residuals were not normally distributed (W = 0.94, p = .0024). Therefore, we again used the ART with a linear mixed model. The analysis revealed only a significant effect of the attention condition, F(2, 55) = 80, p < .001; number of objects: F(2, 55) = 0.047, p = .95; interaction between attention condition and number of objects: F(2, 55) = 0.16, p = .85. Overall, the accuracy values suggest that observers attended to the cue location or fixation target and performed the secondary task according to the instructions.

Discussion

The auditory time interval effects on perceived speed were mainly present in all conditions, and the results did not indicate a significant effect of the number of moving objects. Given that the effects of auditory time intervals have been mostly described as each click altering the perceived timing of each apparent motion frame, these findings suggest that the timing of a single auditory click can drive the timing of more than one object presented in the visual field. There was a significant main effect of attention. However, compared with the neutral condition, the auditory time interval effects were significantly lower when observers attended to a single moving object in the visual field. Based on the hypothesis that there is a limited capacity for the number of visual events that can be bound to a single auditory event, we expected to have higher percentage values (Fig. 2a) for the cued condition in which observers attended to a single moving object. More importantly, these results revealed a significant effect of perceptual load/attention demands in the visual field. In the fixation condition, we diverted attention to a stationary object (i.e., fixation target) during the presentation of each apparent motion. According to the previous research (Alsius et al., 2005; Ren et al., 2020; Ren et al., 2021), we expected a decrease in the amount of audiovisual interactions and hence to have lower percentage values in this condition compared with the neutral condition. In line with this original prediction, the percentage values for the fixation condition were significantly lower than those of the neutral condition.

Experiment 2

Against the original prediction, a spatial cue did not improve auditory time interval effects on perceived visual speed in the previous experiment. The spatial attention was manipulated in a goal-driven manner (Theeuwes & Failing, 2020) by using a static cue and introducing a secondary task relevant to the cue. Although the participants were instructed carefully, it is still conceivable that they might have allocated their attention to the cue itself rather than to the moving object at the cued location. Moreover, high perceptual load due to the discrimination and then speed comparison in a dual-task paradigm might have overshadowed any potential cueing effects in the spatial domain. For instance, having a secondary task on cue color (i.e., an object other than the moving stimuli) might have decreased audiovisual interactions. This decrease might have canceled out any enhancement due to cueing and allocation of attention at the specific location of the moving object. Hence, the spatial cue together with a secondary task, might not efficiently modulate temporal ventriloquism effects on perceived speed. To address these concerns and restrict the contribution of other confounding factors, we re-examined a potential modulatory role of spatial cueing by using a simplified experimental procedure and without having a secondary task in a control experiment.