A byproduct of the ongoing medialization of our everyday life is that we are confronted with a growing overload of stimuli in our environment competing for our attention. For example, if we browse a link in order to acquire specific information, not only the intended content appears but additional windows with advertisements also pop up, or special latest offers are announced in sidebars. In order to be able to interact optimally with our environment, it is increasingly important to be able to distinguish between our actually intended effects (e.g., selecting the relevant information on one specific webpage) and all the stimuli that were not directly elicited by our intentional action (i.e., external stimuli, e.g., distracting advertisements). This ability to distinguish action effects from external stimuli enables us to learn action-effect relations for future goal-directed interactions with our environment (e.g., to quickly succeed in finding the necessary information on a webpage).

A basic phenomenon that goes along with our ability to distinguish between own action effects and external stimuli is a characteristic bias in time perception of these events. Recent studies (e.g., Haggard, 2005; Haggard, Aschersleben, Gehrke, & Prinz, 2002; Haggard, Clark, & Kalogeras, 2002; Ruess, Thomaschke, Haering, Wenke, & Kiesel, 2017; Ruess, Thomaschke, & Kiesel, 2017b) show that time perception differs substantially for own action effects and other stimuli. The perceived time point of own action effects is shifted toward the causing action, thereby being perceived earlier compared to the perceived time point of stimuli that are not caused by own actions (i.e., external stimuli). This phenomenon of biased time perception is typically referred to as temporal or intentional binding (IB; e.g., Haggard, Aschersleben, et al., 2002). It serves as an important implicit indicator of sense of agency (e.g., Moore, 2016; Moore, Wegner, & Haggard, 2009; Ruess, Thomaschke, & Kiesel, 2017a; but see, e.g., Buehner, 2012; see General Disucssion).

Studies investigating IB traditionally employ the so-called clock paradigm (Haggard, Clark, et al., 2002). In this paradigm, participants see a rotating clock hand and press a key that causes an effect tone. Afterwards the participants are asked to estimate the position of the rotating clock hand at the moment when they heard the tone (experimental condition). This time point estimate is compared to the time point estimate of a tone in a condition in which a tone is presented without any causing action of the participant (baseline condition). The clock paradigm has been employed in many different studies, showing on the one hand the robustness of the phenomenon, and on the other hand its modulation by different influencing factors (e.g., Moore & Haggard, 2008; Moore & Obhi, 2012).

One main factor influencing IB established in various studies (e.g., Haggard, Clark, et al., 2002; Humphreys & Buehner, 2009; Ruess, Thomaschke, Haering, et al., 2017; Ruess, Thomaschke, & Kiesel, 2017b) is the delay between the action and the resulting tone (i.e., the delay duration). On the one hand, IB has been shown to decrease for more delayed (i.e., later) action effects (e.g., Haggard, Clark, et al., 2002). On the other hand, recent studies found IB to reach a maximum for certain ranges of delay duration (i.e., effects occurring after a delay of about 200 ms to 400 ms). Yet this specific range of effect delay depended on the specific action-effect constellation, for example, whether the effect followed its causing action always after the same, temporally predictable delay, or after different, temporally unpredictable delays (Ruess, Thomaschke, & Kiesel, 2017b). Consequently, IB seems to be a very fine-tuned bias that depends on specific characteristics of action and effect.

Given the large amount of research on the phenomenon of IB (for a review, see Moore & Obhi, 2012), it is surprising that almost all previous studies employed auditory action effects when assessing IB with the clock paradigm (e.g., Barlas & Obhi, 2013; Capozzi, Becchio, Garbarini, Savzzi, & Pia, 2016; Haggard, 2017; Obhi & Hall, 2011; Saito, Takahata, Murai, & Takahashi, 2015). Referring to our initial example, however, it becomes obvious that in our everyday life we are confronted also with innumerable visual stimuli. Many actions aim at producing visual effects and, especially due to increasing medialization, it is of great importance to be able to distinguish not only auditory effects from external tone stimuli but also visual action effects from external visual stimuli. Consider, for example, reading a selected webpage while distracting latest offers are announced in sidebars.

Some studies presented visual action effects by employing different measurement methods than the clock paradigm (e.g., Buehner, 2012; Fereday & Buehner, 2017; Zhao, Chen, Yan, & Fu, 2013; see General Discussion). Yet the dominant method in the field is the clock paradigm. To our knowledge, only one study by Moretto, Walsh, and Haggard (2011) investigated, how we perceive the time point of visual action effects compared to the time point of action-unrelated visual stimuli by employing the clock paradigm. Yet that study (Moretto et al., 2011) focused on emotional influence on IB and presented some effect pictures that were of highly emotional intensity (e.g., death bodies). As IB has been shown to be strongly influenced by the emotional content of the action effects (e.g., Yoshie & Haggard, 2013) an assessment of the influence of stimulus modality on IB would require to investigate visual IB with emotionally neutral visual effects.

A few previous studies modified the so-called clock paradigm to allow for presentation of visual instead of auditory stimuli during clock presentation (e.g., Bratzke, Bryce, & Seifried-Dübon, 2014; Carlson, Hogendoorn, & Verstraten, 2006; Yabe & Goodale, 2015). Yet, these studies did not investigate IB but rather dual tasking, visual attention, or reactions to stimuli.

In the present study, we modified the clock paradigm along the lines of these studies. Instead of the participants’ action causing the occurrence of an auditory tone, the participants’ keypress caused the clock face to change its color for a short duration (150 ms). Participants were asked to estimate the position of the clock hand when the color change occurred (experimental condition). This time point estimate was compared to a time point estimate of a condition in which the color change of the clock face was not caused by the participant’s action.

Furthermore, motivated by the vast number of studies pointing out the relevance of the delay duration for the magnitude of IB (e.g., Haggard, Clark, et al., 2002; Humphreys & Buehner, 2009; Ruess, Thomaschke, Haering, et al., 2017; Ruess, Thomaschke, & Kiesel, 2017b; Wen, Yamashita, & Asama, 2015), we employed five different delay durations (150 ms, 250 ms, 350 ms, 450 ms, and 650 ms). For auditory effect tones, there is preliminary evidence that IB becomes especially strong for effects presented about 200 ms to 400 ms after action execution (Ruess, Thomaschke, & Kiesel, 2017b). Yet the impact of delay duration on IB of visual effects might be different.

In the present study, we employed five different delay durations for three different reasons: First, with only one, single delay duration, we might accidentally have targeted an action-effect delay at which visual IB was minimal. Second, using five different delay durations allowed us to investigate the temporal dynamics of a potential visual IB. And third, we aimed to compare the temporal dynamics of the IB of visual effects to the already established temporal dynamics of the IB of auditory effects (e.g., Haggard, Clark, et al., 2002; Ruess, Thomaschke, & Kiesel, 2017b). For this, we conducted a second experiment with the same five delay durations but employing the clock paradigm in its classic version by presenting auditory action effect tones.

Experiment 1: Visual effects

In Experiment 1, participants saw a rotating clock hand, and, in the experimental condition, they were asked to press one of two possible response keys at a freely chosen point in time. We chose this procedure because stronger IB in conditions with multiple action alternatives compared to only one fixed action has been reported in previous studies (e.g., Barlas, Hockley, & Obhi, 2017a, 2017b; Barlas & Obhi, 2013). Each key press contingently produced one of two possible effects; that is, the clock face changed its color for 150 ms either to red or to green. This visual effect occurred after a delay of either 150 ms, 250 ms, 350 ms, 450 ms, or 650 ms (varying block-wise). In the baseline condition, the visual stimulus (i.e., color change of the clock face) was presented without preceding action. In both conditions, participants were asked to estimate the respective points in time when the clock color had changed in relation to the position of the rotating clock hand. Finally, IB was computed as the difference of mean estimates in experimental (i.e., with causing action) compared to baseline (i.e., without causing action) conditions (see Fig. 1; cf. Haggard, Clark, et al., 2002).

Fig. 1
figure 1

Clock paradigm (Haggard, Clark, et al., 2002) with visual action effect (Experiment 1). In the experimental condition, participants saw a rotating clock hand and were asked to press a key. A visual effect (i.e., color change of the clock face in Experiment 1, or in Experiment 2 an auditory effect, i.e., tone occurrence) followed participants’ key presses. In the baseline condition, the visual color change (or tone occurrence in Experiment 2) was presented without preceding action. After the clock hand stopped, participants estimated the position of the clock hand at onset of the visual stimulus (or auditory stimulus in Experiment 2). Intentional binding (IB) was calculated as the difference between the mean estimates in experimental and baseline conditions

Method

Participants

Forty-eight participants (38 females, mean age = 25.73 years, SD = 6.71, range: 19–57 years, 44 right-handed) were tested and received 8 Euros or course credit for compensation. We assumed IB to be of medium effect size. In order to detect IB with a reasonable power (1 − β = .9), we would have needed at least 44 participants. Thus, we had planned to collect data of at least 44 participants. Because some more participants had signed for the study, we finally collected the data of 51 participants. Yet, data of three participants were excluded due to technical problems or incorrect following of the instructions, resulting in the data of 48 participants that were analyzed. Prerequisite for participating was normal or corrected-to-normal vision, no dyschromatopsia, no red-green blindness or colorblindness, and normal or corrected-to-normal auditory perception.

Apparatus and stimuli

The experiment was run using the E-Prime 2.0 software (Schneider, Eschmann, & Zuccolotto, 2012), and it was presented on a standard PC with a 24-in. LCD screen (1024 pixels × 768 pixels, 144-Hz refresh rate). We employed a slight modification of the classic version of the clock paradigm (Haggard, Clark, et al., 2002) with the so-called Libet Clock (Libet, Gleason, Wright, & Pearl, 1983; Wundt, 1887). We presented a visual display of an analogue clock with 12 marked “minute” intervals and a clock hand that revolved over the dial at a continuous pace (diameter 6.1 cm; clock hand 2.2 cm; 2,560 ms/full rotation; see Fig. 1; for a new open source tool see, e.g., Garaizar, Cubillas, & Matute, 2016). In the experimental conditions, the keys “D” and “F” of a standard computer keyboard were pressed with the index and the middle finger of the left hand (i.e., as action). The participants entered their time estimates with their right hand using the number pad of the keyboard (1–9) and confirmed by pressing the space or backspace buttons, thereby triggering the next trial. As visual effect stimuli, the clock face changed colors for 150 ms, from black color to red (i.e., RGB 255, 0, 0) or green (i.e., RGB 0, 255, 0). In the experimental conditions, the two colors of the visual effect were contingently mapped to one of the two response keys and the mapping was counterbalanced across participants.

Procedure

The Libet Clock (Libet et al., 1983; Wundt, 1887) served as the reference for participants’ time estimates (see Fig. 1) and was presented at trial start. The clock hand immediately started to rotate at a random position. In the experimental conditions, participants were instructed to wait until the clock hand had revolved at least once before they were free to press one of the two possible response keys at a freely chosen point in time. They were instructed not to press at a pre-planned point in time or clock position and to randomly choose which key to press, merely trying to press both keys roughly equally often. This key press (i.e., the action) was followed by the effect (i.e., the visual color change of the clock face) after a delay duration of either 150 ms, 250 ms, 350 ms, 450 ms, or 650 ms. The different delay durations were realized in separate blocks, thus, within a block, the occurrence of the effect was temporally predictable. We decided to employ 650 ms as longest delay duration (instead of 550 ms), because this delay duration of 650 ms was also employed in the study by Haggard, Clark, and Kalogeras (2002), where they investigated the influence of delay duration on IB of auditory effects. Further, we restricted our design to only five delay durations, thereby omitting the missing 550 ms delay duration, because we intended to limit the duration of the entire experiment to 1 hour, in order to prevent fatigue or attentional changes throughout the experiment that might influence the results. Thus, the steps of the five employed delay durations were of 100 ms difference (i.e., 250 ms, 350 ms, and 450 ms), only the last step between the 450 ms and 650 ms delay duration was of a 200 ms difference.

In the baseline condition, no action was required, and one of the two visual stimuli (i.e., clock color change to red or green) was presented randomly 2,560 ms to 5,120 ms after the trial had started. In both conditions, the clock hand disappeared 2,000 ms to 3,000 ms after the visual color change and the prompt to estimate the point in time of this color change appeared on the screen. Therefore, participants were asked to estimate the position of the clock hand at the moment of visual stimulus onset, that is, of color change (in minutes 1–60).

At the beginning of the experiment, for each participant, there was a training phase of five training trials for both the baseline and for the experimental condition (one trial for each of the five delay durations, respectively). Afterwards, the main experiment started with a baseline block (visual stimulus occurring without preceding action), followed by five experimental blocks (action producing the visual effect; each block with a different delay duration), and finished with another baseline block. The order of the five delay durations was randomized between participants. Each of the two baseline blocks consisted of 20 trials (20 trials × 2 blocks = 40 trials), and the experimental blocks consisted of 40 trials (40 trials per delay × 5 blocks = 200 trials overall). In total, the experiment lasted about 1 hour.

Data analysis

The differences between estimated and actual position of the clock hand at effect occurrence were computed trial-wise for each participant. These differences were averaged separately for each of the five delay duration blocks of the experimental condition and for the baseline condition. The angle differences were transformed into temporal differences (angle difference × 2,560 ms/60) and trials in which this difference deviated more than ±2.5 SD from the participant’s mean difference in the respective condition were excluded from analysis (on average, 1.78% for each participant in Experiment 1; for a similar procedure, see Ruess, Thomaschke, & Kiesel, 2017b). Finally, IB was calculated as the difference between the mean shift of the perceived point in time of the effect in the baseline in comparison to the experimental condition (separately for all five delay durations). We computed this difference in such a way that measures had positive values when IB occurred; thus, results are reported as baseline minus experimental condition values. Consequently, positive values for the IB effect mean that the time point of the effect was temporally perceived earlier than it actually occurred, that is, shifted toward the action (p level of .05 for all results). The data of two participants were excluded due to a significant deviation of their IB from the mean IB of all participants regarding at least one of the delay durations (Tukey, 1977). Greenhouse-Geisser-corrected statistics are reported, where appropriate. Please note that the same baseline condition was employed as comparison for all five delay duration conditions, similar as in previous studies investigating IB for different conditions (e.g., Haggard, Clark, et al., 2002; Walsh & Haggard, 2013).

Results

A one-way within-subjects ANOVA on IB, with the factor delay duration, revealed that IB decreased for longer delay durations, F(4, 180) = 8.60, p < .001, η2p = .16 (see Fig. 2).

Fig. 2
figure 2

Intentional binding (IB) for the effect depending on effect delay (150 ms, 250 ms, 350 ms, 450 ms, and 650 ms) for Experiment 1, with visual effects and Experiment 2 with auditory effects. IB for the effects is depicted on the y-axis with positive values (compare Method for calculation of IB). Error bars represent standard errors

To assess whether IB for the five different delay durations was significantly different from zero, we conducted separate t tests (see Table 1 in the Appendix). We found significant IB for all delay durations, 150 ms: t(45) = 8.17, M = 53 ms, SE = 6.55, p < .001, d = 1.19; 250 ms: t(45) = 5.54, M = 43 ms, SE = 7.69, p < .001, d = 0.82; 350 ms: t(45) = 4.59, M = 36 ms, SE = 7.76, p < .001, d = 0.68; 450 ms: t(45) = 2.84, M = 26 ms, SE = 9.04, p = .007, d = 0.42; except for the 650 ms delay duration: t(45) = 1.50, M = 11 ms, SE = 7.67, p = .141, d = 0.21.

Single contrast comparisons showed that IB was significantly different between the 150 ms and 650 ms delay duration, MDiff(150 vs. 650 ms) = 41.97, SEDiff(150 vs. 650 ms) = 8.34, p < .001; between the 250 ms and 650 ms delay duration, MDiff(250 vs. 650 ms) = 31.10, SEDiff(250 vs. 650 ms) = 8.04, p = .004; and between the 350 ms and 650 ms delay duration, MDiff(350 vs. 650 ms) = 24.15, SEDiff(350 vs. 650 ms) = 5.61, p = .001. Additionally, IB was marginally different between the 150 ms and 450 ms delay duration, MDiff(150 vs. 450 ms) = 27.75, SEDiff(150 vs. 450 ms) = 10.26, p = .096. The IB differences of all other delay durations were not significant.Footnote 1

Experiment 2: Auditory effects

In Experiment 1, significant IB was observed for visual action effects, and the magnitude of IB depended on the delay duration between the participants’ key press and the visual effect. For longer delay durations, visual IB decreased. In Experiment 2, auditory effects instead of visual effects were employed in order to investigate IB of the same five different delay durations as in Experiment 1 (i.e., 150 ms, 250 ms, 350 ms, 450 ms, and 650 ms). Thus, instead of a color change of the clock face (Experiment 1; i.e., visual effect), an auditory effect tone was presented in Experiment 2. In this classic version of the clock paradigm (Haggard, Clark, et al., 2002), participants were asked to estimate the position of the rotating clock hand when they heard the onset of the tone.

Method

Participants

Fifty-three participants (44 females, mean age = 26.28 years, SD = 8.04, range: 18–63 years, 49 right-handed) were tested and received 8 Euros or course credit for compensation. Prerequisite for participating was normal or corrected-to-normal auditory perception. We intended to collect data of only 48 participants, like in Experiment 1; however, more participants were invited and tested to compensate for potential nonappearance. Thus, in total, 53 individuals participated in Experiment 2.

Apparatus, stimuli, procedure, and data analysis

Apparatus, stimuli, procedure, and data analysis were identical to Experiment 1, except the change of the modality of the action effect (in experimental conditions) and of the externally caused stimulus (in the baseline condition). In Experiment 2, two sinusoidal tones (400 Hz or 800 Hz) were presented for 150 ms by Auna Base DJ 10014216 headphones (due to technical problems, 10 participants had to use different headphones, i.e., Auna ANC-10 10028682). For all participants, the effect tones were mapped contingently to the two response keys in a SMARC compatible manner (Mudd, 1963), that is, left key (i.e., “D”) to 400 Hz, and right key (i.e., “F”) to 800 Hz.

In Experiment 2, for each participant, on average, 2.19% of all trials were excluded from analysis because the temporal differences of the actual and estimated time point of the effect deviated more than ±2.5 SD from the participant’s mean difference in the respective condition (see the Data Analysis section in Experiment 1). Additionally, the data of two participants were excluded due to a significant deviation of their IB from the mean IB of all participants regarding at least one of the delay durations (Tukey, 1977).

Results

A one-way within-subjects ANOVA on IB, with the factor delay duration, revealed that IB decreased for longer delay durations, F(4, 200) = 6.04, p < .001, η2p = .11 (see Fig. 2). To assess whether IB for the five different delay durations was significantly different from zero, we conducted separate t tests (see Table 1 in the Appendix). We found significant IB for all delay durations, 150 ms: t(50) = 12.02, M = 83 ms, SE = 6.93, p < .001, d = 1.68; 250 ms: t(50) = 9.80, M = 79 ms, SE = 8.02, p < .001, d = 1.38; 350 ms: t(50) = 7.24, M = 75 ms, SE = 10.32, p < .001, d = 1.02; 450 ms: t(50) = 6.71, M = 72 ms, SE = 10.68, p < .001, d = 0.94; 650 ms: t(50) = 5.97, M = 44 ms, SE = 7.37, p < .001, d = 0.84.

Single contrast comparisons showed that IB was significantly different between the 150 ms and 650 ms delay duration, MDiff(150 vs. 650 ms) = 39.33, SEDiff(150 vs. 650 ms) = 9.59, p = .002; between the 250 ms and 650 ms delay duration, MDiff(250 vs. 650 ms) = 34.63, SEDiff(250 vs. 650 ms) = 8.65, p = .002; between the 350 ms and 650 ms delay duration, MDiff(350 vs. 650 ms) = 30.75, SEDiff(350 vs. 650 ms) = 7.81, p = .003; and between the 450 ms and 650 ms delay duration, MDiff(450 vs. 650 ms) = 27.72, SEDiff(450 vs. 650 ms) = 6.75, p = .001. The IB differences of all other delay durations were not significant.

A between-subjects comparison of IB in Experiment 1 and Experiment 2 (mixed ANOVA, with experiment as between-subjects factor and the within-subjects factor delay duration) revealed significantly stronger IB in Experiment 2 with auditory effects than in Experiment 1 with visual effects, F(1, 95) = 16.03, p < .001, η2p = .14. Again, the factor delay duration influenced IB significantly, F(4, 380) = 13.46, p < .001, η2p = .12, yet there was no interaction of Experiment × delay duration, F(4, 380) = .56, p > .250, η2p = .01.

General discussion

Recent studies have shown that we perceive the time point of an effect caused by an own action earlier compared to the time point of an external stimulus (e.g., Haering & Kiesel, 2014; Haggard, Poonian, & Walsh, 2009; Obhi & Hall, 2011; Wolpe, Haggard, Siebner, & Rowe, 2013; Wolpe & Rowe, 2014). Yet, to our knowledge, in all previous studies using the clock paradigm to assess IB, the action effects were presented auditorily (but see Moretto et al., 2011; see the Introduction), that is, a tone elicited by the participant’s action was perceived earlier compared to a tone without any eliciting action of the participant (e.g., Ruess, Thomaschke, & Kiesel, 2017b). In the present study, we investigated, whether the perceived time point of a visual action effect is shifted toward the causing action. Additionally, we investigated how IB differs for visual action effects (Experiment 1) in comparison to auditory action effects (Experiment 2). Employing a slightly modified version of the so-called clock paradigm (Haggard, Clark, et al., 2002), we found that, like the IB of auditory action effects, the perceived time point of visual action effects is shifted toward the causing action. Additionally, IB of visual action effects depended on the delay duration of the visual effect, with stronger IB for earlier compared to later visual action effects. This was similar to the IB of auditory action effects, with stronger IB for earlier compared to later auditory action effects. Yet, overall IB was weaker for visual (Experiment 1) in comparison to auditory (Experiment 2) action effects.

The modification of the classic version of the clock paradigm in order to be able to investigate visual IB, and the observed IB for visual action effects in the sense of perceiving own visual action effects earlier compared to external visual stimuli, is an interesting, novel finding. It extends the usability of the clock paradigm and allows for investigation of many subsequent research questions about the temporal perception of visual action effects. This is important because in our everyday life, we are confronted also with innumerable visual stimuli. Especially, for comparing IB of visual and auditory effects and the still debatable mechanisms that might underlie IB (e.g., Fereday & Buehner, 2017; Moore & Obhi, 2012; Wenke & Haggard, 2009) the modified clock paradigm with visual action effects offers new possibilities for future research. An example might be the investigation of common, effect-unspecific and modality-unspecific in comparison to effect-specific and modality-specific underlying mechanisms of IB.

Our results showed that the magnitude of IB for visual effects depends on the delay duration of the effects, with weaker IB for later compared to earlier visual action effects. This is in line with previous research investigating IB with the classic version of the clock paradigm employing auditory action effects (Haggard, Clark, et al., 2002; Ruess, Thomaschke, Haering, et al., 2017; Ruess, Thomaschke, & Kiesel, 2017b). Interestingly, however, by looking at single contrast comparisons for the five different delay durations, we found that the magnitude of IB differed only for delay durations that differed by at least 300 ms (150 ms vs. 650 ms, 250 ms vs. 650 ms, 350 ms vs. 650 ms, and 150 ms vs. 450 ms). These results might indicate some resolution constraints of our subjective time perception mechanisms for visual stimuli. Future research is needed to investigate this issue in more detail.

Comparable to the influence of delay duration on the magnitude of IB of visual effects, the IB of auditory effects in Experiment 2 also depended on the delay duration, with weaker IB for later auditory action effects. On the one hand, this is in line with Haggard, Clark, et al.’s (2002) and Ruess, Thomaschke, Haering, et al.’s (2017) studies showing weaker IB for later auditory action effects. On the other hand, a recent study indicated that IB might initially increase for very short delay durations (between delay durations of about 100 ms to 250 ms) and decrease for longer delay durations (Ruess, Thomaschke, & Kiesel, 2017b). In the present study, we merely observed weaker IB for longer delay durations. Yet, somehow comparable to the IB of visual effects, the single contrast comparisons for auditory action effects revealed significantly different IB only for all short delay durations in comparison to the long delay durations (150 ms vs. 650 ms, 250 m vs. 650 ms, 350 ms vs. 650 ms, 450 ms vs. 650 ms), whereas the magnitude of IB of effects after the different shorter delay durations (150 ms, 250 m, 350 ms, 450 ms) was not significantly different. Thus, these null findings concerning differences in the magnitude of IB for short delay durations do not offer further insights with regard to a potential increase of IB with delay duration in the range of very short delay durations (100 ms to 250 ms) as it was found in a previous study (Ruess, Thomaschke, & Kiesel, 2017b).

Comparing the IB of visual effects (Experiment 1) with the IB of auditory effects (Experiment 2), our results showed that the perceived time point of visual effects is shifted less toward the causing action compared to the perceived time point of auditory effects. IB is often employed as a measure for sense of agency (e.g., Moore, 2016; Moore, Wegner, & Haggard, 2009; Ruess, Thomaschke, & Kiesel, 2017a). Thus, our results might be, cautiously, interpreted in terms of a weaker sense of agency for visual compared to auditory action effects. This may yield important implications for human-machine interfaces. For example, important feedback we receive if we do not succeed in causing an intended effect should be auditory rather than visual. Furthermore, it might be generally helpful for visual action effects, for which it is of great importance to have a strong experience of sense of agency, to add an auditory signal.

However, it is not yet clear from previous literature how close the link between IB and sense of agency is. There have been at least some previous findings casting doubt on direct conclusions from IB on sense of agency (e.g., Buehner, 2012; Buehner & Humphreys, 2009; Dewey & Knoblich, 2014). For example, Buehner (2012) reported that intentional actions are not necessary for IB, and they showed binding not only in conditions with intentional agents that caused the effects but also in conditions with pure mechanical causes of effects (i.e., a machine that caused the effects), implying a more general causal binding. Thus, future studies are needed to clarify if the observed differences in IB for the visual and auditory modality go along with differences in the sense of agency for these modalities.

The reasons for the weaker IB of visual action effects compared to the IB of auditory action effects are not yet identified. One possible reason might be related to separate timing systems for visual and auditory stimuli and different speeds of the visual in comparison to the auditory pacemaker (e.g., Wearden, Edwards, Fakhir, & Percival, 1998). An alternative possibility is that the perceived intensity of the visual color change of the clock hand might have been different compared to the perceived intensity of the auditory effect tone. If this intensity difference of visual and auditory stimuli was influenced by the preceding action in our experimental conditions, this might be an alternative possible explanation for the weaker IB of visual in comparison to auditory effects.

Yet, instead of weaker IB for visual in comparison to auditory action effects, the results might, alternatively, be due to differences between the modified and the classic version of the clock paradigm. Both versions differ in the aspect that the visual clock paradigm (Experiment 1) employs similar modalities of reference object (i.e., visual clock face) and action effect (visual color change of clock face). In contrast, the auditory clock paradigm (Experiment 2) employs different modalities of reference object (i.e., visual clock face) and action effect (auditory effect occurrence). Possibly, this might change the salience for the same-modality stimulus (in the visual clock paradigm) compared to the different-modality stimulus (in the auditory clock paradigm). Such a difference in salience depending on the modality of the effect stimulus in comparison to the reference object might be influenced differently by the eliciting action in the experimental conditions, thereby resulting in the different magnitudes of IB. Consequently, an additional version of the paradigm would be helpful, in which the reference object is auditory (e.g., Repp, 2011) and effects of both, visual and/or auditory modality, are presented.

Possibly, it might be more difficult to accurately estimate the clock hand’s position at color change when the color change is less obvious, allowing for a stronger bias by IB. In the current version of the experiment, the visual IB might have been weaker compared to the auditory IB, because the close perceptual proximity between color change and clock face might have made the temporal comparison so direct that it could hardly be biased in the action conditions. Thus, an alternative interesting modification of the visual clock paradigm would be to limit the visual color change to only an inner part of the clock face (e.g., the clock hand) instead of the color change of the whole clock face (i.e., as in our Experiment 1).

Another potential extension of our study would be the replication of these findings in a direct within-subjects comparison between auditory and visual IB. In such a design, the IB of visual and auditory effects could be investigated in separate blocks, and, additionally, also some multimodal, visuo-auditory effect blocks or blocks with intermixed trials of both effect modalities may be employed. Investigating such multimodal action effects would be very interesting, especially due to the high relevance concerning the ecologic validity for real-life action effects. It may offer important insights as to whether there is a modality-specific superiority effect on IB. Especially, such an investigation could help to get some further cues, why IB was stronger for auditory compared to visual effects.

There are some studies investigating IB with different methods than the so-called clock paradigm (e.g., Engbert, Wohlschläger, & Haggard, 2008; Humphreys & Buehner, 2009; Nolden, Haering, & Kiesel, 2012; Wen et al., 2015). The most prominent alternative method does not ask for perceived time points of action effects (i.e., like in the clock paradigm) but for the perceived duration of the delay between action and effect (e.g., Humphreys & Buehner, 2009). These delay duration estimates are compared to delay duration estimates in conditions where the participant is stimulated passively, and this passive stimulation is followed by an effect. Studies employing this delay duration method showed IB in the sense that a delay between an action that causes an effect was underestimated (i.e., perceived shorter) compared to a delay of similar duration between a passive stimulation and an external stimulus (e.g., Humphreys & Buehner, 2009; Nolden et al., 2012).

Although, the majority of these studies with duration estimates employed auditory effects, there are some studies that investigated the perceived delay duration between action execution and visual effects (e.g., Fereday & Buehner, 2017; Haering & Kiesel, 2014; Nolden et al., 2012; Zhao et al., 2013). These studies also found an underestimation for the delay duration between an action and a visual effect. Thus, some existing research already indicates that visual action effects are subject to IB. However, the two research methods (i.e., time point estimates vs. delay duration estimates) are influenced by and can be explained by completely different underlying mechanisms (e.g., Wenke & Haggard, 2009). Whereas time point estimates could be explained by a subjective shift of the time point of the effect occurrence, duration estimates could be explained by both a shift of the subjective time point of effect occurrence and a slowing of the inner clock after action execution. Such a slowing of the inner clock would, comparable to a shift of the time point of effect occurrence, lead to an underestimation of a delay we caused by our action in comparison to a delay that we did not cause by our action. In fact, recent research indicates that IB can differ depending on which of both measurement methods is employed, that is, time point or delay duration estimates (for a discussion, see, e.g., Ruess, Thomaschke, & Kiesel, 2017b).

Consequently, evidence for IB measured as duration estimates between action and visual effects does not automatically imply that IB would also occur for time point estimates of visual effects. Yet, especially, for investigating the influence of effect modality on IB, the measure of the subjective time point of the effect is of special interest. Thus, it is an important, novel finding that not merely the delay duration between action and visual effect but also the subjective time point of the visual effect itself is subject to IB.

Another possibility to investigate IB is the effect anticipation method (e.g., Buehner & Humphreys, 2009). In a study, Buehner (2012) employed this paradigm and presented LED flashes as visual effects. The LED flashes were either caused by an own action (i.e., self-causal condition), by a machine that was initialized by the participants (i.e., machine-causal condition), or the visual flashes were simply preceded by another predicting (but not causing) visual stimulus (i.e., baseline condition). Participants were asked to press a button at the moment they expected the visual effect to occur. Buehner (2012) observed an earlier prediction for a visual effect in the self-causal and in the machine-causal conditions compared with the baseline condition. Thus, these results of binding of effects of visual modality are in line with our results. Yet the employed anticipation paradigm (Buehner, 2012) may be based on very different mechanisms compared with our employed clock paradigm, where we asked for time point estimates of effect occurrence. In the effect anticipation paradigm, preparation and temporal updating of time estimation is required, whereas the clock paradigm is based on perceiving the effect stimulus and the clock hand’s position in order to, retrospectively, estimate the time point of effect occurrence. Again, it seems that the measurement method to assess IB has to be considered when interpreting IB results, as different underlying mechanisms may be involved in different IB paradigms. Thus, our results are an extension of previous knowledge showing also that the time point of a visual effect is estimated earlier if investigated by the clock paradigm.

Taken together, our results offer a first indication that, like the IB of auditory effects, the perceived time point of visual action effects is shifted toward the causing action (i.e., visual IB). This shift of the subjective time point of visual action effects depends on the delay duration of the effect. Like for auditory action effects, earlier visual action effects showed stronger IB compared to later action effects. Yet, overall, the IB of visual effects was weaker compared to the IB of auditory effects.