Humans interact with their environment in at least two different ways, according to whether the pattern of behaviour is selected by the agents themselves, or specified by external information.

One type of action is primarily performed in response to environmental demands. Here, the selection of the action is already implied by external information. For example we usually stop at a crossing when the traffic light turns red. Such actions are considered to be under stimulus control. Arbitrary stimulus–action mappings can be established through learning (e.g., Logan, 1988; Thorndike, 1911), or by instructions which specify an intentional set (e.g., Brass, Wenke, Spengler, & Waszak, 2009; De Houwer, Beckers, Vandorpe, & Custers, 2005; Waszak, Wenke, & Brass, 2008; Wenke, Gaschler, & Nattkemper, 2007; Wenke, Gaschler, Nattkemper, & Frensch, 2009). Once stimulus–response (S–R) associations are established, external stimuli trigger their associated re-actions more or less automatically (Hommel, 2000; Woodworth, 1938).

In another type of action, agents themselves select an appropriate action. Their selection normally aims to produce specific environmental effects (Hommel, Müsseler, Aschersleben, & Prinz, 2001; Lotze, 1852; Prinz, 1997), or to satisfy particular needs (Skinner, 1953). For example, someone who remembers a friend’s birthday may select a specific present with the intention of making the friend happy. This class of actions have two distinguishing psychological characteristics: they involve internal generation of the action, and they involve mental representation of the effects of the action.

The brain regions involved in these two types of actions are dissociable at least in part (Goldberg, 1985; also see Waszak, Wascher, Keller, Koch, Aschersleben, & Rosenbaum, 2005). Responding to external stimuli predominantly recruits a circuit involving the parietal lobes and the lateral premotor areas (PMA). In contrast, internal action generation more heavily relies on fronto-median brain areas including the pre-supplementary motor area (pre-SMA) and cingulate motor areas (CMA). Interestingly, different brain systems are not only involved in internal versus external decisions on what to do (Cunnington, Windischberger, Robinson, & Moser, 2006; Dirnberger, Fickel, Lindinger, Lang, & Jahanshahi, 1998; Lau, Rogers, & Passingham, 2006; Müller, Brass, Waszak, & Prinz, 2007), but also in self-initiated versus externally triggered timing of actions (i.e., in the decision regarding when to act; Cunnington, Windischberger, Deecke, & Moser, 2002; Deiber, Manabu, Ibanez, Sadato, & Hallet, 1999; Jahanshahi, Jenkins, Brown, Marsden, Passingham & Brooks, 1995; Jenkins, Jahanshahi, Jueptner, Passingham & Brooks, 2000; Krieghoff, Brass, Prinz, & Waszak, 2009).

The intentional and reactive routes to action necessarily converge on a single motor execution system, which Sherrington (1906) termed the ‘final common path’. Thus, both pre-SMA and premotor cortex initiate movements by their projections to the primary motor cortex. Intentionally selected and externally specified actions therefore have the same kinematics (Jahanshahi et al., 1995). However, surprisingly little is known about how the mode of selection affects the subjective experience of action. Specifically, only a few studies (Haggard, Aschersleben, Gehrke, & Prinz, 2002a; Haggard, Clark, & Kalogeras, 2002b; Haggard & Clark, 2003; Repp & Knoblich, 2007; Sebanz & Lackner, 2007) directly investigated how the mode of action selection affects the experience of being in control of one’s actions and their effects: the so-called sense of agency. These studies appear to be generally consistent with the intuitive prediction that intentional selection of an action produces a more pronounced sense of agency than external specification of the same action.

We measured the experience of action using a temporal binding effect. Temporal binding refers to the finding that actions are perceived to occur later, and their effects (tones) earlier, when voluntary actions are performed in an operant context than in control conditions where the time of actions or tones are judged in isolation (Haggard et al., 2002b). That is, operant actions and their effects are attracted towards each other in perceived time. This temporal attraction effect cannot be explained by mere contiguity-based association of events, because it was absent when two sensory events (i.e., two successive tones) or two motor events (i.e., two successive keypresses) occurred separated by a similar interval (Haggard et al., 2002a). Further, these temporal shifts were reversed when tones followed TMS-induced passive movements rather than voluntary actions (Haggard et al., 2002b). Instead temporal binding seems specific to intentional actions, and has therefore also been termed “intentional binding” (Haggard et al., 2002b).

Aims of the study

So far, most of the studies on action experience (e.g., Haggard et al., 2002b; Haggard & Clark, 2003) manipulated “voluntariness” by comparing active self-initiated movements and passive involuntary movements. Therefore, they do not allow any conclusions regarding how the mode of action selection affects action experience. The current experiment addresses this issue directly by studying active movements under various conditions of action selection within the same task context.

Most previous studies investigating the functional and neuronal basis of selection mode compared internally generated and stimulus-driven action with respect to either action selection (what-dimension; e.g., Herwig, Prinz, & Waszak, 2007; Müller et al., 2007) or action timing (when-dimension; e.g., Jahanshahi et al., 1995). In our study, by contrast, both dimensions varied simultaneously and independently. Participants performed either a left or a right key press (action choice) in either the first or the second of two designated intervals (action timing). Participants made left or right keypresses, either in response to a specific cue, or as they freely chose. Moreover, the time of each keypress could either be explicitly cued to occur in one of two intervals, or subjects freely chose in which interval to act. The task is based on the paradigm introduced by Krieghoff et al. (2009), and is summarized in Fig. 1.

Fig. 1
figure 1

Outline of the procedure. When the clock hand starts rotating a symbolic cue signals which action to perform during the first or the second interval (shaded areas on the clock). The first character of the cue specified the what-dimension, and the second position the when-dimension. In the example, participants are supposed to press the left key during an interval of their choosing (external choice and internal timing). In the operant movement and operant tone conditions, participants’ movements were contingently followed by one of four tones after 250 ms. Participants either judged the onset time of their movements (operant movement) or of the tone (operant tone). In the respective baseline conditions, participants judged either movement or tone onset in isolation. Binding scores were determined by subtracting judgement errors in the baseline conditions from judgement errors in the operant conditions

Limiting the manipulation of timing selection to the choice between two alternative intervals maximizes comparability of the what- and the when-dimensions. However, in our task participants always internally determined the time point for initiating action within the designated interval, independent of whether they internally choose the specific interval, or whether the interval is specified by the cue. Therefore, this task does not compare completely free with completely fixed selections, but instead systematically varies the degree of internal generation required across the two conditions.

Each keypress was followed by a tone. We used four different tones that were contingently mapped to the specific combination of the what and when parameters of an action (i.e., a left keypress during the first interval evoked tone-1, a left keypress during the second interval tone-2, etc.). Note that these action–effect mappings remained the same whether the selection of action was internally generated or externally triggered. That is, each combination of what and when-selection produced a specific outcome, allowing us to directly compare how mode of selection influences effect prediction. Temporal binding was assessed as in previous studies (Haggard et al., 2002b). Participants monitored a rotating clock hand (Libet, Gleason, Wright, & Pearl 1983) while making a movement and/or hearing a tone. They judged the time of movement or tone onset in separate blocks. Comparing the perceived time of movements in blocks where they evoked tones with blocks without tones provides a measure of binding of actions to effects. Similarly, the perceived time of tones in blocks where the participant evoked tones through their actions was compared with blocks where tones occurred without action to measure the binding of effects to actions (see “Methods”; Fig. 1).

This design allowed us to compare several hypotheses regarding how selection mode might affect temporal binding. These hypotheses are summarized in Fig. 2. On one hypothesis (Hypothesis 1), binding should be more pronounced for internally selected than for stimulus-based actions. This view reflects the folk notion of “free will”, which links the sense of control to free selection between alternative actions. If the overall degree of internality of selection determines binding independent of dimension, then most binding should be observed for completely internal actions, least binding for completely stimulus-cued actions, and intermediate levels of binding in mixed selection conditions where either action identity or action timing are specified by the cue, but not both (Fig. 2, 1a). Our design also allowed us to compare whether selecting what action to perform, or when to perform it had the greater effect on action experience. If binding depends on internal selection of what action to perform (e.g., Herwig et al., 2007; Herwig & Waszak, 2009; but see Hommel et al., 2001, for a review of findings that demonstrated action–effect binding for externally chosen actions), then we should observe more binding when participants can freely choose between action alternatives than when the cue specifies the action (Fig. 2, 1b). Conversely, if temporal binding primarily depends on internal timing of actions (e.g., Haggard et al., 2002b), then binding should be more pronounced in the internally than the externally timed conditions (Fig. 2, 1c).

Fig. 2
figure 2

Illustration of the predictions according to the different hypotheses. The x-axis shows the different selection conditions, the y-axis the expected amount of binding. 1 Internality matters: a Degree of internality matters, b Internal choice matters, c Internal timing matters. 2 Selection mode compatibility matters. See text for details

An alternative “selection compatibility” hypothesis (see Fig. 2, 2) predicts that temporal binding might depend on a common mode of selection for what- and when-information. That is, pronounced binding should occur when the same route to action (i.e., either the internally generated route or the stimulus-driven route) controls both action selection and action timing. By the same token, one should observe less temporal binding when the output of the two systems has to be combined in the mixed mode conditions. This possibility is supported by previous reports of interference between internally timed and stimulus-triggered action (Astor-Jack & Haggard, 2005; Obhi & Haggard, 2004). For example, the reaction time to make a simple response to an external trigger stimulus is increased when the internal action system has prepared to make the same action. Such interference does not occur when the same system specifies all parameters of an action.



Twenty-three paid subjects (mean age 28.7 years, SD = 11.01; 10 females) participated in the experiment with local ethics committee approval.

Design and procedure

The procedure of the experiment was based on Haggard et al. (2002b) and summarized in Fig. 1.

Participants rested their left and right index fingers on the response keys (F4 and F9 keys, respectively). They faced a 17 in. screen, showing a small clock face marked at 5 ‘minute’ intervals (Libet et al., 1983; see Haggard et al., 2002b, for details). The two response intervals were signalled by shaded areas on the clock face, each covering 16 ‘minutes’ of the clock face.

The experimenter initiated rotation of a single clock hand (length 12 mm), which moved clockwise with a period of 2,560 ms. When the clock hand started to move a cue was presented simultaneously above and below the clock in 36 pt Arial font for 1,000 ms. The cue consisted of two symbols separated by a blank space. The first symbol indicated that a left response (L), or that a right response (R) was required, or that participants were free to decide whether to make a left or right response (?). The second symbol signalled in which interval the action should be made. ‘1’ and ‘2’ signalled a response during the first and the second interval of the second clock rotation, respectively, whereas a ‘?’ left the choice of the response interval to the participant. There were thus four factorially arranged conditions which differed in how the keypress action was selected, and how the time of action was selected. That is, the selection of action alternative could be internal (internal choice) or stimulus based (external choice). Similarly, the time of action could be selected internally (internal timing) or stimulus-based (external timing).

Depending on condition, participants’ pressed the left or the right key during the first or second interval and/or heard one of four tones presented via loudspeakers connected to the PC. All tones were of 60 ms duration and equal amplitude. However, they differed regarding pitch (450 and 590 Hz), and whether or not white noise was added to the tone. Pitch was mapped to the what-dimension (i.e., left or right keypress), and the noise dimension was associated with action timing (first or second interval). The assignment of tones to specific hand–interval combinations was constant for a given subject throughout the experiment, but was counterbalanced across participants.

The clock hand continued to rotate for a randomly chosen period between 1,500 and 2,500 ms after the subject made their action or heard the auditory tone. On experimental trials, subjects verbally reported the position of the clock hand at which they experienced one of two designated events once the clock had stopped (see Fig. 1). Subjects were encouraged to make their reports as accurate as possible, using unmarked and fractional clock positions as required. The events judged were the time of keypress movement or the onset time of the auditory tone. Participants were told before each block which event to judge.

Each participant performed four different blocks of 80 trials each. Each block contained an equal number of trials in each of the four selection conditions (Internal choice and internal timing, internal choice and external timing, external choice and internal timing, external choice and external timing). Each action alternative and each action interval was required equally often in externally specified trials. Trials within a block were presented in random order.

Two of these four blocks are termed operant blocks, because subjects’ keypresses caused a tone to occur after a delay of 250 ms. The identity of the tone depended on the key and the interval participants actually selected. On trials where the subject was cued to make one action, but made another in error, the tone depended on the action, not on the cue. Subjects judged the time of the keypress in one of these operant blocks and judged the time of tone onset in the other (see Fig. 1).

The remaining two blocks were baseline blocks designed to assess the temporal experience of actions and their effects when these occurred in isolation. Baseline conditions were identical to operant conditions in every way other than that each event (movements, tones) occurred in isolation. In one baseline condition, subjects made key-presses that were not followed by tones. They were instructed to judge the perceived time of action (see Fig. 1). In a second baseline block, they made no action, but judged the onset of a tone (see Fig. 1). Tones in the latter occurred at random latencies approximating the timing of subjects’ actions in the different choice conditions. In stimulus-triggered trials of the baseline block for tone judgements they always heard the tone that was signalled by the cue. In those trials with non-informative cues indicating internal selection (‘?’) each tone appeared equally often. Importantly, each baseline condition involved either actions or their sensory effects, but never the two in conjunction.

Order of blocks was counterbalanced across subjects with the following constraints: (1) The two blocks requiring the same event to be judged (keypress or tone) always followed each other. (2) The order of conditions (baseline, operant) was the same for both judgements for a given subject.

Prior to the time judgment tasks participants worked through a practice block of 32 trials containing 8 trials of each of the four selection modes. Practice trials differed from the time judgment task in the following ways: (a) Participants were instructed to find out which action produced which tone; (b) Participants only experienced the operant context in which their keypresses caused a tone; (c) They received visual error feedback. The error feedback was presented after the tone and informed subjects whether they had pressed the wrong key and whether they had pressed during the wrong interval on stimulus-based trials, or whether they had missed both intervals on any kind of trial; (c) They did not have to judge the timing of any events. They were informed that the practice block would be followed by a memory test for the action-tone assignment.

A memory test followed the practice block in order to ensure that participants had learned the mappings between cues, actions and tones. The memory test consisted of two sections in which each of the four tones was presented twice. Participants either indicated verbally whether a given tone was produced by a left or right key press action, or they indicated whether it was associated with pressing during the first versus second interval. The order of these judgements was randomized. Participants passed the memory test when they made no more than one error per section. If they failed, they were given two further opportunities to repeat the practice and the memory test.

Data analysis

Our analyses focussed on the judged time of events. For each event, the judgement error was defined as the difference between the time that the event was reported to occur using the clock, and the actual time at which it occurred. A positive judgement error indicates a delay in event perception, while a negative judgement error indicates anticipation. First, we determined judgement errors for each combination of the four selection conditions (internal choice/internal timing etc.) and the two events judged (movement, tone). Data are shown separately for operant and baseline blocks in Table 1.

Table 1 Mean judgement errors (standard error across subjects), in milliseconds, as a function of mode of action choice and action timing

We then subtracted each participant’s mean judgment error in each baseline condition from her/his mean judgment error for the same event in the corresponding operant conditions. This gives a measure of binding, for each combination of selection and event.


Four participants (mean age 40.4 years) did not pass the memory test for action-tone assignments after three consecutive runs and were excluded from the experiment. Two further participants made many (>20%) errors on stimulus-based trials during the experiment and were also excluded. Finally, three further subjects were excluded from the analyses because their judgements of event timing using the clock were highly variable from trial to trial. Specifically, the standard deviation of judgements averaged above 100 ms, across all selection conditions and both judged events. The standard deviation of repeated event judgements has been used previously to identify subjects who perform particularly erratically in the cross-modal timing task (Haggard et al., 2002a). Note that this exclusion criterion is independent of both overall mean judgement error, and of differences between conditions in mean judgement error.

The data of the remaining 16 subjects (6 females, mean age 24.3 years, SD = 4.5) were processed as described above. Only “correct” trials were analysed. That is, we excluded a small number of trials in which subjects failed to respond with the keypress or at the time specified by the cue (1.0%). We also excluded trials in action conditions in which participants failed to press a key during the second rotation of the clock (0.9% of all trials).

On internal selection trials participants chose the right key on 63% of the trials, and the left key on 37%, suggesting that they preferred to press the key with their dominant hand. This bias was similar for the judgements in the baseline condition for movements (62%) as well as the operant conditions for movements (65%) and tones (63%). The first and second intervals were internally chosen on 67 and 33% of the trials, respectively, possibly indicating a boredom-related bias against waiting until the second interval. Again, this bias did not differ between baseline movement trials (68%), operant movement trials (69%), and operant tone trials (64%).

Our analyses focussed on (a) binding scores, and (b) on judgement errors in the baseline conditions (see Table 1). For both types of analyses we collapsed the data across left and right hands and across first and second intervals.

Binding scores

Binding scores were analysed in two complementary ways. First, we conducted ANOVAs on binding scores for each judgement type (see “Mean Shifts” in Table 1) in order to establish the overall pattern of results and to ensure that the same pattern held for movement and tone judgements. Second, we set up contrasts for overall binding scores (i.e., the sum of the absolute values of the binding scores in the 2 judgement conditions). Although the contrast analyses correspond to a subset of the data in the ANOVA, we report them because they allow the various hypothesis to be compared directly using comparison contrasts (see below).

Analyses of variance

The omnibus ANOVA with judgement type (movement, tone), mode of action choice (internal choice, external choice), and mode of when-selection (internal timing, external timing) as within-subjects factors yielded a significant main effect of judgement, F(1,15) = 45.68, p < 0.01, MSE = 6,237.14. This arose because the shifts for the two judgements were in opposite directions, in line with the classic binding effect. The perceived time of actions was later (i.e., closer to tones) in operant blocks than in baseline blocks, while the perceived time of tones was earlier (i.e., closer to actions) in operant blocks than in baseline blocks. The only other significant effect was a three-way interaction between the event judged, the mode of action choice, and the mode of action timing on judgement, F(1,15) = 12.99, p < 0.01, MSE = 351.79. Post hoc tests showed that overall binding for completely stimulus-based actions did not differ from binding for entirely internally specified actions, t(15) = 0.18, p > 0.9. Similarly, binding in the two mixed selection mode conditions did not differ from each other, t(15) = −0.27, p > 0.7, suggesting that the interaction arose because binding was stronger when both the action alternative and action timing were selected in the same way, either internally or stimulus-based, than when the two selections were made in different ways. This is explored more formally in the contrast analyses. The main effects of mode of action selection and mode of timing selection, and the two-way interactions did not reach significance (all Fs < 1, p > 0.5).

To explore the three-way interaction further, we performed separate ANOVAs for each judgment type. Note that the overall binding effect corresponds to a positive shift in event judgements for action, and a negative shift for tones. Therefore, the effect of selection condition on binding should operate in different directions for the two events. Indeed, the pattern of binding was essentially the same for movement and tone judgements. The interactions between mode of action choice and mode of action timing were F(1,15) = 7.32, p < 0.05, MSE = 234.03, for judgements of movements, and F(1,15) = 5.00, p < 0.05, MSE = 588.07, for judgements of tones. The main effects of selection mode of action choice and of action timing were again not significant (all Fs < 1, p > 0.5).

Contrast analyses

Contrasts were designed to test specific hypotheses about how mode of selection influences experience of operant action (see Fig. 2). The contrast coefficients used to test the alternative hypotheses are shown in Table 2, and Fig. 3 shows the overall binding scores. Hypothesis 1a (degree of internality matters; see Fig. 2, 1) was modelled by a linear trend. Alternatively, if binding primarily reflects internal choice (Hypothesis 1b; see Fig. 2, 1), or internal action timing (Hypothesis 1c, Fig. 2, 1), then the corresponding main effect contrasts should best explain the data. Finally, if selection mode compatibility (Hypothesis 2, Fig. 2, 2) determines binding, then a quadratic trend across selection conditions should best account for the data.

Table 2 Standardized coefficients for the contrast analyses
Fig. 3
figure 3

Mean overall binding scores (sum of absolute values of shifts for movements and tones in operant conditions) as a function of selection mode (error bars represent standard errors across individuals)

Comparison contrasts that directly tested the different hypotheses against each other were specified by subtracting the standardized coefficients of the contrasts included in the comparison (Rosenthal, Rosnow, & Rubin, 2000; see Table 2). The squared correlation between the coefficients and the means for each choice condition furthermore indicates how much variance can be explained by a given trend.

As indicated by the amount of binding across selection conditions in Fig. 3, and consistent with the patterns of interaction in the omnibus ANOVA, only the quadratic trend testing hypothesis 2 (selection mode compatibility influences binding, Fig. 2, 2) was significant, F(1,15) = 12.56, p < 0.01. This accounted for a large proportion (r 2 = 0.97) of variance between conditions. The linear trend testing hypothesis 1a (more internal selection causes more binding, independent of dimensions; Fig. 2, 1a) did not reach significance, F(1,15) < 1, and did not explain much variance, r 2 = 0.0035. The main effect contrasts testing hypotheses 1b (mode of action choice influences binding; Fig. 2, 1b), F(1,15) < 1, and 1c (mode of action timing influences binding; Fig. 2, 1c), F(1,15) < 1, did not account for much of the variance in the data either (r 2s < 0.022).

Importantly, the comparison contrasts revealed that the fit of the quadratic trend is significantly better than that of the linear trend, F(1,15) = 5.55, p < 0.05. It was also better than the main effect contrasts, for mode of action selection F(1,15) = 7.11, p < 0.05, and mode of timing selection, F(1,15) = 4.55, p < 0.05.

Judgement errors in baseline conditions

The different trial types used cue strings that differed in information content. Different effect tones could be predicted from the informative cues in the stimulus-based conditions, but not from the non-informative cues characteristic for the internal selection conditions. In particular, information in the cue might encourage binding between cue and action (i.e., S–R binding) and binding between cue and effect. Therefore, we wanted to assess whether differences between the cues, rather than any differences in action-related binding processes, could explain the pattern of results across conditions. To this end we ran ANOVAs on baseline judgement errors (see Table 1). Repeated measures ANOVAs with factors of mode of action choice and mode of action timing did not show significant effects, neither on judgements of action in baseline blocks (all Fs < 1.5, p > 0.25), nor on judgements of tones in baseline blocks (all Fs < 2.9, p > 0.11). In short, there was no indication that the information content of the cue encouraged either cue-action binding or cue-tone binding in the baseline conditions of our experiment.Footnote 1 Therefore, the binding found in the operant conditions presumably reflects the influence of action–tone associations on experience, rather than influences of the cues on action experience or tone experience directly.


Our results showed equally strong temporal binding for entirely internally generated actions and for completely stimulus-based actions. In contrast, temporal binding was reduced in the two mixed selection conditions in which just one dimension (either what action to perform or when to perform it) was internally chosen while the other was externally cued. This pattern of results was symmetrical for judgements of actions and of tones. It was moreover specific to action–effect binding, since no systematic differences occurred in baseline blocks in which actions or tones were judged in isolation. In short, we found an effect of selection mode compatibility in temporal binding. Specifically, temporal binding does not reflect internal generation per se (Hypotheses 1), but occurs whenever both action selection and action timing parameters of an action are specified by the same system, either internal or stimulus based (Hypothesis 2; see Forstmann et al., 2008, for similar findings regarding task rule selection).

Internal selection and temporal binding

Our results do not support a special status of internal selection for action experience. We neither observed a linear increase in temporal binding with increasing internality of selection, nor a main effect of mode of action selection or mode of timing selection (see Fig. 2, 1). Instead, our findings revealed a clear quadratic trend, indicating an effect of selection mode compatibility, as outlined above.

These results suggest that action–effect binding is not a direct consequence of the internal generation of voluntary actions. Previous studies (Haggard et al., 2002b; Haggard & Clark, 2003) comparing voluntary self-initiated actions with passive involuntary movements followed by the same perceptual effects did not find binding between passive movements and their perceptual consequences. However, the stimulus-based movements in the present experiment (external choice, external timing) clearly require premotor preparation and initiation of action, while passive movements do not. The effect in the entirely stimulus-based condition of our study suggests that simple responsive actions can lead to temporal action–effect binding if they involve efferent processes related to preparing and initiating active movements. The origin of the intention to act in a particular way (internal or external) seems of minor importance. If binding is taken as an indirect measure of sense of agency (Synofzik, Vosgerau, & Neven, 2008), our result fits with the intuition that we feel agency for actions and events both when we decide to perform them, and when we are instructed to perform them.

Previous experiments suggested that associations between actions and effects are strong for internally generated actions, whereas stimulus-based actions lead to S–R binding (e.g., Haggard et al., 2002a; Keller, Wascher, Prinz, Waszak, Koch, & Rosenbaum, 2006; Waszak et al., 2005). By contrast, there was no evidence for S–R binding in the stimulus-based conditions of our study: cueing a specific action alternative or its timing in baseline blocks did not influence the perceived time compared to internal generation. Instead, symmetrical temporal binding of movements and tones was observed in the operant conditions of both the completely stimulus driven and the entirely internal selection conditions, suggesting comparable action–effect binding in both selection conditions.

Assuming that the different measures of temporal binding used in the different studies assess comparable aspects of action experience, one possible reason for the diverging results concerns the contingencies between cues, actions, and perceptual action effects in the stimulus-based conditions. For instance, in temporal bisection experiments (e.g., Waszak et al., 2005) particular action alternatives (left and right key presses) were either correlated with the preceding stimulus or the ensuing effect. That is, actions were contingently paired with the ensuing effect stimuli only in the internal generation condition. In the stimulus-based conditions actions were correlated with the preceding stimuli, but not with their effects. In contrast, in the present experiment, each combination of action alternative and timing interval contingently produced a specific effect, regardless of how the action was selected. Given that action–effect contingency has been shown to strongly affect temporal binding (Moore & Haggard, 2008; Moore, Lagnado, Deal, & Haggard, 2009), the perfect action–effect contingency in this experiment might have encouraged action–effect binding, while bisection experiments might have discouraged it. In sum, the present findings suggest that stimulus-based actions can lead to action–effect binding comparable to binding that results from entirely internally generated actions, provided that all action parameters are specified by the stimulus-driven system and that specific actions consistently produce particular effects.

Unity of action selection leads to unified action experience

If temporal binding does not reflect internality of choice, what does it capture? We suggest that temporal binding in our experiment reflects the unity of the action programming in the brain. On this view, the internal and the stimulus driven system do not only prepare their “own” actions (Astor-Jack & Haggard, 2005; Obhi & Haggard, 2004), but also generate their own action experience. When the same system specifies both parameters of an action, this leads to a more coherent experience of action, expressed as stronger binding. By contrast, combining information from different systems incurs coordination costs, with a resulting disunity of experience.

Although temporal action–effect binding is quantitatively similar for completely internally generated and entirely stimulus-based actions, the binding could reflect different processes in the two conditions. In particular, binding involves both a preconstructive and a reconstructive component (Moore & Haggard, 2008). Accordingly, both anticipation of the expected effect before it occurs (Greenwald, 1970; Haggard, 2005; Hommel et al., 2001; Prinz, 1997) and retrospective inference based on the fit between actions, experienced effects, and “prior thoughts” (Wegner & Wheatley, 1999; Wegner, 2002) contribute to temporal binding. One possible explanation for our findings is that the different selection conditions in our experiment differentially favour preconstructive and reconstructive binding processes.

On this view, binding between internally generated actions and effects they consistently produce is primarily driven by effect anticipation (Greenwald, 1970; Haggard, 2005; Hommel et al., 2001; Prinz, 1997). In contrast, binding in the stimulus-based selection conditions may mainly rely on reconstructive processes based on the fit between cues, actions, and experienced effects. If this was the case then the reduced temporal binding effect in the mixed selection mode conditions of the present experiment would reflect coordination costs when preconstructive and reconstructive binding processes have to be combined. Such a dual-mechanism account might also explain why some previous studies did not observe a functional role of action–effect bindings in the control of stimulus-based actions (e.g., Herwig et al., 2007).

Clearly, future experiments will need to address more directly the exact nature of the mechanisms underlying binding between stimulus-based actions and their perceptual consequences.

Temporal binding and sense of agency

Temporal binding has been taken as an implicit measure of sense of agency—the experience that “I did this” (Engbert, Wohlschläger, & Haggard, 2008; Haggard et al., 2002b), linking together voluntary (internally generated) actions and their intended effects by way of efferent predictive processes (Haggard, 2005). The current data constrain this view in the following way: first, the large binding effect when both selection of action and selection of timing are cued suggests that internal selection is not necessary for binding, and, by implication, for agency. In other words, in some situations we feel as much in control of our actions and their consequences when we do what we are told to as when we do what we want. This captures the intuition that we retain responsibility for own action even when following instructions (Milgram, 1963).

Second, the coordination costs observed in the mixed selection mode conditions of our experiment indicate that efferent processes such as those involved in initiating an action or in predicting the effect of an action may not be sufficient for binding to occur. We only found pronounced bindings when both the what- and the when-dimension of an action were prepared by the same system, even though the mixed mode conditions clearly also involved efferent processes. Pronounced temporal binding reflects a “coherent, harmonious ongoing flow of action processing” (Synofzik et al., 2008, p. 228). This could occur either for internally generated action, or for actions appropriately linked to the external environment. This could explain why we often do not have a very pronounced sense of agency in everyday situations that require adaptation to external demands (e.g., keeping deadlines) and internal generation (e.g., being creative) at the same time.


Previous research showed that action selection can occur either via internally generated or externally cued routes within the cortical motor systems. However, both routes can produce similar motor output, and similar environmental effects: our experiment shows that a characteristic signature of the experience of action, namely the binding of actions to effects, can equally be produced by either internal or external selection of action parameters. However, the extent of action–effect binding depends on whether or not both selection of an action alternative and the timing of an action are specified within the same system. Most binding was observed for completely stimulus-based and entirely internally generated actions, suggesting that temporal action–effect binding reflects unity of action programming in the human brain.