Adaptation to motor-visual and motor-auditory temporal lags transfer across modalities
- 880 Downloads
Previous research has shown that the timing of a sensor-motor event is recalibrated after a brief exposure to a delayed feedback of a voluntary action (Stetson et al. 2006). Here, we examined whether it is the sensory or motor event that is shifted in time. We compared lag adaption for action-feedback in visuo-motor pairs and audio-motor pairs using an adaptation-test paradigm. Participants were exposed to a constant lag (50 or 150 ms) between their voluntary action (finger tap) and its sensory feedback (flash or tone pip) during an adaptation period (~3 min). Immediately after that, they performed a temporal order judgment (TOJ) task about the tap-feedback test stimulus pairings. The modality of the feedback stimulus was either the same as the adapted one (within-modal) or different (cross-modal). The results showed that the point of subjective simultaneity (PSS) was uniformly shifted in the direction of the exposed lag within and across modalities (motor-visual, motor-auditory). This suggests that the TRE of sensor-motor events is mainly caused by a shift in the motor component.
KeywordsAdaptation Temporal recalibration Voluntary action Vision Audition Temporal order judgment
Studies on multisensory temporal perception have demonstrated that the brain corrects for small temporal asynchronies between the different senses that may arise naturally due to differences in transmission and processing time (Harris et al. 2009; Keetels and Vroomen 2009). Corrections may occur either immediately while a multisensory stimulus is being processed—as demonstrated in ‘temporal ventriloquism’ where an abrupt sound or touch ‘attracts’ the temporal occurrence of a visual flash (Scheier et al. 1999; Morein-Zamir et al. 2003; Vroomen and de Gelder 2004; Vroomen and Keetels 2006, 2009; Keetels et al. 2007; Keetels and Vroomen 2008a)—or on a larger time scale reflecting adaptive changes in synchrony perception (i.e., ‘temporal recalibration’; Fujisaki et al. 2004; Vroomen et al. 2004). Temporal recalibration has originally been demonstrated between vision and audition, but ever since it has been reported to occur in other modalities as well (visuo-tactile or visuo-motor; Sugita and Suzuki 2003; Navarra et al. 2005; Miyazaki et al. 2006; Keetels and Vroomen 2007, 2008b; Hanson et al. 2008; Takahashi et al. 2008; Vatakis et al. 2008; Haggard and Tsakiris 2009). As an example, Vroomen and Keetels exposed participants for 3 min to sound-first or light-first stimulus pairs (a tone and a flash) presented at ~100–200 ms lags. After this exposure phase to delayed flashes or tones, participants performed a temporal order judgment task (TOJ; “which came first, sound or light?”) or a simultaneity judgement task (“Simultaneous or Successive?”) about a sound/light test stimulus. The results showed that the point of subjective simultaneity (the PSS, the relative time at which the two stimuli are perceived as maximally simultaneous) was shifted towards the adapted lag. So, after adaptation to light-first exposure, sound/light stimuli in which the light came slightly before the sound were perceived as synchronous, while after sound-first exposure, sound-first stimuli were perceived as simultaneous.
The mechanism underlying temporal recalibration, though, remains at this point elusive. One option is that only the criterion for simultaneity between the adapted modalities is adjusted. As an example, after exposure to light-first sound/light pairings, participants may change their criterion for audiovisual simultaneity in such a way that light-first stimuli are taken to be simultaneous. On this view, other modality pairings (e.g., vision/touch) would be unaffected and the change in criterion should also not affect unimodal processing of visually and auditorily presented stimuli. Alternatively, it may also be the case that one modality (vision, audition, touch) is ‘shifted’ towards the other, possibly because the sensory threshold for stimulus detection in the adapted modality is changed. For example, as an attempt to perceive simultaneity during light-first exposure, participants might delay processing time in the visual modality by adopting a more stringent criterion for sensory detection. After exposure to light-first audiovisual stimuli, one would then expect slower processing times of visual stimuli in general, and other modalities pairings that involve the visual modality, say vision/touch, should also be affected. Since for the audiovisual case it is a common belief that the auditory system codes temporal information more precisely than the visual (Welch 1978), one might expect that after audiovisual lag adaptation there is a shift of vision towards audition. In line with this prediction, Harrar and Harris (2008) indeed observed that the simple reaction time to a light was increased after exposure to lights-first audiovisual pairings, whereas simple reaction time to a sound or touch was unaffected by this exposure regime. Possibly, then, participants adopted a more stringent criterion for visual detection after light-first exposure. Others, though, did not observe that the threshold for visual stimuli was adjusted, but rather that of sounds. For example, Navarra et al. (2009) exposed participants to vision-first audiovisual asynchronies and reported that participants’ simple reaction time to sounds but, critically, non-visual stimuli were changed, possibly because here the criterion for auditory detection was adjusted.
In an attempt to further examine the mechanism underlying temporal recalibration, Hanson et al. (2008) explored whether a ‘supramodal’ (a general and modality a-specific) mechanism underlies temporal recalibration by examining lag adaptation to audiovisual, audio-tactile and tactile-visual asynchronies. The data showed that a brief period of repeated exposure to ±90 ms asynchrony in any of these pairings resulted in shifts of about 70 ms of the PSS in subsequent TOJ tasks, and that the size of the shifts was similar across the three pairings. This made the authors conclude that there is single mechanism underlying temporal recalibration. Different results, though, were reported by Harrar and Harris (2008). They exposed participants for 5 min to ~100 ms lags of light-first stimuli for the audiovisual case, and touch-first stimuli for the auditory-tactile and visual-tactile case. The expected shift of the PSS in the direction of the exposed lag was only found for audiovisual exposure and audiovisual test stimuli, but no shifts—or a shift in the opposite direction—were found for test stimuli presented in other modalities or for audio-tactile and visual-tactile exposure stimuli. These results might lead one to conclude that there was only a change in criterion for audiovisual simultaneity, as other modality pairings were not affected in the predicted direction. Conflicting results, though, were obtained by Di Luca et al. (2007). They exposed participants to asynchronous audiovisual pairs (~200 ms lags of sound-first and light-first) and measured the PSS for audiovisual, audio-tactile and visual-tactile test stimuli. Besides obtaining a shift in the PSS for audiovisual pairs, the effect was found to generalize to audio-tactile, but not to visual-tactile test pairs, a pattern that made the authors conclude that adaptation resulted in a phenomenal shift of the auditory event. Taken together, it thus appears that some have obtained results compatible with a criterion shift of audiovisual simultaneity, while others obtained results that can be accounted for delays in either the auditory, or the visual modality. Clearly, then, more research is needed to understand the full pattern of results and the way temporal recalibration generalizes across the specific exposure stimuli.
Here, we further examined the mechanism underlying temporal recalibration using a motor task (i.e., tapping) rather than a purely sensory one. A motor task is interesting because active motion of a self-initiated tap not only involves the sensory feedback from the finger that touched a key or a pad, but also the plan of the motor action that is converted into a series of muscle activations which carry out the movement. A copy of that motor command—the so-called efference copy—is available to many parts of the brain long before the actual movement occurs (~250 ms, Libet et al. 1983), and this efference copy might be used to predict the timing of an action and its sensory feedback (Winter et al. 2008). As a first approximation, one might expect the timing of motor actions and their sensory feedback to be rather rigid because there is extra information available about the timing of the motor component and because sensory feedback is normally expected to occur only after motor actions are initiated. In line with this, some have argued that lag adaptation only occurs for the audiovisual case—because the relative arrival times of sound and light vary with distance—, but not for somatosensory stimuli (Miyazaki et al. 2006). Nevertheless, the ability to correctly judge motor-sensory temporal order has been demonstrated to be flexible as well (see also Cunningham et al. 2001; Stetson et al. 2006). As an example, Stetson et al. adapted participants to short delays between self-initiated key presses and subsequently delivered light flashes. After a short exposure phase to delayed flashes, participants performed a TOJ task about a tap and flash test stimulus (tap-first or flash-first?). The results showed that the PSS was shifted towards the adapted lag, consistent with previous reports on audio–visual temporal recalibration (Fujisaki et al. 2004; Vroomen et al. 2004; Keetels and Vroomen 2007; Hanson et al. 2008). In fact, in the most dramatic case, a visual flash presented at an unexpectedly short delay after a finger tap was actually perceived as occurring before the tap, an experience that runs against the law of causality. At present, though, it is still unclear whether the criterion for simultaneity between the two specific stimuli was adjusted, or whether it is the visual or motor component that was shifted towards the other one.
In the present study, we adopted a motor-sensory task to examine the generalization of temporal recalibration across modalities. Participants tapped their finger on a touch pad during an exposure phase for about 3 min. After a delay of either 50 or 150 ms following each tap, either a tone pip or a flash was presented. After exposure to these motor-auditory or motor-visual lags, a motor-visual or motor-auditory test stimulus was presented, and participants judged whether the stimulus had occurred before or after the tap. If lag adaptation affects the criterion of a specific combination of two modalities (i.e., the criterion for motor-visual or motor-auditory simultaneity), there should be no transfer to the other modality. It might also be the case that lag adaptation shifts a specific modality (e.g., a shift in audition, vision, or the motor component). If the auditory modality was shifted (when did the sound occur?), one would expect a shift of the PSS in the motor-auditory test after motor-auditory adaptation, but not for the other combination. Likewise, if only the visual modality were shifted (when did the flash occur?), one would expect a shift of the PSS in the motor-visual test after motor-visual adaptation, but not for the other case. If the motor system adapts (when did I move the finger or touch the pad?), one would expect a uniform transfer of adaptation across the motor-auditory and motor-visual test stimuli, because both involve a motor component.
The three authors and two skilled participants (four male, mean age 34.6) from Tilburg University participated. All had normal hearing and normal or corrected-to-normal seeing. Four of them were right-handed.
Stimuli and apparatus
Participants sat at a desk in a dimly lit and soundproof booth looking at a CRT display at about 65 cm viewing distance. The visual stimulus consisted of a 1-cm white square (9 cd/m2) flashed for 30 ms on a black background (0 cd/m2). The auditory stimulus consisted of a 2,000 Hz pure tone pip (30 ms duration, 2 ms rise/fall slope) presented via headphones (Sony MDR-XD100) at 70 dB(A). White noise was continuously presented via headphones at 59 dB(A) to mask the sound of the taps. A custom-made touch pad was used for detecting the precise timing of the finger taps. The temporal resolution of the response device was about 1 ms as verified on a multiple trace oscilloscope.
There were four within-subjects factors: The adapted modality (motor-visual, motor-auditory), the exposure lag during the adaptation phase (50 ms, 150 ms), the modality of the test stimuli (same or different as adapted), and the stimulus-onset-asynchrony (SOA) between the tap and the test stimulus (0, 50, 100, 150, and 200 ms).1 These specific SOA values were chosen because they covered the range from ‘stimulus clearly before the tap’ to ‘stimulus clearly after the tap’. The whole test consisted of 1,000 trials with 25 repetitions for each of the 40 conditions. The adapted modality, exposure lag, and the modality of the test were all blocked, while the SOA varied randomly in a block of 125 trials. The two exposure lags were split across two consecutive days and counterbalanced for order across participants.
Immediately after adaptation, testing started. A test trial consisted of five “top-up” tap-feedback pairs using the same lag as in the adaptation phase and—after a short delay varying between 850 and 1,250 ms and as signaled by the fixation cross becoming bright—participants made two taps (at an intertap interval of ~750 ms), each accompanied by a critical flash (-or—depending on condition—a tone) presented at one of the five SOAs relative to each tap. Participants then judged whether the two final sound or light stimuli had occurred before or after the two taps. The unspeeded response was made by pressing one of two buttons on a special keyboard with the non-dominant hand. Note that we used two taps rather than a single one as test stimulus because the two ‘shots’ increase sensitivity for temporal order, thus lowering JNDs and reducing noise (Morein-Zamir et al. 2004). After the response, the next top-up/test stimulus was presented. Each block of 125 trials took about 20 min with a short break after 65 trials.
To acquaint participants with the procedures, experimental trials were preceded by a practice session for tapping at a constant pace of ~750 ms. Participants were trained for ~5 min to maintain a constant tap interval as induced via an auditory pacer signal. The intertap interval between two consecutive taps was also shown continuously on the screen, and participants tried to keep it at 750 ms. Practice then continued with TOJ trials in which only the extreme SOAs were presented (0 and 200 ms).
Trials of the training session were excluded from further analysis. Performance on the catch trials in the adaptation phase was completely flawless, except for one participant who missed a single catch trial. Participants were thus indeed looking at the light or listening to the sound during the exposure phase. The average inter-tap interval in the adaptation phase was 672 ms, which was somewhat faster than participants were originally trained on, but there was no correlation between tapping speed and the amount of temporal recalibration (rxy = −0.408, p = 0.50), and tapping speed as such was therefore not further analyzed.
Mean points of subjective simultaneity (PSSs) and just noticeable differences (JNDs) in ms
As is clearly visible, exposure to the 150-ms lag indeed shifted the PSS in the predicted direction if compared to the 50-ms lag and—most importantly—this shift was uniform across conditions. This generalization was confirmed in an ANOVA on the PSSs and JNDs with as within-subjects factors adapted modality, exposure lag, and modality of test. In the ANOVA on the PSSs, only the main effect exposure lag was significant, F (1, 4) = 14.21, p = 0.02 indicating that the PSS was shifted by 29 ms (a 29% shift) in the direction of the lag. The effects of adapted modality, F (1, 4) = 1.68, p = 0.27, modality of the test, F (1, 4) = 1.01, p = 0.37, and all interactions were non-significant.
In the ANOVA on the JNDs, none of the main effects was significant: adapted modality, F (1, 4) = 2.04, p = 0.23, modality of test, F (1, 4) = 1.68, p = 0.27, exposure lag, F(1, 4) = 0.04, p = 0.85. There was a tendency that JNDs were slightly worse in motor-visual adaptation followed by motor-visual test,—possibly reflecting lesser temporal accuracy in the visual system—, but the interaction between the adapted modality and modality of the test was non-significant, F (1, 4) = 1.84, p = 0.25. All other interactions were also non-significant.
Here we demonstrate that exposure to a voluntary action (a finger tap) and a delayed auditory or visual feedback stimulus that is associated with this action induces a shift in the subjective temporal order of both the auditory and visual event. Presumably, temporal delays were adjusted during recalibration so that the two signals moved toward simultaneity because events appearing at a consistent delay after motor actions are interpreted as consequences of those actions. The brain then recalibrates timing judgments to make them consistent with a prior expectation that sensory feedback will follow motor actions without delay. As reported before, flashes at unexpectedly short delays after a finger tap were consistently perceived as occurring before the tap (Stetson et al. 2006). This finding might—in isolation—be explained by assuming that participants had adjusted their criterion for motor-visual simultaneity. However, our study demonstrates that the same phenomenon occurs with tones, and—most importantly—that the effect generalizes across modalities as equivalent shifts were obtained when participants were tested in the same or in a different modality as the adapted one. This pattern of result is most easily explained by assuming that it is the motor system that has been shifted, rather than that the specific criteria for simultaneity were adjusted, or that the visual and auditory modalities were shifted in time. Most likely, participants thus shifted their interpretation about when they moved their finger or when they touched the pad.
At first sight, this may seem quite remarkable if one considers that we experience a strong sense of conscious control when generating self-paced motor actions. Yet, several authors have demonstrated that this sense may be illusory, and that the timing of perceived intentions and actions is quite flexible (Lau et al. 2007; Haggard and Tsakiris 2009). Together with the previously mentioned studies on pure sensory temporal recalibration, it thus seems that the timing of visual, auditory, and motor events are all flexible.
It is of interest to note that JNDs in the present study were relatively small if compared to previous reports on using crossmodal temporal order judgement where JNDs are usually in the order of about 40–80 ms (Keetels and Vroomen in press). Possibly, JNDs were small here because participants were trained and because participants were allowed to give two taps (with two accompanying tones/flashes) rather than a single one. This usually improves sensitivity and reduces noise (Morein-Zamir et al. 2004). More importantly, JNDs were also found to be invariant across modalities and adapted lags. Each of the conditions thus remained equally difficult after lag adaptation. This finding is in contrast with studies that reported that after exposure to asynchronous pairs, there is an increase in the JND rather than a shift in the PSS (Winter et al. 2008; Navarra et al. 2009). It has been argued that this increase in JND is the first stage of temporal recalibration, which may later be followed by a shift in the PSS if the adaptation regime is maintained (Navarra et al. 2009). Our results, though, suggest that the nervous system has the ability to adaptively recalibrate sensory temporal relationships without a discernable loss of sensitivity. This agrees with informal reports from observers who felt that during adaptation, the physically asynchronous stimulus pairs felt close to being perceptually synchronous. The JND data also suggest that this phenomenon is not a product of a loss in sensitivity, but rather that the signals are re-aligned relative to one another.
Further research will be needed to gain a fuller understanding of the mechanisms underlying temporal recalibration. A critical question for future work is how motor-sensory adaptation relates to pure sensory temporal recalibration. One possibility is that motor-sensory recalibration is in fact a purely sensory phenomenon because proprioception (when did I move my finger) or touch (when did my finger hit the pad) rather than the timing of the intention of the self-initiated motor command was adjusted. Another question is the extent to which motor-sensory recalibration depends on the task involved. One possibility is that attention during the exposure phase plays a role. For example, it may be that recalibration becomes even bigger if participants pay attention to the intersensory delay rather than to a unimodal aspect of the stimulus (like detecting a visual or auditory deviant, as in the present case). Previous experience with intersensory timing variability may also be of importance. For example, a delayed feedback signal after a finger tap may in fact be quite natural because humans are exposed to response keys that vary in sensitivity (e.g., it takes about ~25 ms before a stroke on a keyboard is visible as a letter on a computer screen, while there are other buttons—like those of a remote control—that are even slower). There are other examples, though, like hearing oneself speak or seeing oneself move in a mirror for which there is in real life virtually no variability between the movement and the perceptual consequences of that movement. It remains for future research to examine whether in these cases there is flexibility in the system as well.
Note that the actual delay in the 0 ms SOA was not zero due to hardware limitations. The average SOA was ~10 ms for tap-flash, and ~6 ms for the tap-tone condition.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Di Luca M, Machulla T, Ernst MO (2007) Perceived timing across modalities. In: International Intersensory Research Symposium 2007: perception and actionGoogle Scholar
- Haggard P, Tsakiris M (2009) The experience of agency; feelings, judgments, and responsibility. Psychol Sci 18:242–246Google Scholar
- Harris LR, Harrar V, Jaekl PM, Kopinska A (2009) Mechanisms of simultaneity constancy. In: Nijhawan R (ed) Issues of space and time in perception and action. Cambridge University Press (in press)Google Scholar
- Keetels M, Vroomen J (2009) Perception of synchrony between the senses. In: Murray MM, Wallace MT (eds) Frontiers in the neural bases of multisensory processes. Taylor and Francis (in press)Google Scholar
- Morein-Zamir S, Li K, Kingstone A (2004) Looking at the window of perceived simultaneity: repetition, rate, and crossmodal asymmetries. In: The 5th meeting of the International Multisensory Research Forum, Barcelona, Spain (June 2–5), www.imrf.info/2004/131
- Scheier CR, Nijhawan R, Shimojo S (1999) Sound alters visual temporal resolution. Investig Ophthalmol Vis Sci 40:4169Google Scholar
- Welch RB (1978) Perceptual modification: adapting to altered sensory environments. Academic Press, New YorkGoogle Scholar