Experimental Brain Research

, 185:655

Enhancement of response times to bi- and tri-modal sensory stimuli during active movements


    • The Touch Laboratory, Department of Education in Technology and ScienceTechnion - Israel Institute of Technology
    • The Brain-Behavior Research CenterUniversity of Haifa
  • Miriam Reiner
    • The Touch Laboratory, Department of Education in Technology and ScienceTechnion - Israel Institute of Technology
  • Avi Karni
    • The Brain-Behavior Research CenterUniversity of Haifa
Research Article

DOI: 10.1007/s00221-007-1191-x

Cite this article as:
Hecht, D., Reiner, M. & Karni, A. Exp Brain Res (2008) 185: 655. doi:10.1007/s00221-007-1191-x


Simultaneous activation of two sensory modalities can improve perception and enhance performance. This multi-sensory enhancement had been previously observed only in conditions wherein participants were not performing any movement. Since tactile perception is attenuated during active movements, we investigated whether a bi- and a tri-modal enhancement can occur also when participants are presented with tactile stimuli, while engaged in active movements. Participants held a pen-like stylus and performed bidirectional writing-like movements inside a restricted workspace. During these movements participants were given a uni-modal sensory signal (visual––a thin gray line; auditory––a brief sound; haptic––a mechanical resisting force delivered through the stylus) or a bi- or tri-modal combination of these uni-modal signals, and their task was to respond, by pressing a button on the stylus, as soon as any one of these three stimuli was detected. Results showed that a combination of tri-modal signals was detected faster than any of the bi-modal combinations, which in turn were detected faster than any of the uni-modal signals. These facilitations exceeded the “Race model” predictions. A breakdown of the time gained in the bi-modal combinations by hemispace, hands and gender, provide further support for the “inverse effectiveness” principle, as the maximal bi-modal enhancements occurred for the least effective uni-modal responses.


Multi-sensory enhancementTri-modalDetection timeTactile attenuationInverse effectiveness


Many of our daily experiences are multi-modal by nature. For instance, face to face conversations usually involve producing and receiving auditory as well as visual cues––voice, lip movements and gestures. Under noisy conditions, seeing the speaker’s lip movements and gestures can produce a gain in speech perception equivalent to increasing the auditory signal-to-noise ratio by up to 15 decibels (Sumby and Pollack 1954). Similarly, we recognize animals faster when their pictures are presented together with their particular sounds, than only uni-modally––picture or sound (Molholm et al. 2004), and odors are perceived more intensely when they are accompanied with a congruent color or flavor (Zellner and Kautz 1990; Zellner et al. 1991; Dalton et al. 2000). Posture and movement control depend on inputs from the visual, somatosensory and vestibular systems (Mergner and Rosemeier 1998). Thus, our brains constantly combine multiple cues from different sensory channels to form a coherent percept of the events in our surrounding environments.

Experimental observations show that even synthetic single-sensory modality stimuli (e.g., light-flash, white-noise etc.) are detected better if presented in close temporal and spatial proximity with another stimulus from a different sensory modality. This enhancement can occur even when there is no semantic relationship between the signals. For instance, participants tended to rate a weak light as brighter when it was accompanied by a pulse of white noise, but not when the light was presented alone (Stein et al. 1996). The opposite effect, of an irrelevant light enhancing the detectability of a sound was also reported (Lovelace et al. 2003). Reaction times (RT) for detecting a visual signal were slower in the uni-modal conditions, compared to when a sound was heard in close temporal synchrony to the visual signal (Doyle and Snowden 2001; Fort et al. 2002). Similarly, manual and saccadic RTs for detecting a visual signal were faster when an electrical pulse to the finger was delivered simultaneously with the visual signal (Forster et al. 2002; Diederich et al. 2003). Even patients with visual deficits (e.g., hemianopia or neglect) can show an inter-sensory enhancement in the affected hemi-field, with detection of visual stimuli improved when a simultaneous accessory sound is presented at the same spatial location (Frassinetti et al. 2002, 2005).

Can tri-modal combinations of stimuli enhance performance even beyond the bi-modal effects? Tri-modal combinations of auditory, visual and haptic stimuli were investigated in two previous studies. Todd (1912) compared RT in uni-modal stimulation with RT in bi-modal and tri-modal combinations in two experimental paradigms. When participants were instructed to “respond as quickly as possible to any stimulus or stimuli combination” (“redundant target” paradigm) there was no significant difference in RT between the bi-modal and the tri-modal conditions. However, when instructed to “respond only to a specific target modality” (“focused attention” paradigm) bi-modal and tri-modal facilitation effects were found. Recently, Diederich and Colonius (2004) reported a tri-modal enhancement beyond the bi-modal effect also in a redundant target task, i.e., one in which participants responded to any stimulus or stimuli combination.

These tri-modal experiments, however, were conducted with the participants waiting passively for the stimulus or stimuli. Several studies suggest that the ability to detect tactile sensation is attenuated during active movement. The transmission of somato-sensory stimuli is modulated during movement, and the threshold for detecting tactile stimuli during movement increases (Chapman et al. 1987; Post et al. 1994; Duysens et al. 1995). A series of experiments indicated three factors influencing the magnitude of movement-related “gating” of tactile detection: (a) hand movement velocity––faster movements produced a larger gating effect (Angel and Malenka 1982; Schmidt et al. 1990a); (b) stimulus location––attenuation of tactile perception was maximal when the stimulation was delivered to the moving finger/hand and minimal if it was directed to more distant body parts (Schmidt et al. 1990b; Williams et al. 1998); and (c) stimulus intensity––reductions in the proportion of stimuli detected were larger at the lowest stimulus intensities and progressively smaller at higher intensities (Williams and Chapman 2000). A current hypothesis suggests that the signals controlling the attenuation of tactile stimuli during movement originate from both the central and peripheral nervous systems, and thus both bottom-up transmission and cortical processing may be attenuated during movement (Williams and Chapman 2002). The dual-suppression hypothesis is supported by evidence of presynaptic inhibition of cutaneous inputs at the spinal cord afferents during movements (Seki et al. 2003), and evidence that motor cortex output commands for voluntary movements can be delayed by transcranial magnetic stimulation and even in the absence of movements during this delay tactile sensation is attenuated, suggesting that sensory suppression relies also on central signals related to the preparation for movement presumably upstream of the primary motor cortex (Voss et al. 2005). Thus, it is not clear whether bi-modal combinations of haptic-visual and haptic-auditory stimuli can facilitate detection when delivered while participants are engaged in an active movement. Similarly, whether combining audio–visual and haptic stimuli, while participants are engaged in active movements, can enhance detection beyond the bi-modal effects is not known. Therefore, the present study was designed to answer the following questions: (a) Can bi-modal combinations of haptic-visual and haptic–auditory stimulation during active movement speed-up detection despite the movement-related “gating” of tactile perception? (b) Would tri-modal stimulation (haptic-auditory-visual) during an active movement result in an additional facilitation, greater than the bi-modal effect?



Sixteen students participated in the experiment, eight males and eight females (mean age: 23.5 years, ±2.1). They were all right-handed according to the Edinburgh inventory (Oldfield 1971), with normal hearing and normal or corrected to normal vision and without any known tactile dysfunction. Subjects were paid for their participation and gave their consent to be included in this study, and were unaware of the purpose of the experiment, except that it tested eye-hand coordination in different conditions. The experiment was carried out under the guidelines of the Technion’s ethics committee.

Apparatus and stimuli

We used a virtual-reality touch-enabled computer interface capable of providing users with visual, auditory and haptic stimuli. The assembly included a computer screen that was tilted 45° and was reflected on a semitransparent horizontal mirror (Fig. 1a). The participants viewed this reflection from above. A pen-like robotic arm (stylus) gripped and moved as in handwriting or drawing, was placed below the mirror surface and was represented as a stick figure in the visual workspace. Full technical descriptions of this virtual haptic system are available at http://www.reachin.se and http://www.sensable.com.
Fig. 1

a Experimental setup. The visual display from the computer screen was reflected onto the horizontal mirror. Participants held the pen-like stylus in their hand and received through it the haptic sensations. b The rectangular workspace. The rectangle was located, alternately, right or left to the fixation cross. Participants held the stylus and performed upward and downward writing-like movements (broken-line arrow) inside the rectangle space. During these active movements, the computer generated, randomly, either uni-modal, bi-modal or tri-modal sensory stimulations

Participants were presented visually with a green cross (0.5 × 0.5 cm) at the center of their visual field, and a vertical rectangle (16.5 × 3 cm) which constituted the workspace, it’s center positioned 7 cm to the side of the cross (Fig. 1b). On each trial participants held the stylus in their hand and moved it along the vertical axis of the workspace (upward or downward movements), staying inside the rectangle’s vertical borders, and crossing its horizontal borders only in order to initiate the next trial. During every upward or downward movement, the computer generated a sensory stimulation, either uni-modal––visual (V), auditory (A) or haptic (H); bi-modal––a combination of the auditory and visual (AV), the haptic and visual (HV) or the auditory and haptic (AH) stimulations; or tri-modal––a combination of the auditory, haptic and visual (AHV) stimulations. The visual stimulus consisted of a thin, grey, horizontal line (length: 3 cm, width: one pixel) which appeared inside the rectangular workspace. The auditory stimulus was a compound sound pattern of a honk (8 kHz, 42 dB SPL) that was presented from loudspeakers located either left or right of the workspace 60 cm from participants’ ear. The haptic stimulus was a mechanical resisting force (0.35 Newton) delivered through the stylus, a pen-like robotic arm controlled by a programmable engine.


Participants sat comfortably in front of the virtual reality system, directing their eyes at the cross at the center of the display, and with their heads fixed using a forehead and chin rest. Participants were instructed to respond, by pressing a button on the stylus, as soon as they could detect either one of the three stimuli or any of their combinations (a redundant target paradigm). Pressing the response button stopped all stimulus presentation. Although the participants’ movements triggered the stimulations, they could not anticipate the timing or location of the stimulation. To this end, the rectangular workspace was divided horizontally into 13 equal units. These sub-divisions existed only in the programming code but were invisible to the users. On each trial, participants when moving the stylus upward or downward inside the workspace crossed these subdivisions, triggering the stimulation, randomly, after the 5th to 13th crossings. (For example, in the Nth trial, with the stylus moving upward the stimulation would be delivered after crossing the 5th subdivision from the bottom (the starting point). In the N + 1th trial, with the stylus moving downward, the stimulation would be delivered after crossing the 12th unit from top; etc.). In order to initiate the next trial, after responding to the stimulation, the participants were instructed to complete the upward or downward movements, until the workspace’s horizontal boundary was crossed.

Each subject was pre-trained briefly (about 10 trials in each stimulation condition) on the task before data recording began. The experiment was carried out in blocks of 40 trials (20 upward movements and 20 downward movements) each consisting of the same stimulation conditions, with 3 min of rest between blocks. This arrangement was implemented to avoid the possible prolongation of RT to uni-modal stimuli when delivered immediately after multi-modal stimuli (Nickerson 1973). Each participant completed a total of 48 blocks of stimulus conditions: blocks with auditory stimulation, presented uni-modally or in combination (A, AV, AH, AVH), were each performed 8 times, once with each combination of sound source (left or right speaker location), hand (left or right) and workspace location (left of, or right of, fixation); blocks without auditory stimulation (H, V, HV) were each performed 4 times, once with each combination of hand (left or right) and workspace location (left of, or right of, fixation). The order of stimulus-combinations was randomized across subjects. Similarly, the hands performing the movements were changed alternately (the other hand resting freely on the table). The rectangular workspace appeared on the right or left of the fixation cross in a counterbalanced order.


Two temporal alignment methods can be used for synchronizing multi-sensory stimuli. The most common one is stimulus onset synchronization, in which the two (or three) stimuli begin at the same time. Another method is response synchronization, in which the later-detected stimulus (in uni-modal presentations), precedes the earlier-detected stimulus by a time interval that equals the RT difference, so the motor responses are synchronized (e.g., if RTs are 320  and 290 ms to visual and auditory signals respectively, their bi-modal combination is arranged with the onset of the visual signal first, and after 30 ms the auditory signal is delivered). Previous studies found that RTs were shorter and the multi-sensory enhancement was larger with the latter synchronization method (Diederich and Colonius 2004, 1987; Hershenson 1962; Miller 1986). In the current study, the bi- and tri-modal stimuli were therefore response synchronized. This synchronization was implemented for each participant based on his or her individual RT to the different stimuli. Response times were measured, from the initiation of the first stimulus until the participant’s reaction.

Data analysis

Data were analyzed on two levels. First, comparisons were conducted between the mean RT in each bi-modal condition and the mean RT in the shortest of its uni-modal components. Similarly, we compared mean RT in the tri-modal condition with RT in the shortest of the bi-modal conditions. Second, to test whether the “Race model” can account for the observed facilitations, RT distributions analyses were performed and the actual RT distributions in the bi and tri-modal conditions were compared to the distributions of the sum of the corresponding uni-modal conditions.


All stimuli were above threshold levels and there weren’t any failures to detect them. Since RT data for all conditions were collected from the same participants performing repetitive trials of different stimuli-combinations with different hands and in the different hemispaces etc., we analyzed these hierarchical data by the Mixed Models procedure (with the Bonferroni adjustment method for multiple comparisons) using repeated measures analyses with stimulation conditions and trials as within-subject comparisons. Separate analyses were conducted for the bi-modal effects (comparing all uni- and bi-modal combinations; conditions A, V, H, AH, AV, HV), and for the tri-modal effect (comparing all bi-modal combinations with the tri-modal combination; conditions AH, AV, HV, AHV).

Bi-modal enhancement

Statistical analysis indicated a significant main effect for condition [F(5,1131) = 535.66, P < 0.0005]. Mean detection times of the uni-modal stimuli were the longest (401 ± 90 ms, 325 ± 84 ms, 317 ± 74 ms, RT to the visual, haptic and auditory stimulus, respectively). As can be seen in Fig. 2, all three bi-modal combinations were detected faster than any uni-modal condition (249 ± 79 ms, 247 ± 74 ms, 226 ± 63 ms, RT to the audio-visual, haptic-visual, audio-haptic stimulus combinations, respectively).
Fig. 2

Group-averaged response times, in the uni-, bi- and tri-modal stimulations. Each participant was tested in each of the stimulation conditions. Bars RT with standard deviations. Numerals average RTs

Paired comparisons analysis between uni-modal and bi-modal combinations revealed that: (a) when participants were stimulated with a bi-modal combination of auditory and visual signals simultaneously, their RT was significantly faster than the shortest of the corresponding uni-modal components––auditory stimulation [t(1131) = 20.78, P < 0.0005]. (b) When participants were stimulated with a bi-modal combination of haptic and visual signals simultaneously, their RT was significantly faster than the shortest of the corresponding uni-modal components––haptic stimulation [t(1131) = 17.14, P < 0.0005]. (c) When participants were stimulated with a bi-modal combination of auditory and haptic signals simultaneously, their RT was also significantly faster than the shortest corresponding uni-modal component––auditory stimulation [t(1131) = 27.88, P < 0.0005]. Thus, bi-modal stimuli were detected faster than the shortest of their corresponding uni-modal component stimuli.

Tri-modal enhancement

A significant difference between stimulus-combinations was found when the bi-modal and tri-modal stimulation combinations were compared [F(3,877) = 56.28, P < 0.0005]. Mean RT in the tri-modal combination of auditory, haptic and visual signals, was 218 ± 62 ms, faster than the shortest RT to a bi-modal combination (i.e., to auditory and haptic stimulation). Paired comparisons analysis revealed that the difference between these two conditions was significant [t(877) = 3.18, P < 0.009]. Tri-modal stimuli were thus detected faster than the shortest bi-modal stimuli.

Within-subjects analysis

The three bi-modal enhancements were present in all 16 participants, and the tri-modal advantage occurred in 14 participants, indicating that the advantage in detecting multimodal stimuli was characteristic of most of the individual participants’ performances.


Analysis of the RT to uni-modal and multi-modal stimulus-combinations as a function of the tested hemispace, revealed a hemispace effect [F(1,1125) = 6.98, P < 0.008]. The interaction between stimulus-combinations and hemispace in the uni- and bi-modal comparison was also significant [F(5,1125) = 3.19, P = 0.007]. This interaction was not significant in the bi- and tri-modal comparisons. As can be seen in Fig. 3, detection of the haptic signal was significantly faster in the left visual hemispace than in the right visual hemispace [F(1,1125) = 13.36, P < 0.0005].
Fig. 3

Average response times by hemispace


Analysis of the RT to uni-modal and multi-modal stimulus-combinations as a function of the tested hand, revealed clear hand effects, in both, the uni- and bi-modal comparisons [F(1,1125) = 53.06, P < 0.0005] and in the bi- and tri-modal comparisons [F(1,873) = 36.04, P < 0.0005]. The interaction between stimulus-combination and hand in the uni- and bi-modal comparison was significant [F(5,1125) = 2.25, P = 0.047]. This interaction was not significant in the bi- and tri-modal comparisons. As can be seen in Fig. 4, the right hand responses were faster than left hand responses in all stimulus conditions. This difference was significant for the responses to the visual signal [F(1,1125) = 8.18, P = 0.004], to the haptic signal [F(1,1125) = 27.82, P < 0.0005], to the auditory signal [F(1,1125) = 9.08, P = 0.002], to the combination of audio and visual signals [F(1,1125) = 4.22, p = 0.040], to the combination of haptic and audio signals [F(1,1125) = 9.99, P = 0.001] and to the combination of visual, haptic and audio signals [F(1,873) = 23.55, P < 0.0005].
Fig. 4

Average response times by hands


Analysis of the RT to uni-modal and multi-modal stimulus-combinations as a function of the tested gender did not reveal significant gender effects in the uni- and bi-modal comparisons, or in the bi- and tri-modal comparisons. However, the interaction between stimuli-combination and gender was significant, in both, the uni- and bi-modal comparisons [F(5,1126) = 4.91, P < 0.0005], and in the bi- and tri-modal comparisons [F(3,874) = 2.66, P = 0.047]. As can be seen in Fig. 5, males were faster in the V, H, HV combinations and females were faster in all combinations that included an auditory signal (A, AV, AH, AHV), but none of these differences were statistically significant.
Fig. 5

Average response times by gender

Multi-sensory enhancement (MSE)

We further analyzed the gains in response times accrued in the inter-sensory interactions, by calculating the differences between RT in each bi-modal combination and RT in the shortest respective uni-modal component. In a similar way, the tri-modal gain was calculated as the difference between RT to the tri-modal combination and RT to the shortest of the bi-modal combinations. These differences reflect multi-sensory enhancements (MSE), and are summarized in Table 1.
Table 1

Time (mean ± SD in ms) gained from an additional signal in a different modality

Multi-sensory enhancements



68 ± 20



78 ± 29



91 ± 23



9 ± 6

MSE Multi-sensory enhancement, Subscript: the combined modalities; a auditory, v visual, h haptic

A breakdown of these MSE by hemispace, hand and gender revealed a number of significant differences (Table 2). Comparing MSE in males and females indicated a significant gender difference in the enhancement accrued in the bi-modal auditory-visual combination over the uni-modal auditory condition. That is, the absolute gain in the bi-modal combination (MSEav) was 13 ms greater in females than in males [t(1126) = −2.09, P < 0.03].
Table 2

Time (in ms) gained from the additional stimulus for each gender, hand and hemispace

Multi-sensory enhancement (MSE)







































Bold numerals significant difference

There was also a significant hand difference in the enhancement accrued in the bi-modal haptic-visual combination over uni-modal haptic condition (H-HV), as the time gained in the bi-modal combination (MSEhv) was 23 ms greater in the left hand than in the right hand [t(1125) = −2.67, P < 0.007]. Note that the RT differences between the hands in the haptic-visual combination were not statistically significant in the Uni-/Bi-modal comparisons [F(1,1125) = 2.25, P = 0.134] although there was a trend for longer left hand RT in the Bi-/Tri-modal comparisons [F(1,873) = 3.07, P = 0.08]. Nevertheless, the differences in the MSE between hands––i.e., the time gained by the additional stimulus for each hand—were significant. Also, the MSE for the left-hand were larger than the MSE for the right-hand, even though the RTs for the right-hand were shorter than the left-hand RT.

A significant hemispace difference was also found in the enhancement accrued in the bi-modal haptic-visual combination over uni-modal haptic condition (MSEhv(Right hemispace)–MSEhv(Left hemispace)), as the MSE was 21 ms greater when the signals were in the right hemispace than in the left hemispace [t(1125) = 2.32, P < 0.02].

Altogether, our results show that there was a significant effect of hemispace and hand on detection times, with an advantage for the detection of the haptic signal in the left visual hemispace, and an advantage for the right hand. The relative multi-modal gains, however, were higher in the right hemispace and for the left hand. There was no significant effect of the participants’ gender on absolute performance but females gained relatively more from combined stimuli in the auditory and visual modalities.

Direction of movement, and hand-hemispace congruency

In half of the trials, participants were stimulated while they performed upward movements, and in the other half, while performing downward movements. There were no significant differences in RT or MSE between the two movement directions. Also, there were no significant differences in RT or MSE whether the hands were congruent, ipsilateral, (right-hand movements in the right hemispace or left-hand movements in the left hemispace) or incongruent, contralateral, with the hemispace (right hand movements in the left hemispace or left hand movements in the right hemispace).

The “Race model” does not explain the enhancements

To test whether the “Race model” (Raab 1962) can account for the observed facilitations, RT distributions analyses were performed and the actual RT distributions in the bi and tri-modal conditions were compared to the distributions of the sum of the corresponding uni-modal conditions using the method developed for specifically testing the “Race-model” by Miller (1982) and a recently published computer algorithm by Ulrich et al. (2007). The “Race model” suggests that the shorter RT for bi-modal signals may be due to a simple statistical principle, rather than reflect an integration process for multi-sensory signals. According to this model, each stimulus is detected separately and processed in parallel. In trials with bi-modal stimuli, a response is triggered as soon as the first stimulus is detected. Thus, RT is determined by the latency of a single detection process in trials with one stimulus, whereas it is determined by the faster of two stimulus detection processes in trials with bi-modal signals. Because the average time of the winner in a race is usually shorter than the average detection time of each single process, it is expected that RT in trials with bi-modal signals will be faster than in trials with only one stimulus.

Miller (1982) developed a method for testing the “Race model” by calculating the maximal facilitation that can be expected by a race. Given the probability of two arbitrary events E1 and E2 is P(E1U E2) ≤ P(E1) + P(E2), thus, in a race between the signals, the probability for every value of RT in the bi-modal condition can be less than or equal to the sum of the uni-modal signals probabilities. This Race model inequality provides an upper bound on the statistical facilitation that can be produced by a race. By plotting the cumulative density functions (CDF) of the observed RT in the bi-modal condition against the race-model boundary––the CDF of the sum of uni-modal probabilities––one can test the race hypothesis since it requires that for every value of RT, the bi-modal CDF remain within the predicted boundary (i.e., to the right of (or superimposed on) the sum of uni-modal probabilities CDF). If the Race-model inequality is violated, for any value of RT, the race hypothesis can be rejected, and a multi-sensory integration process is assumed.

The results of the probabilities’ analyses are shown in Fig. 6. The distributions of the responses to bi-sensory stimuli exceeded the Race-model predictions by violating the Race-model inequality. Paired t-tests showed that the bi-modal auditory-visual, haptic-visual and auditory-haptic distributions were significantly different from the distributions of the sums of their corresponding uni-modal signals (P < 0.0005) for each pair of percentiles from the 5th through the 75th. Similarly, the probability function of the tri-modal auditory-haptic-visual condition was significantly different from the sum of uni-modal auditory, haptic and visual probability at the 5th through the 75th percentiles (P < 0.01).
Fig. 6

Cumulative density functions (CDFs) representing probability scores of RT for the uni-modal signals (A stars, V circles, H diamonds), the boundary of the “Race model” prediction, i.e., the sum of the corresponding (two or three) uni-modal signals (triangles), and for the actual (measured) multi-modal signals (squares). Each data point on CDFs represents percentiles, starting from the 5th and up to the 95th percentile, in increments of 10%, of the RT distributions. Evidence for multi-sensory integration comes from the violation of the “Race model” inequality requirement that the actual multi-modal CDFs be within the boundary, i.e., to the right of (or superimposed on) the sum of uni-modal CDFs


The results of the current study provide clear evidence that the detection of redundant bi-modal signals is faster than each of their uni-modal components also when the stimuli are presented during active movements. Furthermore, the results showed that RT to tri-modal signals were shorter even than the RT to all of the bi-modal combinations. The occurrence of these inter-sensory enhancements even when there was no meaningful relationship or semantic connection between the two (or three) stimuli, suggests that the mere activation of two sensory modalities at the same time may suffice for perceptual improvement, possibly due to a brief increase in vigilance and attention (Bertelson and Tisseyre 1969; Posner et al. 1973; Sanders 1980). One cannot rule out an effect of the stimulus presentation stagger in the multi-modal conditions. This stagger was on the order of 76 ± 28 ms and 84 ± 31 ms for the haptic and auditory stimuli respectively, relative to the initiation of the visual stimuli, and 8 ± 9 ms for the auditory stimuli relative to the initiation of the haptic stimuli. However, as the RT for detection of multi-modal stimuli were recorded from the onset of the initial uni-modal component stimuli in each of the multi-modal conditions, the staggering of the stimuli per se cannot account for the advantage in detecting multi-modal stimuli without assuming cross-modal or supra-modal effects (e.g., increased vigilance and attention).

Altogether, the tri-modal combination resulted in only a small, albeit statistically significant, shortening of RT. As can be seen from the current results and from the results of a study on passive participants (Diederich and Colonius 2004), the magnitude of the special tri-modal effect was smaller than the bi-modal effect, i.e., MSE were in the range of 60–95 ms in the bi-modal combinations and only 6–12 ms in the tri-modal combination. This may reflect a lower limit of MSE–a “floor effect”. This proposal is testable. One would predict that adding a fourth signal from another sensory system (e.g., olfaction/gustation) would be unlikely to result in further shortening of the RT compared to the tri-modal combination.

The current results cannot be explained by the “Race model” (Raab 1962) but are compatible with the co-activation hypothesis (Smith 1968; Miller 1982). The latter hypothesis states that (a) a (neural) activation is initiated by the sensory experience and builds over time until some criterion for response initiation is reached, and (b) that activations from different sensory channels can combine in satisfying that criterion. Thus, responses for multi-modal signals are faster because multiple processing channels, rather than one, provide the activation buildup towards satisfying a single criterion (Smith 1968).

A more sophisticated explanation that combines Smith’s notion with Bayesian statistics principles was recently proposed (Körding and Wolpert 2006; Rowland et al. 2007; Bays and Wolpert 2007). It is based on the notion that each sensory modality by itself provides the CNS with imperfect and variable sensory inputs. According to Bayesian inference principles the imperfect estimate obtained from one sensory input can be improved by taking into account the probabilities of signals from another sensory modality. Thus, our brain may minimize the uncertainty of imperfect and noisy sensory inputs by combining probabilities of multiple sensory signals to refine sensory estimates, which in turn lead to faster RT to multi-modal signals.

The results indicated a clear hand effect, with the right hand responses faster than left hand responses in all stimulus conditions. The shorter RT of the right hand were expected since all participants were right-handers, however, the results indicated too that haptic stimuli in the left hemispace were detected faster by both hands. The faster detections of the haptic stimulus when it was presented in the left, as compared to the right, hemispace, which was observed in the current study, can be explained by the right hemisphere advantage in trajectory perception (Boulinguez et al. 2003). In an experiment requiring participants to predict whether a ballistic trajectory would coincide with a stationary target, Boulinguez et al. (2003) found shorter RT, in both hands, when the stimulus was presented to the left visual hemifield than to the right visual hemifield. They hypothesized that there is a right hemisphere specialization for trajectory perception and that this hemispheric asymmetry is independent of handedness. The haptic signal, in the current study, was detected during active movements that were confined to a specific path (along the given rectangle). Thus, our findings of shorter RT in the left hemispace, may therefore reflect, in part, a right hemisphere advantage for hand trajectory perception in the type of exploratory movements that were employed by the participants.

Oliveri et al. (1999) used transcranial magnetic stimulation (TMS) as a “virtual lesion” technique to investigate interferences in tactile perception. TMS to the right parietal cortex produced more errors in tactile detection than TMS to the left parietal cortex, not only for contralateral but also for ipsilateral stimuli. This evidence of the right hemisphere greater sensitivity to TMS, led Oliveri et al. (1999) to hypothesize that the larger portion of tactile processing occurs in the right hemisphere. If so, it may be that our findings of faster detection of the haptic signal when presented in the left hemispace reflect, at least partially, a greater involvement of the right hemisphere in tactile detection.

Inverse effectiveness

An important notion that emerged from multi-sensory integration studies is that the maximal bi-modal enhancement is seen for the least effective uni-modal responses––this constitutes the inverse effectiveness principle. Thus, for instance, physiological recordings in the mammalian superior colliculi, showed that the firing rates of multisensory-responsive neurons were greater in bi-modal stimuli conditions compared to those evoked by the most effective uni-modal stimulus (Wallace et al. 1998; Jiang et al. 2001; Perrault et al. 2003). Overall, this enhancement was inversely related to the magnitude of the uni-modal responses, so that combinations of the least effective stimuli––those that were difficult to perceive or identify––showed the most profound enhancements (Wallace et al. 1998; Jiang et al. 2001; Perrault et al. 2003). Similar results were also reported in other multisensory integration sites, including in the primary auditory cortex (Kayser et al. 2005; Ghazanfar et al. 2005).

Such inverse effectiveness has been reported also in human studies. For example, in the Diederich and Colonius (2004) study, the time gained in bi-modal stimulus combinations, compared to uni-modal stimulus conditions, was larger for the low-intensity stimuli, and smaller for the high-intensity stimuli. Similarly, when performance of young and aged individuals was compared, the aged responded to uni-modal signals slower, however the time gained in the bi-modal combination was significantly greater in the aged (Laurienti et al. 2006). Also, the electrophysiological patterns of multisensory integration at the early stage of sensory analysis (before 150 ms post-stimulus) showed that in visually-dominant subjects, only little integration occurred in the visual cortex, but a clear effect was displayed in auditory cortex. Conversely, auditory-dominant subjects, showed early integration effects in the visual cortex but no effect in the auditory cortex, suggesting that multisensory integration induced enhanced neural activity predominantly in the cortex of the “non-dominant” sensory modality (Giard and Peronnet 1999).

All three significant MSE differences reported in the current study (Table 2) are in accordance with, and thus provide further support for, the inverse effectiveness principle. Firstly, more time was gained in the bi-modal haptic-visual combination (compared to the uni-modal haptic signal) in the left hand than in the right hand. This difference in the MSE effectiveness was inversely related to the left hand performance, as left-hand RTs were always slower in our right-handed participants. Second, in the bi-modal haptic-visual combination, the MSE was larger in the right hemispace. Again, as the detection of a haptic signal, presented on its own, was significantly faster in the left hemispace, the results show that MSE effectiveness was larger in the “weaker” hemispace. One can perhaps make the same point for the third MSE difference found in the current study, i.e., the gender difference. The MSE in the bi-modal audio-visual combination was greater for females than for males. The mean RT for the auditory signal was not much different for males and females (317 and 316 ms, respectively) however the RT for the visual stimulus showed a small male advantage (395 and 407 ms, respectively). Thus, the MSE effectiveness was significantly greater in the slow-performing group. Overall, our results suggest that the addition of a second signal, in a different modality, resulted in significant time gains for detection, and that these multi-modality dependent gains were larger for the slowest uni-modal conditions (left hand, females, right hemispace) and smaller in the fastest uni-modal situations (right hand, males, left hemispace).

The current study was undertaken to test whether bi- and tri-modal stimuli-combinations delivered while participants are engaged in active movements, may result in enhanced performance despite the movement-related “gating” of tactile perception. Given the inverse effectiveness principle, it may be that the magnitude (in ms) of the bi- and tri-modal enhancements and the significance of these effects reflect the fact that participants were tested while engaged in active movement––a condition that reduces and “gates” tactile perception. Thus, it may be the case that the more “difficult” haptic discrimination condition imposed in the current study, paradoxically resulted in large multisensory gains. Thus it is not clear, whether these same multi-modal signals would result in as large effects in conditions where participants would experience haptic stimulation in a more passive manner. In conditions affording highly effective tactile perception, the bi- and tri-modal enhancements may be smaller in magnitude.

In sum, the results of the current study indicate that the detection of redundant and semantically-unrelated tri-modal (visual, audio and haptic) signals is faster than the detection of their bi-modal combinations, which in turn is faster than the detection of the corresponding uni-modal components. These enhancements occurred despite the fact that the haptic stimuli were delivered while participants were performing active movements. As our technologies become more sophisticated and combine signals from multiple sensory channels, it is of practical importance to establish, and reassuring to find, enhanced perception when sensory information is delivered in several sensory modalities, even when the haptic component is provided during active exploration. The latter point is crucial because many multi-modal interfaces with haptic technologies are designed for users engaged in manipulation and active limb movements. Our results show that although haptic perception was shown to be inferior during active movement, its combination in multi-modal signals is advantageous. The finding that the maximal bi-modal enhancements occurred for the least effective uni-modal responses are in line with the inverse effectiveness principle. For applicative purposes, this principle may indicate that the value of multi-modal signaling may be more effectively enhanced in cases where a single uni-modal stimulus is less salient.


This research was funded by the EU research project PRESENCCIA––Presence: Research Encompassing Sensory Enhancement, Neuroscience, Cerebral-Computer Interfaces and Applications. We thank Mr. Gad Halevy for programming the computer for the experiment, and Ms. Ayelet Gal-Oz for her help in collecting the data. We also thank Mrs. Tatiana Gelfeld for her help in the race-model analysis.

Copyright information

© Springer-Verlag 2007