Twenty years ago, Aglioti, DeSouza, and Goodale (1995) reported that reach-to-grasp kinematics were unaffected by illusory size differences and interpreted this finding within the framework of Goodale and Milner’s (1992) two-visual-systems hypothesis (TVSH). According to the TVSH, actions resist visual illusions because the visual networks of the dorsal “stream” process objects using absolute metrics, coding them for actions in effector-specific spatial frames of reference. In contrast, perceptual judgments are routinely “fooled” by visual illusions, because the ventral stream processes objects using relative metrics within scene-based reference frames.

In the wake of Aglioti et al.’s (1995) study, many investigators questioned the TVSH interpretation, pointing to possible confounds in Aglioti et al.’s experimental design, such as an attentional mismatch between the “perception” and “action” tasks (e.g., Franz, Gegenfurtner, Bülthoff, & Fahle, 2000); the availability of online visual feedback for the grasps (Haffenden & Goodale, 1998); the hand’s avoidance of the surrounding annulus that induces the Ebbinghaus illusion (Franz, Bülthoff, & Fahle, 2003; Haffenden, Schiff, & Goodale, 2001); differences in the computed functions that relate changes in stimulus size to changes in the measured response, and the effect that these functions have on calculating the magnitude of the illusion bias (Franz, Fahle, Bülthoff, & Gegenfurtner, 2001); and a mismatch in the availability of haptic feedback when the hand makes contact with the target (Haffenden & Goodale, 1998). There is still no consensus on whether this dissociation reflects a fundamental difference in ventral- and dorsal-stream visual processing, as is outlined in the TVSH (for reviews, see Franz & Gegenfurtner, 2008; Westwood & Goodale, 2011).

Here we take a different approach, motivated by the observation that the vast majority of studies in this debate have assumed that illusory effects are stable across time. As Coren and Girgus (1978) discussed, participants show reductions in illusory bias for the Müller-Lyer illusion when tested over the course of consecutive days. Coren and Girgus’s review of studies that tracked the eye movements of participants performing perceptual judgments supports the contention that illusion decrements can be explained by changes in the underlying sampling strategies. Over time, fixations tend to land closer to the physical endpoints of the test lines, rather than the endpoints implied by the illusion. These changes in fixation presumably lead iteratively to changes in the nature of the information that is available when making a judgment about line length.

In the studies reviewed by Coren and Girgus (1978), participants made perceptual judgments; they did not interact physically with the target. Few investigators have asked whether any effects of the illusion on actions might also show a decrement over time. Franz et al. (2001) estimated the effect of the Ebbinghaus illusion on each trial for both judgments of disk size and grasps and found the effects to be stable (though understandably noisy) across both modes of responding. Gonzalez, Ganel, Whitwell, Morrissey, and Goodale (2008) showed that unfamiliar thumb and ring-finger grasps executed with the right hand are initially biased by the illusion, but develop an immunity over three consecutive days of testing. The possibility that the effect of an illusion on action can diminish with practice suggests that motor learning and error minimization might play roles in the resistance of traditional precision grasps to illusory-size bias. Whether or not perceptual judgments can access this process remains an open question.

In the present study, we assessed changes in the strength of the Ponzo illusion over time, both for perceptual judgments and for natural precision grasps. It is possible that the effects of the Ponzo illusion would initially be observable on grip aperture, but that repeated grasping of targets embedded in the display would lead to a rapid reduction in the influence of the illusion, in a manner analogous to that seen with fingertip force scaling in the context of the size–weight illusion (see Buckingham, 2014, for a review).

Method

Participants

We tested 35 participants (M = 20.4 years, SD = 2.3; 20 female) in the main experiment and eight participants (M = 20.3 years, SD = 1.7, four female) in a control study. All of the participants were self-reported right-hand dominant, provided informed consent in accordance with the protocol approved by the University of Western Ontario local ethics committee, and were compensated $10 CAD.

Apparatus

The Ponzo display is depicted in Fig. 1a. A bar appears to be longer when it is positioned where the background edges converge rather than where the background edges diverge. The target bars were 2 mm wide and 4 mm tall and differed in length (48 vs. 50 mm), allowing us to separate the effects of real differences in target length on the two modes of responding (grasps and manual estimations) from the effects of illusory differences in length. Note that the experimenter presented only one of the two bars on any given trial, at one of the two possible locations on the display.

Fig. 1
figure 1

a Ponzo background displays used in the present study. The two orientations depicted were counterbalanced across participants. Note that two bars are included to show the classic Ponzo illusion. In this study, only one bar was presented per trial. b The manual estimation and grasping tasks used in the present study. Participants either indicated the apparent length of the target by displacing their thumb and index fingers while keeping the rest of their hand stationary, or they reached out to pick up the target across its length. Again, only one target bar was presented per trial. c The manual estimation and grasping tasks were timed using goggles equipped with lenses that the experimenter could switch between translucent and transparent states. Participants were instructed to respond when the goggles cleared. The goggles remained open for 2.5 s after participants released the start button. Thus, both tasks were performed in a visual “closed” loop. d The experimental protocol was designed (1) to parse variance in the response into the effect of the real difference in target length and the magnitude of an illusory bias induced by the Ponzo display, and (2) to allow us to operationalize the timing for each of these effects by binning adjacent groups of four consecutive trials for temporal analysis

LCD shutter goggles (PLATO goggles; Translucent Technologies, Toronto, ON, Canada) were used to control participants’ view of the workspace. The workspace itself consisted of a black table with a start button located 5 cm in front of the participant along the midsagittal plane. The kinematic data were collected at 200 Hz using an optoelectronic system (OPTOTRAK 3020; Northern Digital, Waterloo, ON, Canada) that recorded the 3-D locations of three infrared light-emitting diodes (IREDs).

Procedure

Participants were seated comfortably at the table. Using adhesive tape, the experimenter attached one IRED to the inside corner of the participant’s thumb nail, a second IRED to the inside corner of the index-finger nail, and a third to the wrist. The tape did not cover the pads of the fingers, allowing normal tactile feedback when the participants touched the target objects. For the manual estimation task, the participants were asked to keep their thumb and index finger pinched together while using these same fingers to depress a start button at the beginning of each trial. When the goggles cleared, the participants were asked to look at the target bar and then to indicate its apparent length by displacing their thumb and index finger a matching amount (see Fig. 1b). The participants were asked to refrain from reaching toward the stimulus by keeping the edge of their hand firmly on the surface of the table while opening their finger and thumb. Five practice trials were administered using stimuli that were not used in the experiment, on a nonillusion background.

Next, participants practiced reaching to grasp the practice stimuli. Participants were again instructed to begin each trial by keeping their thumb and index finger pinched together on the start button. When the goggles cleared, the participants were asked to view the target bar and to reach out to pick it up along its length (see Fig. 1b). Again, participants practiced this task five times. For both the manual estimation and grasping tasks, the goggles remained clear for 2.5 s following the release of the start button (see Fig. 1c), allowing participants enough time to perform their grasps and manual estimations with full vision.

Experimental design

We employed an ABA design. Specifically, a block of trials in which participants performed manual estimations (A) was followed by a block of trials in which participants performed the grasping task (B), which was then followed by a final block of manual estimation trials (A). In a brief follow-up control experiment designed to test the possibility that a general effect of haptic feedback could account for our principal findings, participants performed a single block of manual estimation trials but were instructed to reach out and pick up the targets immediately after each estimate (e.g., Ganel, Tanzer, & Goodale, 2008).

We counterbalanced the two possible horizontal orientations of the Ponzo display across participants. Furthermore, to minimize the enhancement of the perceptual bias that might occur if two targets were presented, one at each end of the display (see, e.g., Franz et al., 2000), we presented only one target bar per trial.

To quantify changes in the illusory-size effect on the responses across time, we grouped the trials into “bins” of four unique and consecutive trials (see Fig. 1d). The trials in a given bin contained one of each of the four possible combinations of target length and target position (converging vs. diverging) on the Ponzo display. In this way, the responses on at least two trials would contribute to the sample stability of the effects of interest when we examined them on a bin-by-bin basis. Trial orders were constructed by randomly selecting ten of these four-trial bins from the population of the 24 possible four-trial combinations for each participant. In total, 40 grasping trials were bracketed by two blocks of 20 manual estimation trials to complete the ABA format.

Data processing and analysis

The data were analyzed offline with custom software written in MATLAB (Mathworks Inc., Natick, MA, USA). The data from the IREDs were low-pass filtered at 20 Hz using a second-order Butterworth digital filter. Grip aperture was computed as the Euclidean distance between the IRED placed on the thumb and the IRED placed on the index finger, and the instantaneous velocities were computed for each of the three IREDs and for grip aperture. The peak grip aperture was defined as the largest grip aperture within a velocity-based temporal search window designed to capture the forward-reach component of the movement. The manual estimate aperture was defined using a measure of grip stability (grip aperture velocity).

Sensitivity to the real difference in target length was determined by subtracting the mean dependent measures (peak grip aperture or manual estimate aperture) for responses to the 48-mm target from those for responses to the 50-mm target. The magnitude of bias induced by the Ponzo display was determined by subtracting the mean values for responses directed at targets positioned where the Ponzo display diverged (i.e., the illusory “short” condition) from those for responses directed at targets positioned where the Ponzo display converged (i.e., the illusory “long” condition), with positive values reflecting an illusion-based bias.

The effects of target length and the magnitude of the bias induced by the Ponzo display were computed for each task for each participant using session-wise averages. We used paired t tests to determine (1) whether the responses, on the whole, were sensitive to an illusion-based bias and to the real difference in length between the target bars and (2) whether this sensitivity differed between the two tasks. We also examined these effects over a shorter temporal scope by computing the influence of the illusion on the first half (20 trials) and the second half (20 trials) of the block of grasping, to test for a simple change in the effect. We performed a similar test on the manual estimation trials. All statistical tests used an alpha criterion of .05 (two-tailed).

Results

The analysis of the sensitivity of the grasps and the manual estimations to the real 2-mm difference in target length was used to determine whether an analysis of the raw effects of the illusion was justified or required an adjustment for differences in sensitivity. As Fig. 2a indicates, the grasps and the manual estimates each showed sensitivity to the 2-mm difference in target length. Importantly, the sensitivities did not differ significantly from one another, t(34) = 0.5, p = .68. We also found no significant difference in sensitivity for the manual estimations and grasps between the first [t(34) = 0.87, p = .34] and second [t(34) = 1.59, p = .12] halves of the trials. Finally, a between-task comparison of sensitivity within the first and second halves of the trials yielded no significant differences [first half: t(34) = 0.18, p = .86; second half: t(34) = 1.38, p = .18]. Thus, an adjustment to the raw effects was not required, so we report our analysis of the raw effects themselves.

Fig. 2
figure 2

Session-wise averages for the separable effects of a the 2-mm difference in target length on the manual estimations and grasps, and b the magnitude of the illusory bias induced by the Ponzo display on the manual estimations and grasps

As Fig. 2b indicates, the Ponzo display induced a significantly larger overall illusion-based bias in the manual estimations than it did in the grasps, t(34) = 2.6, p < .02. The overall bias on the manual estimations differed significantly from zero, t(34) = 4.94, p < 3 × 10–5. This was not the case, however, for the grasps, t(34) = 1.64, p = .11.

We then tested for a possible difference in the effects of the illusion on the first block of 20 manual estimation trials (Bins 1–5) and the second block of 20 trials (Bins 6–10). As Fig. 3a indicates, no significant difference was found, t(34) = 0.12, p = .91. This was not the case for the grasps. As Fig. 3a also indicates, the grasps were significantly less affected by the illusion in the second than in the first half, t(34) = 2.52, p < .02. Importantly, however, as Fig. 3b shows, the effect of the illusion observed in the first half of the grasping trials was largely due to the first few trials. Indeed, whereas the effect of the illusion in the first bin differed significantly from zero, t(34) = 3.13, p < 4 × 10–3 (corrected α criterion ≤ 5 × 10–3), the remaining nine bins did not.

Fig. 3
figure 3

Temporal analysis of the magnitudes of the illusion induced by the Ponzo display on the manual estimations and the grasps. a The manual estimations fail to show any difference between the first half and the second half of the trials, whereas the grasps show a significantly smaller magnitude of illusory bias in the second half than in the first. b Bin-by-bin analysis of the magnitudes of illusory bias for the manual estimations and grasps. The manual estimations yield no significant evidence for a linear or an exponential trend, whereas the grasps reveal an exponential decay in illusory bias. c The decrements in illusory bias observed in the grasps are largely a function of the changes in the peak grip aperture occurring across the first several grasping trials directed at targets positioned where the edges of the Ponzo display diverge (i.e., where the Ponzo display makes the target bar appear to be shorter than it actually is). These changes in peak grip aperture follow a rational function reasonably well

To quantify this reduction in the effect of the illusion, we modeled the data by fitting a “baseline” linear function, f(x) = bx + a, and a “test” exponential decay function, f(x) = a bx, to the set of ten bins of group means. Both functions entail estimating two parameters, and the exponential function has the added advantage of being monotonic and possessing an asymptote. We used the coefficient of determination (R 2) to identify which of the two functions provided a closer approximation to the data and performed an F test on the proportion of variance explained by the model. Note that neither model can be nested in the other to perform a formal F test to contrast the two. Thus, we adopted a straightforward diagnostic: The function that yielded the higher R 2 provided the better fit.

The exponential decay function yielded a larger coefficient of determination (R 2 = .72, p < 2 × 10–3) than did the linear one (R 2 = .56, p < .02). For completeness, Fig. 3b also indicates the group mean illusion effects across successive bins for the manual estimations. Neither linear nor exponential fits captured significant amounts of variance for Bins 1–5 or Bins 6–10 (all p values > .12). Thus, the overall average adequately describes the variability in the illusory-size effect across the bins of manual estimations.

To take a closer look at the decrements in the effects of the illusion for the grasps, we parsed the effect in each bin into its composite, complementary conditions (illusory “short” and “long” configurations). We reasoned that a decrement in the influence of the illusion would be evidenced by an initial difference in peak grip aperture between the “short” and “long” configurations that would be minimized over the course of several trials as the peak grip apertures converged toward one another. Notably, the difference in peak grip apertures between these two conditions was independent of the difference in the real lengths of the targets.

As Fig. 3c shows, the peak grip aperture for grasps directed at “illusory long” targets remained relatively steady across the bins. Indeed, a linear fit of this data set failed to account for a significant proportion of variability (p = .62). Examination of the peak grip apertures for grasps directed at “illusory short” targets tells a different story (see Fig. 3c). Here, the peak grip aperture increased across successive bins toward a horizontal asymptote, approaching the overall average peak grip aperture for the complementary condition. We decided to fit a “baseline” linear f(x) = bx + a and then a “test” rational function of the form f(x) = bx/(x + a) to this data set. We opted to use this rational function due to its simplicity. Like the linear and exponential functions, only two parameters are estimated for this function, and like the exponential decay function, it is monotonic and possesses an asymptote. We adopted the same diagnostic outlined above: The function that yielded the higher R 2 provided the better fit. The rational function provided a better fit (R 2 = .75, p < 2 × 10–3) than the linear one (R 2 = .58, p < .02).

To determine whether or not sensory feedback (visual and haptic) could account for the reduction in the illusory-size bias, we asked an additional group of eight participants to perform a series of manual estimations. Critically, after each manual estimation was completed, the participants reached out to pick up the targets. The goggles remained open for 5 s following the release button, allowing sufficient time to complete both tasks. If haptic feedback of target size could account for the reduction in the bias in some general way, then we should observe a similar reduction for the manual estimations. Despite the haptic feedback in this task, we still observed a significant effect of the illusion on the manual estimations, M = 2.2 mm, SD = 1.3 mm, t(7) = 4.79, p < 3 × 10–3, which did not differ significantly from that observed in the group that had received no haptic feedback, t(41) = 0.62, p = .54. The Levene’s test detected no violation of homogeneity of variance between the two samples, F(1, 41) = 0.123, p = .73. In short, providing haptic feedback after each manual estimation did not affect the magnitude or variability of the illusion-driven bias.

Discussion

The aims of this study were (1) to chart the time course of the illusory effects of a Ponzo display on visually guided grasping and (2) to determine whether or not the effects of such displays on grasps and manual estimations follow distinct trial-to-trial time courses. We found a significant effect of the illusion on manual estimations that remained relatively stable throughout testing. The grasps, in contrast, showed a significant effect of the Ponzo display initially, but this effect decayed exponentially over the series of iterative acts. These findings suggest different time courses for the effects of the Ponzo display on the two different response modes: Grasps showed a much more rapid decrement in the biasing effect of the Ponzo display than did manual estimations.

We also examined the peak grip aperture of the grasping movements across the series of grasping trials by breaking down the effect of the illusion into its constituent conditions. We reasoned that if the effect of the display on grip aperture was due to the illusion, then the peak grip apertures for the display configurations would differ initially, before converging over the course of several trials. In contrast to this prediction, we found that the peak grip aperture remained relatively constant when the target appeared at the converging edges of the Ponzo display (i.e., the illusory “long” condition). When the target was near the diverging edges (i.e., the illusory “short” condition), the peak grip aperture was smaller for the first few grasping trials and then quickly increased over the course of the subsequent trials to the levels observed for the grasps in the “long” condition.

This asymmetric effects of the Ponzo display on peak grip aperture suggest that the sensorimotor system may have treated the diverging edges as an obstacle. Programming a smaller grip aperture would be an effective way to minimize the probability of collision with obstacles that flank a goal object (Mon-Williams, Tresilian, Coppard, & Carson, 2001; Voudouris, Smeets, & Brenner, 2012). Indeed, these effects have been systematically explored using the Ebbinghaus display (e.g., Haffenden et al., 2001), highlighting the danger of conflating illusory and obstacle-based effects. Here, however, both the illusion-based and obstacle-based effects of the Ponzo display would operate in the same direction. Importantly, regardless of whether the observed effect on the grasps was illusion-based or obstacle-based, our findings strongly suggest that the sensorimotor system treated this bias as an error, minimizing it over the course of several iterative grasps.

Importantly, an error-minimizing account of action (e.g., Shadmehr, Smith, & Krakauer, 2010) can explain the rapid decrease in either an illusion-based or an obstacle-based effect. In the context of reaching and grasping, the sensorimotor system’s modus operandi is to specify parameters of the motor program that result in the hand’s smooth and efficient acquisition of the goal object. Error detection, minimization, and movement updating would be engaged particularly when the grasp aperture underestimates the length of the target bar, because the hand needs to open widely enough for the fingers to approach the opposing sides at angles that are suitable for smooth contact and a subsequent lifting phase.

Error minimization is a critical component in contemporary computational theories of motor control, movement updating, and motor learning. According to these theories, feedforward “inverse” models generate motor commands, whereas “forward” models accept copies of those commands to generate predictions about the sensory feedback from movement execution. A comparison of the expected and observed sensory consequences yields corrective error signals that are incorporated into the next volley of motor commands (Kawato, 1999). Movement updating can occur during the movement itself and/or after the movement is completed (Desmurget & Grafton, 2000; Kawato, 1999). In the present study, both visual and haptic sensory feedback were available, and either source of feedback could have been used offline to help fine-tune the parameterization of the inverse model’s program for more accurate motor commands. Notably, however, feedback and its role in online updating cannot explain our key findings, because online updating had the same opportunity from the outset to mitigate the effect of the display.

In summary, our results show that although the perceptual estimates and grasp apertures are equally sensitive to real differences in target length on initial trials, only the perceptual estimates remain biased by the illusion over repeated measurements. The relative stability of the perceptual bias induced by the display, despite the fact that participants were allowed to grasp the targets after each estimate, suggests that the error minimization for grasps is refractory to conscious access. Taken together, our findings support the notion of a relatively encapsulated sensorimotor system (housed in fronto-parietal areas of the cerebral cortex) that is refractory to conscious access and serviced not only by visual feedback and feedforward information processed in the dorsal stream, but also by the haptic sources of feedback from pathways located in the anterior parietal cortex. Our findings also suggest that at least some of the dorsal stream’s overall immunity to the effects of pictorial displays can be explained by error minimization and movement-updating processes of this system which rapidly refine the programming of visually guided grasping.