Rapid decrement in the effects of the Ponzo display dissociates action and perception
- 1.1k Downloads
It has been demonstrated that pictorial illusions have a smaller influence on grasping than they do on perceptual judgments. Yet to date this work has not considered the reduced influence of an illusion as it is measured repeatedly. Here we studied this decrement in the context of a Ponzo illusion to further characterize the dissociation between vision for perception and for action. Participants first manually estimated the lengths of single targets in a Ponzo display with their thumb and index finger, then actually grasped these targets in another series of trials, and then manually estimated the target lengths again in a final set of trials. The results showed that although the perceptual estimates and grasp apertures were equally sensitive to real differences in target length on the initial trials, only the perceptual estimates remained biased by the illusion over repeated measurements. In contrast, the illusion’s effect on the grasps decreased rapidly, vanishing entirely after only a few trials. Interestingly, a closer examination of the grasp data revealed that this initial effect was driven largely by undersizing the grip aperture for the display configuration in which the target was positioned between the diverging background lines (i.e., when the targets appeared to be shorter than they really were). This asymmetry between grasping apparently shorter and longer targets suggests that the sensorimotor system may initially treat the edges of the configuration as obstacles to be avoided. This finding highlights the sensorimotor system’s ability to rapidly update motor programs through error feedback, manifesting as an immunity to the effects of illusion displays even after only a few trials.
KeywordsPonzo illusion Illusion decrement Action Perception Grasping Sensorimotor Visuomotor Visual feedback Haptic feedback Error minimization Motor learning Two-visual-systems hypothesis
Twenty years ago, Aglioti, DeSouza, and Goodale (1995) reported that reach-to-grasp kinematics were unaffected by illusory size differences and interpreted this finding within the framework of Goodale and Milner’s (1992) two-visual-systems hypothesis (TVSH). According to the TVSH, actions resist visual illusions because the visual networks of the dorsal “stream” process objects using absolute metrics, coding them for actions in effector-specific spatial frames of reference. In contrast, perceptual judgments are routinely “fooled” by visual illusions, because the ventral stream processes objects using relative metrics within scene-based reference frames.
In the wake of Aglioti et al.’s (1995) study, many investigators questioned the TVSH interpretation, pointing to possible confounds in Aglioti et al.’s experimental design, such as an attentional mismatch between the “perception” and “action” tasks (e.g., Franz, Gegenfurtner, Bülthoff, & Fahle, 2000); the availability of online visual feedback for the grasps (Haffenden & Goodale, 1998); the hand’s avoidance of the surrounding annulus that induces the Ebbinghaus illusion (Franz, Bülthoff, & Fahle, 2003; Haffenden, Schiff, & Goodale, 2001); differences in the computed functions that relate changes in stimulus size to changes in the measured response, and the effect that these functions have on calculating the magnitude of the illusion bias (Franz, Fahle, Bülthoff, & Gegenfurtner, 2001); and a mismatch in the availability of haptic feedback when the hand makes contact with the target (Haffenden & Goodale, 1998). There is still no consensus on whether this dissociation reflects a fundamental difference in ventral- and dorsal-stream visual processing, as is outlined in the TVSH (for reviews, see Franz & Gegenfurtner, 2008; Westwood & Goodale, 2011).
Here we take a different approach, motivated by the observation that the vast majority of studies in this debate have assumed that illusory effects are stable across time. As Coren and Girgus (1978) discussed, participants show reductions in illusory bias for the Müller-Lyer illusion when tested over the course of consecutive days. Coren and Girgus’s review of studies that tracked the eye movements of participants performing perceptual judgments supports the contention that illusion decrements can be explained by changes in the underlying sampling strategies. Over time, fixations tend to land closer to the physical endpoints of the test lines, rather than the endpoints implied by the illusion. These changes in fixation presumably lead iteratively to changes in the nature of the information that is available when making a judgment about line length.
In the studies reviewed by Coren and Girgus (1978), participants made perceptual judgments; they did not interact physically with the target. Few investigators have asked whether any effects of the illusion on actions might also show a decrement over time. Franz et al. (2001) estimated the effect of the Ebbinghaus illusion on each trial for both judgments of disk size and grasps and found the effects to be stable (though understandably noisy) across both modes of responding. Gonzalez, Ganel, Whitwell, Morrissey, and Goodale (2008) showed that unfamiliar thumb and ring-finger grasps executed with the right hand are initially biased by the illusion, but develop an immunity over three consecutive days of testing. The possibility that the effect of an illusion on action can diminish with practice suggests that motor learning and error minimization might play roles in the resistance of traditional precision grasps to illusory-size bias. Whether or not perceptual judgments can access this process remains an open question.
In the present study, we assessed changes in the strength of the Ponzo illusion over time, both for perceptual judgments and for natural precision grasps. It is possible that the effects of the Ponzo illusion would initially be observable on grip aperture, but that repeated grasping of targets embedded in the display would lead to a rapid reduction in the influence of the illusion, in a manner analogous to that seen with fingertip force scaling in the context of the size–weight illusion (see Buckingham, 2014, for a review).
We tested 35 participants (M = 20.4 years, SD = 2.3; 20 female) in the main experiment and eight participants (M = 20.3 years, SD = 1.7, four female) in a control study. All of the participants were self-reported right-hand dominant, provided informed consent in accordance with the protocol approved by the University of Western Ontario local ethics committee, and were compensated $10 CAD.
LCD shutter goggles (PLATO goggles; Translucent Technologies, Toronto, ON, Canada) were used to control participants’ view of the workspace. The workspace itself consisted of a black table with a start button located 5 cm in front of the participant along the midsagittal plane. The kinematic data were collected at 200 Hz using an optoelectronic system (OPTOTRAK 3020; Northern Digital, Waterloo, ON, Canada) that recorded the 3-D locations of three infrared light-emitting diodes (IREDs).
Participants were seated comfortably at the table. Using adhesive tape, the experimenter attached one IRED to the inside corner of the participant’s thumb nail, a second IRED to the inside corner of the index-finger nail, and a third to the wrist. The tape did not cover the pads of the fingers, allowing normal tactile feedback when the participants touched the target objects. For the manual estimation task, the participants were asked to keep their thumb and index finger pinched together while using these same fingers to depress a start button at the beginning of each trial. When the goggles cleared, the participants were asked to look at the target bar and then to indicate its apparent length by displacing their thumb and index finger a matching amount (see Fig. 1b). The participants were asked to refrain from reaching toward the stimulus by keeping the edge of their hand firmly on the surface of the table while opening their finger and thumb. Five practice trials were administered using stimuli that were not used in the experiment, on a nonillusion background.
Next, participants practiced reaching to grasp the practice stimuli. Participants were again instructed to begin each trial by keeping their thumb and index finger pinched together on the start button. When the goggles cleared, the participants were asked to view the target bar and to reach out to pick it up along its length (see Fig. 1b). Again, participants practiced this task five times. For both the manual estimation and grasping tasks, the goggles remained clear for 2.5 s following the release of the start button (see Fig. 1c), allowing participants enough time to perform their grasps and manual estimations with full vision.
We employed an ABA design. Specifically, a block of trials in which participants performed manual estimations (A) was followed by a block of trials in which participants performed the grasping task (B), which was then followed by a final block of manual estimation trials (A). In a brief follow-up control experiment designed to test the possibility that a general effect of haptic feedback could account for our principal findings, participants performed a single block of manual estimation trials but were instructed to reach out and pick up the targets immediately after each estimate (e.g., Ganel, Tanzer, & Goodale, 2008).
We counterbalanced the two possible horizontal orientations of the Ponzo display across participants. Furthermore, to minimize the enhancement of the perceptual bias that might occur if two targets were presented, one at each end of the display (see, e.g., Franz et al., 2000), we presented only one target bar per trial.
To quantify changes in the illusory-size effect on the responses across time, we grouped the trials into “bins” of four unique and consecutive trials (see Fig. 1d). The trials in a given bin contained one of each of the four possible combinations of target length and target position (converging vs. diverging) on the Ponzo display. In this way, the responses on at least two trials would contribute to the sample stability of the effects of interest when we examined them on a bin-by-bin basis. Trial orders were constructed by randomly selecting ten of these four-trial bins from the population of the 24 possible four-trial combinations for each participant. In total, 40 grasping trials were bracketed by two blocks of 20 manual estimation trials to complete the ABA format.
Data processing and analysis
The data were analyzed offline with custom software written in MATLAB (Mathworks Inc., Natick, MA, USA). The data from the IREDs were low-pass filtered at 20 Hz using a second-order Butterworth digital filter. Grip aperture was computed as the Euclidean distance between the IRED placed on the thumb and the IRED placed on the index finger, and the instantaneous velocities were computed for each of the three IREDs and for grip aperture. The peak grip aperture was defined as the largest grip aperture within a velocity-based temporal search window designed to capture the forward-reach component of the movement. The manual estimate aperture was defined using a measure of grip stability (grip aperture velocity).
Sensitivity to the real difference in target length was determined by subtracting the mean dependent measures (peak grip aperture or manual estimate aperture) for responses to the 48-mm target from those for responses to the 50-mm target. The magnitude of bias induced by the Ponzo display was determined by subtracting the mean values for responses directed at targets positioned where the Ponzo display diverged (i.e., the illusory “short” condition) from those for responses directed at targets positioned where the Ponzo display converged (i.e., the illusory “long” condition), with positive values reflecting an illusion-based bias.
The effects of target length and the magnitude of the bias induced by the Ponzo display were computed for each task for each participant using session-wise averages. We used paired t tests to determine (1) whether the responses, on the whole, were sensitive to an illusion-based bias and to the real difference in length between the target bars and (2) whether this sensitivity differed between the two tasks. We also examined these effects over a shorter temporal scope by computing the influence of the illusion on the first half (20 trials) and the second half (20 trials) of the block of grasping, to test for a simple change in the effect. We performed a similar test on the manual estimation trials. All statistical tests used an alpha criterion of .05 (two-tailed).
As Fig. 2b indicates, the Ponzo display induced a significantly larger overall illusion-based bias in the manual estimations than it did in the grasps, t(34) = 2.6, p < .02. The overall bias on the manual estimations differed significantly from zero, t(34) = 4.94, p < 3 × 10–5. This was not the case, however, for the grasps, t(34) = 1.64, p = .11.
To quantify this reduction in the effect of the illusion, we modeled the data by fitting a “baseline” linear function, f(x) = bx + a, and a “test” exponential decay function, f(x) = a –bx , to the set of ten bins of group means. Both functions entail estimating two parameters, and the exponential function has the added advantage of being monotonic and possessing an asymptote. We used the coefficient of determination (R 2) to identify which of the two functions provided a closer approximation to the data and performed an F test on the proportion of variance explained by the model. Note that neither model can be nested in the other to perform a formal F test to contrast the two. Thus, we adopted a straightforward diagnostic: The function that yielded the higher R 2 provided the better fit.
The exponential decay function yielded a larger coefficient of determination (R 2 = .72, p < 2 × 10–3) than did the linear one (R 2 = .56, p < .02). For completeness, Fig. 3b also indicates the group mean illusion effects across successive bins for the manual estimations. Neither linear nor exponential fits captured significant amounts of variance for Bins 1–5 or Bins 6–10 (all p values > .12). Thus, the overall average adequately describes the variability in the illusory-size effect across the bins of manual estimations.
To take a closer look at the decrements in the effects of the illusion for the grasps, we parsed the effect in each bin into its composite, complementary conditions (illusory “short” and “long” configurations). We reasoned that a decrement in the influence of the illusion would be evidenced by an initial difference in peak grip aperture between the “short” and “long” configurations that would be minimized over the course of several trials as the peak grip apertures converged toward one another. Notably, the difference in peak grip apertures between these two conditions was independent of the difference in the real lengths of the targets.
As Fig. 3c shows, the peak grip aperture for grasps directed at “illusory long” targets remained relatively steady across the bins. Indeed, a linear fit of this data set failed to account for a significant proportion of variability (p = .62). Examination of the peak grip apertures for grasps directed at “illusory short” targets tells a different story (see Fig. 3c). Here, the peak grip aperture increased across successive bins toward a horizontal asymptote, approaching the overall average peak grip aperture for the complementary condition. We decided to fit a “baseline” linear f(x) = bx + a and then a “test” rational function of the form f(x) = bx/(x + a) to this data set. We opted to use this rational function due to its simplicity. Like the linear and exponential functions, only two parameters are estimated for this function, and like the exponential decay function, it is monotonic and possesses an asymptote. We adopted the same diagnostic outlined above: The function that yielded the higher R 2 provided the better fit. The rational function provided a better fit (R 2 = .75, p < 2 × 10–3) than the linear one (R 2 = .58, p < .02).
To determine whether or not sensory feedback (visual and haptic) could account for the reduction in the illusory-size bias, we asked an additional group of eight participants to perform a series of manual estimations. Critically, after each manual estimation was completed, the participants reached out to pick up the targets. The goggles remained open for 5 s following the release button, allowing sufficient time to complete both tasks. If haptic feedback of target size could account for the reduction in the bias in some general way, then we should observe a similar reduction for the manual estimations. Despite the haptic feedback in this task, we still observed a significant effect of the illusion on the manual estimations, M = 2.2 mm, SD = 1.3 mm, t(7) = 4.79, p < 3 × 10–3, which did not differ significantly from that observed in the group that had received no haptic feedback, t(41) = 0.62, p = .54. The Levene’s test detected no violation of homogeneity of variance between the two samples, F(1, 41) = 0.123, p = .73. In short, providing haptic feedback after each manual estimation did not affect the magnitude or variability of the illusion-driven bias.
The aims of this study were (1) to chart the time course of the illusory effects of a Ponzo display on visually guided grasping and (2) to determine whether or not the effects of such displays on grasps and manual estimations follow distinct trial-to-trial time courses. We found a significant effect of the illusion on manual estimations that remained relatively stable throughout testing. The grasps, in contrast, showed a significant effect of the Ponzo display initially, but this effect decayed exponentially over the series of iterative acts. These findings suggest different time courses for the effects of the Ponzo display on the two different response modes: Grasps showed a much more rapid decrement in the biasing effect of the Ponzo display than did manual estimations.
We also examined the peak grip aperture of the grasping movements across the series of grasping trials by breaking down the effect of the illusion into its constituent conditions. We reasoned that if the effect of the display on grip aperture was due to the illusion, then the peak grip apertures for the display configurations would differ initially, before converging over the course of several trials. In contrast to this prediction, we found that the peak grip aperture remained relatively constant when the target appeared at the converging edges of the Ponzo display (i.e., the illusory “long” condition). When the target was near the diverging edges (i.e., the illusory “short” condition), the peak grip aperture was smaller for the first few grasping trials and then quickly increased over the course of the subsequent trials to the levels observed for the grasps in the “long” condition.
This asymmetric effects of the Ponzo display on peak grip aperture suggest that the sensorimotor system may have treated the diverging edges as an obstacle. Programming a smaller grip aperture would be an effective way to minimize the probability of collision with obstacles that flank a goal object (Mon-Williams, Tresilian, Coppard, & Carson, 2001; Voudouris, Smeets, & Brenner, 2012). Indeed, these effects have been systematically explored using the Ebbinghaus display (e.g., Haffenden et al., 2001), highlighting the danger of conflating illusory and obstacle-based effects. Here, however, both the illusion-based and obstacle-based effects of the Ponzo display would operate in the same direction. Importantly, regardless of whether the observed effect on the grasps was illusion-based or obstacle-based, our findings strongly suggest that the sensorimotor system treated this bias as an error, minimizing it over the course of several iterative grasps.
Importantly, an error-minimizing account of action (e.g., Shadmehr, Smith, & Krakauer, 2010) can explain the rapid decrease in either an illusion-based or an obstacle-based effect. In the context of reaching and grasping, the sensorimotor system’s modus operandi is to specify parameters of the motor program that result in the hand’s smooth and efficient acquisition of the goal object. Error detection, minimization, and movement updating would be engaged particularly when the grasp aperture underestimates the length of the target bar, because the hand needs to open widely enough for the fingers to approach the opposing sides at angles that are suitable for smooth contact and a subsequent lifting phase.
Error minimization is a critical component in contemporary computational theories of motor control, movement updating, and motor learning. According to these theories, feedforward “inverse” models generate motor commands, whereas “forward” models accept copies of those commands to generate predictions about the sensory feedback from movement execution. A comparison of the expected and observed sensory consequences yields corrective error signals that are incorporated into the next volley of motor commands (Kawato, 1999). Movement updating can occur during the movement itself and/or after the movement is completed (Desmurget & Grafton, 2000; Kawato, 1999). In the present study, both visual and haptic sensory feedback were available, and either source of feedback could have been used offline to help fine-tune the parameterization of the inverse model’s program for more accurate motor commands. Notably, however, feedback and its role in online updating cannot explain our key findings, because online updating had the same opportunity from the outset to mitigate the effect of the display.
In summary, our results show that although the perceptual estimates and grasp apertures are equally sensitive to real differences in target length on initial trials, only the perceptual estimates remain biased by the illusion over repeated measurements. The relative stability of the perceptual bias induced by the display, despite the fact that participants were allowed to grasp the targets after each estimate, suggests that the error minimization for grasps is refractory to conscious access. Taken together, our findings support the notion of a relatively encapsulated sensorimotor system (housed in fronto-parietal areas of the cerebral cortex) that is refractory to conscious access and serviced not only by visual feedback and feedforward information processed in the dorsal stream, but also by the haptic sources of feedback from pathways located in the anterior parietal cortex. Our findings also suggest that at least some of the dorsal stream’s overall immunity to the effects of pictorial displays can be explained by error minimization and movement-updating processes of this system which rapidly refine the programming of visually guided grasping.
The work that the authors put into this project was made possible by grants from the Natural Sciences and Engineering Research Council of Canada to M.A.G. and to J.T.E. We thank Stefanie Fortunado and Jessica Mikkila with their help collecting the preliminary data. Preliminary findings were presented at the 12th Annual International Meeting of the Vision Sciences Society (Whitwell et al., 2012).
- Coren, S., & Girgus, J. (1978). Seeing is deceiving: The psychology of visual illusions. Hillsdale, NJ: Erlbaum.Google Scholar
- Gonzalez, C. L. R., Ganel, T., Whitwell, R. L., Morrissey, B., & Goodale, M. A. (2008). Practice makes perfect, but only with the right hand: Sensitivity to perceptual illusions with awkward grasps decreases with practice in the right but not the left hand. Neuropsychologia, 46, 624–631. doi: 10.1016/j.neuropsychologia.2007.09.006 CrossRefPubMedGoogle Scholar