An object’s perceived size arises from the interaction and integration of the angular size of the object (as projected on the retina) with numerous contextual cues. Most relevant to the current study, these cues include physical and perceived distance (Berryhill, Fendrich, & Olson, 2009; Boring, 1940; Emmert, 1881; Ponzo, 1911; von Bezold, 1884), and the relative size of different objects in a scene (Cormack, Coren, & Girgus, 1979; Kunnapas, 1955; Mruczek, Blair, Strother, & Caplovitz, 2017b; Roberts, Harris, & Yates, 2005; Rock & Ebenholtz, 1959). These contextual influences are revealed in classic visual illusions that demonstrate size contrast (e.g., Ebbinghaus illusions; see Fig. 1, top right) and size constancy (e.g., the corridor illusion; see Fig. 1, top middle) effects. In this paper, we investigate how dynamic stimulus components influence this cue integration process.

Fig. 1
figure 1

The six unique conditions presented in Experiment 1. The experiment used a 2 × 3 factorial design with context (columns: isolated vs. Ebbinghaus vs. corridor) and motion condition (rows: static vs. dynamic) as factors. The white arrows indicate the direction of motion of a single target stimulus during dynamic trials and were not visible to the participant. Green fixation spots (white in the figure) were used to aid in tracking and fixation, but participants were instructed to freely view the displays for as long as they needed before indicating at which position (upper or lower) the central circle appeared larger

We recently described a novel illusory effect that we term dynamic illusory size contrast, or the DISC effect, which highlights the role of dynamic visual information in modulating the contribution of different cues to perceived size (Mruczek, Blair, & Caplovitz, 2014; Mruczek, Blair, Strother, & Caplovitz, 2017a). In the DISC effect, the viewer perceives a target object to be dramatically shrinking when (1) it is surrounded by an expanding context and (2) there are additional dynamic cues such as eyes movements or target motion. Importantly, the expanding context is necessary but not sufficient by itself to induce an illusory percept. The DISC effect is perhaps best illustrated by the dynamic Ebbinghaus illusion (Mruczek, Blair, Strother, & Caplovitz, 2015). In the dynamic Ebbinghaus illusion, the combination of expanding inducers and target motion yields an illusion that is almost twice as strong as the classic, static Ebbinghaus illusion. However, the expanding inducers alone, in the absence of additional eye movements or target motion, yield an illusion that is only half as strong as the static Ebbinghaus illusion. Thus, the DISC effect represents more than the classic size-contrast illusion played out dynamically over time. Rather, the DISC effect depends critically on the interaction between a size-contrast effect and motion in the retinal image, leading to changes in the relative contribution of different cues (e.g., angular size and relative size) to perceived size.

Considering these empirical observations, we have proposed the precision hypothesis (previously referred to as the uncertainty hypothesis; Mruczek et al., 2014; Mruczek et al., 2015, 2017a). This hypothesis states that the precision of the representation of an object’s angular size will influence how it will interact with representations of the surrounding context. In other words, representational precision of angular size will determine the degree to which contextual effects alter perceived size. In the DISC effect, we hypothesize that the dynamic nature of the target object impairs the brain’s ability to precisely represent the angular size of that object. Hence, other visual cues make stronger contributions to the target’s perceived size. This model is consistent with Bayesian models of cue integration (Angelaki, Gu, & Deangelis, 2011; Kersten, Mamassian, & Yuille, 2004; Knill & Pouget, 2004), in which the reliability of individual cues is proportional to their weight during the integration process at the behavioral level.

Here, we explore the effects of motion dynamics on a different class of size illusions. In size constancy illusions, such as the corridor or Ponzo illusions, two stimuli that have the same angular size but appear to be at different distances are perceived to have different physical sizes. We created a dynamic version of the corridor illusion, in which a single target translates along the corridor background. We directly compared the effects of motion dynamics on the Ebbinghaus and corridor illusions using similar manipulations and across different viewing conditions. To summarize our main result, we replicated our findings for the Ebbinghaus illusion—dynamic target motion led to stronger illusion magnitudes. In contrast, dynamic target motion led to weaker illusion magnitudes for the corridor illusion. Moreover, individuals for whom stimulus dynamics lead to the greatest increase in the magnitude of the Ebbinghaus illusion, tended to have the greatest decrease in the magnitude of the corridor illusion. Our results highlight the diverse effect that motion dynamics can have on static size illusions, raise new hypotheses regarding the mechanisms supporting these effects across different classes of illusions, and place constraints on current neural models of size perception.

Experiment 1

The goal of Experiment 1 was to make a direct comparison between the effects of image dynamics on the Ebbinghaus and corridor illusions in the same set of participants using matched stimulus parameters.

Method

Participants

Twenty participants (two experimenters) completed Experiment 1. To compute the expected statistical power of the experiments reported in this paper, we used G*Power (Version 3.1.9.4; Faul, Erdfelder, Lang, & Buchner, 2007). Based on our previous studies, we anticipated an effect size of approximately d = 1.01 (based on the comparison of the dynamic and static Ebbinghaus illusions in Mruczek et al., 2015). For two-tailed tests at α = .05 and a related-samples design with a sample size of 20, this yields an expected power of .99 in Experiment 1. The observed effect sizes for the complementary comparisons in Experiment 1 were at or above the anticipated effect size (see Experiment 1: Results and Discussion section).

All participants, save the two experimenters themselves, consisted of student volunteers participating in exchange for course credit from Worcester State University. Prior to participating, each observer provided informed written consent. All participants reported normal or corrected-to-normal vision and all participants, except the authors, were naïve to the specific aims and designs of the experiments. All procedures were approved by the Institutional Review Board of Worcester State University.

Apparatus and display

Stimuli were generated and presented using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) for MATLAB (The MathWorks Inc., Natick, MA). The experimental setup used a Dell UltraSharp 1908FP monitor (19 in, 1,280 × 1,024-pixel resolution, 60-Hz refresh rate) driven by a Mac mini computer (2.6 GHz, 8 GB of DDR3 SDRAM) with an Intel Iris graphics processor (1,536 MB). Participants viewed the stimuli binocularly from ~75 cm.

Design and procedure

Experiment 1 used a 2 × 3 factorial design, with context (isolated vs. Ebbinghaus vs. corridor) and motion condition (static vs. dynamic) as factors. Thus, there were six unique conditions: static isolated, dynamic isolated, static Ebbinghaus, dynamic Ebbinghaus, static corridor, and dynamic corridor (see Fig. 1). The inclusion of the isolated conditions allowed us to quantify any response bias or other factors that may influence the perceived size of the target, in the absence of any surrounding context. We first describe the difference between the static and dynamic conditions (for the isolated context; see Fig. 1, left column), and then describe the different contexts.

On static trials (see Fig. 1, top row), two circles were presented, one in the upper right quadrant and one in the lower right quadrant of the monitor. The position of the upper circle was about 4.5° above and 4.65° right of the center of the screen, and the position of the lower circle was about 3.0° below and 1.9° right of the center of the screen. The exact position of the stimuli was jittered on a trial-by-trial basis by up to 0.2° in a random direction to add some uncertainty of the stimulus position with respect to the borders of the monitor. The two circles were 8° apart along a direction 70° from vertical. The upper circle was deemed the “standard” and was always 1.5° ± 0.03° in diameter. The lower circle was deemed the “target” and its size on each trial was controlled by an adaptive staircase procedure (see below). A small green dot (0.1° in diameter), to aid in fixation, was superimposed on both the standard and the target. The participants’ task was to indicate with a keyboard press which of the two circles appeared to be larger (“O” for the upper circle, or “M” for the lower circle). The static display remained until the participant made a response; there was no time limit on any trial.

The stimulus parameters and task on dynamic trials were chosen to match and compliment those of static trials. On dynamic trials (see Fig. 1, bottom row), a single circle was presented and dynamically translated throughout the trial. A small green dot (0.1° in diameter), to aid in fixation, was superimposed on the circle. The circle always started in the upper right quadrant and translated at a constant speed (8°/s) between its starting position and a point in the lower right quadrant that was 8° away along a direction 70° from vertical. Thus, the end points of the translating target in the dynamic conditions matched the position of the standard and target circles in the static conditions. In addition to translating, the circle also smoothly changed size between its starting and end point positions. In the upper position, the circle was deemed the “standard” and was always 1.5° ± 0.03° in diameter. In the lower position, at the end point of its translation vector, the circle was deemed the “target” and its size on each trial was controlled by an adaptive staircase procedure (see below). The size of the circle was changed smoothly from the standard position to the target position as the circle translated. As in static trials, the participants’ task was to indicate with a keyboard press where the circle appeared to be larger (“O” for the upper position, or “M” for the lower position). The circle continued to translate and change size back and forth between the two positions until the participant made a response.

Three different contexts were used for both static and dynamic trials. In the isolated context (described above), there were no additional stimulus elements beyond those described above. Next, we describe the additional contextual elements that were present on corridor (Fig. 1, middle column) and Ebbinghaus (Fig. 1, right column) trials. As on isolated trials, the stimulus display was present until the participant made a response.

For the static corridor condition, the standard and target circles were superimposed on a background image of a corridor. The background provided linear perspective and shading cues consistent with the upper circle being positioned farther away from, and the lower circle being positioned closer to, the participant. If the standard and target were the exact same physical sizes, this corridor configuration would be expected to lead to the upper standard circle being perceived as larger than the lower target circle (i.e., the classic corridor illusion). As on isolated trials, the positions of the target circles were slightly jittered; however, the position of the background image was not. Thus, the positions of the circles were slightly shifted relative to the corridor background on each trial. Participants indicated which circle, upper or lower, appeared larger.

For the dynamic corridor condition, a single circle appeared superimposed on the same background image of a corridor as in the static corridor condition and followed the same trajectory as described in the isolated dynamic condition above. The translating circle appeared to traverse the corridor from back to front, and then reverse directions from front to back. Participants indicated whether the circle appeared larger when it was in the upper or lower position.

For the static Ebbinghaus condition, both the standard and target circles were surrounded by six equally-spaced circles or “inducers.” To match the perceptual effects of the corridor illusion, the standard, which was always positioned in the upper position, was surrounded by small-and-close inducers (0.53° diameter, 1.13° eccentricity) and the target, which was always positioned in the lower position, was surrounded by large-and-far inducers (3.19° diameter, 3.75° eccentricity). If the standard and target were the exact same physical sizes, this Ebbinghaus configuration would be expected to lead to the upper standard circle being perceived as larger than the lower target circle (as is the case for the corridor configuration described above). As on isolated trials, participants indicated which circle, upper or lower, appeared larger.

For the dynamic Ebbinghaus condition, the circle was surrounded by six equally spaced circles or inducers. The size of the inducers changed smoothly during the translation of the stimulus, from small and close at the upper starting position to the large and far at the lower position. As on isolated trials, participants indicated at which position in its translation, upper or lower, the circle appeared to be larger.

For each condition, the participants made perceptual judgments regarding the perceived size of two circles (static trials) or of one dynamically changing circle (dynamic trials). As stated above, the size of the standard circle (i.e., the upper circle on static trials or the starting position of the circle on dynamic trials) on all trials was 1.5° ± 0.03° in diameter. In contrast, the size of the target circle (i.e., the lower circle on static trials or the lower position of the circle on dynamic trials) on each trial was controlled by an adaptive staircase procedure and depended on the participant’s previous response pattern. The size target was selected from one of 29 equally spaced values from 0.32° to 4.20° (i.e., 78.4% smaller than to 180% larger than the standard). The upper and lower limits of this range were selected to maximize the range of target sizes while also making sure that the smallest target was never smaller than the green fixation point or large enough to touch the large-and-far inducers. For each of the six unique conditions, there were four pseudorandomly interleaved staircases, with half starting from the minimum target size and half starting from the maximum target size. Each staircase continued for 15 trials. For each staircase, independently, the target size on subsequent trials was adjusted based on the participant’s response on the previous trial of that staircase. If the participant indicated that the target circle was smaller than the standard, then the next trial in that staircase sequence would utilize a target with a larger size relative to the standard. During the first five trials of each staircase, the target size was shifted by five steps along the 29 possible values. During the middle five trials, the target size was shifted by two steps, and for the final five trials, the target size was shifted by one step. This method allowed us to sample the majority of trials near the perceptual point of equality for the standard and target, with relatively fewer samples taken for extreme differences.

Each participant in Experiment 1 completed two sessions on different days. One participant’s data yielded extremely poor psychometric curve fits for one of the two sessions, and we limited our analysis to a single session’s data for this participant. This participant completed 60 trials for each condition, for a grand total of 360 trials (1 session × 6 conditions × 4 interleaved staircases/condition × 15 trials/staircase). The other 19 participants completed 120 trials for each condition, for a grand total of 720 trials (2 sessions × 6 conditions × 4 interleaved staircases/condition × 15 trials/staircase). During each session, participants were given self-paced breaks every 36 trials (i.e., every 10% of the session total).

Data analysis

For all experiments, we analyzed the data using standard parametric statistical tests, although we note that nonparametric alternatives (i.e., permutation tests) yielded the same basic pattern of results and did not change the interpretation. For all statistical analyses, we used an alpha of 0.05. Given the within-subjects design of our experiments and our a priori focus on comparisons across conditions, we report and plot 95% confidence interval (CI) with between-subjects variance removed (Cousineau, 2005; Morey, 2008).

Data from all trials of a given condition, independent of the staircase procedure, were combined to estimate psychometric curves (Leek, 2001; Leek, Hanna, & Marshall, 1992) describing the relationship between the physical size of the circle in the target position and the participant’s perception of the target circle’s size relative to the standard circle. This was plotted as the proportion of trials in which the participant reported that the target circle was larger than the standard as a function of the actual target circle size (see Fig. 2 for an example). Target sizes (and the resulting point of subjective equality [PSE] and illusion magnitudes) are reported as the percentage change (%∆) relative to the standard, with zero representing targets that were physically identical in size to the standard. Negative target sizes indicate that the target circle was smaller than the standard; positive target sizes indicate that the target circle was larger than the standard. Given the two-alternative forced-choice paradigm of both Experiments, we used the following sigmoidal shaped binomial-logit function to model the data (Wichmann & Hill, 2001) with the MATLAB glmfit command, independently for each condition and participant:

$$ f(x)={\mathrm{e}}^{b1+ xb2}/\left(1+{\mathrm{e}}^{b1+ xb2}\right). $$
Fig. 2
figure 2

Computation of illusion magnitudes from psychometric curves. a Example of the psychometric curve fitting for the static Ebbinghaus condition for one representative participant. The proportion of trials in which the participant reported that the target circle was larger is plotted against the actual size of the target relative to the standard. Zero percentage change (%∆) indicates that the target was the same size as the standard; positive values indicate that the target was larger than the standard. The raw data (black circles) were modeled with a sigmoidal shaped binomial-logit function (gray line). The black arrow indicates the point of subjective equality (PSE), defined as the actual target size at which the fitted curve crossed 0.5 (horizontal dashed gray line). The PSE (35.0) represents the target size at which the participant had an equal probability of perceiving the target as larger or smaller than the standard for a given condition. If no illusory percept was observed, we would expect a PSE of zero (vertical dashed gray line). Error bars represent SEM. b Psychometric curves for all six conditions for the same representative participant. Solid curves represent static conditions and dashed curves represent dynamic conditions. c Bar chart of the PSEs for all six conditions for the representative participant. We calculated illusion magnitudes for the corridor and Ebbinghaus conditions by subtracting the PSE from the corresponding static or dynamic isolated condition. Double arrowed lines represent the four illusion magnitudes of interest for Experiment 1: Static Ebbinghaus, static corridor, dynamic Ebbinghaus, and dynamic corridor (from left-to-right)

When fitting psychometric curves to the behavioral data, we excluded data from the most extreme values of the target size (i.e., the starting points for the interleaved staircases), as errors on these trials can strongly impact the psychometric curve fits (because it is very unlikely that the staircase procedure would return to these extreme values). However, we note that for the present data sets, the exclusion of these extreme data points did not change the pattern or interpretation of the results.

The PSE was determined by interpolating the chance-level response probability (0.5) from the function fit to the data (PSE = −b1/b2). For a given condition, the PSE represents the size of the target circle at which the participant had an equal probability of perceiving the target to be larger or smaller than the standard. For the static conditions, the PSE represents the size of the target circle such that the participant perceived the target and standard circles to be the same size. For the dynamic conditions, the PSE represents the size of the circle in the lower target position (at the lower extreme of the animation cycle, which defines the growth rate of the circle), such that the participant perceived a minimal change in the size of the circle over the entire animation cycle. Under conditions in which there is no illusory percept, the PSE is expected to be zero. A PSE greater than zero indicates that the target had to be physically larger than the standard for participants to perceive them to be the same size. In other words, a PSE greater than zero indicates that the contextual components of the display caused the target to appear smaller than it was (relative to the standard).

To account for potential response biases and to isolate the effects of the contextual cues (i.e., inducers for Ebbinghaus illusion and background for the corridor illusion), we computed an illusion magnitude for each of the Ebbinghaus and corridor conditions by subtracting the PSEs of the corresponding isolated condition. Indeed, PSEs for the isolated conditions were significantly greater than zero, indicating that noncontextual cues of interest did influence perceived size in our task (see also Mruczek et al., 2015). Because we subtract out the PSE from the isolated conditions, these biases do not contribute to the final illusion magnitudes reported below. As with the PSEs, illusion magnitudes greater than zero indicate that the target had to be physically larger than the standard in order that participants perceived them to be the same size.

To determine whether an illusory effect was observed for each condition, illusion magnitudes were compared against zero using one-sample t tests. At the group level, normalized illusion magnitudes were compared across the Ebbinghaus and corridor conditions of interest using a repeated-measures ANOVA. Post hoc tests were completed using the Bonferroni correction (α = 0.0083).

In addition to PSEs, we also extracted the maximum slope of the psychometric curve (i.e., slope at the inflection point). The slope of the psychometric curve provides a quantitative measure of the participant’s certainty in their behavioral response. At the group level, slopes were compared across all conditions using a repeated-measures ANOVA. Post hoc comparisons were made for a set of three a priori pairs based on our hypothesis (dynamic vs. static conditions for each of the three contexts) using the Bonferroni correction (α = 0.017). Violations of sphericity, as indicated by Mauchly’s test, were corrected using the Greenhouse–Geisser correction.

To quantify the reliability of the illusion magnitude and slope estimates, we computed each of these metrics separately for the two sessions for each participant. We calculated intersession reliability as the Pearson correlation of values from the first session and second session across participants, independently for each condition.

Additionally, we performed a series of individual differences analyses. For these analyses, we computed the difference in illusion magnitude and slope, separately, between the dynamic and static conditions of the same context (e.g., dynamic Ebbinghaus illusion magnitude − static Ebbinghaus illusion magnitude). We then compared these metrics across participants. First, we calculated the Pearson correlation between the illusion magnitude change for the two illusion types (corridor and Ebbinghaus) across participants. Second, we calculated the Pearson correlation between the slope change for the no-context condition with the illusion magnitude changes for the two illusion types, separately.

Results and discussion

Psychometric curves and associated metrics

Figure 2a shows data from one representative participant for one condition (static Ebbinghaus). This is plotted as the proportion of trials in which the participant reported that the target (lower position circle) appeared to be larger than the standard (upper position circle) as a function of the actual size of the target in units of percentage change (%∆) from the standard. The PSE was derived from the corresponding psychometric function fit to the data. In the example shown, the PSE was 35.0, indicating that the target had to be larger than the standard to be perceived to be the same size as the standard for this condition. Figure 2b shows the psychometric curves and PSEs for all six unique conditions of Experiment 1 for the same representative participant. As schematically depicted in Figure 2c, we subtracted the PSE of the isolated conditions from the corresponding Ebbinghaus and corridor conditions (e.g., static Ebbinghaus − static isolated). The resulting illusion magnitudes quantify the impact of the contextual cues of the inducers or the corridor image, independent of other factors such as the target’s vertical position or eccentricity.

Group analysis: Illusion magnitudes

Figure 3 shows the illusion magnitudes for the static and dynamic versions of the Ebbinghaus and corridor illusions averaged over all participants. Illusion magnitudes for the static corridor, t(19) = 7.70, p < .001, d = 1.72; dynamic corridor, t(19) = 5.34, p < .001, d = 1.19; static Ebbinghaus, t(19) = 9.46, p < .001, d = 2.12; and dynamic Ebbinghaus, t(19) = 12.16, p < .001, d = 2.72, were all significantly greater than zero (Ms and 95% CIs for each condition are listed below). Thus, all four conditions led to a consistent illusory effect.

Fig. 3
figure 3

Illusion magnitudes from the main conditions of interest of Experiment 1. Illusion magnitudes for all four conditions were significantly greater than zero. In the figure, we display the p values for the comparisons that were most relevant to our hypothesis. However, all pairwise comparisons were significantly different (Bonferroni corrected α = 0.0083). Error bars represent 95% CI with between-subjects variance removed

A repeated-measures ANOVA revealed a significant main effect of illusion type, F(1, 19) = 117.16, p < .001, ηp2 = .86. On average, illusion magnitudes for the Ebbinghaus illusion (M = 49.5, 95% CI [43.5, 55.6]) were significantly greater than for the corridor illusion (M = 18.2, 95% CI [12.1, 24.2]). Additionally, there was a significant main effect of motion context, F(1, 19) = 44.91, p < .001, ηp2 = .70. On average, illusion magnitudes for the dynamic conditions (M = 39.7, 95% CI [36.0 43.4]) were significantly greater than for the static conditions (M = 28.0, 95% CI [24.3, 31.6]). However, both main effects were largely driven by a significant interaction between illusion type and motion context, F(1, 19) = 71.11, p < .001, ηp2 = .79. Post hoc analyses revealed significant differences for all pairwise comparisons (p < .0082, Bonferroni corrected α = 0.0083). Replicating our previous results (Mruczek et al., 2015), illusion magnitudes for the dynamic Ebbinghaus (M = 67.4, 95% CI [59.2, 75.6]) were significantly larger (i.e., stronger illusion) than those for the static Ebbinghaus (M = 31.6, 95% CI [26.5, 36.8]), t(19) = −9.00, p < .001, d = −2.01. In contrast, illusion magnitudes for the dynamic corridor (M = 12.0, 95% CI [5.0, 19.0]) were significantly smaller (i.e., weaker illusion) than those for the static corridor (M = 24.3, 95% CI [21.2, 27.4]), t(19) = 4.79, p = .0001, d = 1.07.

Group analysis: Slopes

We have previously proposed (Mruczek et al., 2014; Mruczek et al., 2015) that the dynamic motion of the target results in a less precise representation of that target. To support this contention in the context of the current experiment, we extracted a quantitative behavioral metric of precision—namely, the maximum slope of the psychometric curves (i.e., the slope at the inflection point). Steep slopes indicate more consistent decision boundaries (i.e., closer to a step function), and thus a more precise representation of the target.

We compared slopes across all six conditions using a repeated-measures ANOVA (see Fig. 4). This analysis revealed a significant main effect of context, F(1.35, 25.64) = 29.26, p < .001, ηp2 = .61, a significant main effect of motion, F(1, 19) = 20.52, p < .001, ηp2 = .52, and a significant interaction between context and motion, F(1.35, 25.64) = 29.26, p < .001, ηp2 = .61. The most important result is the main effect of motion. Specifically, slopes were significantly lower for the dynamic (M = 4.05, 95% CI [−0.54, 8.63]) compared with the static (M = 13.97, 95% CI [9.39, 18.56]) conditions. This relationship also held for each pairwise comparison of dynamic and static conditions across context (see Fig. 4). Slopes were significantly lower for the dynamic no-context (M = 5.82, 95% CI [2.89, 8.74]) compared with the static no-context (M = 23.63, 95% CI [17.19, 30.07]) condition, t(19) = 5.17, p = .0001, d = 1.16, and for the dynamic corridor (M = 4.03, 95% CI [1.17, 6.88]) compared with the static corridor (M = 10.90, 95% CI [7.05, 14.76]), t(19) = 2.82, p = .011, d = 0.63. The difference in slopes between the dynamic Ebbinghaus (M = 2.30, 95% CI [−1.42, 6.02]) and static Ebbinghaus (M = 7.39, 95% CI [4.81, 9.96]) conditions was marginally significant, t(19) = 2.35, p = .0296, d = 0.53 (Bonferroni corrected α = 0.017 for three a priori pairwise comparisons). Overall, this pattern of results provides behavioral support for our contention that the representational precision of the target was reduced under the dynamic conditions.

Fig. 4
figure 4

Slopes from the six conditions of Experiment 1. For each context, the dynamic illusion led to shallower slopes compared with the static illusion, indicating that participants’ responses were less consistent around the PSE on dynamic trials (Bonferroni corrected α = 0.017 for three a priori pairwise comparisons). Error bars represent 95% CI with between-subjects variance removed

Individual differences analyses

In addition to the group-level results presented above, we analyzed the pattern of illusion magnitudes and slopes observed across participants. Consistent with previous reports (Grzeczkowski, Clarke, Francis, Mast, & Herzog, 2017; but see Schwarzkopf, Song, & Rees, 2011; Song, Schwarzkopf, & Rees, 2011), we observed a significant positive correlation between illusion magnitudes for the static Ebbinghaus and static corridor illusions, r(18) = .71, p < .001, R2 = .50. However, the same did not hold for a comparison across dynamic versions of the Ebbinghaus and corridor illusions, r(18) = .36, p = .12.

Our main intention for the analysis of individual differences was to explore how the addition of motion dynamics altered illusion magnitudes across participants. If the effects of motion dynamics on the corridor and Ebbinghaus illusion reflect a common underlying mechanism, then we would expect that participants who showed the biggest decrease in illusion magnitude for the dynamic corridor (relative to the static corridor) would also show the biggest increase in illusion magnitude for the dynamic Ebbinghaus (relative to the static Ebbinghaus). To test this hypothesis, we computed the difference in illusion magnitudes between the dynamic and static conditions for both illusion types, separately for each participant. There was a significant inverse correlation between this dynamic–static illusion magnitude difference for the corridor and Ebbinghaus illusions, r(18) = −.50, p = .026, R2 = .25 (see Fig. 5a). Specifically, participants for which motion dynamics greatly increased the magnitude of the Ebbinghaus illusion were also the participants for which motion dynamics greatly decreased the magnitude of the corridor illusion.

Fig. 5
figure 5

Individual difference analysis of the effects of motion dynamics on the two illusion types. In both panels, each dot represents data from one participant. a For each participant, we compared the difference in illusion magnitude (IM) between the dynamic (Dyn) and static (Stat) conditions, separately for the Ebbinghaus (Ebb) and corridor (Corr) illusions. There was a significant inverse relationship between these values. Specifically, participants for whom motion dynamics led to the largest increase in the strength of the Ebbinghaus illusion (positive values on x-axis) also showed the largest decrease in the strength of the corridor illusion (negative values on the y-axis). b For each participant, we compared the difference in slopes between the dynamic and static no-context conditions with the difference in illusion magnitudes across the same motion conditions for the Ebbinghaus (left) and corridor (right) illusions. Overall, participants who experienced the largest change in slope due to motion dynamics showed the smallest change in illusion magnitudes. Although significant, the direction of this effect (positive correlation for Ebbinghaus illusion and negative correlation for corridor illusion) was in the opposite direction of our prediction

The precision hypothesis predicts that participants who had the biggest change in slope across dynamic and static conditions should show the biggest change in illusion magnitude across dynamic and static conditions. To test this hypothesis, we computed the difference between the slope of the no-context dynamic and static conditions, and compared these with the change in illusion magnitudes across the dynamic and static conditions. There was a significant positive correlation between the dynamic–static slope difference for the no-context condition and the dynamic–static illusion magnitude difference for the Ebbinghaus illusions, r(18) = .51, p = .022, R2 = .26 (see Fig. 5b, left). Additionally, there was a significant negative correlation between the dynamic–static slope difference for the no-context condition and the dynamic–static illusion magnitude difference for the corridor illusion, r(18) = .58, p = .008, R2 = .34 (see Fig. 5b, right). However, although significant, these correlations are in the opposite direction than predicted by the precision hypothesis. These results indicate that the participants for whom motion dynamics led to the greatest decrease in slope (i.e., largest decrease in precision of the target representation) showed the smallest increase for the dynamic Ebbinghaus illusion and the smallest decrease for the dynamic corridor illusion. For this analysis, we used the no-context condition to quantify the change in slope from static to dynamic condition, under the assumption that it would be the most direct measure of the desired quantity. However, as shown below, slopes for the no-context condition were less reliable, showing a high degree of intersession variability. This is likely due to the extremely steep slopes for this condition (see Fig. 4). Regardless, when we reran this analysis using the slope change in the Ebbinghaus or corridor illusions, we did not find any significant correlations. Additionally, these nonsignificant trends remained in the opposite direction than predicted.

Overall, the individual differences analyses provided some support for the precision hypothesis. Future studies using alternate manipulations of representational precision (Mruczek et al., 2015) may lead to more reliable slope measures for the no-context condition, allowing for more conclusive data regarding these predictions.

Reliability of illusion magnitudes and slopes

The analyses presented above are predicated on reliable measures of illusion magnitudes and slopes in our experimental paradigm. For the 19 participants that had good psychometric curve fits across both data collection sessions, we could compare independent measures of these metrics across sessions as a measure of their intersession reliability. One participant was excluded from this analysis due to extremely poor psychometric curve fits for one session. The intersession reliability results are presented in Table 1.

Table 1 Intersession reliability measures for PSEs, illusion magnitudes, and slopes.

Experiment 2

In Experiment 1, participants freely viewed the display until making a response. Thus, there were no time or eye movement constraints. In Experiment 2, we replicate our results for the corridor illusion under different viewing conditions. Participants were instructed to fixate a peripheral spot to the left of the static or translating circles. We compared illusion magnitudes across static and dynamic versions of the corridor illusion.

Method

Participants

Ten participants (one experimenter) completed Experiment 2. The effect size for the comparison of the dynamic and static corridor illusions in Experiment 1 was d = 1.07. For two-tailed tests at alpha = .05 and a related-samples design with a sample size of 10, this yields an expected power of .85 for Experiment 2. The observed effect size for Experiment 2 was above the anticipated effect size (see Experiment 2: Results and Discussion section).

All participants, save one experimenter, consisted of student volunteers participating in exchange for course credit from the University of Nevada, Reno. Prior to participating, each observer provided informed written consent. All participants reported normal or corrected-to-normal vision and all participants, except the authors, were naïve to the specific aims and designs of the experiments. All procedures were approved by the Institutional Review Board of the University of Nevada, Reno.

Apparatus and display

Stimuli were generated and presented using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) for MATLAB (The MathWorks Inc., Natick, MA). The experimental setup used a ViewSonic Graphic Series G220fb monitor (20 in, 1,024 × 768-pixel resolution, 85-Hz refresh rate) driven by a Mac mini computer (2.5 GHz, 16 GB of DDR3 SDRAM) with an Intel HD Graphics 4000 graphics processor (768 MB). Participants viewed the stimuli binocularly from 73 cm with their chin positioned in a chin rest.

Design and procedure

Experiment 2 used a 2 × 2 factorial design with context (isolated vs. corridor) and motion condition (static vs. dynamic) as factors. Thus, there were four unique conditions: static isolated, dynamic isolated, static corridor, and dynamic corridor (see Fig. 6). The four conditions for Experiment 2 differed from the same conditions in Experiment 1 in terms of a fixation requirement, a limited trial duration, and a fixation-only cue period. We focus our description below on the differences between the two experiments.

Fig. 6
figure 6

The four unique conditions presented in Experiment 2. The experiment used a 2 × 2 factorial design with context (columns: isolated vs. corridor) and motion condition (rows: static vs. dynamic) as factors. The white arrows indicate the direction of motion of a single target stimulus during dynamic trials, and were not visible to the participant. Participants were instructed to maintain fixation on a green fixation spot (white in the figure) to the left of the stimulus. Each trial lasted 1 s, after which the participant reported at which position (upper or lower) the central circle appeared larger

For all trials in Experiment 2, participants were instructed to maintain fixation on a green fixation point positioned 3° perpendicular to the center of the vector connecting the standard and target positions. For Experiment 2, the position of the upper circle was about 4.5° above and 4.7° right of the center of the screen, and the position of the lower circle was about 3.0° below and 1.95° right of the center of the screen. As in Experiment 1, the two positions were 8° apart along a direction 70° angle from vertical.

The stimulus for each trial in Experiment 2 was preceded by a 500-ms fixation-only period, in which the green fixation spot was displayed but no other stimulus components were displayed, and a 200-ms cue period, in which the initial stimulus configuration was displayed but no translation of the circle occurred for dynamic trials. After the cue period, the stimulus remained visible for an additional 1 s for static trials, or translated smoothly (8°/s) from the upper starting position (i.e., the standard position) to the lower position (i.e., the target position) over 1 s for dynamic trials. At the end of this 1 s period, the stimulus was removed from the display, and the fixation spot remained until the participant made a response.

The size of the target circle on static trials or the size of the circle in the target position on dynamic trials was controlled by the same adaptive staircase procedure as described for Experiment 1 (see above). For each of the four unique conditions, there were six pseudorandomly interleaved staircases (15 trials), with half starting from the minimum target size and half starting from the maximum target size.

Each participant in Experiment 2 partook in one session in which they completed 90 trials for each condition, for a grand total of 360 trials (4 conditions × 6 interleaved staircases/condition × 15 trials/staircase). Participants were given self-paced breaks every 36 trials (i.e., every 10% of the session total).

Data analysis

Data analysis followed the same procedure as outlined for Experiment 1. PSEs, illusion magnitudes, and slopes were extracted for each condition from psychometric curves fit to data from all trials, excluding the data from the most extreme values of the target size (i.e., the starting points for the interleaved staircases).

To determine whether an illusory effect was observed for each condition, illusion magnitudes were compared against zero using one-sample t tests. At the group-level, normalized illusion magnitudes were compared across the corridor conditions of interest using a paired t test. At the group-level, slopes were compared across all conditions using a repeated-measures ANOVA. Post hoc comparisons were made for a set of two a priori pairs based on our hypothesis (dynamic vs. static conditions for each of the three contexts) using the Bonferroni correction (α = 0.025).

Results and discussion

Group analysis: Illusion magnitudes

Figure 7 shows the comparison of illusion magnitudes for the static and dynamic corridor conditions averaged over all participants. Illusion magnitudes for the static corridor, t(9) = 10.31, p < .001, d = 3.26, and the dynamic corridor, t(9) = 3.49, p = .007, d = 1.10, were both significantly greater than zero. Consistent with Experiment 1, a paired t test revealed that illusion magnitudes for the dynamic corridor (M = 10.2, 95% CI [3.3, 17.2]) were significantly smaller (i.e., weaker illusion) than those for the static corridor (M = 24.6, 95% CI [17.6, 31.5]), t(9) = 4.69, p = .001, d = 1.48. The magnitude of the mean difference between illusion magnitudes for the static and dynamic conditions across participants was 14.4 (95% CI [7.5, 21.3]).

Fig. 7
figure 7

Illusion magnitudes from the main conditions of interest in Experiment 2. Illusion magnitudes for both conditions were significantly greater than zero. Additionally, the dynamic corridor illusion led to significantly weaker illusion magnitudes compared with the static corridor illusion. Error bars represent 95% CI with between-subjects variance removed

Group analysis: Slopes

To complement the slope analysis for Experiment 1, we compared slopes across the four conditions of Experiment 2 using a repeated-measures ANOVA. To reiterate, we expected that the maximum slope of the psychometric curves, which quantifies the precision of the target representation, would be lower for the dynamic illusion conditions.

We observed marginally significant main effects of context, F(1, 9) = 4.09, p = .074, ηp2 = .31, and motion, F(1, 9) = 4.99, p = .052, ηp2 = .36, with no significant interaction between context and motion, F(1, 9) = 3.13, p = .11. Consistent with our expectations, slopes were lower for the dynamic (M = 2.92, 95% CI [1.99, 3.85]) compared with the static (M = 8.29, 95% CI [2.96, 13.61]) conditions. This relationship also held for the pairwise comparison of dynamic and static conditions across context. Slopes were marginally significantly lower for the dynamic no-context (M = 3.15, 95% CI [2.16, 4.14]) compared with the static no-context (M = 12.45, 95% CI [2.26, 22.63]) condition, t(9) = 2.03, p = .07, d = 0.64, and for the dynamic corridor (M = 2.69, 95% CI [1.33, 4.05]) compared with the static corridor (M = 4.12, 95% CI [2.95, 5.30]), t(9) = 2.33, p = .045, d = 0.74 (Bonferroni corrected α = 0.025 for three a priori pairwise comparisons). Although only trending toward significance, these data are consistent with our results from Experiment 1 and suggest that motion dynamics reduced the representation precision of the target.

Experiment 3

Experiments 1 and 2 compared static and dynamic versions of the corridor and Ebbinghaus illusions. As shown in the analysis of psychometric curve slopes for Experiment 1, the dynamic condition yielded a less precise representation of the target. In Experiment 3, we manipulated viewing conditions within the dynamic corridor illusion. Compared with smooth pursuit, peripheral viewing is expected to decrease target representation precision due to differences in the acuity of foveal and peripheral retina. Thus, we predicted that peripheral viewing would result in a decreased magnitude for the corridor illusion, complementing the results of Experiments 1 and 2. This Experiment also used the method of adjustment to obtain a direct measure of the PSE on every trial.

Method

Participants

Eight participants (one experimenter) completed Experiment 3. In a previous study (Mruczek et al., 2015) we made a similar comparison of viewing conditions (fixating a static inducer with a moving target versus tracking a moving target with smooth pursuit) for the dynamic Ebbinghaus illusion and observed an effect size of approximately d = 1.34. For two-tailed tests at α = .05 and a related-samples design with a sample size of 8, this yields an expected power of .90 for Experiment 3. The observed effect size for Experiment 3 was above the anticipated effect size (see Experiment 3: Results and Discussion section).

All participants, save one experimenter, consisted of student volunteers participating in exchange for course credit from the University of Nevada, Reno. Prior to participating, each observer provided informed written consent. All participants reported normal or corrected-to-normal vision and all participants, except the authors, were naïve to the specific aims and designs of the experiments. All procedures were approved by the Institutional Review Board of the University of Nevada, Reno.

Apparatus and display

Experiment 3 used the same experimental setup as in Experiment 2.

Design and procedure

All trials in Experiment 3 used the dynamic corridor for the motion and context parameters, with slight modifications from the first two experiments. Specifically, the target object was a shaded sphere with a shadow, enhancing the 3-D perspective of the display (see Fig. 8). The sphere translated at a constant speed (5.2°/s) between a position in the upper right (corresponding to the back of the corridor) and lower left (corresponding to the front of the corridor) throughout a given trial. The upper position was 5.3° right and 5.3° above the center of the screen, and the lower position was 1.3° right and 4.3° below the center of the screen.

Fig. 8
figure 8

The two conditions presented in Experiment 3. All trials in the experiment used the dynamic corridor illusion, with differences in fixation requirements defining conditions. In the pursuit condition (right), the fixation spot was centered on the spherical target and moved along with the target ball as it translated along the corridor. In the peripheral fixation condition, the fixation spot was superimposed on the left wall of the corridor background (left). The white arrows indicate the direction of motion of the sphere during each trial and were not visible to the participant. Each trial lasted until the participant pressed the space bar indicating that the self-adjusted size change of the sphere resulted in the perception of no change in sphere size

Participants were instructed to maintain fixation on a red fixation spot (0.2° diameter) throughout all trials. There were two conditions for Experiment 3 defined by the position of the fixation spot. In the smooth pursuit condition, the fixation spot was centered on the spherical target and moved along with the target ball as it translated along the corridor (Fig. 8, right). In the peripheral fixation condition, the fixation spot was overlaid on the left wall of the corridor background at a position that was 3.1° left and 2.8° above the midpoint of the vector connecting the upper and lower positions of the target sphere’s translation path. This corresponded to a location superimposed on the left wall of the corridor, near the “front” of the corridor (Fig. 8, left).

Experiment 3 used the method of adjustment in which participants controlled the size of the sphere as it translated in the corridor using a computer mouse. In addition to translating, the circle smoothly changed size between its upper and lower positions. Whichever position (upper or lower) was the initial position of the sphere on a given trial was deemed the “standard.” The standard sphere was either 1.8°, 1.9°, or 2°, chosen randomly on each trial. In the opposite position, at the endpoint of its translation vector, the sphere was deemed the “target” and its size was controlled by the participant moving the computer mouse. The target could be any size between 0.15° and 5°, with higher values associated with higher mouse positions, and vice versa. The size of the target at the start of each trial was set to one of the extreme values (much larger or much smaller than the standard), with half of the trials starting from the smallest extreme and half starting from the largest extreme.

The size of the sphere always changed smoothly from the standard position to the target position as the circle translated. The participant’s task was to adjust the size of the sphere in the target position such that it did not appear to change size during its translation. Each trial continued, with the sphere translating and changing size back and forth between the two positions, until the participant pressed the space bar.

Five participants in Experiment 3 completed 40 trials for each condition, for a grand total of 80 trials (2 conditions × 2 initial position of standard × 2 initial target sizes × 10 repetitions). Three participants completed 20 trials for each condition, for a grand total of 40 trials (2 conditions × 2 initial position of standard × 2 initial target sizes × 5 repetitions). During each session, participants were given self-paced breaks after completing every 10% of the total session trials.

Data analysis

Using the method of adjustment, the response on each trial provides an estimate of the PSE for that condition. Specifically, the PSE represents the size of the sphere at the target position (at one extreme of the corridor, which defines the growth rate of the sphere), such that the participant perceived a minimal change in the size of the sphere over the entire animation (e.g., subjectively equivalent to an unchanging sphere).

To equate the sign for PSEs derived from trials in which the upper (i.e., “far”) or lower (i.e., “close”) sphere was the standard, which are predicted to have opposite effects on the perceived size of the target, we inverted the sign of the PSEs for trials in which the standard was in the lower position. Thus, all PSEs are reported as the size of the target sphere in the lower (i.e., “close”) position to perceptually match the standard sphere in the upper (i.e., “far”) position. Positive PSEs indicate that the target sphere had to be physically larger to appear perceptually equivalent to the standard.

To account for potential outliers, we took the conservative approach of discarding, for each participant, the trial with the highest and lowest recorded PSE for each condition. Because Experiment 3 did not include a no-context condition, we report PSEs directly, rather than normalized illusion magnitudes, as in Experiments 1 and 2. At the group level, PSEs were compared across the two conditions using a paired t test.

Results and discussion

Group analysis: PSEs

Figure 9 shows the comparison of PSEs for the smooth pursuit and peripheral fixation conditions averaged over all participants. PSEs for the smooth pursuit, t(7) = 5.62, p < .001, d = 1.99, and the peripheral fixation, t(7) = 5.58, p < .001, d = 1.97, conditions were both significantly greater than zero. Consistent with our predictions, a paired t test revealed that PSEs for the smooth pursuit condition (M = 30.0, 95% CI [27.0, 33.1]) were significantly larger (i.e., stronger illusion) than those for the peripheral fixation condition (M = 25.0, 95% CI [22.0, 28.0]), t(7) = 3.97, p = .005, d = 1.40. The magnitude of the mean difference between PSEs for the smooth pursuit and peripheral fixation conditions across participants was 5.05 (95% CI [2.0, 8.1]).

Fig. 9
figure 9

PSEs from the main conditions of interest of Experiment 3. PSEs for both conditions were significantly greater than zero. Additionally, the smooth pursuit condition led to significantly stronger PSEs compared with the peripheral fixation condition. Error bars represent 95% CI with between-subjects variance removed

General discussion

Across three experiments, we explored the effects of dynamic motion on classic size contrast (Ebbinghaus illusion) and size constancy (corridor illusion) stimuli. In Experiment 1, illusion magnitudes for the dynamic Ebbinghaus illusion were over twice as large as for the static Ebbinghaus illusion (replicating our previous results; Mruczek et al., 2015). In contrast, illusion magnitudes for the dynamic corridor illusion were only half as large as for the static corridor illusion. To demonstrate the consistency of this novel effect, we replicated the results for the static versus dynamic corridor illusion under peripheral fixation and time-constrained viewing conditions (Experiment 2) and across smooth pursuit and peripheral fixation of the dynamic corridor illusion (Experiment 3).

Target precision and additional factors

We have previously posed what we call the precision hypothesis to explain why motion dynamics have the effects on perceived size that they do. The precision hypothesis states that the integration between an object’s angular size and the surrounding contextual cues will depend on the representational precision of the angular size of the object. The results obtained in the current set of experiments are largely, although not completely, consistent with this hypothesis. In the dynamic versions of the Ebbinghaus and corridor illusions, motion dynamics resulting from target motion and eye movements were shown to decrease the precision of the representation of the angular size of a target object; an analysis of psychometric slopes in Experiment 1 confirmed that the dynamic conditions led to a decrease in target precision; slopes were lower for the dynamic versus static conditions for both the Ebbinghaus and corridor illusions. For the Ebbinghaus illusion, decreasing the precision of the angular size representation of a target object increased the strength of the size contrast illusion. In other words, the contextual surround had a larger influence on the perceived size of the target. In contrast, for the corridor illusion, decreasing the precision of the angular size representation of a target object decreased the strength of the size constancy illusion. This observation is especially intriguing given that differences between the dynamic and static illusions in Experiment 1 were correlated across individuals, suggesting a common underlying mechanism.

What might explain the opposite effects of motion dynamics on these two illusions? The core principle underlying the corridor illusion is that the size of an object’s retinal image is determined by the actual size of the object scaled by viewing distance. Thus, if two objects have the same retinal size, then the farther of the two must be larger, information that then becomes represented in the perceived size of the objects. One possibility that would be consistent with the precision hypothesis is that in order to make the distance-dependent adjustment, the visual system must first have a precise representation of the object’s retinal size. Indeed, recent evidence suggests that the integration of distance cues during size constancy in V1 occurs after 150 ms (Chen, Sperandio, Henry, & Goodale, 2019). We note that at this stage of investigation, this explanation is speculative. Additionally, one observation from Experiment 1 was not consistent with another prediction of the precision hypothesis. Specifically, participants who experienced the largest change in slope across motion conditions would be expected to show the largest difference in illusion magnitudes across those same conditions. We observed, however, that participants who experienced the biggest effect of motion dynamics on target precision as measured by psychometric slope showed the smallest change in illusory effect, which runs in the opposite direction relative to our prediction. This may suggest an inadequacy of the psychometric slope as a measure of target representation precision, or that the precision hypothesis does not fully capture the effects of motion dynamics on size contrast and size constancy illusions.

Beyond our precision hypothesis, we note several alternative explanations for the results obtained here. In particular, an important difference across the studied illusion variants is that in the dynamic Ebbinghaus illusion the contextual inducers are themselves highly dynamic due to translation and a drastic size change. In contrast, in the dynamic corridor illusions the corridor provides a static context over the course of each trial. This difference may have multiple related implications. First, the distribution of attention is known to be influenced by motion (Hillstrom & Yantis, 1994), and attention to inducers can alter the magnitude of the Ebbinghaus illusion (Shulman, 1992). Attention may be strongly drawn to the dynamic context in the dynamic Ebbinghaus illusion, but away from the static context in the dynamic corridor illusion. Second, because the translation of the inducers in the dynamic Ebbinghaus illusion matches the translation of the target, there may be strong perceptual grouping of the target and inducers by common fate motion. Such contextual grouping has been shown to enhance neuronal interactions via horizonal and feedback connections (Roelfsema, 2006). It is not as readily apparent, however, how such an effect would apply to the corridor illusion, in which the target (i.e., figure) is inherently segmented from the corridor (i.e., ground), irrespective of the target motion.

Size perception may also use a prior that objects do not change their physical size during simple transformations, such as translation. In the dynamic Ebbinghaus illusion, this prior may be overridden by the dynamic nature of the contextual elements, which are highly salient, as discussed above. In contrast, this prior may dominate in the presence of a static background providing depth cues in the dynamic corridor illusion.

More quantitative models of the effects of motion dynamics, attention and other manipulations of target precision, as well as an exploration of these factors on other visual illusion, may help resolve these inconsistencies in future work.

Implications for neural models of size perception

Neuronally, the retinotopic organization of early visual cortex (Hubel & Wiesel, 1959; Tootell, Switkes, Silverman, & Hamilton, 1988) serves as a natural basis for the representation of angular size, the amount of the visual scene taken up by an object as projected onto the retina. Intriguingly, recent evidence has implicated V1 in the representation of an object’s perceived rather than its angular size (Murray, Boyaci, & Kersten, 2006; Pooresmaeili, Arrighi, Biagi, & Morrone, 2013; Schwarzkopf et al., 2011; Sperandio, Chouinard, & Goodale, 2012). Using size illusions to dissociate perceived and physical size, these studies have consistently found that the spatial spread of activity in V1, or the size of V1 itself, is correlated with the object’s perceived size and not just its angular size.

There is some evidence that certain representations of perceived size may be represented directly in V1, whereas others are not. For example, the Ponzo illusion shows strong interocular transfer, indicating the involvement of higher visual areas, whereas the Ebbinghaus illusion shows weak interocular transfer consistent with an earlier V1 representation (Song et al., 2011). In the context of the corridor illusion, it has been hypothesized that the integration of contextual cues, which are represented across multiple regions of visual cortex, occurs via feedback to V1 (Chen, Sperandio, et al., 2019; Fang, Boyaci, Kersten, & Murray, 2008; Liu et al., 2009).

The most recent mechanistic model to account for the influence of the contextual background on perceived size in the corridor illusion posits a shift in the receptive field positions of primary visual cortex cells (MacEvoy & Fitzpatrick, 2006). For spatial positions at the “far” end of the corridor, V1 receptive fields are compressed. This leads to a larger spread of activity in topographically organized V1 and a corresponding increase in the perceived size of the object. For spatial positions at the “near” end of the corridor, V1 receptive fields are spread out, leading to the opposite effect. These background-mediated receptive field shifts ostensibly allow for the direct representation of perceive size within V1, which we refer to as the direct hypothesis. This model is supported by results from single-unit neurophysiology in primate V1 (Ni, Murray, & Horwitz, 2014) and fMRI studies of voxel-based receptive field models in human early visual cortex (He, Mo, Wang, & Fang, 2015). For example, Ni et al. (2014) identified neural correlates of perceived size in V1 arising as early as 30–60 ms after the target appeared on the corridor background.

Importantly, this model does not easily account for our observations of the dynamic corridor illusion, in which a single target is moving along the corridor. Specifically, the dynamic corridor illusion leads to a weaker illusion magnitude compared with the classic static corridor illusion. In contrast to this observation, the direct hypothesis predicts that changes in target position should simply re-size the object in the same manner as for the stationary targets, hence predicting no change in illusion magnitude across static and dynamic conditions. Thus, it remains unclear as to the degree to which this mechanism alone underlies perceive size and raises the possibility that other mechanisms instantiated within V1 or other cortical areas may also play an important role in the representation of perceived size. Indeed, the neurophysiological results of Ni et al. (2014) did not fully account for differences between angular and perceived size as measured by behavioral responses (only 25% and 63% for the two monkey participants; see also, Chen, McManus, Valsecchi, Harris, & Gegenfurtner, 2019).

Importantly, our observations strongly suggest that individual cues, such as the depth cues provided by the background image in the corridor illusion, do not have a constant and obligatory effect. Rather, we pose that an initial cortical representation of angular size is subsequently integrated with contextual cues prior to the representation of perceived size in V1 and elsewhere.

Beyond size perception

Our results highlight the diverse effects that motion dynamics can have on classic visual size illusions. In addition, motion dynamics have been shown to influence the magnitudes of other forms of visual illusions as well. One compelling example can be observed with motion-induced position shifts, in which the perceived position of an object (e.g., an internally drifting Gabor patch viewed through a stationary aperture) can be subtly influenced by the direction and speed with which it is internally drifting (De Valois & De Valois, 1991; Fischer & Whitney, 2009; Ramachandran & Anstis, 1990; Whitney et al., 2003). Several modified versions of this effect, such as the infinite regress illusion (Tse & Hsieh, 2006), the curveball illusion (Shapiro, Lu, Huang, Knight, & Ennis, 2010), and the double-drift illusion (Lisi & Cavanagh, 2015) reveal that motion-induced position shifts can be greatly enhanced if the object itself is translating (e.g., an internally drifting Gabor patch viewed through a moving aperture).

While it is generally accepted that the perceived motion trajectory observed in these double-dynamic displays arises from an integration of the internal and external motion of the stimuli, how these motion signals are integrated is not fully understood. Analogous to our precision hypothesis, Kwon, Tadin, and Knill (2015) modeled the effects of motion on the perception of position using a Bayesian framework, in which retinal motion is attributed to a combination of internal and external motion of an object in a manner that reflects the amount of uncertainty in the perceived position of the stimulus. If the position of the object is uncertain, then retinal motion is largely attributed to external motion (i.e., a change in position); if the position of the object is known with a high degree of certainty, then retinal motion is largely attributed to internal motion. For example, the magnitude of the perceived positional shift in the curveball illusion is large when viewed peripherally (i.e., high uncertainty in target position), but the effect disappears when viewed foveally (i.e., low uncertainty in target position; Kwon et al., 2015; Shapiro et al., 2010).

Similarly, based on observations with the same double-drift stimulus, Lisi and Cavanagh (2015) provided evidence that external motion of the target interferes with the accumulation of local error signals between the retinotopic location of the target and its perceived location. This, in turn, allows for greater errors to be accumulated before a perceptual reset threshold is reached.

Our proposed model of size perception, in which the influence of contextual cues depends largely on the precision of the angular size of the target, is similar in nature to these models of motion and position perception. In the context of a double-drifting Gabor, the precision hypothesis suggests that external motion makes the representation of its retinotopic location at any moment less precise and would thus increase the influence that internal motion has on perceived position. It could also be that the reduced precision of position could equivalently interfere with the local accumulation of error signals. Further empirical work investigating representation precision and error-accumulation across a range of stimuli will be necessary to tease these hypotheses apart.

Conclusion

Visual motion can greatly influence contextual effects on the perceived size of an object. The nature of these effects depends on the stimulus. Here, we demonstrated that image dynamics can increase the magnitude of the Ebbinghaus illusion and decrease the magnitude of the corridor illusion. These opposing effects place important constraints that will need to be accounted for in existing and future computational and neural models of size perception.