Structure-from-motion: dissociating perception, neural persistence, and sensory memory of illusory depth and illusory rotation
- First Online:
In the structure-from-motion paradigm, physical motion on a screen produces the vivid illusion of an object rotating in depth. Here, we show how to dissociate illusory depth and illusory rotation in a structure-from-motion stimulus using a rotationally asymmetric shape and reversals of physical motion. Reversals of physical motion create a conflict between the original illusory states and the new physical motion: Either illusory depth remains constant and illusory rotation reverses, or illusory rotation stays the same and illusory depth reverses. When physical motion reverses after the interruption in presentation, we find that illusory rotation tends to remain constant for long blank durations (Tblank ≥ 0.5 s), but illusory depth is stabilized if interruptions are short (Tblank ≤ 0.1 s). The stability of illusory depth over brief interruptions is consistent with the effect of neural persistence. When this is curtailed using a mask, stability of ambiguous vision (for either illusory depth or illusory rotation) is disrupted. We also examined the selectivity of the neural persistence of illusory depth. We found that it relies on a static representation of an interpolated illusory object, since changes to low-level display properties had little detrimental effect. We discuss our findings with respect to other types of history dependence in multistable displays (sensory stabilization memory, neural fatigue, etc.). Our results suggest that when brief interruptions are used during the presentation of multistable displays, switches in perception are likely to rely on the same neural mechanisms as spontaneous switches, rather than switches due to the initial percept choice at the stimulus onset.
Keywords3D perception Depth and shape from motion Motion in depth Rivalry/bistable perception Multi-stable displays
When the sensory system faces ambiguous input that is consistent with several equally plausible interpretations, it does not settle on a single percept. Instead, an observer perceives semistochastic switches between all possible alternatives, a phenomenon called multistable perception (Blake & Logothetis, 2002; Leopold & Logothetis, 1999). In the present study, we employed a structure-from-motion (SFM) stimulus (also known as the kinetic depth effect or depth from motion) to examine the interaction of the two features in conflict: illusory rotation and illusory depth. In an SFM display, dots move back and forth across the screen in a certain pattern to produce a vivid impression of a 3-D object rotating in depth (Sperling & Dosher, 1994; Wallach & O’Connell, 1953) (see Movie 1). Although competition occurs for both illusory rotation and illusory depth, most prior work concentrated on the ambiguity of illusory rotation alone.
The second reason is that the question about the illusory depth of the interpolated object is meaningless when the object is depth symmetric. In this case, individual dots interpolate to the same illusory object for both alternative states of their own illusory depth. In Fig. 1a, it is impossible to know whether the illusory sphere (depicted as a gray circle) follows the example dots and reverses its depth, although current evidence strongly suggests that it remains constant (Brouwer & van Ee, 2006; Li & Kingdom, 1999; Pastukhov, Vonau, & Braun, 2012; Petersik, 1979; Stonkute, Braun, & Pastukhov, 2012; Treder & Meulenbroek, 2010; Zivotofsky & Goldstein, 2007). Taking into account that commonly used object shapes such as spheres or cylinders are rotationally symmetric and, thus, depth symmetric, it is easy to see why illusory depth was overlooked.
Nonetheless, it is still possible to dissociate illusory depth and illusory rotation in SFM displays. For this, we used a rotationally asymmetric band-shaped object. This resolves the latter issue, since the depth order of individual dots now reflects the illusory depth of the interpolated illusory shape (Fig. 1b). In order to break up the link between the states of illusory rotation and illusory depth, we varied the direction of the physical motion of the dots. This alters the relationship between the two illusory properties and reveals all four possible combinations (compare Fig. 1b, c).
Our initial interest was in the interaction of sensory stabilization memories of illusory depth and illusory rotation. The sensory stabilization memory of multistable displays is a history trace that biases perception toward the recently dominant state: When a presentation is interrupted, the same state tends to dominate perception when the presentation resumes (Adams, 1954; Brascamp et al., 2008; Leopold, Wilke, Maier, & Logothetis, 2002; Orbach, Ehrlich, & Heath, 1963; Pastukhov & Braun, 2008; for a review, see Pearson & Brascamp, 2008). This effect is particularly evident after long (>1 s) interruptions in presentation. One of its hallmarks is high specificity: Only the information about the feature in conflict is retained. For example, with the SFM, sensory stabilization memory of illusory rotation is specific to the particular axis of rotation, but not to other display properties, such as size, color, speed of rotation, or exact shape (Maier, Wilke, Logothetis, & Leopold, 2003). Each competing illusory state leaves such a stabilizing trace, and the final perception reflects the relative strengths of all traces (Pastukhov & Braun, 2008). Moreover, several competing features, such as eye dominance, color, and the orientation of the binocular rivalry display used by Pearson and Clifford (2004), each leave an independent sensory stabilization memory trace, all of which interact during the following perception. In our case, this means that illusory rotation and illusory depth must each produce an independent sensory stabilization memory trace, and we must be able to examine how they compete or cooperate at the onset of the subsequent presentation.
Although we used blank durations of up to 4 s, our primary focus was on interruptions shorter than 1 s. This region of parameter space is particularly interesting, since it produces a nonmonotonic relationship between blank durations and the stability of perception (Kornmeier & Bach, 2004; Noest, van Ee, Nijs, & van Wezel, 2007; Orbach et al., 1963). It has been argued that the mechanisms behind perceptual switches differ between short (<0.4–0.6 s) and long (>0.6–1 s) interruptions (Kornmeier & Bach, 2012). In the latter case, perceptual competition “starts from scratch,” in the sense that the activity of both neural populations that represent competing states returns to the baseline and neither population is dominant or suppressed. Accordingly, any change in dominance relies on the same mechanisms as the initial perceptual choice at the start of the stimulation; it is influenced only by history effects and various biases, but not by the current states of competing neural populations. In the former case, the activity of competing neuronal populations should not decay completely during the blank period. Accordingly, the percept choice at the next presentation onset is dependent on a persisting neural representation of competing states, just like during a spontaneous switch. Clarifying this issue is important, since it has strong bearing on the interpretation of results from M/EEG experiments that rely on such intermittent presentations.
When we manipulated the stimulus to force competition between memory traces of illusory rotation and illusory depth, we found that illusory depth determines perception following short interruptions (≤0.1 s), while the prior state of illusory rotation determines perception after long blank durations (≥0.5 s). However, the observed influence of the original illusory depth is not due to the sensory stabilization memory but is consistent with influence of neural persistence, the continued response of neurons after stimulus offset (Coltheart, 1980; Irwin & Thomas, 2008; Sperling, 1960), and can be curtailed by a stimulus mask. We find that the same neural persistence stabilizes perception of illusory rotation during classic intermittent presentation with short blank durations. Finally, we show that the representation that persists contains a static snapshot of the interpolated illusory shape and appears to be agnostic with respect to the properties or distribution of individual dots.
Taken together, our findings demonstrate that although illusory rotation and illusory depth are typically linked in perception, they are represented independently and are likely to have distinct neural correlates (although possibly within the same cortical region). Second, our results confirm that during brief interruptions (<100–200 ms), competing neural populations persist and perceptual reversals rely on the same mechanisms as spontaneous switches during continuous viewing.
Procedures were approved by the medical ethics board of the Otto-von-Guericke Universität, Magdeburg: “Ethik-komission der Otto-von-Guericke-Universität an der Medizinischen Fakultät.” All participants had normal or corrected-to-normal vision. Apart from the authors, observers were naïve as to the purpose of the experiments and were paid for their participation. Six observers (3 females) participated in Experiments 1–4. Five observers (2 females) participated in Experiments 5 and 6.
Stimuli were generated with MATLAB using the Psychophysics Toolbox (Brainard, 1997) and were displayed on a CRT screen (Iiyama VisionMaster Pro 514, iiyama.com) with a spatial resolution of 1,600 × 1,200 pixels and a refresh rate of 100 Hz. The viewing distance was 73 cm, so that each pixel subtended approximately 0.019°. In all experiments, background luminance was kept at 36 cd/m2, and environmental luminance at 80 cd/m2.
Experiment 1: Presentation with constant physical motion
In our first experiment, we sought to replicate prior work on the intermittent presentation of multistable displays (Adams, 1954; Leopold et al., 2002; Orbach et al., 1963; Pastukhov & Braun, 2008; Pearson & Brascamp, 2008) using a different shape of the SFM stimulus: an illusory band. Unlike a sphere or a cylinder (the most commonly used SFM shapes), a band is not rotation symmetric and, therefore, has both well-defined ambiguous illusory rotation and ambiguous illusory depth.
Six observers (3 females) participated in Experiment 1.
The SFM stimulus consisted of 2,000 dots distributed randomly over the surface of an illusory band with a height of 5.7°. The diameter of the individual dots was 0.057°, and the luminance was 110 cd/m2.
Summary of the illusory band trajectories used in Experiment 1
Constant illusory depth and rotation
Reversed illusory depth and rotation
Following each presentation, observers were prompted with a question mark to indicate the constancy of the illusory rotation. They pressed the left arrow key if illusory rotation remained constant during the entire trial (i.e., the illusory band rotated in the same direction during both T1 and T2). Alternatively, they pressed the right arrow key if illusory rotation reversed exactly once during the trial. If illusory rotation reversed more than once or if the illusory band “split” into two independently rotating half-rings, observers pressed the down arrow key (<1 % of trials had noncanonical perceptions and were discarded; for more information on noncanonical perceptions of SFM, please refer to Chen & He, 2004; Hol, Koene, & van Ee, 2003; Treder & Meulenbroek, 2010).
In our first experiment, the physical motion of the dots on the screen remained unperturbed; at the onset of the second presentation interval, T2, dots continued to move on the screen in the same direction as before (compare the motions of two example dots during intervals T1 and T2 in Fig. 2b). Here, illusory properties—depth and rotation—may both remain constant or may both change (Fig. 2b shows illusory rotation and depth in the top and bottom rows, respectively). When the illusory rotation and corresponding illusory depth remain constant on average (the survival probability of illusory depth and rotation, PD+R = PD = PR > .5), it reveals the influence of facilitatory effects that stabilize perception—for example, sensory stabilization memory (Adams, 1954; Leopold et al., 2002; Orbach et al., 1963; Pastukhov & Braun, 2008). If observers report frequent reversals of illusory rotation (PD+R < .5), this shows a negative history effect due, for example, to an accumulation of neural adaptation (Blake, Westendorf, & Fox, 1990; Kang & Blake, 2010; Nawrot & Blake, 1989; Pastukhov & Braun, 2011; Wolfe, 1984). Values of survival probability PD+R near .5 are ambiguous: They can be interpreted as a complete decay of all history traces or as a cancellation of the positive and negative history traces.
Results for Experiment 1 are presented in Fig. 2c. In general, these results are consistent with previous reports. During the continuous presentation, observers almost never perceived a reversal in the illusory rotation [PD+R(uninterrupted) = .99 ± .006], due to a combination of the brief presentation duration (1.5 s) and the illusory shape used in the study (for details on how shape of the illusory object influences the probability of the spontaneous switch, please refer to Pastukhov et al., 2012). For nonzero blank durations, perception is very stable, with a survival probability PD+R remaining at .75 ± .04, even after 4 s. While this is consistent with earlier work showing that sensory stabilization memory is long-lasting (Leopold et al., 2002; Pastukhov & Braun, 2008), the observed decay is somewhat faster than in prior reports (approximately 10 s, instead of tens of seconds). Another notable difference between our results and prior work on intermittent presentation of multistable displays is the high perceptual stability for intermediate-length interruptions with Tblank ∈ [0.2, 0.5] s. In earlier reports, these intermediate blank durations have been found to destabilize illusory perceptions, with more switches reported than in the continuous presentation (Kornmeier & Bach, 2004; Orbach et al., 1963).
Both of these differences are likely to be explained by our presentation schedule, which was trial based rather than block based. First, in our case, the initial direction of physical motion was randomized between trials that could have biased the initial direction of illusory rotation. In this case, a new illusory rotation state could be the opposite of the state on the previous trial and produce a sensory stabilization memory that has to compete against the “older” trace, making within-trial perception less stable. Second, in earlier studies, presentation and blank intervals followed each other in an orderly fashion, with observers reporting regularly on the perceived illusory state. In our case, each trial was preceded by a random onset delay (0.5–1.0 s) and followed by a response interval that terminated only after the response was obtained (mean response time was 0.6 ± 0.3 s). This means that the average blank interval was at least 0.8 s long, providing enough time for the visual system to recover from neural fatigue (Alais, Cass, O’Shea, & Blake, 2010; Kang & Blake, 2010; Pastukhov & Braun, 2011; van Ee, 2009).
Experiment 2: Presentation with reversed physical motion
In our second experiment, we sought to compare the results of Experiment 1 (where illusory depth and illusory rotation biased observers toward the same perception) to the situation in which they competed and biased alternative outcomes. To this end, we used a forced ambiguous switch paradigm: The physical motion of individual dots reversed at the onset of the second presentation interval, T2. This creates a conflict between the new physical motion and the initial illusory states, so that one of them has to be adjusted (switched). Accordingly, it allows us to study competition between history traces of illusory depth and illusory rotation.
Six observers (3 females) participated in Experiment 2.
The stimulus was identical to that in Experiment 1.
Summary of the illusory band trajectories used in Experiment 2
Constant illusory depth/reversed illusory rotation
Reversed illusory depth/constant illusory rotation
To examine the competition of history traces for illusory depth and illusory rotation, we modified the procedure of Experiment 1 in one important way. At the beginning of the second presentation interval T2, all dots reversed their physical motion (a forced ambiguous switch paradigm; see the two example dots in Fig. 3a). This makes the original illusory rotation and depth incompatible with the new physical motion, and it becomes necessary to alter one (but not both) of the illusory properties to resolve the conflict. If an observer reports that the illusory rotation reversed during a trial, it implies that the illusory depth remained constant (see Fig. 3a, illusory rotation and depth, top row). Conversely, a constant illusory rotation implies that only the illusory depth of the band reversed (Fig. 3a, illusory rotation and depth, bottom row).
Because memory traces for the illusory depth and the illusory rotation compete, there are two corresponding survival probabilities that complement each other to unity: PD + PR = 1. Here, we chose to plot the probability of survival of illusory depth (Fig. 3b, PD, red open circles and the left vertical axis). High values of PD > .5 mean that perception tended to follow the original illusory depth, and the illusory rotation was reversed. Conversely, low values of PD < .5 mean that perception followed the original illusory rotation, and illusory depth was instead reversed. Values of PD near .5 are ambiguous and can indicate either a lack of any history effect or that memories of illusory depth and illusory rotation cancel each other out.
The results of Experiment 2 are plotted in Fig. 3b (PD, red open circles and the left vertical axis). To facilitate comparison, the results of Experiment 1 are replotted (blue circles and inverted right vertical axis; values positioned lower correspond to higher survival probabilities of illusory depth and rotation and, correspondingly, to higher stability of perceptions). When presentation was uninterrupted, illusory depth remained constant, consistent with previous observations (Pastukhov et al., 2012). For long blank durations (Tblank ≥ 0.5 s), perception followed the memory of illusory rotation instead. Interestingly, there was no significant difference between the survival probability of the illusory depth alone (PD in the present condition) and the survival probability of the illusory depth and rotation together (PD+R) in Experiment 1 (Fig. 3b; paired t-test, all ps > .28). This suggests that the sensory stabilization memory of the illusory depth has little influence on the perception, at least for the paradigm used and for the chosen blank intervals. Short blank durations (Tblank ∈ [0.1, 0.2] s) fall somewhere between the two extreme cases (continuous motion and long blank durations), with longer blank intervals leading to a stronger influence of the original state of the illusory rotation.
The constancy of the illusory depth during an uninterrupted presentation is likely to be explained by the ecological validity of involved transformations. Pastukhov et al. (2012) have recently shown that even when dealing with the ambiguous illusory states, the visual system takes into account the ecological plausibility of transitions between them. During the forced ambiguous switch, a reversal of the illusory rotation is preferred over the inversion of the illusory depth (this is true only when the object is not depth symmetric, as in the present experiment; for further details, please refer to Pastukhov et al., 2012; Stonkute et al., 2012). However, these transformation constraints are unlikely to be relevant if there are interruptions in the perception of the illusory object. When the object is occluded, it is conceivable that it may continue to rotate; it may be replaced by another object, and so forth. No particular transition path can be assumed. As a result, perception is determined by the perceptual memory of the illusory rotation.
Importantly, this makes it possible to determine whether a particular interruption in the presentation is long enough to ensure that the activity of both competing neural populations has returned to the baseline. In our case, perceptual outcomes for short blank intervals of Tblank = 0.1 s suggest that a representation of the dominant illusory depth still persists, keeping the “ecological plausibility” constraint relevant. The most likely mechanism behind this effect is neural persistence, in which neurons continue to respond after stimulus offset (Coltheart, 1980; Irwin & Thomas, 2008; Sperling, 1960). The observed effective time range (<0.5 s) is consistent with time constants reported for both static (Coltheart, 1980; Irwin & Thomas, 2008; Sperling, 1960) and motion-defined (Demkiw & Michaels, 1976; Shioiri & Cavanagh, 1992) displays.
Experiment 3: Effect of masking on persistence of illusory depth
The purpose of our third experiment was to confirm or reject the involvement of neural persistence in the stabilization of illusory depth during brief blank intervals. Unlike sensory stabilization memory (Maier et al., 2003), neural persistence is sensitive to masking (Irwin & Thomas, 2008; Loftus, Johnson, & Shimamura, 1985; Sperling, 1963). Here, we examined whether the presence of a mask would disrupt the persistence of illusory depth during a short blank interval (Tblank = 0.1 s).
Six observers (3 females) participated in Experiment 3.
The main stimulus was identical to that in Experiment 1. The mask stimulus was a yellow sphere (diameter of 5.7°) that consisted of 500 dots (diameters of 0.057°). The sphere color helped differentiate it from the main stimulus. The mask rotated around the horizontal axis, orthogonal to the rotation axis of the main stimulus (the white illusory band rotated around the vertical axis).
To examine the effect of masking on the stability of illusory depth during short blank intervals, we used a slightly modified version of the procedure from Experiment 2. Here, each blank interval contained a mask, a yellow illusory sphere that rotated ambiguously around the horizontal axis (whereas the main illusory band stimulus rotated around the vertical axis; see Fig. 4a and Movie 8). If persistence of illusory depth depends on neural persistence, this mask should destabilize it (Irwin & Thomas, 2008; Loftus et al., 1985; Sperling, 1963). In contrast, if persistence of illusory depth depends on other types of memory, such as neural adaptation and sensory stabilization memory, we should observe no effect (Maier et al., 2003; Nawrot & Blake, 1989). To minimize confusion, the mask was a yellow illusory sphere, and observers were instructed to report only on the stability of the illusory rotation of the white illusory band.
The results of Experiment 3 are plotted in Fig. 4b (filled red circles). To facilitate comparison, we also replotted the results of Experiment 2 (open red circles). The presence of the orthogonally rotating sphere significantly destabilized illusory depth (paired t-test, T5 = 5.91, p = .002), and perceptual reports were similar to those for the long blank intervals in Experiment 2. This indicates that the mask was indeed effective in curtailing the neural persistence of illusory depth.
Experiment 4: Effect of masking on persistence of illusory rotation
In our fourth experiment, we examined whether neural persistence is also responsible for the stabilization of illusory rotation in SFM. To this end, we used a classic block-based intermittent presentation of an ambiguously rotating sphere (instead of the illusory band) and examined whether masking strongly destabilized its illusory rotation during brief interruptions.
Six observers (3 females) participated in Experiment 4.
The main stimulus was a sphere (diameter of 5.7°), comprising 500 randomly distributed dots (diameters of 0.057°). It was white and rotated around the vertical axis. The mask stimulus was a yellow illusory sphere rotating around the horizontal axis (see the Method section of Experiment 3 for details). The mask was presented for 50 ms in the middle of the blank interval.
Observers continuously reported on the perceived direction of rotation of the main stimulus: the left arrow key was pressed if the front surface rotated to the left, the right arrow key was pressed if the front surface rotated to the right, and the down arrow key was pressed for unclear percepts (<1 % of trials).
To examine the role of neural persistence in the stabilization of illusory rotation, we combined the classic intermittent presentation schedule with masking. The relationship between the switching rate and blank durations is nonmonotonic: The switching rate increases as the blank duration increases up to ~400–600 ms but decreases and profoundly stabilizes for blank intervals longer than 1 s (Klink et al., 2008; Kornmeier & Bach, 2004; Noest et al., 2007; Orbach et al., 1963). Taking into account our observations, this inverted-U curve can be divided into three parts: an early part (0 ms < Tblank < ~ 500 ms) in which neural persistence stabilizes perception and neural fatigue destabilizes perception (Alais et al., 2010; Kang & Blake, 2010; Nawrot & Blake, 1989; Pastukhov & Braun, 2011; van Ee, 2009; Wolfe, 1984), a middle part (~500 ms < Tblank < ~1,000 ms) in which neural persistence has already decayed and perception is destabilized by neural fatigue and stabilized by sensory stabilization memory, and a late part (Tblank > ~1,000 ms) in which the influence of neural adaptation is minimal and perception is strongly stabilized by sensory stabilization memory (Adams, 1954; Leopold et al., 2002; Orbach et al., 1963; Pastukhov & Braun, 2008; Pearson & Brascamp, 2008).
To test this hypothesis, we investigated whether the same masking paradigm would disrupt neural persistence and destabilize illusory rotation during classic intermittent presentations with short blank intervals. We compared the switching rate of the illusory rotation when the illusory sphere was presented continuously (baseline condition), intermittently, or intermittently with a masking stimulus (Fig. 5a). The main stimulus was a white illusory sphere rotating around the vertical axis. We used a sphere here, instead of a band shape, because a sphere is rotationally symmetric and does not restrict spontaneous switches to a specific range of rotation angles (Pastukhov et al., 2012). The main stimulus was presented during the TON interval (TON = 1 s). During the TOFF interval (TOFF ∈ [0.1, 0.2, 0.4, 1]), the screen either remained blank (blank condition), or contained a mask stimulus presented for 50 ms in the middle of the TOFF interval. As in Experiment 3, the mask was a yellow sphere rotating around the horizontal axis. Unlike in Experiments 1–3, the presentation schedule was block based and consisted of 60 TON and 59 TOFF intervals (Fig. 5a). Observers reported continuously on the dominant direction of illusory rotation, with no additional time given for responses (left and right arrow keys to indicate direction; down arrow key for unclear percepts, <1 % of total trials).
As a measure of perceptual stability, we used the number of perceptual reversals per minute of the presentation time (ΣTON, excluding intervals TOFF): Rate = Nreversals/ΣTON. The results from intermittent presentations without the mask stimulus replicate findings from earlier studies (Fig. 5b, filled circles). The switching rate initially increased, as compared with the continuous presentation, and was highest for blank intervals of 0.4 s. For the 1-s blank duration, we observed the expected perceptual stabilization.
The presence of the orthogonally rotating mask created a dependence in which the switching rate decreased monotonically as the blank duration increased (Fig. 5b, open circles). This is compatible with our hypothesis that masking abolishes the neural persistence of the illusory rotation. Here, competition for the perceptual dominance at the stimulus onset is dominated by neural fatigue, leading to a significantly higher switching rate (t4 = −10.8, p = .004, for Tblank = 0.1 s; t4 = −5.2, p = .0065, for Tblank = 0.2 s). We conclude that neural persistence stabilizes ambiguous perception during brief interruptions in presentation.
Experiment 5: Effect of stimulus changes on persistence of illusory depth
In Experiment 5, we sought to examine the specificity of the illusory depth representations that persist during short blank intervals. We were interested in whether persistence depends on stable representation at the level of individual dots or on the representation of the interpolated illusory shape. To this end, we measured the stability of illusory depth after we modified nonmotion properties of the dots (color and size), motion and location of the dots (by using a smaller shape), or the shape itself (by using either a uniform sphere with a stripe or a drum volume). As a control, we used an unmodified illusory band shape.
Five observers (2 females) participated in Experiment 5.
Descriptions of various stimuli used during the second presentation interval T2, in Experiment 5
Description of alterations
Main stimulus, original band.
Nonmotion dot attributes
Color of dots
Same as original band, but colored red.
Size of dots
Same as original band, but dots were twice the original size (0.114° instead of 0.057°).
Motion dot attributes
Same as original band, but half the size (height of 2.85˚ instead of 5.7˚).
Sphere with a color stripe
A sphere consisting of 2,000 dots uniformly distributed on its surface. Dots were colored white, except for a red stripe with dimensions identical to the original band. Due to the difference in shape between the sphere and the illusory band, the red stripe was included to ensure a similar axis of symmetry.
A shape that was created by adding side surfaces to the original band, turning it into a volume with dimensions and symmetry similar to the original band.
The procedure was identical to that in Experiment 2 (Fig. 6a), aside from the stimuli used during the second presentation interval T2 (see above). A single blank duration was used, with Tblank = 0.1 s.
- 1)Nonmotion dot attributes. It is possible that all aspects of the scene must remain constant, including nonmotion properties of dots. In this case, a change in their color or size should destabilize illusory depth.
- 2)Motion dot attributes. Since the motion of individual dots is used to interpolate the illusory shape (Treue, Andersen, Ando, & Hildreth, 1995), it may be necessary for the motion and location of individual dots to remain constant. In this case, persistence of illusory depth should be disrupted if the size of the illusory object is changed, altering the location and the motion of individual dots. This manipulation should not affect the representation of the interpolated illusory object itself, since neurons that are selective for SFM can tolerate large changes in the stimulus size (Mysore, Vogels, Raiguel, Todd, & Orban, 2010).
Object size: the second stimulus was the original band (presented during interval T1), but reduced to half the original size (2.85˚ instead of 5.7˚) (see Movie 11).
- 3)Interpolated illusory object. Finally, it is possible that only the 3-D-interpolated illusory object must remain constant, so that only changes to the shape itself disrupt the persistence of illusory depth. Note that this would differentiate neural persistence from other types of the sensory memory, since neither neural fatigue nor sensory stabilization memory of SFM is selective for object shape (Maier et al., 2003; Nawrot & Blake, 1989).
Sphere with a color stripe: the second stimulus was a uniform white sphere, with a stripe of red dots that had dimensions identical to those of the original band. Although the sphere has a different shape than the original band, the red stripe ensured a similar axis of symmetry (see Movie 12).
The drum volume: the second stimulus (the drum) was created by adding side surfaces to the original band, turning it into a volume (instead of a surface) with identical dimensions and symmetry (see Movie 13).
A priori, the prime candidate for the basis of neural persistence of illusory depth was the interpolated illusory object representation. Earlier work on SFM displays showed that the stability of perception depends primarily on the constancy of the interpolated illusory object (the surface interpolation hypothesis; Treue et al., 1995). Even large changes to the motion and location of individual dots, such as limiting their lifetimes (Brouwer & van Ee, 2006) or reversing their physical motions (Li & Kingdom, 1999; Pastukhov et al., 2012; Petersik & Dannemiller, 2004; Stonkute et al., 2012; Zivotofsky & Goldstein, 2007), may not destabilize perception as long as the illusory shape remains the same.
The results of the present experiment confirm our hypothesis (Fig. 6c). Changes to either the nonmotion or the motion attributes of the dots slightly destabilize illusory depth, although this effect does not reach significance in our group of 5 observers (two sample t-tests, T13 = 0.99, p = .34, and T8 = 1.81, p = .107, for nonmotion and motion conditions, respectively). In contrast, changes to the shape itself strongly destabilize illusory depth (shape condition, T13 = 4.49, p = .0006; significance level after Bonferroni correction for multiple comparisons was α = .05/3 = .0167).
The dependence of illusory depth stability on the shape of the illusory object differentiates it from illusory rotation. In general, the only changes that destabilize illusory rotation are those affecting the location and axis of rotation of the object (but, curiously, not the speed of rotation) (Chen & He, 2004; Maier et al., 2003). Changes to the object itself (shape, size, and color) do not destabilize it. This suggests that the representation of illusory rotation and its respective memory trace are independent of the representation of the illusory shape itself.
In contrast, the representation of illusory depth appears to be tightly linked with the representation of illusory shape. Most notable here is a strong destabilization of illusory depth when the shape is changed from an illusory band (surface) to an illusory drum (volume). The applied changes are minimal: Additional side surfaces do not alter the axis of symmetry, overall shape, or dimensions of the object. This shows that in contrast to illusory rotation, the representation of illusory depth is not abstracted away from an object—for example, via a link with its axis of symmetry.
Experiment 6: Persistence of illusory depth shows no evidence of dynamic update
In our final experiment, we examined whether a persisting representation of illusory depth is dynamically updated during the blank period in which the stimulus is not visible. It is possible that a dynamic update would advance the representation in the direction of rotation. Experiments on spatiotemporal relatability show that the representation of a partially occluded object is dynamically updated through time (Palmer, Kellman, & Shipley, 2006). In the tunnel effect, a representation of an object is maintained and dynamically updated while it moves behind an occluder (Flombaum & Scholl, 2006; Kawachi & Gyoba, 2006). In our case, this would produce a growing mismatch between a persisting representation and the actual stimulus, since the initial angle of rotation at the onset of T2 was always identical to the final one of interval T1. To test this hypothesis, we measured the stability of illusory depth using different starting angles of rotation for interval T2.
Five observers (2 females) participated in Experiment 6.
The stimulus was identical to that in Experiment 1.
Angles of rotation used during the second presentation interval, T2, in Experiment 6, following the clockwise illusory rotation in T1
T2, constant illusory depth/reversed illusory rotation
T2, reversed illusory depth/constant illusory rotation
18° → 63°
342° → 297°
27° → 72°
333° → 288°
36° → 81°
324° → 279°
40.5° → 85.5°
319.5° → 274.5°
45° → 90°
315° → 270°
49.5° → 94.5°
310.5° → 265.5°
54° → 99°
306° → 261°
63° → 108°
297° → 252°
72° → 117°
288° → 243°
To examine whether the representation of illusory depth is dynamically updated while the object is not visible, we repeated Experiment 2 using different starting angles of rotation at the beginning of the second presentation interval, T2. Possibilities for the reappearance of the band were the following: in the same orientation as at the end of T1 (Fig. 7, top row; the 0˚/0 ms condition replicates Experiment 2); as it would appear if it kept rotating in the same direction and with the same speed for 50, 100, 200, or 300 ms (positive angles of rotation and time intervals in Fig. 7); or as it would appear if, during the blank, it rotated with the same speed in the opposite direction for 50, 100, 200, or 300 ms (negative angles of rotation and time intervals in Fig. 7). If the representation were dynamically updated during the invisibility period, we should observe fewer illusory rotation reversals when the illusory band appears at “correctly advanced” angles of rotation (i.e., advanced by 9˚ when Tblank = 100 ms or by 18˚ when Tblank = 200 ms, marked with filled circles). However, if the representation were static, the peak of the responses should be located near the original angle of rotation (0˚ and 0 ms), and any deviation, due to either positive or negative advances in time, should be equally disruptive.
The results of this experiment are presented in the lower row of Fig. 7, with orange marking the Tblank = 100 ms condition and green marking the Tblank = 200 ms condition. We find no evidence of dynamic updating for either blank duration. The means of the distributions do not significantly differ from zero: μ100 = −3, t-test with p = .85; μ200 = 3.5, t-test with p = .83. Moreover, the distributions are highly symmetric, showing no skewness to the right (ϒ100 = 0.034, ϒ200 = 0.028). We conclude that the persistence of illusory depth depends on a static representation of the interpolated object.
Illusory depth and illusory rotation in SFM may be dissociated using a forced ambiguous switch paradigm. We have examined how the respective memory traces of illusory depth and rotation cooperate or compete following an interruption in the presentation. Our results show qualitatively different effects for uninterrupted presentations and presentations with brief interruptions (Tblank = 0.1 s), as compared with presentations with long interruptions (Tblank ≥ 0.5 s). In the former situations, the memory of illusory depth dominates, in that the illusory depth is typically maintained over the interruption (while the illusory rotation is reversed). In the latter situation, the memory of the illusory rotation dominates, in that the illusory rotation is typically maintained (while the illusory depth is reversed). We find no influence of a depth memory after long interruptions, since the relative likelihoods of the two illusory percepts are the same regardless of which percept is favored by depth memory. The constancy of illusory depth over brief interruptions is ensured by neural persistence. When this is curtailed by masking, illusory depth is strongly destabilized, just as with long interruptions. The stabilizing effect of neural persistence for brief interruptions in presentations of SFM is not specific to the forced ambiguous switch paradigm or the illusory depth. When a mask is presented during the off-intervals of the classic intermittent presentation with a spherical SFM shape, illusory rotation is also dramatically destabilized for brief interruptions. Finally, we showed that neural persistence, which ensures the stability of illusory depth during brief interruptions, relies on a static representation of the interpolated illusory object.
Independent representations of illusory depth and illusory rotation
Our results provide additional support for the idea that illusory depth and illusory rotation are represented independently despite being linked and constrained by the same physical motion (see the Introduction and Fig. 1 for a detailed explanation). If we assume that a single neuronal population encodes a unique combination of illusory depth and illusory rotation, the forced ambiguous switch paradigm used in Experiment 2 should show no specific dependence on the past, since any two new combinations of illusory depth and rotation are equally different from the original one. Instead, we find that perception is determined either by the neural persistence of illusory depth alone (short interruptions) or by the sensory stabilization memory of illusory rotation alone (long interruptions). Moreover, in contrast to illusory rotation, the persistence of illusory depth depends on the constancy of the illusory shape (Experiment 5).
This result is consistent with earlier research on transition probabilities in SFM (Pastukhov et al., 2012). The observed probability distribution of the spontaneous switch (which involves a simultaneous switch of both illusory depth and rotation; see Fig. 1) was well approximated by a product of the probability distributions of illusory depth and illusory rotation. This strongly suggests that illusory depth and illusory rotation are represented independently and their transitions are governed by two independent random processes.
The independence of illusory depth and illusory rotation suggests that their respective neural correlates may be found in different cortical areas. Alternatively, different subpopulations within one cortical region may represent both properties independently. Prior imaging studies have demonstrated that perception of SFM elicits an activation in a distributed network of occipital and extrastriatal areas, including the hMT/V5 complex, the ventral and dorsal intraparietal sulci (VIPS and DIPS), the lateral occipital sulcus (LOS), the superior temporal sulcus (STS), and areas V2 and V3A (Beer, Watanabe, Ni, Sasaki, & Andersen, 2009; Brouwer & van Ee, 2007; Orban, 2011; Paradis et al., 2000; Peuskens et al., 2004; Vanduffel et al., 2002). Accordingly, at the moment, it is impossible to reliably establish the exact relationship between these perceptual components, their respective memory traces, and the neural correlates of SFM. Further research, using the forced ambiguous switch paradigm or other procedures that dissociate illusory depth and illusory rotation, is necessary to shed more light on this issue.
Influence of history on the perception of multistable displays
One reason multistable displays have attracted scientific attention since the 19th century (Necker, 1832; von Helmholtz, 1866; Wheatstone, 1838) is that they are particularly revealing about even minor influences on perception. Prior research has uncovered a plethora of forces that bias perception in favor of one of the outcomes. Attention has the most versatile influence and can alter the balance between competing percepts (Chong, Tadin, & Blake, 2005; Meng & Tong, 2004; Mitchell, Stoner, & Reynolds, 2004) or influence the overall switching rate (Brouwer & van Ee, 2006; Pastukhov & Braun, 2007; Struber & Stadler, 1999). Factors that create a constant shift of balance in favor of a specific state are, to name a few, the observer-specific bias (Carter & Cavanagh, 2007; Medith, 1967; Shannon, Patrick, Jiang, Bernat, & He, 2011), stimulus properties (Dosher, Sperling, & Wurst, 1986; Levelt, 1965; Nawrot & Blake, 1991; Orbach et al., 1963), and the context (Fang & He, 2004; Mudrik, Deouell, & Lamy, 2011). Forces that reflect the viewing history of multistable displays are neural fatigue (Alais et al., 2010; Kang & Blake, 2010; Pastukhov & Braun, 2011; van Ee, 2009), perceptual memory (Adams, 1954; Leopold et al., 2002; Orbach et al., 1963; Pastukhov & Braun, 2008; Pearson & Brascamp, 2008), cue recruitment (Harrison & Backus, 2010, 2012), and expectations-driven perception (Denison, Piazza, & Silver, 2011; Maloney, Dal Martello, Sahm, & Spillmann, 2005).
Finally, our results show that neural persistence, the continued response of neurons after stimulus offset (Coltheart, 1980; Irwin & Thomas, 2008; Sperling, 1960), also needs to be taken into account when interruptions are brief. Its role in multistable perception was originally demonstrated by O’Shea and Crassini (1984), using temporally interleaved dichoptic displays. Its importance was also highlighted by experiments that used unambiguous priming stimulus to induce “flash facilitation” in binocular rivalry (Brascamp, Knapen, Kanai, van Ee, & van den Berg, 2007).
Here, the forced ambiguous switch paradigm was used to reveal the influence of neural persistence, and masking was used to demonstrate that its disruption leads to a destabilization of multistable perception.
One curious aspect of our findings is that in Experiment 2, only illusory depth is stabilized during brief interruptions, while long blank intervals reveal an influence of sensory stabilization memory on illusory rotation alone. It is unlikely that no neural persistence and no sensory stabilization memory are generated by illusory rotation and illusory depth, respectively. When illusory depth is irrelevant, as in Experiment 4, the neural persistence of illusory rotation clearly stabilizes the perception (Fig. 5b). It is likely that in forced ambiguous switch, its influence is diminished due to the conflict with illusory depth and concerns over the ecological validity of the transformations involved (the latter gives strong preference for the stable perception of illusory depth; see the Results section and Pastukhov et al., 2012, for details).
The absence of sensory stabilization memory for illusory depth after long interruptions is more puzzling, since we have no alternative condition that would reveal its existence. In principle, it should be present: All ambiguous displays, in contrast to the nonambiguous ones, reliably produce sensory stabilization memory traces for the feature in conflict (Pastukhov & Braun, 2012; Sterzer & Rees, 2008). Similarly, if several illusory properties compete, they each leave a memory trace (Pearson & Clifford, 2004). One possible explanation is the difference between exposure times of illusory depth and rotation: Strengths of perceptual memory traces strongly depend on the dominance time of the particular illusory state (Pastukhov & Braun, 2008, 2012). In our case, the direction of illusory rotation remains constant for 1 s during the first presentation interval, T1. In contrast, the illusory depth of a shape constantly changes (since the illusory band is rotationally asymmetric), and its particular state is dominant for several hundreds of milliseconds at most. It is more likely that the dominance of a particular illusory depth configuration lasts only tens of milliseconds, resulting in a very weak memory trace. If this is the case, a perceptual memory of illusory depth can be revealed when illusory depth is artificially stabilized near a particular orientation, while illusory rotation is destabilized to reverse the strength of corresponding memory traces.
Finally, our results show that neural persistence and sensory stabilization memory rely on different neural mechanisms and different neural populations. As we stated above, in Experiment 2, the relative strength of neural persistence and sensory stabilization memory reverses over time. Also, in contrast to sensory stabilization memory, neural persistence can be disrupted by masking (Experiments 3 and 4). Taken together, these findings suggest that future imaging studies should be expected to reveal that the neural correlates of sensory stabilization memory are different than the neural correlates of competing perceptual states.
Neural persistence of competing states during interruptions in the presentation
Our results also help to understand the differences between perceptual switches during intermittent presentations with short and long interruptions. It has been suggested previously that perceptual reversals during brief interruptions (less than 400–500 ms) rely on the same mechanisms as spontaneous switches during continuous presentation. Our results indicate that there is not enough time for the activity of competing neuronal populations to completely decay. Accordingly, the percept choice at the next presentation onset is dependent on a persisting neural representation of competing states, just as during spontaneous switches. In contrast, when the activity of competing neural populations decays completely during longer blank intervals, the presentation that follows can trigger a new percept choice, just as the very first presentation of a session triggers an initial percept choice.
The original evidence for this distinction comes from psychophysics and EEG studies. Behavioral studies show that the relationship between the blank interval duration and the switching rate is nonmonotonic and has an inverted-U shape. Initially, the switching rate increases, reaching its maximum around blank intervals of 400–600 ms (Kornmeier & Bach, 2004; Noest et al., 2007; Orbach et al., 1963) (the duration of the on-interval has only a marginal effect; Klink et al., 2008), while longer blank intervals lead to a progressively more stable perception. Accordingly, it was hypothesized that these two parts of the curve may correspond to qualitatively different processing (Kornmeier & Bach, 2012).
More direct evidence comes from EEG studies. If ambiguous displays are presented intermittently, one observes a reversal positivity that occurs approximately 130 ms after the stimulus onset. However, this is true only for relatively brief blank intervals of 10–600 ms (Britz, Landis, & Michel, 2009; Britz & Pitts, 2011; Kornmeier & Bach, 2005, 2012; Kornmeier, Ehm, Bigalke, & Bach, 2007). It is absent for the long blank durations associated with perceptual stabilization (O’Donnell, Hendler, & Squires, 1988). Again, this suggests that the mechanisms behind perceptual reversals during brief interruptions are more similar to those behind spontaneous switches during continuous presentation than to perceptual switches following long interruptions.
The results of Experiment 2 help to further clarify this issue. Here, an uninterrupted presentation of the forced ambiguous switch paradigm leads to a reliable persistence of illusory depth, while blank intervals of 1 s (a clear case of truly intermittent presentation) favor illusory rotation (see Fig. 4). Since outcomes of the stimulus manipulation are qualitatively different for these two extreme cases, they can be used as indicators of whether the activity of neural populations has completely decayed (as with long blank durations) or still persists (as with uninterrupted presentations). The pattern in our results shows that the activity of neural populations clearly persists over brief interruptions (Tblank = 0.1 s), since perception strongly follows illusory depth as in the continuous case. Note that this is not a necessary outcome; when neural persistence is curtailed by a mask, responses and perceptions are similar to those found in the long blank intermittent presentations, even for brief interruptions (see Experiments 3 and 4, Figs. 4 and 5).
Our results suggest that perceptual alternations that occur during intermittent presentation with very brief intervals (≤100 ms) rely on the same mechanisms as spontaneous perceptual switches. However, one has to keep in mind several important differences between continuous and interrupted stimulus presentations. First, during interrupted presentation, stimulus onsets introduce a strong exogenous signal that can trigger perceptual alternations. Second, for longer blank intervals, influence of neural fatigue becomes progressively more evident, as witnessed by an increase in the destabilization of multistable perception. On the other hand, one must note that even if the presentation is not interrupted by the experimenter, it is frequently interrupted by the observer himself. Eyeblinks and microsaccades occur fairly frequently (rates are >0.5 Hz for eyeblinks [Monster, Chan, & O’Connor, 1978] and >0.25 Hz for microsaccades [Martinez-Conde, Macknik, Troncoso, & Hubel, 2009]) and are known to influence the dynamics of multistable perception (Leopold et al., 2002; van Dam & van Ee, 2006). Accordingly, from the observer’s perspective, a reasonably long—for example, a few tens of seconds—physically continuous presentation is never truly continuous. In view of that, we conclude that very brief blank intervals (≤100 ms) are an appropriate tool for studying the mechanisms of spontaneous perceptual switching during intermittent presentation.
We would like to thank Witold Klare for his assistance in piloting experiments. Clip art used in Fig. 1 was obtained from the Open Clip Art Library (openclipart.org) and is used under the Public Domain license.
- Levelt, W. J. (1965). On binocular rivalry. Soesterberg (The Netherlands): Institute for Perception RVO-TNO.Google Scholar
- Mysore, S. G., Vogels, R., Raiguel, S. E., Todd, J. T., & Orban, G. A. (2010). The selectivity of neurons in the macaque fundus of the superior temporal area for three-dimensional structure from motion. The Journal of neuroscience: the official journal of the Society for Neuroscience, 30(46), 15491–508. doi:10.1523/JNEUROSCI.0820-10.2010 CrossRefGoogle Scholar
- Necker, L. A. (1832). Observations on some remarkable phenomena seen in Switzerland; and an optical phenomenon which occurs on viewing of a crystal or geometrical solid. Philosophical Magazine, 1, 329–337.Google Scholar
- Paradis, a L., Cornilleau-Pérès, V., Droulez, J., Van De Moortele, P. F., Lobel, E., Berthoz, a, Le Bihan, D., et al. (2000). Visual perception of motion and 3-D structure from motion: An fMRI study. Cerebral cortex (New York, N.Y.: 1991), 10(8), 772–83.Google Scholar
- Pastukhov, A., & Braun, J. (2012). Disparate time-courses of adaptation and facilitation in multi-stable perception. Learning & Memory.Google Scholar
- Sperling, G., & Dosher, B. A. (1994). Depth from motion. In T. V. Papathomas, A. G. Charles Chubb, & E. Kowler (Eds.), Early Vision and Beyond (pp. 133–142). Cambridge, MA: MIT Press.Google Scholar
- Sterzer, P., & Rees, G. (2008, March). A neural basis for percept stabilization in binocular rivalry. Journal of cognitive neuroscience, 20(3), 389–99. doi:10.1162/jocn.2008.20039
- Stonkute, S., Braun, J., & Pastukhov, A. (2012). The Role of Attention in Ambiguous Reversals of Structure-From-Motion. (S. B. Hamed, Ed.)PLoS ONE, 7(5), e37734. doi:10.1371/journal.pone.0037734
- von Helmholtz, H. (1866). Treatise on Physiological Optics: Vol. 3 (Vol. 3). Birmingham, Alabama: The Optical Society of America.Google Scholar