Certain displays that are consistent with several equally likely interpretations cause the visual system to endlessly switch between these alternatives (Leopold & Logothetis, 1999; Tong, Meng & Blake, 2006). Classic examples of such multistable displays are binocular rivalry (von Helmholtz, 1925; Wheatstone, 1838), the Necker cube (Necker, 1832), and structure-from-motion (Sperling & Dosher, 1994; Wallach & O’Connell, 1953). During continuous viewing, alternations of perception are generated endogenously and cannot be prevented even by top-down attention control (Meng & Tong, 2004; van Ee, van Dam & Brouwer, 2005). It is therefore all the more surprising that multistable perception can be profoundly stabilized by means of an intermittent presentation, provided that blank intervals are sufficiently long (Adams, 1954; Leopold, Wilke, Maier & Logothetis, 2002; Orbach, Ehrlich & Heath, 1963; Ramachandran & Anstis, 1983).

The stabilization of multistable perception is ensured by sensory memory, an implicit visual memory that biases the ambiguous perception toward a recent most-dominant state (Pastukhov & Braun, 2008; Pearson & Brascamp, 2008). Sensory memory traces are generated and maintained independently for each type of multistable display (Maier, Wilke, Logothetis & Leopold, 2003), for each alternative state (Brascamp, Knapen, Kanai, Noest, van Ee & van den Berg, 2008; Pastukhov & Braun, 2008), and for each conflicting feature (Pearson & Clifford, 2004). However, they are observed only for fully ambiguous displays and not for their disambiguated versions (Brascamp, Knapen, Kanai, van Ee & van den Berg, 2007; Nawrot & Blake, 1989; Pastukhov & Braun, 2013a; Sterzer & Rees, 2008). The sensory memory of a multistable display is modeled as an interaction between memory and sensory representations (Gigante, Mattia, Braun & Del Giudice, 2009), but it is equally possible to model its effects purely at the level of sensory representations (Noest, van Ee, Nijs & van Wezel, 2007; Noest & van Wezel, 2012). Current imaging studies generally support the former idea, as they show an involvement of higher-level cognitive regions in perceptual stabilization (Schwiedrzik, Ruff, Lazar, Leitner, Singer & Melloni, 2013; Sterzer & Rees, 2008), in addition to the specialized visual areas (Brascamp, Kanai, Walsh & van Ee, 2010; Sterzer & Rees, 2008).

As we noted earlier, when several features of a multistable display are ambiguous, each of them produces an independent sensory memory trace. Pearson and Clifford (2004) used a binocular rivalry display that was ambiguous with respect to the eye of origin, grating color, and orientation. They found that perception in subsequent trials was predicted by the competition between respective independent sensory memory traces. However, this was the only example of such displays with multiple ambiguous features and it is unclear whether the observed independence is a general rule or is specific to binocular rivalry.

We investigated this question using structure-from-motion (SFM), another type of multistable display with two ambiguous properties. SFM is produced by a 2-D flow that generates a vivid impression of a 3-D object rotating in depth (Sperling & Dosher, 1994; Wallach & O’Connell, 1953) (see Movie 1; for the best results, please ensure that the presentation is looped). Importantly, for our study, both the illusory rotation of the 3-D object and its illusory depth (i.e., how close individual dots or parts of an object appear to the observer) were ambiguous (Pastukhov & Braun, 2013b; Pastukhov, Vonau & Braun, 2012; Stonkute, Braun & Pastukhov, 2012). The relationship between them is illustrated in Fig. 1. Both are derived from the same on-screen motion (top row in Fig. 1), such that only two combinations of perceptual states are possible (middle and bottom rows in Fig. 1). However, the states of illusory rotation and illusory depth can be dissociated by reversing the on-screen motion (compare the directions of the on-screen motion and the possible combinations of perceptual states in Figs. 1a and 1b).The existence of a sensory memory of illusory rotation is well documented, and its properties have been extensively studied (Chen & He, 2004; Leopold et al., 2002; Maier et al., 2003; Pastukhov & Braun, 2008, 2013b). In contrast, evidence for a sensory memory of illusory depth is only indirect. We have recently reported that the perception of rotationally asymmetric shapes is more stable when objects are presented at the same angle of rotation as before, rather than at a different angle (Pastukhov, Füllekrug & Braun, 2013). These results are consistent with additional stabilization being generated by the sensory memory of illusory depth: When an SFM object is presented at the same angle of rotation as before, the sensory memory of illusory depth would favor the same perceptual state, complementing the effect of the sensory memory of illusory rotation. However, this is a post-hoc explanation of the effect, and it does not prove the existence of an independent sensory memory of illusory depth in SFM. Thus, the first goal of our study was to investigate sensory memory of illusory depth directly. We demonstrated that both illusory depth and illusory rotation generate independent sensory memory traces, confirming and extending earlier results (Pearson & Clifford, 2004).

Fig. 1
figure 1

Relationship between ambiguous illusory depth and ambiguous illusory rotation in structure-from-motion (SFM) displays. The top row shows static snapshots of the SFM display (frontal view, x–y plane). The bottom rows show the perceived ambiguous illusory rotation and illusory depth, as if viewed from above (schematic top view, xz plane). Example dots show the correspondence between the on-screen motion and the inferred states of illusory rotation and illusory depth. (a) The illusory rotation and illusory depth of a band are ambiguous. The band can be perceived with its front surface rotating left or right—respectively, clockwise (middle row) or counterclockwise (bottom row) when viewed from above. For a given on-screen motion, only two combinations of illusory rotation and illusory depth are possible. During continuous viewing, perception spontaneously switches between these two alternatives. Note that a spontaneous perceptual switch entails simultaneous changes in both illusory rotation and illusory depth. (b) When the on-screen motion of an illusory band is opposite (relative to the motion of the example dots in panel a), the association between the states of illusory rotation and illusory depth is reversed

Our second goal was to investigate potential similarities or differences in the neural mechanisms of sensory memories of illusory depth and illusory rotation. As we noted above, both illusory properties are derived from the same on-screen motion and their perceptual states are associated. Accordingly, we were curious whether their respective sensory memory traces would exhibit similar properties (suggesting similar underlying neural mechanisms) or would behave differently (this would suggest that qualitatively different neural mechanisms are involved). To this end, we examined the specificity of sensory memory of illusory depth using a selective-adaptation paradigm (Schacter, Wig & Stevens, 2007) and compared it to the specificity of sensory memory of illusory rotation (Chen & He, 2004; Maier et al., 2003; Pastukhov et al., 2013a). Our results suggested that both rely on similar neural mechanisms.

In addition, we will discuss how the existence of two independent traces for a single multistable display advances our understanding of the neural mechanisms behind sensory memory.

General method

Observers

Nine observers (five females, four males), including the second and third authors, participated in experiments. The procedures were approved by the medical ethics board of the Otto-von-Guericke University, Magdeburg: Ethik-Komission der Otto-von-Guericke-Universität an der Medizinischen Fakultät. All participants had normal or corrected-to-normal vision. Apart from the authors, observers did not know the purpose of the experiments and were paid for their participation.

Apparatus

Stimuli were generated with MATLAB using the Psychophysics Toolbox (Brainard, 1997) and displayed on a CRT screen (Iiyama VisionMaster Pro 514, iiyama.com) with a spatial resolution of 1,600 × 1,200 pixels and a refresh rate of 100 Hz. The viewing distance was 73 cm, with individual pixels subtending approximately 0.019º. In all experiments, background luminance was kept at 36 cd/m2 and environmental luminance at 80 cd/m2.

Structure-from-motion display

The main structure-from-motion (SFM) stimulus, referred to as a band throughout the study, consisted of 500 dots distributed randomly over the surface of a band. It had a height of 5.7º, with individual dots having a size of 0.057º and a luminance of 110 cd/m2. The correspondence between the on-screen orthographic projection and the polar coordinate system is depicted in Fig. 2. The observer is positioned at 270º.

Fig. 2
figure 2

Correspondence between on-screen orthographic projection and the polar coordinate system. (Top row) Snapshots of the stimulus as seen on the screen, in the x–y plane (orthographic projection). (Bottom row) Stimulus as if seen from above, in the x–z plane (polar projection). The horizontal axis has 0º orientation, and angles of rotation are measured counterclockwise. The observer is positioned at 270º

The motion of the SFM band was restricted to a range of [37º, 73º] for two reasons. First, the band object is depth-symmetric when it is at 0º/180º (“frontal” view; left column in Fig. 2) and 90º/270º (“orthogonal” view; right column in Fig. 2). In these two cases, there is only one global state of illusory depth: Mirroring the object relative to the zero-depth line results in the same shape. Accordingly, it is impossible for an observer to report on the state of illusory depth when the band is presented at one of these angles of rotation. Second, crossing 0º/180º or 90º/270º lines reverses the association between states of illusory depth and illusory rotation for the illusory band (see Fig. 1). However, our intention was to dissociate their states between trials rather than within trials. Therefore, we have sidestepped both issues by staying away from critical angles of rotation and limiting the motion of the band to [37º, 73º].

Responses

Observers reported which side of the band (left or right) appeared to be closer to them using arrow keys (respectively, the left and right keys). They were instructed to withhold responses when perception was mixed (e.g., two half-rings were rotating independently) or unclear. Trials with no responses or with multiple responses were discarded.

Measure of perceptual stability: Probability of survival

As a measure of perceptual stability, we used the probability of survival (P surv), which is the probability that the same perceptual state of an illusory property is reported on two consecutive trials. Values of P surv > .5 show that the illusory property tended to remain stable, with high values close to 1 indicating strong stabilization. Values below .5 denote a perceptual destabilization, with lower values close to 0 signifying a stronger negative history effect.

Note that due to an observer-specific bias (Carter & Cavanagh, 2007), the baseline level of P surv may be different from .5. To establish an observer-specific baseline, we computed preferred initial state of illusory depth over all experiments and conditions. Observers showed only a small and insignificant bias toward one of the perceptual states of illusory depth. The right side was reported to be closer on 48% ± 11% of the trials. This was not significantly different from 50% [paired sample t test, t(8) = −0.16, p = .88]. Accordingly, baseline of .5 was used for statistical testing.

Statistical analysis

For valid trials (excluding trials with multiple or no responses), only two responses were possible: either the left or the right side of the band appeared closer to the observer. Accordingly, we analyzed the distributions of responses by means of binomial statistics. In figures, error bars represent a 95% confidence interval around the average binomial proportion based on the mean number of valid trials (N totalN no respN mixed).

For pair-wise comparisons, we determined the highest significance level for which the true difference between two binomial proportions P 1 and P 2 was not equal to zero. In other words, if D = P 2P 1 < 0, we determined the highest significance level for which D true < 0. Conversely, for D = P 2P 1 > 0, we determined the highest significance level for which D true > 0. The confidence interval for the true difference between two binomial proportions was estimated using the “Accurate Confidence Intervals” toolbox (see the MATLAB central file exchange and Ross, 2003).

Experiment 1: Sensory memory of illusory depth

In our first experiment, we sought to establish whether ambiguous illusory depth of SFM generates a sensory memory trace that is independent from the sensory memory of illusory rotation. We used a band shape, which is rotationally asymmetric and has a well-defined illusory depth (Pastukhov & Braun, 2013b; Pastukhov et al., 2012). To dissociate perception and sensory memories of illusory rotation and illusory depth, we combined an intermittent presentation with a variable on-screen motion (a so-called forced ambiguous switch paradigm; Pastukhov & Braun, 2013b; Pastukhov et al., 2012; Stonkute et al., 2012).

Method

The SFM band was presented intermittently (T on = 0.5 s, T off = 1 s) and its on-screen motion was restricted to a limited range of rotation angles: Θ ∈ [37º, 73º]; see the General Method section for details.

The speed of rotation was 0.2 Hz. The direction of the on-screen motion was randomized, so that in the model space the stimulus was equally likely to rotate in a “positive” (Θon = 37º → Θoff = 73º) or “negative” (Θon = 73º → Θoff = 37º) direction (see Movie 2). A single experimental session consisted of ten blocks. Each block contained 80 T on and 80 T off intervals (400 trials per direction of rotation). Trials with no responses or with multiple responses were discarded (respectively, ~3% and 0% of total trials).

Results and discussion

When the directions of on-screen motion on two consecutive trials i and i + 1 were identical [e.g., Θon(i) = 73º → Θoff(i) = 37º and Θon(i + 1) = 73º → Θoff(i + 1) = 37º], the states of illusory rotation and illusory depth were associated. As is illustrated in Fig. 3a, illusory rotation and illusory depth either both remained stable (Fig. 3a, top outcome) or both reversed (Fig. 3a, bottom outcome). This corresponded to a typical intermittent-presentation procedure used in studies of sensory memory (Adams, 1954; Leopold et al., 2002; Orbach et al., 1963; Pastukhov et al., 2013a). Consistent with earlier work, long blank intervals (T off = 1 s) resulted in strong stabilization of perception (Fig. 3b) (\( {P}_{\mathrm{surv}}^{\mathrm{depth}}={P}_{\mathrm{surv}}^{\mathrm{rotation}}= \).9, 95% confidence interval using the binomial distribution = [.87, .93]). P surv was significantly above the baseline level of .5 for every observer (probability that binomial proportion was not significantly different from .5 was p = .00013 ± .00015).

Fig. 3
figure 3

Independent sensory memory of illusory depth. (a, c) Top row: Snapshots of the stimulus seen on the screen (x–y plane), with the directions of the on-screen motion indicated by two example dots. Bottom row: States of illusory depth and rotation, as if seen from above (xz plane; 0 corresponds to the horizontal axis, and angles are measured counterclockwise). (a) When the directions of the on-screen motion were identical on two consecutive trials [e.g., Θon(i) = 73º → Θoff(i) = 37º and Θon(i + 1) = 73º → Θoff(i + 1) = 37º], the states of illusory rotation and illusory depth were associated. At the onset of trial T on(i + 1), both illusory properties either remained stable (top outcome) or reversed (bottom outcome). (b) This condition led to strong perceptual stabilization, as can be seen in the mean and 95% confidence interval for the binomial distribution (\( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)= \( {P}_{\mathrm{surv}}^{\mathrm{rotation}} \)= .9 [.87, .93]). P surv was significantly above the baseline level of .5 for every observer (the probability that the binomial proportion was not significantly different from .5 was p = .00013 ± .00015). (c) When the directions of the on-screen motion were opposite [e.g., Θon(i) = 73º → Θoff(i) = 37º and Θon(i + 1) = 37º → Θoff(i + 1) = 73º], sensory memories of illusory depth and illusory rotation competed, \( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)= 1 – \( {P}_{\mathrm{surv}}^{\mathrm{rotation}} \). At the onset of trial T on(i + 1), either illusory depth remained stable while illusory rotation reversed (top outcome) or illusory rotation remained stable while illusory depth reversed (bottom outcome). (d) In this condition, illusory depth tended to remain stable at the expense of the illusory rotation (\( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)> > .5, mean and 95% confidence interval for the binomial distribution). The probability that the binomial proportion was not significantly different from .5 was p = .0007 ± .002

When the direction of the on-screen motion in trial i + 1 was opposite to that of trial i [e.g., Θon(i) = 73º → Θoff(i) = 37º and Θon(i + 1) = 37º → Θoff(i + 1) = 73º], the new physical motion was compatible with alternative combinations of two illusory perceptual states. As is illustrated in Fig. 3c, the sensory memory traces of illusory rotation and illusory depth competed, biasing perception toward opposite states (a so-called forced-ambiguous-switch paradigm; Pastukhov & Braun, 2013b; Pastukhov et al., 2012; Stonkute et al., 2012). Accordingly, only one of the illusory properties could remain stable: \( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)= 1 – \( {P}_{\mathrm{surv}}^{\mathrm{rotation}} \). Either illusory depth could persist at the expense of the stability of illusory rotation (Fig. 3c, top outcome \( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)> .5), or the direction of illusory rotation could remain constant while illusory depth reversed (Fig. 3c, bottom outcome, \( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)< .5). If a single neuronal population codes for a combination of illusory depth and illusory rotation, both combinations of the two perceptual states would be equally likely (\( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)= .5).

For the selected combination of the presentation schedule and the range of rotation angles, we found that illusory depth tended to remain stable but that illusory rotation reversed (Fig. 3d; \( {P}_{\mathrm{surv}}^{\mathrm{depth}}= \).74, 95% confidence interval for the binomial distribution = [.69, .77]). \( {P}_{\mathrm{surv}}^{\mathrm{depth}} \)was significantly above the baseline level of .5 for every observer (the probability that the binomial proportion was not significantly different from .5 was p = .0007 ± .002).

Our results established that the illusory depth of an SFM object generates a separate sensory memory trace. Disparate persistence demonstrates that its representation is independent of the sensory memory of illusory rotation. This confirms the hypothesis that, in a multistable display with multiple ambiguous properties, each of these properties produces an independent sensory memory trace.

Experiment 2: Feature selectivity of sensory memories of illusory depth

Illusory depth and illusory rotation generate independent sensory memory traces, although both are derived from the same on-screen motion and, under normal circumstances, their perceptual states are associated (see the introduction for details). In our second experiment, we investigated whether this relationship is reflected as similar constraints on the neural mechanisms behind sensory memories. To this end, we have examined the specificity of sensory memory of illusory depth using a selective-adaptation paradigm (Schacter et al., 2007) and compared it to the specificity of sensory memory of illusory rotation (Chen & He, 2004; Maier et al., 2003; Pastukhov et al., 2013a).

In a selective-adaptation paradigm, the strength of the history effect for a primary feature is measured while changing one of the secondary features of an otherwise constant display. If the changes reduce the strength of the history effect, it may suggest that primary and secondary features have a joint representation (McCollough, 1965; Ware & Mitchell, 1974). Measurements of selectivity help to better characterize neural mechanisms behind the history effect, constraining models of such mechanisms and guiding imaging studies (Malach, 2012; Schacter et al., 2007). If two history effects have a comparable specificity, this indicates a similarity in their neural mechanisms. Conversely, when the specificities differ, it provides a strong indication that different neural mechanisms are involved. For example, we recently demonstrated that specificities of sensory memory and perceptual adaptation in SFM are strikingly different and that they rely on separate neural mechanisms (Pastukhov et al., 2013a; Pastukhov, Lissner & Braun, 2013).

In the case of an SFM display, the strength of the sensory memory of illusory rotation is dependent on the shape and volumetric property of an object, but not on the color or size (Chen & He, 2004; Maier et al., 2003; Pastukhov et al., 2013a). Accordingly, in the present experiment we examined the dependence of the sensory memory of illusory depth on color, size, and volumetric property of an SFM object.

The experimental procedure mostly followed that of Experiment 1. The only modification was a reversal of the on-screen motion halfway through the T on interval. Its purpose was to maximize the strength of the sensory memory of illusory depth and minimize that for illusory rotation (see the Method section below).

Method

The standard stimulus was the band object (see the General Method section for details). The alternative stimulus conditions were:

  1. 1)

    Color change (Movie 3): same shape as the standard band, but colored red

  2. 2)

    Size change (Movie 4): same shape as the standard band, but half the size (shape height of 2.85º, dot size of 0.03º)

  3. 3)

    Fill change (Movie 5): a drum object with dimensions identical to the band, but with dots distributed uniformly within the volume.

Two stimuli (the standard band and one of the alternative displays) were presented in pseudorandom order, ensuring that all pairs occurred equally often (one-back history randomization).

To minimize the influence of sensory memory of illusory rotation, we slightly modified the presentation schedule from that in Experiment 1. The direction of the on-screen motion reversed upon reaching the limit of the rotation angle range, and the presentation continued until the starting angle was reached again (e.g., 37º → 73º → 37º or 73º → 37º → 73º; see Movies 3–5). To accommodate this change, the T on duration was doubled to T on = 1 s, while T off remained the same (T off = 1 s). This “flip-flop” presentation schedule should minimize interference from illusory rotation, as sensory memory traces for both directions of rotation have a comparable strength and should cancel each other out (Pastukhov & Braun, 2008).

For the selected range of rotation angles, an inversion of the on-screen motion was expected to result in reliable reversals of illusory rotation, while illusory depth should have remained stable (Pastukhov & Braun, 2013b; Pastukhov et al., 2012). In the present experiment, illusory depth always remained stable, with none of the trials containing multiple responses.

Each condition was measured during a single experimental session (from a total of three experimental sessions). A single experimental session consisted of ten blocks, each of which contained 80 T on and 80 T off intervals. Trials with no responses were discarded (~4% total trials).

Results and discussion

The results and statistical analysis for Experiment 2 are presented in Fig. 4 and Table 1. The effect of display changes on the strength of sensory memories was analyzed via a three-way analysis of variance (ANOVA). The independent factors in this analysis were Prime–Probe Congruency, Identity of a Prime (trial i), and Identity of a Probe (trial i + 1).

Fig. 4
figure 4

Effect of display changes on the stability of illusory depth (mean and 95% confidence interval, using the binomial statistics). (a, b) Changes in the color (a) or size (b) of the band had no significant effect on the stability of illusory depth. (c) In contrast, a change in the volumetric property (from a hollow band to a filled drum, or vice versa) significantly destabilized illusory depth. (d) For the volumetric property condition, the prime–probe congruency effect remained significant for lags of up to 22 trials. The p values for the prime–probe congruency effect in a three-way ANOVA are plotted as a function of lag. See the text for details

Table 1 Experiment 2: Three-way analysis of variance with Prime–Probe Congruency, Prime Identity, and Probe Identity as independent factors

Neither color nor size changes had a significant effect on the stability of illusory depth (Figs. 4a and 4b). In contrast, a change in the volumetric property (whether the object was a hollow band or a filled drum) significantly decreased the strength of a sensory memory (Fig. 4c).

The observed properties of the sensory memory of illusory depth—a strong dependence on the volumetric property, but not on color or size—are identical to those of illusory rotation (Chen & He, 2004; Maier et al., 2003; Pastukhov et al., 2013a). In addition, the natures of the dependence on the volumetric property were similar for both ambiguous illusory properties. First, both illusory rotation and depth were modulated by the congruency between a prime and a probe, but not by the identity of the prime. The main effect of prime–probe congruency was F(1, 32) = 11.8, p = .0017, for illusory depth, and p < 10–8 for illusory rotation (Pastukhov et al., 2013a). The main effect of prime identity was F(1, 32) = 0.09, p = .77 for illusory depth, and p = .16 for illusory rotation (Pastukhov et al., 2013a). Second, as for illusory rotation (Pastukhov et al., 2013a), the observed dependence was order-independent—that is, not influenced by the order of presentation of the two objects. P surv(band, drum) and P surv(drum, band) were not significantly different (α = .31, difference of binomial proportions; see the Statistical Analysis section in the General Method for details). Third, in neither case was the strength of the sensory memory correlated with the total volume occupied by the object. In the present experiment, although the total volume of the filled drum was approximately three times bigger than that of the hollow band, the strength of their sensory memory traces was similar [P surv(band, band) was not significantly different from P surv(drum, drum), α = .38, difference of binomial proportions]. Finally, the time scale of the sensory memory of illusory depth was comparable to that of the sensory memory of illusory rotation. Repeating the three-way ANOVA for various lags showed that the effect of the prime–probe congruency remained significant for 22 trials (Fig. 4d), as compared to approximately 20 trials for illusory rotation (Pastukhov et al., 2013b).

Taken together, our results suggest that the same or similar mechanisms working at the same level of representation determine the dependence of sensory memories on both illusory properties in SFM displays.

General discussion

In the present article, we examined whether illusory depth of structure-from-motion (SFM) displays produces a sensory memory trace that stabilizes perception later on. We used a combination of an intermittent presentation and a forced-ambiguous-switch paradigm (Pastukhov & Braun, 2013b; Pastukhov et al., 2012; Stonkute et al., 2012) to dissociate perception and sensory memories of illusory rotation and illusory depth in SFM displays. We showed that illusory depth of SFM produces a sensory memory trace independent of the related sensory memory of illusory rotation. We also found that, despite their independence, sensory memories of both illusory properties exhibit an identical feature specificity. In both cases, sensory memories were strongly modulated by changes to the volumetric property of an SFM object, but not by changes to its color or size.

Independent sensory memories of independent ambiguous features

In the presented work, we demonstrated that illusory depth and illusory rotation produce independent sensory memories. They can work in accord, biasing perception toward the same combination of perceptual states (see the first condition of Exp. 1). Alternatively, they may compete, biasing perception toward opposing alternative states (see second condition of Exp. 1 and Exp. 2). In the latter case, the outcome of the competition is determined by the relative strengths of the two memory traces. For example, the limited range of rotation angles used in Experiment 1 tips the balance in favor of illusory depth. Alternatively, crossing symmetry lines (0º/180º or 90º/270º) balances memory traces of illusory depth, giving an advantage to sensory memory of illusory rotation (Pastukhov & Braun, 2013b).

Our results confirm and extend the original report of Pearson and Clifford (2004). The independence of sensory memories for independent ambiguous properties holds true for both binocular rivalry and structure-from-motion displays. Although the number of experimental displays is still very limited, one hypothesis for future studies is clear: When two or more ambiguous properties have independent perceptual representations, they should generate independent sensory memory traces.

Constraints on neural mechanisms of sensory memory

The neural mechanisms of sensory memory of multistable displays are still poorly understood. Current evidence from imaging studies suggest that they involve both sensory level representations in the specialized visual areas (Brascamp et al., 2010; Sterzer & Rees, 2008), and higher-level cognitive regions (Schwiedrzik et al., 2013; Sterzer & Rees, 2008). However, the exact roles of higher-level areas and the nature of sensory representations involved in sensory memories are presently unclear.

In our previous work, we were able to define an upper limit for neural representations that are involved in a sensory memory of illusory rotation (Pastukhov et al., 2013a). The dependence of sensory memories on SFM shape was gradual and strongly correlated with the similarity between the objects. However, a significant effect of sensory memory was observed, even when two objects were clearly perceptually different. The same was true for sensory memories of illusory depth (see Exp. 2). Accordingly, sensory memory does not seem to rely on high-level representations of canonical shapes (e.g., ring, sphere, cylinder, etc.).

Our present results allow us to add a lower limit as well. The independence of sensory memories of illusory rotation and illusory depth rules out any neural population that jointly encodes these two properties. For example, this excludes speed-gradient-selective neurons that code for a combination of rotation and depth (Mysore, Vogels, Raiguel, Todd & Orban, 2010; Orban, 2011), and are involved in negative aftereffects due to neural fatigue (Uomori & Nishida, 1994).

To summarize, neural mechanisms engaged in sensory memories of illusory depth and illusory rotation rely on intermediate-level representations. The nature of these representations is presently unknown, but they are likely to involve a variety of SFM object properties including shape, volume and other surface- or volume-based features. They are likely to be nonextensive, such that the number of neurons engaged in the representation is independent of the total volume occupied by the SFM object. Also, these sensory representations do not seem to contain any information about the color or size of an SFM object.

Relationship between sensory memories of illusory depth and illusory rotation

Although illusory depth and illusory rotation generate independent sensory memory traces, the two are remarkably similar in their dependence on the properties of the SFM object. Both were unaffected by changes in either color or size (Exp. 2, as well as Chen & He, 2004; Maier et al., 2003). For both illusory properties, however, the strength of sensory memory was significantly reduced when a hollow object was substituted for a filled one, or vice versa (Exp. 2, as well as Pastukhov et al., 2013a). Moreover, the natures of the latter dependence were very similar for illusory depth and illusory rotation (see Exp. 2 for a detailed comparison).

The independence of sensory memories, coupled with their identical specificities, allows us to formulate two hypotheses with respect to underlying neural populations. As we have argued above (see the Constraints on Neural Mechanisms of Sensory Memory section), both memories engage intermediate feature-based object representations. First, it is possible that each relies on an independent (but ultimately redundant) joint representation: illusory depth + features of the SFM object and illusory rotation + features of the SFM object. Alternatively, sensory memory representations of illusory depth and illusory rotation, although they are independent, may be linked and modulated via a single representation of an interpolated SFM object (Husain, Treue & Andersen, 1989; Treue, Andersen, Ando & Hildreth, 1995) or of a combination of its features. The present results do not favor a particular hypothesis and further experiments are required to resolve the issue.

Conclusion

Sensory memory of illusory depth is independent from sensory memory of illusory rotation. Both sensory memories show similar dependencies on the constancy of the volumetric property, but not on the constancy of color or size. This suggests that sensory memories of both illusory depth and illusory rotation rely on similar neural mechanisms. Also, in both cases, sensory memory representations include a feature-based representation of an SFM object.