Masking is among the most popular and enduring techniques in vision research. More than just a blunt tool for disrupting perception, masking has advanced understanding of how the visual system processes information and how these computations relate to visual experience (Breitmeyer & Ogmen, 2006). One particular kind of masking—object-substitution masking (OSM; Enns & Di Lollo, 1997)—has been especially helpful toward this end, encouraging careful examination of how object recognition and visual awareness unfold (Dux, Visser, Goodhew, & Lipp, 2010; Enns & Di Lollo, 1997), potentially via feedforward and feedback processing (Di Lollo, 2014; Enns, 2004). Like all forms of masking, OSM is typically measured in terms of its capacity to disrupt visual processing, but it is also known to disrupt visual awareness (Harrison, Rajsic, & Wilson, 2016). Indeed, these two outcomes tend to co-occur with such regularity that they can easily be mistaken as being one and the same, or at least strongly correlated (Fahrenfort, Scholte, & Lamme, 2007). Yet object discrimination and detection during OSM can occur independently (Gellatly, Pilling, Cole, & Skarratt, 2006; Kahan & Enns, 2010). This independence between visual processing and awareness seems to be strong in early stages of the ventral cortical pathway where simple visual features, such as object size, are represented (Choo & Franconeri, 2010) but not at higher levels of the visual hierarchy where more complex objects and faces are represented (Carlson, Rauschenberger, & Verstraten, 2007; Reiss & Hoffman, 2007). Although OSM is known to have a later locus of interference than other types of masking, such as noise or metacontrast masking (Chakravarthi & Cavanagh, 2009), the precise stage of visual analysis at which its disruptive effects on visual processing become bound to the mechanisms of visual awareness is still relatively unclear. In the current investigation, we evaluated the extent to which OSM’s effects on visual awareness and visual processing are dissociated at intermediate stages of visual analysis, where two-dimensional features of complex shapes are encoded (Dumoulin & Hess, 2007).

In most OSM paradigms, a target object is typically shown for a brief amount of time, flanked by four or more masking dots. When these masking dots persist after the target disappears, they tend to disrupt discrimination of its features and often eliminate awareness of its presence altogether (Gellatly et al., 2006). Although recent evidence suggests that OSM is not special in terms of its interaction with spatial attention (Argyropoulos, Gellatly, Pilling, & Carter, 2013; Filmer, Mattingley, & Dux, 2014a; Filmer, Mattingley, & Dux, 2014b; Goodhew & Edwards, 2016; Pilling, Gellatly, Argyropoulos, & Skarratt, 2014), it can still be differentiated from other types of masking in terms of its time course (Enns, 2004) and stage of interruption (Chakravarthi & Cavanagh, 2009). It also is particularly valuable for understanding how the visual system creates and maintains stable representations of objects (Goodhew, 2017; Lleras & Moore, 2003; Moore & Lleras, 2005). Exactly how OSM disrupts processing and perception has been the focus of some recent discussion (Di Lollo, 2014; Goodhew, 2017; Põder, 2012). Resolving this debate was not our goal. Rather, we hoped to provide more general knowledge about the limits of information processing and awareness during OSM. Nevertheless, these accounts are relevant to understanding our hypotheses and results, thus we discuss them briefly below.

Both early and more recent accounts of object substitution (Di Lollo, 2014; Enns & Di Lollo, 1997; Enns, 2004) focus on the extent to which it selectively disrupts the kind of re-entrant communication between higher- and lower-level visual areas (e.g., extrastriate areas and V1) that appears to be necessary for visual awareness of objects (Lamme, Supèr, & Spekreijse, 1998; Pascual-Leone & Walsh, 2001; Silvanto, Cowey, Lavie, & Walsh, 2005). More specifically, these accounts of OSM propose that when feedforward information about a masked object (e.g., a shape with masking dots nearby) arrives in later stages of vision, these areas generate a preliminary hypothesis about that object’s identity or location. This hypothesis is then tested by sending information about that object (in this case, the shape and masking dots) back to earlier visual areas for comparison with the most current sensory input. According to this account, when the feedback representation matches the current input, a visual experience of that object is likely to follow shortly thereafter. OSM may disrupt this matching process by allowing masking dots to remain on the screen at the time this feedback arrives (Jannati, Spalek, & Di Lollo, 2013), possibly interfering with figure/ground segmentation (Di Lollo, Enns, & Rensink, 2000). In this case, the representation of the target is lost and then replaced by that of the mask. Several recent investigations that directly manipulated the availability of re-entrant signals provide support for the iterative framework within this account (Boehler, Schoenfeld, Heinze, & Hopf, 2008; Pascual-Leone & Walsh, 2001). Notably, this substitution model of OSM features little or even zero interference both at the level of local contour interactions (Di Lollo et al., 2000) and across the initial wave of feedforward activity from V1 to higher-level visual areas (Enns, 2004; Goodhew, Dux, Lipp, & Visser, 2012; Jannati, Spalek, & Di Lollo, 2013; Kotsoni, Csibra, Mareschal, & Johnson, 2007).

A second and more recent account of OSM, often referred to as the object-updating account, is quite similar to the substitution account described above, except that rather than substituting one representation for another, a single, ongoing representation of the target object is updated to only include the masking dots (Goodhew, 2017; Lleras & Moore, 2003; Moore & Lleras, 2005). Importantly, iterative processing and minimal interference are still plausible mechanisms within this updating account (Filmer et al., 2014a, b; Pilling et al., 2014).

A third characterization of OSM proposes that local interference, such as lateral inhibition or the addition of noise from the masks, may degrade processing and perception of an object (Bridgeman, 2006; Macknik & Martinez-Conde, 2007; Põder, 2012). These models tend not to include iterative processing as a central piece of the puzzle and instead focus on the accumulation of interference in the representation of an object during feedforward processing. Importantly, we note that all three of these accounts (1) allow that a feedforward sweep of analysis should occur during OSM, and (2) predict that when OSM eliminates awareness of an object, that masked object’s ability to nevertheless influence increasingly complex perceptual judgments should depend, at least to some extent, on the strength of its representation as it moves up this feedforward sweep. We measured the extent to which this kind of lingering representation resonates at intermediate levels of the ventral pathway. Regardless of which of the above accounts is most accurate, our results should sharpen general understanding of the kinds of visual analyses that can be expected to occur during OSM both with and without awareness of an object’s presence.

Recent work suggests that OSM tends to permit stronger visual processing in lower compared with higher levels of analysis. For example, the processing of relatively simple features of objects, such as size or orientation, appears to persist during OSM, even when these objects are not visible (Choo & Franconeri, 2010; Jacoby, Kamke, & Mattingley, 2012). This lingering representation from these unseen objects is so strong, in fact, that it can influence perceptual judgments of other nearby objects that are clearly visible. Conversely, OSM is associated not just with disruptions of visibility, but also with weakened processing at high-levels of vision (e.g., LOC) during the perception of objects (Carlson et al., 2007) and faces (Reiss & Hoffman, 2007). This suggests that, as with other kinds of masking (Sweeny, Grabowecky, & Suzuki, 2011a), the disruptive effects of OSM on visual processing may become bound to the mechanisms of visual awareness at some intermediate stage of analysis or later.

To test this hypothesis, we evaluated the extent to which OSM disrupted the processing and visual awareness of aspect ratio. This two-dimensional feature of shapes (Regan & Hamstra, 1992; Suzuki & Cavanagh, 1998) is known to be encoded by neurons in intermediate stages of visual analysis (e.g., V3/VP and V4; Dumoulin & Hess, 2007). Aspect-ratio is important for a variety of visual judgments, including figure-ground segmentation and quick shape discriminations (Elder & Zucker, 1993; Koffka, 1935), as well as basic evaluations about the structure of objects (Biederman, 1987) and faces (Young & Yamane, 1992). Like other investigations (Choo & Franconeri, 2010), we did not ask observers to discriminate the appearance of a masked shape, because this could have been confusing on trials in which that shape was not visible. Rather, we evaluated the extent to which the aspect ratio of a masked shape influenced the appearance of a different shape that was clearly visible and seen nearby. Previous work demonstrated that when two ellipses were seen briefly and with equally distributed attention, they tended to distort each other’s appearance (Sweeny, Grabowecky, & Suzuki, 2011b). We evaluated a similar outcome, except in the current investigation, we attempted to manipulate the processing and awareness of one shape within each pair by introducing object-substitution masking on a subset of trials. Additionally, we wanted to tease apart potentially distinct effects of masking and awareness by gathering information about each observer’s subjective awareness on each trial. Based on a pilot study, we expected that when two ellipses were presented nearby one other, each shape’s aspect ratio would appear more like that of the other—an aspect-ratio attraction effect.Footnote 1 More importantly, we expected that this attraction effect would be present, albeit weakened, when one shape from the pair received OSM. We made this prediction based on the assumption that any effect of averaging would likely depend on feedforward representation, which would presumably decay but not be eliminated at this intermediate stage of visual analysis, either due to the accumulation of random neural noise (Faisal, Selen, & Wolpert, 2008) or interference from the masking dots.

Method

Observers

Thirty students from the University of Denver gave informed consent to participate in the experiment. We selected this sample size because it was sufficient to observe significant shape interactions in a pilot study with similar ellipses, spatial locations, and timing, but without masking (see Footnote 1). Each observer had normal or corrected-to-normal visual acuity. All experimental protocols were approved by the University of Denver IRB.

Stimuli

We used a stimulus set from a previous investigation (Sweeny et al., 2011a). This set included 11 ellipses with a range of aspect ratios with equivalent areas (log aspect ratios—wide: −0.374, −0.311, −0.221, −0.131, −0.043, circular: 0.0; tall: +0.043, +0.131, +0.221, +0.311, +0.374), symmetrically distributed in log scale around a circle. The diameter of a circular ellipse was 2.7°, and the width (or height) of the widest (or tallest) ellipse on the screen was 4° of visual angle. Ellipses were drawn with dark gray lines (thickness = 0.4°, luminance = 18.3 cd/m2). Each ellipse was blurred (using a 2.0-pixel Gaussian blur) to minimize aliasing. On all trials, four black dots (0.7° diameter, luminance = 1.1 cd/m2) appeared around each ellipse, each equidistant (2.5°) from that ellipse’s center. All stimuli were presented on a gray background (RGB = 170, 170, 170; luminance = 43.65 cd/m2) on a 18” monitor using Matlab with the Psychophysics toolbox (Brainard, 1997) at a viewing distance of 57 cm. Ellipses were always presented in pairs. Some pairs included only wide ellipses (log aspect ratios: −0.311, −0.131), some pairs included a moderately wide (−0.221) or tall (+0.221) ellipse and a circle, and others pairs included only tall ellipses (+0.131, +0.311). We also included trials in which each ellipse was paired with itself. These particular trials allowed us to gather baseline measurements of individual bias in the perception each ellipse’s aspect ratio, which we would then subtract out before measuring shape interactions.

Ellipse pairs were presented in horizontal or vertical spatial organizations (Figure 1a). Horizontal pairs spanned the vertical meridian (i.e., with an ellipse in both the left and right visual hemifields) and were presented in either the upper or lower visual field. We refer to this as the between-hemifield condition. Vertical pairs were presented solely within either the left or right visual hemifield. We refer to this as the within-hemifield condition. Ellipses were closer to each other in the between-hemifield condition (6.4°, center to center) than in the within-hemifield condition (7.2°, center to center). Each ellipse was presented along an iso-acuity orbit (Rovamo & Virsu, 1979) around the fixation point. This ensured that, regardless of the distances between pairs of ellipses, each individual shape would be seen with the same visual acuity, 4.6° from the fixation point.

Figure 1
figure 1

(a) Typical trial sequence, drawn to scale. Each trial contained a pair of shapes. In the within-hemifield condition, the pair of shapes appeared entirely in the left or right visual hemifield. In the between-hemifield condition, one shape appeared in the left visual field and one shape appeared in the right visual field, both above or both below fixation. Each shape was surrounded by a quartet of black masking dots. On some trials, a set of masking dots remained on the screen after the offset of the shapes, presumably masking the shape that appeared in that location. On other trials, all masking dots offset with the shapes and were followed by a blank screen instead of trailing dots. The post-cue, a centrally presented arrow pointing up, down, to the left, or to the right, indicated which shape from the pair the observer should rate in terms of its aspect ratio. (b) The magnitude-matching screen.

These particular spatial arrangements allowed us to gather additional information about the mechanisms of shape interactions. Spatial proximity may be the primary factor in determining how strongly an object might distort the appearance of another object seen nearby. If this were true, then perceptual averaging would be greatest when the shapes were physically closer, in the between-hemifield condition. This between-hemifield arrangement ensured that an effect of spatial proximity would have to occur despite greater cortical distance between the representations of each shape. Alternatively, cortical proximity may be the primary factor in determining the magnitude of shape interactions. If this were true, then perceptual averaging would be strongest in the within-hemifield condition. In this case, despite increased spatial distance, the populations of cells representing each shape would still be able to communicate via local connectivity within each cerebral hemisphere. We selected this approach for pitting spatial and cortical proximity against each other based on a similar investigation with faces (Sweeny, Grabowecky, Paller, & Suzuki, 2009).

Procedure

Observers began the experiment by completing five randomly generated practice trials with the experimenter. Observers were allowed to complete additional practice trials until they indicated that they were comfortable with the experimental procedures. Each trial began with the presentation of a fixation point (0.2 × 0.3°) at the center of the screen for a randomly determined duration between 1,000 msec and 1,500 msec. The experimenter encouraged observers to hold their gaze on the fixation point, emphasizing that looking elsewhere would not improve task performance since the location of the cued ellipse would not be apparent until after the shapes had disappeared. Next, an ellipse pair appeared for 20 msec, either in a within-hemifield or between-hemifield arrangement, which was determined randomly on each trial (Figure 1a). We used this brief presentation to increase the effectiveness of masking and to prevent observers from making deliberate saccades or shifts of attention to either of the shapes. Each ellipse was surrounded by four black dots. On trials with no masking, the flanking dots and ellipses offset simultaneously and were followed by a blank gray screen for 240 msec. On trials with masking, one set of masking dots offset with one ellipse from the pair (the unmasked ellipse), whereas the masking dots surrounding the other ellipse remained on the screen for an additional 240 msec. Previous investigations have shown that feedback activity tends to arrive in early visual areas with a latency of 80-120 msec (Jannati et al., 2013) and that the timing of this reentrant activity is related to the effectiveness of OSM (Kotsoni et al., 2007). Thus, according to re-entrant accounts of OSM, our 240-msec lag time should have been more than adequate to induce strong masking during late stages of object representation (Enns, 2004). The location of the trailing mask was counterbalanced so that it appeared around each shape in each spatial arrangement an equal number of times.

Regardless of whether a trial was intended to induce masking or not, we still presented flanking dots during the presentation of each ellipse. This prevented observers from identifying the to-be-rated ellipse while the pair was on the screen. Observers only learned the location of the ellipse to be rated after the shapes and any masking dots had offset, at which time an arrow (1.2 × 1.2°, luminance = 0.86 cd/m2) replaced the fixation cross for 800 msec pointing up or down in the within-hemifield condition, or left or right in the across-hemifield condition (Figure 1a). After the arrow cue, observers viewed a magnitude-matching screen consisting of 10 ellipses, paired with response numbers 1-10 (Figure 1b). As in Sweeny et al. (2011b), we excluded a circle from the magnitude-matching screen to preclude observers who were not confident about their response from selecting a circle by default. On each trial, observers selected the response ellipse with an aspect ratio that most closely matched their perception of the cued ellipse. The same screen also prompted observers to indicate how many ellipses from the pair they were able to see clearly (1 or 2) using the left and right arrow keys. This inquiry about subjective awareness allowed us to sort our data according to phenomenology on a trial-by-trial basis and thus separately measure the extent to which OSM interfered with shape interactions with and without simultaneously disrupting visual awareness. Observers made their aspect ratio response before completing their awareness response.Footnote 2 The response screen appeared until both responses were recorded, which triggered the start of the next trial. The experiment included 480 trials and lasted approximately 50 minutes.

Results

General aspect ratio sensitivity

Before evaluating interactive effects between shapes, we first confirmed that observers were indeed using information from the shapes to guide their responses by calculating the slope of the relationship between the physical aspect ratios of the ellipses and their perceived aspect ratios using the 1-10 aspect ratio scale from the magnitude-matching screen. For simplicity, we conducted this preliminary analysis only on trials in which the ellipses in each pair were identical, and then collapsed across data from the within- and between-hemifield spatial arrangements. With an average slope of 1.04 (SD = 0.505), observers were very sensitive to the aspect ratios of the ellipses (compared against a slope of zero using a one-sample t-test: t[29] = 11.37, p < 0.001, d = 2.07). Thus, any effect of shape attraction in the analyses below would occur over and above this general sensitivity to aspect ratio.

Aspect ratio attraction index

To account for individual biases in perceiving either a tall or a flat aspect ratio, we created an aspect-ratio-attraction index in which we subtracted the rating of each ellipse when paired with a separate ellipse with a different aspect ratio (e.g., a circle and a tall ellipse) from that same ellipse’s rating when paired with itself (e.g., two circles).Footnote 3 We performed this computation separately for each observer. The sign of the attraction index was coded such that a positive value reflected a response in the direction toward the aspect ratio of the paired ellipse (attraction) and a negative value reflected a response away from the aspect ratio of the paired ellipse (repulsion). For example, if a circle appeared taller when paired with a tall ellipse than when paired with another circle, the attraction index would have a positive sign.

Aspect ratio attraction: Trial-type analysis

We began our analysis of aspect ratio attraction as simply as possible, sorting the data based on the presence or absence of masking dots independent of the experience of the observer. First, we conducted a repeated-measures ANOVA on the attraction index with factors of masking (OSM, no OSM), cued ellipse (flat, circle, tall), and arrangement (within-, between-hemifield). This analysis revealed a main effect of masking, F(1,29) = 10.17, p = 0.003, η p 2 = 0.259, a main effect of cued ellipse, F(2,28) = 6.617, p = 0.004, η p 2 = 0.321, and a main effect of arrangement, F(1,29) = 14.62, p = 0.001, η p 2 = 0.335. None of the interactions between these factors were significant. The main effect of masking confirmed our prediction that perceptual attraction between shapes would be stronger in the absence of masking. A one-sample t-test against a null value of zero confirmed that attraction occurred in no-masking condition, t(29) = 3.85, p < 0.001, d = 0.703. Surprisingly, a significant, albeit weakened effect of attraction also persisted in the masking condition, t(29) = 2.59, p = 0.01, d = 0.473 (Figure 2).

Figure 2
figure 2

Effects of aspect-ratio attraction based on the presence of masking. The attraction index—a metric of the amount of perceptual averaging between two ellipses—is shown separately for the no masking and masking trials, collapsed across the aspect ratios of the cued ellipse. Error bars represent ±1 SEM that has not been corrected for multiple comparisons in order to emphasize comparisons against a null value of zero.

The main effect of cued ellipse reflected the fact that attraction was stronger when pairs included circles, which we confirmed with paired-samples t-tests (pairs with circles vs. pairs with flat ellipses; t[29] = 3.02, p = 0.005, d = 0.552, pairs with circles vs. pairs with tall ellipses; t[29] = 3.14, p = 0.003, d = 0.574). This was likely because the aspect ratio difference between shapes in pairs with circles (log transformed difference = 0.221) was greater than in pairs that did not include circles (log transformed difference = 0.18), suggesting that the strength of attraction increases with the physical (or perceptual) distinction between shapes.

The main effect of arrangement revealed that ellipses presented in different visual hemifields produced stronger attraction effects than ellipses presented within the same hemifield. This suggests that spatial proximity was more important for influencing attraction between shapes than cortical proximity, since shapes in the between-hemifield condition were closer in space (yet further in cortical distance) to one another than shapes in the within-hemifield condition. This had nothing to do with OSM—the effect of arrangement did not interact with the effect of masking, and we also observed a trend for this pattern when we analyzed data from the no-masking trials alone, t(29) = 1.73, p = 0.09, d = 0.316. Overall, this result is notable, because it suggests that attraction effects do not simply reflect a generic response bias that occurs indiscriminately whenever two objects are seen at the same time. Rather, it tends to increase with spatial proximity. The lack of other interactions in our ANOVA indicated that the presence of masking similarly diminished attraction independently of the aspect ratios or arrangements of the ellipses.

We also conducted an exploratory analysis to evaluate the extent to which redundancy across shape pairs influenced the effectiveness of OSM. We recorded the proportion of trials in the masking condition in which each observer indicated having seen only one ellipse from the pair. We did this separately for trials in which the two ellipses from the pair were identical (e.g., two circles) or different (e.g., a circle and a flat ellipse). We then conducted a paired-samples t-test on the success of OSM as a function of ellipse pairing (same, different). OSM was less effective at eliminating awareness when both ellipses in a pair had the same aspect ratio (M = 29% of trials, SD = 20%) than when both had different aspect ratios (M = 40% of trials, SD = 27%), t(29) = 3.692, p < 0.001, d = 0.674. This surprising result is consistent with models of redundancy gain, in which identical stimuli receive especially strong visual representation via probability summation or signal integration (Guzman-Martinez, Grabowecky, Palafox, & Suzuki, 2011).

Aspect ratio attraction: Experience-based analysis

In the preceding analyses, trials were categorized as masking or no-masking based solely on whether masking dots lingered after the offset of the shapes on that trial. Although this kind of approach is favored more often than not in studies of OSM, it does not take into account an observer’s subjective visual experience. Thus, the analyses reported above do not necessarily account for the phenomenology and visual awareness of each observer. We therefore repeated our primary analysis from above, this time using each observer’s reports of subjective awareness to assign particular trials to each condition. This allowed us to evaluate the extent to which shape attraction occurred as a function of OSM’s effect on visual awareness. We limited the no-masking condition to contain data only from trials in which (a) masking dots did not linger after the offset of the shapes and (b) observers reported seeing two shapes. We divided the data from trials in which the dots remained visible after the ellipses disappeared—the original masking condition—into two new sub-conditions: the masking/aware condition, which contained data only from trials in which observers nevertheless reported seeing both shapes, and the masking/unaware condition, which contained data only from trials in which observers reported seeing just one shape. The reassignment of data across these different conditions was unpredictable since it depended entirely on each observer’s subjective threshold for reporting visual awareness. We thus collapsed our data across the conditions less central to our investigation (e.g., spatial arrangement, paired ellipse, etc.) to minimize the likelihood of missing data and observers. We focused specifically on measuring the overall effect of attraction in the no-masking, masking/aware, and masking/unaware conditions, only including observers who provided data for each condition.

A one-sample t-test against a null value of zero confirmed a significant effect of attraction in the no-masking condition, t(28) = 3.48, p = 0.001, d = 0.647 (Figure 3). One observer did not produce data for this condition, and was omitted from this particular analysis. Attraction also occurred on trials in which the masking dots were present but observers still reported seeing two shapes—the masking/aware condition, t(29) = 2.56, p = 0.01, d = 0.468 (Figure 3). Twenty-eight of our 30 observers produced trials in which they only reported awareness of one shape from the pair—the masking/unaware condition. Surprisingly, we found a significant effect of attraction even in this condition, t(27) = 2.09, p = 0.04, d = 0.396. We conducted paired-samples t-tests to compare attraction effects across these three conditions using only data from the 28 observers with no missing data. Attraction was stronger in the no-masking condition than in the masking/unaware condition, t(27) = 2.25, p = 0.03, d = 0.425. There was a trend for stronger attraction in the no-masking condition than in the masking/aware condition, t(27) = 1.71, p = 0.09, d = 0.324, and no difference between the masking/aware and masking/unaware conditions, t(27) = 0.995, p = 0.32, d = 0.188.

Figure 3
figure 3

Effects of aspect ratio attraction based on the presence of masking dots (no masking, masking) and the phenomenology on each trial (aware; A(+), unaware; U(−)). Error bars represent ±1 SEM that has not been corrected for multiple comparisons to emphasize comparisons against a null value of zero.

Ruling out response bias

It was important to rule out the possibility that our main results could have emerged from a simple and common bias to select responses from the center of the magnitude-matching range (Crawford, Huttenlocher, & Engebretson, 2000; Duffy, Huttenlocher, Hedges, & Crawford, 2010; Hollingworth, 1910) and not actual averaging between the aspect ratios of the ellipses. For example, on trials in which a circle was paired with a tall ellipse and the tall ellipse was cued as the shape to be rated, an observer randomly pressing the buttons from the middle of the response range would have produced data suggestive of attraction. To determine whether this occurred, we calculated the attraction index separately for ellipse pairings for which an actual effect of attraction would have elicited a response toward the center of the response range (e.g., a cued tall shape paired with a circle), and also for pairings in which attraction would have elicited a response away from the center of the response range (e.g., a cued circle paired with a tall shape). For simplicity, we calculated these scores separately for trials from the no-masking condition. Attraction was actually stronger when it pushed responses away from, rather than toward the center of the response range, t(29) = 2.61, p = 0.01, d = 0.476. This demonstrates that a center-bias cannot account for our attraction effects and instead suggests the presence of a perceptual effect. Additionally, this pattern is consistent with reports that perception of simple and complex features tends to be distorted away from null values or categorical boundaries (Mareschal, Morgan, & Solomon, 2008; Solomon, 2000; Sweeny, Grabowecky, Kim, & Suzuki, 2011; Sweeny, Haroz, & Whitney, 2012).

Hierarchical shape interactions

Previous work showed that when multiple shapes was presented nearby each other, their global organization influenced the local perception of each shape within the pair (Sweeny et al., 2011b). For example, when a pair of ellipses were seen in a vertical global organization (e.g., one shape above the other, like in our within-hemifield condition) each shape within the pair appeared slightly taller, over and above any local distortions occurring simultaneously between the shapes in the pair. We analyzed our data to determine if a similar global effect occurred in the current investigation. For simplicity, we used data from trials in which the ellipses within each shape pair were identical, and we evaluated this effect separately for trials from the no-masking and masking conditions (according to trial type, as in our original analysis above). We conducted a repeated-measures ANOVA on raw aspect-ratio responses (not on the attraction index) with factors of organization (vertical in the within-hemifield condition, and horizontal in the between-hemifield) and mask (no-masking, masking). This analysis revealed a significant main effect of organization, F(1,29) = 22.56, p < 0.001, η p 2 = 0.437, indicating that ellipses were perceived as taller when they were presented in a vertical organization compared to when they were presented in a horizontal organization. There were no other significant main effects or interactions, indicating that this effect was equally strong regardless of the persistent presence of masking dots around the second shape in the pair. This may seem surprising, because interfering with the presence of the second shape in each pair could have weakened the extent to which a global organization was perceived. Yet even if the second ellipse were not even visible, its masking dots would have been present throughout the trial, presumably contributing to the presence of a vertical or horizontal pattern at the global level. In general, this effect is consistent with reverse-hierarchical models of visual processing which propose that awareness of visual information proceeds from rapid global-level analyses to more detailed analyses of local object features (Hochstein & Ahissar, 2002). We note that this global effect is orthogonal to our primary effect of shape-to-shape distortions, and it appears to rely on different mechanisms (Sweeny et al., 2011b).

Discussion

We showed that visual analyses central to the perception of shapes persist despite object-substitution masking, even when subjective visual awareness of an object is completely eliminated. Specifically, when two shapes were presented briefly and simultaneously, each shape’s aspect ratio appeared more similar to that of the other. This perceptual attraction still occurred when one shape from a pair was the target of object-substitution masking, albeit to a weakened extent. Strikingly, attraction between a pair of shapes even persisted when observers reported seeing only one object from the pair. These results are important, because they deepen understanding of the extent to which visual analyses that occur in intermediate stages of the ventral pathway may (or may not) be bound to the mechanisms of visual awareness. They also expand the boundaries of neural computations known to persist in the face of object-substitution masking.

The attractive effects in this investigation were not due to response bias. Additional analyses confirmed that effects of attraction could not simply be accounted for by observers selecting from the middle of the response range. Our findings would have been unlikely had observers accidentally rated the uncued ellipse from a pair, even on a subset of trials. If this had occurred, it is difficult to understand why the more nuanced results from our analyses would have emerged, like stronger averaging with closer spatial arrangement. Most important, we still observed a significant effect of perceptual attraction on trials in which the observers did not even see the un-cued ellipse.

By demonstrating that a shape’s aspect ratio can be encoded even when that shape is invisible, and that this representation has consequences for the perception of other objects that are clearly visible, the current investigation advances understanding of the depth of object-substitution masking, and potentially its mechanisms as well. Previous work showed that information encoded in early stages of visual analysis, like an object’s size (Chong, Joo, Emmanouil, & Treisman, 2008), can still influence perception when suppressed from awareness by OSM (Choo & Franconeri, 2010). Specifically, when observers viewed a collection of circles with varying diameters (some of which received OSM and some of which did not), the sizes of the masked objects still biased estimates of the average circle size in the group. Choo and Franconeri interpreted these results as indicating that the process of size averaging can proceed with undiminished strength based exclusively on an early feedforward sweep of information likely to persist during OSM, without the need for reentrant processes that occur later in time (Di Lollo et al., 2000; Jannati et al., 2013; Kotsoni et al., 2007). We demonstrated a similar effect of attraction in the perception of aspect ratio, a more complex visual feature encoded in intermediate stages of the ventral pathway (e.g., V3/VP & V4, Dumoulin & Hess, 2007). If OSM does isolate an early stream of feedforward processing (a feature of both substitution and updating accounts of OSM), then our results would suggest that this initial wave of visual analysis remains relatively robust even at intermediate stages of the ventral pathway. It is notable that OSM did reduce the strength of aspect ratio attraction. In this regard, our findings dovetail nicely with those from Jacoby et al. (2013), who also found some weakening from OSM on the perception of orientation and size. Assuming that feedforward processing in OSM is free of inhibitory contour interactions (Di Lollo, 2014), how might this decay have arisen? One possibility is that the relative strength of visual signal compared to internal noise may degrade as information about an object travels from low to high levels of the ventral pathway. Indeed, neural noise is known to accumulate up successive stages of visual processing (Faisal et al., 2008). Alternatively, if one instead assumes that OSM directly interferes with feedforward activation (Macknik & Martinez-Conde, 2007; Põder, 2012), this decay could simply reflect more opportunities for inhibition at each stage of visual analysis. Determining which of these characterizations is more accurate was not the goal or our investigation. However, we suspect that all models of OSM could still benefit by incorporating the pattern of reduced representation without awareness across the visual hierarchy highlighted by the current work.

It is reasonable to wonder at which stage of visual analysis might the lingering activity that survives OSM be too weak to influence perception. Other researchers have demonstrated that even during OSM, information from other objects like shapes (Prime, Pluchino, Eimer, Dell’acqua, & Jolicœur, 2011; Woodman & Luck, 2003) and arrows (Chen & Treisman, 2009) can still be processed with enough strength to influence attention. Semantic processing (Goodhew, Visser, Lipp, & Dux, 2011) and categorization of letters (Goodhew, Greenwood, & Edwards, 2016) is known to persist as well. Processes, such as feature integration, also might occur during OSM (Chakravarthi & Cavanagh, 2009; although see Gellatly et al., 2006; Jacoby et al., 2012). Considering that processing of aspect ratio was nearly eliminated in the current investigation, one might expect that even more complex objects, such as faces, might show little to no evidence of representation when they receive OSM. Indeed, OSM has been shown to eliminate face-specific processing as measured by EEG (Reiss & Hoffman, 2007). This gradual decay across the visual hierarchy seems to be superficially at odds with the fact that some complex object discriminations appear to be possible based on feedforward activity alone (Rousselet, Macé, & Fabre-Thorpe, 2003; VanRullen & Koch, 2003; VanRullen & Thorpe, 2001). However, it is important to remember that these so-call ultrarapid categorizations were not made in the context of masking, thus direct comparisons must be made with caution. It may be the case that, just as with metacontrast masking (Haynes, Driver, & Rees, 2005), sandwich masking (Harris, Wu, & Woldorff, 2011), and backward masking (Rodríguez et al., 2011), neural activation during OSM tends to be tightly coupled with an object’s visibility in late stages of the ventral pathway, at least more than in earlier or intermediate stages of analysis.

This investigation converges with a few recent studies to illustrate the importance of carefully considering distinct types of phenomenology that can occur during OSM (Gellatly et al., 2006; Harris et al. 2011; Harrison et al., 2016; Kahan & Enns, 2010; Prime et al., 2011). Masking is typically defined simply as interference in the perception of a stimulus, and it is classically measured as a reduction in the ability to correctly report details about a visual feature (Breitmeyer & Ogmen, 2006). Elimination of visual awareness often coincides with masking, so much so that the two phenomena are sometimes conflated. Yet masking need not eliminate detection of an object altogether in order to disrupt perception of its features. For example, crowded objects often are still visible despite being difficult to discriminate (Whitney & Levi, 2011), and backward-masked objects can be detected without being identified (Mack, Gauthier, Sadr, & Palmeri, 2008). Similar dissociations have been considered in the study of OSM, both in terms of measuring detection and discrimination separately (Gellatly et al., 2006) and treating consciousness as a continuous variable (Harrison et al., 2016). By sorting masking data based on each observer’s visual experience on a trial-by-trial basis (Prime et al., 2011), we were able to refine our analyses, measuring the extent to which interference from OSM may or may not have depended on accompanying changes in visual awareness. We found that a shape’s ability to induce perceptual attraction was substantially reduced when OSM was powerful enough to render it invisible. Nevertheless, we also found a trend for this same reduction when that shape was still visible. This interference may reflect a more general effect of visual crowding from the presence of the masking dots (Kahan & Enns, 2010). We tentatively interpret this general pattern as indicating that disruptions of visual awareness tend to co-occur when OSM strongly disrupts visual processing, but disrupted processing by no means guarantee a loss of awareness. In any case, it is reasonable to wonder how the strength of well-known and important effects that emerge during object-substitution masking (e.g., automatic attraction of attention, Woodman & Luck, 2003) also might vary as a function of phenomenology.

OSM has proven to be important in vision science, both for practical purposes like disrupting perception and for advancing understanding of the algorithms and mechanisms that underlie everyday visual experience. Any investigation which clarifies the kinds of processes that can and cannot occur during OSM thus has the potential to reveal more general insights about vision and awareness. We showed that complex visual analyses central to the perception of shapes persist, albeit with weakened strength, despite object-substitution masking and disruptions of visual awareness. These results add to growing evidence that visual processing and awareness become more tightly coupled at successive stages of the visual hierarchy (Haynes et al., 2005).