In standard models, visual masking is understood as a consequence of inhibition or interference associated with the mask’s spatially overlapping or adjacent edges with the target, or with the transients associated with the mask’s delayed onset (Kahneman, 1968; Breitmeyer, Hoar, Randall, & Conte, 1984). The phenomenon of object substitution masking (OSM), first reported by Enns and Di Lollo (1997), has been argued to pose a challenge to standard models. In OSM, a mask consisting of just four surrounding dots is sufficient to prevent awareness of the target when the mask lingers after the target offset, the duration of the trailing mask being associated with the strength of masking (Di Lollo, Enns, & Rensink, 2000). In OSM, the mask, in being comprised of just four dots, contains no significant overlapping or adjacent edges; the onset of the mask does not seem to play any special role (OSM occurs irrespective of whether the mask onsets simultaneously with the target or with a delayed onset). Instead it has been suggested that the processes in OSM are object based, masking being a reflection of the process by which mask and target compete with each other as separate perceptual objects for conscious representation (Enns & Di Lollo, 1997; Di Lollo et al., 2000).

The original descriptions of OSM strongly emphasised the importance of attention as a factor in masking (Enns & Di Lollo, 1997; Di Lollo et al., 2000; Di Lollo, Enns, & Rensink, 2002; Enns, 2004). The reason for this was that initial empirical studies of OSM seemed to indicate that masking only occurred when the target and mask were presented in the context of multielement displays; with just the target and mask alone, OSM—as indexed by the difference in performance between simultaneous and delayed mask offset conditions—seemed to be absent (Enns & Di Lollo, 1997). Later studies found what seemed to be a systematic relationship between set size (i.e. the number of display items) and the magnitude of OSM (Di Lollo et al., 2000; Kotsoni, Csibra, Mareschal, & Johnson, 2007).

More recently, however, a number of studies have reported results which challenge the status of attention in OSM (Argyropoulos, Gellatly, Pilling, & Carter, 2013; Filmer, Mattingley, & Dux, 2014, 2015; Filmer, Wells-Peris & Dux, 2017; Goodhew & Edwards, 2016; Pilling, Gellatly, Argyropoulos, & Skarratt, 2014). For instance, both Argyropoulos et al. (2013) and Filmer et al. (2014) failed to observe a Set Size × Mask Duration Interaction in OSM in their data. Both authors claim that the interactions reported in the original experiments of Di Lollo et al. (2000) were artefactual in nature, the product of ceiling effects in the smaller set-size conditions (particularly when set size = 1). When, as in these later studies, the discrimination task was made more difficult to bring performance in the smaller set size conditions into a measurable range a masking effect in these conditions became apparent. Under such conditions set size had a clear main effect on performance; however the interaction with mask duration was no longer found. More recently Filmer et al. (2015) showed that OSM can even occur under conditions where the target is the sole focus of attention and is presented at fixation.

Together these findings suggest that the original claims regarding the status of attention as a variable in OSM were, at best, overstated. It seems that attention has, if at all, only a small effect on OSM. Certainly, at the very least, the role of attention cannot be considered a signature aspect of the OSM phenomenon as was originally claimed.

Though the role of attention is ostensibly small and though the presence of distractors has been demonstrated to be unnecessary for OSM to occur, recent research has suggested that distractors, where present, can influence OSM at least under some circumstances. Camp, Pilling, Argyropoulos, and Gellatly (2015) in contrast to the earlier described findings of Argyropoulos et al. (2013) and Filmer et al. (2014) found a reliable effect of set size on mask duration. Although OSM occurred without distractors, adding distractors to the display reliably increased the size of OSM. However, a further experiment showed that this effect was not a consequence of the changes in set size as Di Lollo et al. (2000) had earlier assumed. Rather, this effect was explained by the relative position of the distractors in the display with respect to the target. Where distractors were positioned to closely flank the target location OSM was stronger than when the distractors flanked a location opposite the target. This effect was found irrespective of overall set size. Camp et al. attributed this increased OSM which occurred with flanking distractors (hereafter ‘flankers’) to an effect of crowding on OSM.

Crowding is a well-established visual phenomenon (Levi, 2008; Whitney & Levi, 2011). One widely held theory of crowding deems it as a consequence of neural pooling or signal averaging. On this account the features of a target and those of sufficiently closely located flankers become mingled together, the result being that the visual system is unable to bind only the appropriate features to the token representation of the target (Parkes, Lund, Angelucci, Solomon, & Morgan, 2001; Levi & Carney, 2009; Greenwood, Bex, & Dakin, 2009). The interaction between OSM and crowding is interesting because it suggests the two phenomena, though distinct, share common mechanisms.

Camp et al. (2015) argued that crowding the target degraded the initial target percept and in doing so rendered it more susceptible to the trailing mask. They argued that the converse possibility, that OSM influenced crowding, was ruled out as an explanation of the interaction. This was argued on the basis of previous empirical findings and theoretical claims which suggest that OSM occurs as a later stage process than crowding within the visual processing hierarchy (Breitmeyer, 2014; Chakravarthi & Cavanagh, 2009).

Aside from OSM, some other forms of masking can influence crowding (Vickery, Shim, Chakravarthi, Jiang, & Luedeman, 2009). Vickery et al. (2009) presented a brief target in a location directly below the observer’s fixation. On unmasked trials no mask was present; on masked trials a surround ring (in a later experiment, a surround square) was presented around the target and onset and coterminated with it. Flankers were located at each of the four cardinal positions around the target at one of three increasing distances from the target. This flanker position manipulation was done on both unmasked and masked trials. On unmasked trials, a classic crowding effect was observed: Accuracy was low when flankers were closest to the target and much higher when at the middle and furthest distances given. With these outer two distances accuracy was the same as a baseline unmasked condition in which no flankers were present. On masked trials with flankers at the nearest position, accuracy was similarly low to that found on unmasked trials. However, unlike for unmasked trials, accuracy remained low for the middle and furthest flanker distances compared against a no flanker masked baseline. Thus, when the target was masked the flankers continued to have a deleterious effect on performance across a broader spatial range than they did under unmasked conditions. This spatially extended crowding effect the authors dubbed ‘supercrowding’. This effect occurred despite the fact that masking individually had only a marginal effect on performance.

The current study had two aims. The first was to attempt to replicate the finding of Camp et al. (2015) that crowding and OSM interact. In Camp et al. crowding was only specifically manipulated in one of the four experiments. Given this, it is important to demonstrate that this interaction is a replicable one. The study’s second aim was to more thoroughly explore the nature of the interaction. Specifically, the aim was to determine if the interaction is better understood as an effect of crowding on OSM (Camp et al., 2015) or some other process, such as OSM affecting crowding (Vickery et al., 2009). Camp et al. (2015) manipulated crowding only in a coarse way; the spatial character of crowding under masked and unmasked conditions was not determined in their experiment. These limitations make it difficult to determine what the relationship between crowding and OSM actually is. The current set of experiments aimed to provide a clearer picture on this relationship by presenting a greater number of crowding conditions, ones which allowed the spatial profile of the crowding effect to be determined under masked and unmasked conditions.

Crowding is strongly sensitive to the spatial distance between the flankers and the target, indeed crowding is typically operationalised in terms of this variable (Bouma, 1970; Whitney & Levi, 2011; Pelli & Tillman, 2008). Crowding is typically maximal when the flankers are nearest to the target and the effect declines monotonically as the distance is increased. The critical spacing for crowding to occur is dependent on target eccentricity with critical spacing increasing proportionally with the distance of the target from fixation. The effective distance for crowding tends to be approximately half that of the target’s distance from fixation though the range of the effect does depend on several other factors such as the position of the target and flankers with respect to fixation (Pelli & Tillman, 2008).

If crowding interacts with OSM because crowding makes a target more susceptible to OSM then we should find a certain data pattern with respect to manipulations of target-flanker distance. Specifically, OSM should be strongest at the smallest target-flanker distance, where crowding itself is strongest; OSM should then decrease to an asymptote as target-flanker distance is increased and crowding is correspondingly diminished. If this pattern of OSM decline does not occur with respect to target-flanker distance then it would challenge the explanation offered by Camp et al. (2015) regarding the relationship between crowding and OSM. Experiment 1 assessed this possibility.

Experiment 1

In Experiment 1, three target-flanker distance conditions are given, each of which is compared against an uncrowded condition in which the flankers surround a nontarget item at the same distance. A digit identification task was given.Footnote 1 The target, surrounded by a four dot mask (4DM) was presented at a random location on a virtual circle. On unmasked trials the 4DM coterminated with the target, on masked trials it lingered on-screen for a period after the target offset.

In the task on some trials two flanker digits flanked the target on either side (designated flanked-target trials). On other trials the two flankers flanked a nontarget digit located directly opposite the target on the virtual circle (designated unflanked-target trials). The distance between the flankers and the flanked item (i.e. target or nontarget) was also manipulated. Four flanker distance positions were given across both the flanked-target trials and the unflanked-target trials. This flanker distance manipulation, it was assumed, would give us a measure of the spatial profile of the flanker effect on OSM. The inclusion of the unflanked-target trials conditions reflected the same basic design given in Camp et al. (2015). These trials were included for two reasons. First, their inclusion made the experiment design a symmetrical one: For each target-flanker distance there was an equivalent control condition. Second, because of this symmetry, flankers did not potentially serve as a spatial cue to the target location as they would have done had only flanked-target trials been given.

It was predicted that OSM would be greater on the flanked target trials than on the unflanked target trials (i.e. trials where the flankers surround the nontarget), replicating the finding reported by Camp et al. (2015). A further prediction was made based on the claim stated in Camp et al. regarding the relationship between OSM and crowding. If Camp et al. are correct then OSM should be greatest when flankers were positioned closest to the target (where the crowding effect on the target was strongest) and diminish as the distance between the flankers and flanked target was increased. If this pattern is not found, then it would be evidence against their interpretation of the relationship between OSM and crowding.

Method

Participants

Thirty-five first-year Oxford Brookes Psychology students (27 female) took part in the experiment. All gave informed consent and received course credits for completing the experiment. All reported normal or corrected-to-normal visual acuity. This and all other experiments in this study received full approval by the Oxford Brookes University ethics panel.

Design

The experiment had three factors, all repeated measures: mask duration (0 ms, 180 ms), target condition (flanked target, unflanked target), and flanker distance (0.63°; 0.89°; 1.15°; 1.41°). The dependent variable was identification accuracy, measured by the percentage of correct responses.

Stimuli and procedure

The experiment was conducted in a darkened and sound-deadened room with back lighting. Stimuli were presented on a 20-inch Sony Trinitron CRT computer monitor (resolution = 1024 × 768; refresh rate = 100 Hz). The monitor was controlled by an Intel Pentium 4 (2.66 GHz) PC fitted with a NVDIA GeForce 4 graphics card. The monitor was viewed by the participant from a distance of approximately 110 cm. Bespoke software written in the BlitzMax programming language (BlitzMax V. 1.5; Sibley, 2011) controlled all aspects of stimulus presentation, randomisation and response recording. All stimuli were black (0.03 cd/m2) on a white (97 cd/m2) background. The stimulus array always consisted of four digits (0–9) positioned on the circumference of a virtual circle around a central fixation point. Each digit was in Arial font 32 type size (a subtended visual angle of 0.47° in height). The virtual circle itself had a radius subtending 3.9° from the centre of the fixation cross to the centre of each digit. One of the four digits was designated as the target, one as the nontarget and the other two as flankers. The target was presented at a point, randomly determined on each trial, on the virtual circle. The nontarget was always presented diametrically opposite the target on the virtual circle. The target was identified in the stimulus array by the surrounding 4DM. The 4DM was arranged in a virtual square (subtending 0.89° in height/width) around the target. The dots comprising the mask were each 0.10° of visual angle in width/height.

On flanked-target trials the flankers surrounded the target location at one of four distances: 0.63°; 0.89°; 1.15°; or 1.41° (distances are expressed in units of subtended visual angle of the circumferential distances between the midpoints of the surrounded item and the flanker digits on the virtual circle).Footnote 2 On unflanked-target trials the flankers surrounded the non-target location, again at one of four distances: 0.63°; 0.89°; 1.15°; or 1.41° (Fig. 1 gives an example of a flanked and unflanked trial for the nearest of the four flanker distances; 0.63°).

Fig. 1
figure 1

Schematic depiction of the trial sequence in Experiment 1. Bottom shows the trial sequence for flanked-target trials. In the given example, the flankers are at the closest given position (0.63°) with respect to the target, which is indicated by the surrounding four dots. Top gives the equivalent sequence for an unflanked-target trial. Here, the flankers closely surround the nontarget

The identity of the target digit was randomly determined on each trial with the constraint that each of the 10 digits appeared with equal frequency for all trial types. The identity of the nontarget and flanker digits on each trial was determined randomly with replacement. A schematic depiction of an example trial sequence is shown in Fig. 1. All trials started with the onset of a blank white screen presented for 500 ms. A frame was then shown in which the fixation cross alone was presented for 250 ms. The onset of this frame was accompanied by a brief alerting tone. The stimulus array was presented with the 4DM surrounding the target digit. The stimulus array frame was shown for 40 ms. Then both the stimulus array and mask disappeared from screen (0-ms trailing mask), or the stimulus array disappeared but the mask remained for a further 180 ms (180-ms trailing mask). The fixation cross was present on-screen throughout these frames and remained visible until the participant responded. The task was to identify the target digit. Participants responded by pressing the corresponding key (0–9) on a standard computer keyboard. Immediate aural error feedback was given following an incorrect response. The participant’s response instigated the start of a new trial.

There were 640 trials in total, 40 trials for each combination of mask duration, target condition, and flanker distance. Trials were presented in 10 blocks of 64 trials. The computer prompted the participant to take a brief break after each 64 trial increment. Five demonstration trials presented at a slowed speed and 30 practice trials given at the real speed of the experiment were undertaken prior to the start of the experiment. Participants were instructed to emphasise accuracy in responding. The total session lasted approximately 30 minutes.

Results

Figure 2a gives the mean percentage correct responses for all conditions; Fig. 2b shows the masking strength in the different target conditions (masking strength is calculated by subtracting performance in the 180-ms mask duration trials from the corresponding 0-ms trials). A three-way repeated-measures ANOVA was performed to analyse the data. The three factors were mask duration (0, 180), target condition (flanked-target, unflanked-target), and flanker distance (0.63°; 0.89°; 1.15°; 1.41°). Significant main effects were found for all three factors: mask duration, F(1, 34) = 212.77, MSerror = 50.15, p < .001, ηp 2 = .86; target condition, F(1, 34) = 174.56, MSerror = 220.14, p < .001, ηp 2 =.84; and flanker distance, F(3, 102) = 7.08, MSerror = 44.46, p < .001, ηp 2 =.17.

Fig. 2
figure 2

Performance in Experiment 1. a Accuracy (% correct) in for the four flanker distances (0.63°, 0.89°, 1.15°, 1.41°) for each of the two mask durations (0 ms, 180 ms) for flanked and unflanked target conditions. b Masking effect for each combination of flanker distance and target condition. Masking is calculated as the difference in accuracy between the respective 0-ms and 180-ms mask duration conditions

A significant two-way Mask Duration × Target Condition interaction was observed, F(1, 34) = 5.44, MSerror = 50.54, p = .026, ηp 2 = .14. This reflects the fact that masking was stronger when the flankers surrounded the target compared to when they surrounded the nontarget. This interaction supports our first prediction; it replicates the finding reported by Camp et al. (2015). The two-way Target Condition × Flanker Position interaction was also significant, F(3, 102) = 11.72, MSerror = 47.26, p < .001, ηp 2 = .26. This interaction simply reflects the fact that variation in flanker position has a greater effect on accuracy on flanked-target trials than on unflanked-target trials. The two-way Mask Duration × Target Position interaction was not significant, F(3, 102) = 1.47, MSerror = 40.63, p = .226. The three-way Mask Duration × Target Condition × Flanker Position interaction did not approach significance, F(3, 102) = 0.61, MSerror = 50.40, p = .609.

Discussion

Our first prediction of an interaction between flanker position and mask duration was supported. The interaction reflects the fact that masking tended to be stronger when flankers surrounded the target location compared to when they surrounded the non-target. This finding replicates the findings reported by Camp et al. (2015).

The second prediction was that OSM would be greatest when the flankers were located nearest to the target and diminish as flanker distance was increased. The data did not support this. In fact, the trend was in the opposite direction. For instance, for flanked-target trials slightly more masking was observed at the largest (1.41°) than the smallest (0.63°) flanker distance conditions. Second, and unexpectedly, flanker distance had at least as much of an effect on unflanked-target trials as it did for flanked ones (see Fig. 2b). We shall defer from making any further interpretation of these results at this stage other than to state that the pattern of data obtained was inconsistent with the crowding on OSM hypothesis proposed by Camp et al. (2015).

Given the pattern of the data obtained in Experiment 1, Experiment 2 looked at the effect of flanker distance on OSM over a much larger spatial range. This was done to obtain a clearer picture of the relationship between these variables. In Experiment 1 a distinction was made between flanked-target trials and unflanked-target trials. It should be noted that the distinction was somewhat arbitrary given that all stimuli are positioned on the same virtual circle. This arbitrariness becomes more palpable when the distances of the flankers from the target (or nontarget) are larger as they are for Experiment 2. Consequently for Experiment 2 it was deemed more appropriate to consider flanker distance as a single continuous variable.

Experiment 2

The aim of Experiment 2 was to explore the effect of flanker distance on OSM over a larger distance range than in Experiment 1. This distance covered the range of the entire arc of the virtual circle on which the stimuli were presented. Methods were the same as Experiment 1, except for the differences thus described. The aim of this experiment was to get a clearer indication of the relationship between flanker position and mask duration than was apparent from Experiment 1.

Method

Participants

Thirty-two undergraduate and postgraduate Oxford Brookes Psychology students (27 female) were recruited for the experiment. All participants reported normal or corrected-to-normal visual acuity. None had taken part in Experiment 1. Participants received course credits for taking part in the experiment.

Stimuli and procedure

The basic procedure was the same as in Experiment 1. A target digit was presented with a nontarget located directly opposite it on a virtual circle. The 4DM surrounded the target and denoted it as being the target item within the array. The mask either offset with the target or trailed it by 180 ms. The dimensions of the digits and of the virtual circle were the same as in Experiment 1. Two flankers were presented on the virtual circle at one of the six target-flanker (circumferential) distances (0.63°; 3.02°; 4.9°; 7.35°; 9.23°; 11.62°).Footnote 3 Note that the symmetrical nature of the flanker positions is maintained in Experiment 2 as it was in Experiment 1. For instance the condition in which flankers are nearest to the target (0.63°) has a corresponding condition in which the flankers are the same distance from the nontarget (11.62°).

There were 480 trials, 60 trials for each factorial combination of masking and flanker position. The trials were presented within 10 blocks each of 48 trials. Participants were given a short break after completion of each block.

Results

The mean percentage correct responses are given in Fig. 3a. The data were analysed using a two-way repeated-measures ANOVA. The factors were mask duration (0 ms, 180 ms) and target-flanker distance (0.63; 3.02°; 4.9°; 7.35°; 9.23°; 11.62°). This analysis showed significant main effects for both masking, F(1, 31) = 130.53, MSerror = 53.70, p < .001, ηp 2 = .81, and target-flanker distance, F(5, 155) = 97.93, MSerror = 41.85, p < .001, ηp 2 = .69). There was also a Significant Masking × Target-Flanker Distance interaction, one which displayed a quadratic trend, F(1, 31) = 10.42, MSerror = 26.57, p = .003, ηp 2 = .25. The quadratic nature of this interaction reflected the fact that masking—as defined by the performance difference between masked and unmasked trials—exhibits an inverted U-shaped function with respect to target-flanker distance (see Fig. 3b). That is, masking was greatest not at the nearest target-flanker distance (0.63°), but at an intermediate distance (3.02°); it was lower when flankers were placed closer to or further from the target than this.Footnote 4

Fig. 3
figure 3

Performance in Experiment 2. a Accuracy (% correct) for each of the six target-flanker distances shown separately for each of the two mask durations (0 ms, 180 ms). b Masking effect (difference between 0-ms and 180-ms mask duration conditions) for each target-flanker distance

A further analysis was performed in the form of a piecewise linear regression. Line fits are often used to characterise the range of spatial crowding (Pelli, Palomares, & Majaj, 2004; Yeshurun & Rashal, 2010). The piecewise regression fits were performed on the raw accuracy data for the two masked conditions. The fitting was done using a least squares method. In this equation the fit was constrained by a two line solution; the single hinge point (k) between the two line segments was implemented as a free parameter. In this analysis, the linear distances, rather than the circumferential differences, were used. The resulting k values along with the slopes for the first and second lines (b 1, b 2) are presented in Table 1. The line fits are somewhat different for the unmasked and masked conditions as might be expected given the interaction. The breakpoint k is similar for the masked and unmasked conditions though it occurs at a slightly greater target-flanker distance under masked conditions. More notably, the slope of the first line segment is distinctly shallower under masked conditions. Thus, the quadratic effect in the masking data can be essentially characterised as being largely a consequence of the difference in the steepness of the crowding function under masked and unmasked conditions.

Table 1 Knot-point (k) and slope (b) values resulting from piecewise linear regression of Experiment 2 data*

Discussion

Experiment 2 revealed more of the spatial character of the relationship between target-flanker distance and OSM than was apparent from Experiment 1. Again, contrary to the prediction OSM did not decline and then asymptote as flanker distance was increased. Instead the relationship between target-flanker distance and OSM was quadratic (inverted-U shaped) in nature.

Before discussing this further, it should be noted that the interpretation of a quadratic masking function arguably rests on the position of a single data point. If accuracy in the condition with a 0.63° target-flanker distance and trailing mask had been rather lower than observed then the masking function would have appeared monotonic rather than quadratic and would have then supported our original prediction of an effect of crowding on masking. Given this fact, Experiment 3 was conducted to further test the seeming quadratic relationship between OSM and flanker distance found in Experiment 2.

Experiment 3

Experiment 3 consisted of two parts (3a, 3b). Both experiments had the same two factor design as Experiment 2. In both cases, the factors were mask duration and target-flanker distance. Three mask durations were presented in these experiments (0 ms, 60 ms, 180 ms). The additional masking condition presented additional data points on which to assess the nature of the masking function with respect to flanker distance. A further change from Experiment 2 was also implemented. The eccentricity of the stimuli with respect to fixation was increased from that in the previous two experiments. This was done to amplify the overall crowding effect on the target (Gurnsey, Roddy & Chanab, 2011; Pelli et al., 2004).

Experiment 3, as in Experiment 2, had two factors (mask duration, target-flanker distance). Like in Experiment 2 target-flanker distance was manipulated to sample across the entire available range on the virtual circle. In Experiment 3a, the stimulus array was presented at an eccentricity of 4.75°; in Experiment 3b, the stimulus array was presented at an eccentricity of 5.4° (compared with an eccentricity of 3.9 ° in Experiments 1 and 2). The same target-flanker distances were given in Experiments 3a and 3b with the exception that Experiment 3b had an additional target-flanker distance condition which was allowed for by the larger circumferential distance of the virtual circle in a 5.4° display. The aim of Experiment 3 was to confirm whether the interaction between masking and crowding has an inverted U-shape.

Method

Participants

Forty-four undergraduate and postgraduate Oxford Brookes Psychology students (35 female) took part in the experiment. Half the participants were allocated to Experiment 3a, half to Experiment 3b by a random process. All participants gave informed consent and received course credits (undergraduate students) or financial remuneration (£7 GBP) for completing the experiment. All had normal or corrected-to-normal visual acuity.

Stimuli and procedure

The stimuli and procedure were identical to those of Experiment 2 except where stated. For Experiment 3a, the radius of the display was 4.75° of visual angle and for Experiment 3b was 5.4° of visual angle. For both experiments trailing mask duration was one of three conditions (0 ms, 60 ms, or 180 ms). In Experiment 3a there were seven target-flanker distances (1°; 3°; 5°; 7°; 9°; 11°; 13°).Footnote 5 In Experiment 3b the target-flanker distances were 1°, 3°, 5°, 7°, 9°, 11°, 13°, and 15° of circumferential visual angle.Footnote 6

For both experiments there were 30 trials for each factorial combination of masking and target-flanker distance. This resulted in a total of 630 trials in Experiment 3a and 720 trials in Experiment 3b. Trials were presented in 10 blocks each of equal length. Participants were asked to take a short break at the end of each block. Participants were shown a demonstration and given practice trials were given before undertaking the main experiment.

Results

Experiment 3a

The average percentage correct responses in each factorial condition of mask duration and target-flanker distance are shown in Fig. 4a. These data were analysed using a 3 × 7 repeated-measures ANOVA with mask duration (0 ms, 60 ms, 180 ms) and target-flanker distance (1°; 3°; 5°; 7°; 9°; 11°; 13°) as the two factors. A significant main effect was found for mask duration, F(2, 42) = 31.48, MSerror = 85.31, p < .001, ηp 2 =.60, and with a Greenhouse–Geisser correction for target-flanker distance, F(3.11, 65.40) = 23.52, MSerror = 97.77, p < .001, ηp 2 = .53, respectively. A significant quadratic Mask Duration × Target-Flanker Distance interaction was found, F(1, 21) = 10.98, MSerror = 43.57, p = .003, ηp 2 = .34 (see Fig. 4a). The quadratic nature of the masking effect with respect to target-flanker distance can be seen in the masking function in Fig. 4b.Footnote 7

Fig. 4
figure 4

Performance in Experiment 3a. a Accuracy (% correct) for each of the seven target-flanker distances shown separately for each of the three mask durations. b Masking effect (difference between the 60-ms and 180-ms mask duration conditions each from the 0-ms mask duration baseline)

Experiment 3b

The average percentage correct responses in each factorial condition of mask duration and target-flanker distance are shown in Fig. 5a. These data were analysed using a 3 × 8 repeated-measures ANOVA with mask duration (0 ms, 60 ms, 180 ms) and target-flanker distance (1°; 3°; 5°; 7°; 9°; 11°; 13°; 15°) as the two variables of interest. Significant main effects were found for masking and target-flanker distance, F(2, 42) = 20.96, MSerror = 102.75, p < .001, ηp 2 = .50, and F(7, 147) = 51.12, MSerror = 67.01, p < .001, ηp 2 = .71, respectively. A significant Quadratic Mask Duration × Target-Flanker Distance interaction was found, F(1, 21) = 10.75, MSerror = 64.09, p = .004, ηp 2 = .34. The quadratic nature of the masking effect with respect to target-flanker distance can be seen in the masking function in Fig. 5b.Footnote 8

Fig. 5
figure 5

Performance in Experiment 3b. a Accuracy (% correct) for each of the eight target-flanker distances shown separately for each of the three mask durations. b Masking effect (difference between the 60-ms and 180-ms mask duration conditions each from the 0-ms mask duration baseline)

Piecewise regression

The same piecewise regression described for Experiment 2 was also done for the Experiments 3a and 3b data. The resulting knot points and slopes are presented in Tables 2 and 3 for the two respective experiments. Unlike for Experiment 2, the knot points of line fits occurred at a nearer target-flanker distance for the masked conditions compared to the unmasked. However, for the slopes the same basic pattern was found as for Experiment 2: In all cases, masked conditions resulted in shallower slopes for the first line segments.

Table 2 Knot-point (k) and slope (b) values resulting from piecewise linear regression of Experiment 3a data
Table 3 Knot-point (k) and slope (b) values resulting from piecewise linear regression of Experiment 3b data

Discussion

Both Experiments 3a and 3b found an interaction between flanker position and mask duration. Importantly, in both cases, masking exhibited an unambiguous quadratic function for both the short and long mask conditions. The effect was just as evident for both the short and long trailing mask conditions. The data pattern cannot be simply attributed to ceiling and/or floor effects. Performance was well within a measurable range of performance for most participants. The quadratic function was obtained even when further analysis removed participants performing close to ceiling or floor in any condition.

General discussion

Camp et al. (2015) claimed that crowding and OSM interact because crowding increases a target’s vulnerability to object substitution processes. If this were the case, then masking should have been observed to be strongest with flankers closest to the target and then decline to asymptote as target-flanker distance was increased. No such data pattern was found in any of our experiments. Instead, target-flanker distance and OSM showed a robust inverted U-shaped relationship.

Our data do support Camp et al (2015)’s general claim of an interaction between OSM and crowding. They do not, however, support Camp et al.’s explanation of that interaction. We argue instead that the best interpretation of the data pattern is that it reflects a process in which OSM affects crowding. Before discussing this issue any further however we wish to first present and then contend against some alternative interpretations of our results that might be made.

One might argue that the findings we reported are accounted for by spatial attention. In this explanation the differences in masking strength across the different target-flanker distances are associated with differences in how attention is spread across the display in the different conditions. It can be argued that the greater effect of mask duration at intermediate target-flanker distances is reflective of attention being most diffuse in those particular conditions.

There are good reasons for rejecting this attentional interpretation. Firstly, it is unclear why the particular target-flanker distances in which masking was most evident should be ones in which attention would be most diffuse. Presumably, attention would tend to be most spread out in circumstances in which the display elements (the target and flankers) were furthest apart from each other. However, if we take Experiment 2 as an example, the display elements are most broadly distributed in the 4.9° and 7.35° conditions. However, it is the 3.02° target-flanker condition in which the effect of mask duration is most evident (see Fig. 3).

Further to this the pattern of accuracy we obtained, when looked at in detail, also fails to support the attention account. One way that we can evaluate the effect of attention in the different target-flanker conditions is to look at the unmasked trials. It is on these trials that we can most easily evaluate the effect of attention on target perceptibility independent of any effect on OSM. Were attention more diffuse with intermediate target-flanker distances, then we should presumably observe that accuracy in reporting the target on these unmasked trials was also reduced. To evaluate this possibility the most reasonable comparison to make is between the intermediate flanker positions against those where the flankers are furthest from the target (judging the intermediate flanker positions against the conditions where the flankers are nearest to the target would potentially conflate any putative attention effects from those that arise from crowding). However, in neither Experiment 2 nor Experiment 3 was this data pattern found; for example, in Experiment 3b accuracy in unmasked trials was slightly higher in intermediate target-flanker conditions when compared against the largest target-flanker conditions. This suggests that attention was not more diffuse in the conditions in which we observed the most masking.

Second, one might argue that our observed interaction was an indirect consequence of the circumferential organisation of the stimuli we had in our experiment. In this interpretation masking is greater at larger target-flanker distances because of the greater likelihood of the target and flankers being presented in different visual fields due to the circular stimulus arrangement. Under such circumstances greater masking is found because of the induced interhemispheric competition between the target and flanker objects (Geng et al., 2006; Szczepanski & Kastner, 2013). Here, masking is amplified under such conditions of interhemispheric competition because attention is drawn away from the hemisphere containing the target because of processing of the irrelevant distractor(s) in the other hemifield. However, again, such an explanation would not properly account for the data we obtained in our experiments. Were this explanation correct, then the most masking should have been found in our data in the trials in which the target-flanker distance was greater than a quarter circumference away. Under these circumstances at least one of the distractors would always be in the opposite hemisphere to the target (for Experiment 2 this would be all conditions >=7.31°; in Experiment 3a, condition >=9°; in Experiment 3b, >=11°). However in many cases little masking was found in these trials. In particular the largest target-flanker distances (ones where both flankers would be in the opposite hemisphere to the target on almost all trials) tended to produce very little masking (see especially the 15° condition in Experiment 3b). Thus this hemispheric competition interpretation fits poorly with our data.

A third account that might be proposed is that the interaction is simply a consequence of OSM limiting the maximum achievable performance level. In this explanation OSM and crowding have no specific interactive effect on each other as cognitive processes; the observed ‘interaction’ is instead a consequence of the different performance limits for masked and unmasked conditions. This account can explain why masking is initially increased as the flankers are moved away from the closest distance to the target: It occurs because the recovery from crowding towards the maximum achievable performance level is greater in the unmasked condition than the masked condition. However, the account does not explain why masking then subsequently declines with further increases in target-mask distance. If the initial increase in masking was a consequence of a performance constraint then we would have observed a monotonic increase in masking as flanker distance was increased followed by a plateau once the putative performance limit was reached for the masked trials. There is no obvious plateau in the masking functions in any of our experiments, beyond peak level masking; instead masking always shows a general trend to reduce with further increases in flanker distance. Thus, the performance limit explanation does not account for our data.

In summary, there was no empirical support for the possibility that the quadratic interaction we observed was mediated by spatial attention or some other form of competition related to the spatial organisation of the stimuli. Nor was there support for the possibility that the interaction was a consequence of some form of induced performance constraint.

OSM, flanker distance, and the window of crowding

The main finding of our experiments was that crowding and OSM interact not because crowding affects OSM but because OSM affects crowding. In this respect the data has parallels with the ‘supercrowding’ effect reported by Vickery et al. (2009). Vickery et al. argued that masking a target results in an increase in the interference zone of crowding compared with unmasked conditions. Our results, like Vickery et al., indicated that masking influences the character of crowding as expressed by manipulations of target-flanker distance. However our results suggested masking seemed to have the effect of increasing the magnitude of crowding at intermediate target-flanker distances near the edge of the interference zone. The size of the crowding interference zone itself was unaffected by masking.Footnote 9

These differences in our data and those of Vickery et al. (2009) may reflect differences in the effect of OSM masks and contour masks on crowding. The contour masks used by Vickery et al. would produce surround suppression on the target contours (Meese & Baker, 2009); those of OSM involve higher level object-level processes (Di Lollo et al., 2000; Moore & Lleras, 2005). This fact may well explain the more limited effect of OSM on crowding that we observed. However, there are a number of other differences between our paradigm and that of Vickery et al. which make any direct comparison of the effect of masking on crowding difficult. It would be useful for future research to specifically compare the effect of OSM masks and other types of mask on crowding within the same experimental paradigm to try and quantify these differences more precisely.

Why might OSM influence crowding in the manner we observed? Vickery et al. (2009) suggested a model in which masking affects the types of feature integrators that are pooled when processing the target. In target identification, the relevant feature integrators that have integration fields that overlap the target location are selected for analysis. Integrators within this field will vary in terms of a number of properties, such as in the size of their spatiotemporal receptive fields and in their selectivity to certain feature properties. The addition of a mask to the target stimulus might lead to adjustments in the types of feature integrators which are recruited. For instance the presence of a trailing mask might mean that integrators with course temporal receptive fields do not tend to be recruited in the pooling operation because they would tend to summate both the target and mask elements. By rejecting integrators with larger temporal fields the visual system is then only able to draw upon a more limited number of other integrators in selecting the target for analysis. As a consequence, this may make it more likely that flankers become pooled together with the target and thus result in further degradation of the target percept. This may therefore mean that that the visual system becomes more reliant upon integrators which have spatial receptive fields which extend towards the limits of the crowding window. The result of this is that the combined effects of OSM and flankers on target perceptibility become most apparent when the flankers are present towards the outer limits of the crowding interference zone.

It should be noted that Vickery et al.’s (2009) model of mask-crowding interactions and the version of it offered above is speculative. It is also incomplete with respect to certain details, in particular regarding how integrators are utilised or excluded in the pooling process, for example, whether this process is done in a bottom-up or top-down directed manner. Thus, further research is still needed to fully understand the mechanics behind the effects of masking on crowding and in particular the effects of OSM on crowding.

OSM and the object processing hierarchy

Our claim that OSM influences crowding seems, at first glance, to conflict with accounts that claim OSM to be a process which occurs after crowding is completed (Breitmeyer, 2014, 2015; Chakravarthi & Cavanagh, 2009). The main evidence for the claim of OSM as a postcrowding visual process comes from a study reported by Chakravarthi and Cavanagh (2009). In this study the authors presented a target Landolt square in the visual periphery which was flanked at all cardinal positions by four other Landolt squares each in a random orientation. The presence of these flankers resulted in a standard crowding effect on the target, accuracy in reporting the target was significantly reduced against a no-flanker baseline. On certain trials these flankers were masked by one of three types of backward mask, a noise mask, a metacontrast mask, or a 4DM. Both the noise and metacontrast mask applied to the flankers resulted in recovery from crowding. For the 4DM, no recovery was obtained: Crowding was the same as in the no-mask baseline condition. The ineffectiveness of the 4DM in reducing crowding was not explained by it being a weaker form of masking. A later experiment showed that when observers had to report the identity of one of the masked flankers rather than the target the perceptibility of the flankers was just as reduced by the 4DM as by the other two mask types. The authors argued instead that the differential effectiveness of the masks in reducing crowding reflected the nature of the underlying masking processes. It was argued that the noise and metacontrast mask disrupted processing of the flankers at an early stage, one which occurred prior to the crowding process in which the flanker and target signals become pooled together. By contrast it was argued that the OSM processes that underlie the 4DM effect were late stage, occurring subsequent to this pooling operation, therefore rendering the mask ineffective in modulating crowding generated by the flankers.

Thus, our results in comparison with those of Chakravarthi and Cavanagh (2009) seem to present an enigma: When a 4DM is used to mask flankers it has no discernible effect on crowding, yet when it is used to mask the target it has a reliable effect on crowding. If OSM occurs at a postcrowding stage—as Chakravarthi and Cavanagh claim—then masking of the target should not be able to affect crowding. However, our findings show that OSM masking of the target does influence crowding. How do we reconcile these different findings? We suspect that the findings of the Chakravarthi and Cavanagh (2009) paradigm concern different aspects of visual processing to those in the current experiments. The results of Chakravarthi and Cavanagh (2009) reflect the extent to which a mask suppresses interference from task-irrelevant spatially proximal flankers. Our results, however, reflect the consequences of how the visual system adapts to competition from a temporally defined mask when engaged in target identification.

There is no reason to assume that these different processes that we describe are ones which operate within the same time frame or which are susceptible or immune to the same manipulations. The pooling which occurs in the standard crowding effect between a target and spatially proximal flankers may, as Chakravarthi and Cavanagh (2009) claim, be a rapid and feedforward one. Consequently such pooling from task-irrelevant flankers may only be prevented by the rapid inhibitory effects produced by low-level masking (Rolls & Tovee, 1994). By contrast the assumed process, described earlier, by which the visual system adjusts itself when processing a task-relevant target, may be one which occurs over a longer time course. The process by which feature integrators are included or excluded in the pooling operation may not be achieved in a rapid and feedforward manner. Instead, it may be a longer process dependent on cyclical exchanges between different levels of the visual system (Di Lollo et al., 2000; Fahrenfort, Scholte, & Lamme, 2008; Scholte, Jolij, Fahrenfort, & Lamme, 2008). If this is the case, then manipulations involving masking of the target may be more sensitive to OSM by crowding interactions than those involving the task-irrelevant flankers.

Conclusion

Fundamentally, our results show that OSM—or at least OSM as operationally defined as the effect of mask duration (Di Lollo et al., 2000)—is affected by the spatial configuration of the display. The presence and position of distractors, or other display elements, can modulate the intensity with which OSM occurs. The OSM effect can be argued to be one which essentially reflects the extent to which the visual system can resolve a briefly presented target from competing visual stimulus information that is present in both the temporal and spatial domains. Given this the masking effect observed tends to reflect both the temporal properties of the mask as well as the spatial locus of distractor elements. In this respect our results are consistent with other claims regarding the close interdependence of spatial and temporal factors in masking, and in visual processing more generally (Enns, 2004; Ghose, Hermens, & Herzog, 2012; Hermens, Luksys, Gerstner, Herzog & Ernst, 2008; Herzog, 2007; Lev & Polat, 2015; Lleras & Moore, 2003; Yeshurun, Rashal, & Tkacz-Domb, 2015).