Introduction

Visual scenes are complex, containing information on both a local scale (i.e., an individual object) and a global scale (i.e., the whole configuration). The question of whether the visual system prioritizes local or global elements has been debated for over a century (Coren, Ward, & Enns, 1994; Kimchi, 1992; Neisser, 1967; Titchener, 1909; Uttal, 1994; Wertheimer, 1955). A prominent phenomenon was identified by Navon (1977), who presented a compound stimulus with small local objects making up a large global object. Participants responded to either the local or the global object, and responded faster to the global object, suggesting a processing advantage for the global form. While this finding has received prevalent support (Badcock, Whitworth, Badcock, & Lovegrove, 1990; Broadbent, 1977; Miller, 1981; Mills & Dodd, 2014), other researchers have argued for an advantage for local elements (Grice, Canham, & Boroughs, 1983; Lindsay & Norman, 1972; Pomerantz & Sager, 1975; Rumelhart & Siple, 1974). To explain these opposing views, several factors have been identified, including visual angle (Lamb & Robertson, 1990), stimulus sparsity (Martin, 1979) and size (Kinchla & Wolfe, 1979), grouping mechanism (Enns & Kingstone, 1995), attention allocation (Ward, 1982), and goodness of form (Hoffman, 1980).

These existing explanations have typically focused on the spatial features of the compound stimulus which consists of identical local objects, but an important factor has been overlooked. Since visual scenes often contain distinct individual objects which tend to co-occur in the real world (e.g., a chair often appears next to a table), we propose a new factor: Statistical relationships between individual objects can prioritize local or global processing. This account is supported by recent evidence that regularities between objects are automatically extracted by the visual system and can guide attention (Chalk, Seitz, & Series, 2010; Chun & Jiang, 1998; Fiser & Aslin, 2002; Summerfield, Lepsien, Gitelman, Mesulam, & Nobre, 2006; Turk-Browne, Junge, & Scholl, 2005; Zhao, Al-Aidroos, & Turk-Browne, 2013). Specifically, the reliable co-occurrences between individual objects can bias attention to the objects themselves (Yu & Zhao, 2015; Zhao et al., 2013).

The current study aimed to examine how statistical regularities guide the spatial scale of processing as measured by visual attention. We constructed Navon-like figures (Navon, 1977) consisting of small local objects making up a global object. The Navon figure contained local regularities (e.g., A always appeared next to B), or global regularities (e.g., A and B co-occurred globally). We hypothesize that local regularities bias attention to the local scale, prioritizing the processing of individual objects, while global regularities bias attention to the global scale, prioritizing the processing of the global form.

Experiment 1

This experiment examined whether local regularities draw attention to the local scale.

Participants

One hundred undergraduates (69 female, mean age = 20.2 years, SD = 3.2) from the University of British Columbia (UBC) participated for course credit. Participants in both experiments had normal or corrected-to-normal vision, and provided informed consent. The experiments were approved by the UBC Behavioral Research Ethics Board.

Apparatus

Participants in both experiments were seated 60 cm from a computer monitor (refresh rate = 60 Hz). Stimuli were presented using MATLAB (Mathworks) and Psychophysics Toolbox (http://psychtoolbox.org).

Stimuli

The stimuli consisted of nine shapes in nine distinct colors (color name = R/G/B values: red = 255/0/0; green = 0/255/0; blue = 0/0/255; yellow = 255/255/0; magenta = 255/0/255; cyan = 0/255/255; gray = 185/185/185; brown = 103/29/0; black = 0/0/0). The nine colored shapes were randomly assigned into three “color triplets” for each participant (Fig. 1b). Each shape was a square (subtending 1.1°) or a diamond (a square rotated 45°). The nine shapes were presented in a 3 × 3 matrix which was either a global square or diamond. In each matrix, the three triplets were assigned into either the three columns or the three rows. The triplets were local regularities because the three colors within a triplet always appeared immediately next to each other in a column or row (Fig. 1b). A congruent matrix was either a global square (subtending 5.3°) consisting of local squares, or a global diamond (subtending 7.1°) consisting of local diamonds. An incongruent matrix was either a global square (subtending 5.8°) consisting of local diamonds, or a global diamond (subtending 6.8°) consisting of local squares.

Fig. 1
figure 1

Experiment 1. (a) In each trial, a matrix appeared in one quadrant on the screen and participants performed either a local task to identify the shape of a local object, or a global task to identify the global shape of the matrix. (b) The nine colors were assigned into three triplets, each of which was presented as one column or one row in the matrix. Congruent matrices were either a global square consisting of local squares, or a global diamond consisting of local diamonds. Incongruent matrices were either a global square consisting of local diamonds, or a global diamond consisting of local squares. (c) Response times of shape identification were analyzed using a 2 (condition: local regularities vs. random; between-subjects) × 2 (task: local vs. global; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) ANOVA. Error bars reflect ±1 SEM; **p < .01; ***p < .001

Procedure

The experiment consisted of two conditions: In the local regularities condition the matrix contained three triplets, and in the random condition the nine colored shapes were randomly assigned into the nine cells in the matrix. Thus, the only difference between the two conditions was the presence or absence of triplets. In the local regularities condition, participants first performed a shape identification task, and then a test phase. In the shape identification task, participants performed a local or a global task (Fig. 1a) where they identified whether the local or the global shape was a square or a diamond (by pressing a “1” or “0” key) as quickly and accurately as possible. There were four groups (2 conditions × 2 tasks) with 25 participants in each.

In each trial, one matrix appeared randomly in one of the four quadrants on the screen for 1,000 ms (Fig. 1a). This prevented participants from basing their judgments on a single object in a fixed location. The center of each matrix was 8.9° away from central fixation. If participants did not respond within 1,000 ms, the screen remained blank until response. The inter-trial interval was 1,000 ms. Participants were asked to maintain fixation throughout the experiment. There were 100 trials for each type of matrix, resulting in 400 trials in total (order randomized). Each participant received eight trials for practice before starting the experiment.

After the shape identification task, participants in the local regularities condition completed a test phase. In each trial, two sets of shapes were presented for 1,000 ms, one on the left and one on the right side of the screen. Participants pressed a key to indicate whether the left (“1” key) or the right (“0” key) set looked more familiar. One set was a triplet and the other “foil” set contained three colors from three different triplets that never co-occurred before. Participants could only choose the triplet as more familiar if they had learned color co-occurrences within a column or a row in the matrix.

After the test phase, a debriefing session was conducted, where participants were asked if they noticed any colored shapes that appeared with one another. For those who responded yes, we further asked them to specify which colors co-occurred.

Results and discussion

At test, triplets were chosen over foils for 62.0 % of the time in the local task condition, which was reliably above chance (50 %) [t(24) = 3.46, p = .002, d = 0.69]. In the global task condition, however, triplets were chosen over foils for 48.4 % of the time, which was not above chance [t(24) = 0.60, p = .55, d = 0.12]. Thus, learning of the triplets was successful when participants identified the local shape, but not the global shape. During debriefing, seven participants reported noticing the triplets, but none correctly reported which specific colors co-occurred. This suggests that participants had no explicit awareness of the triplets.

The critical test of our hypothesis was whether local regularities draw attention to the local scale. If so, identification of individual shapes in the local regularities condition should be facilitated. Since the overall accuracy was high (93.5 %) and the only significant effect was that congruent trials (95.0 %) were more accurate than incongruent trials (92.0 %) [F(1,96) = 12.10, p < .001, η p 2 = .11], we used response times (RTs) as a more sensitive measure. Only correct trials were included and RTs greater than 3 SDs from the mean were removed (0.9 % of all trials).

The RTs of the shape identification task were analyzed with a 2 (condition: local regularities vs. random; between-subjects) × 2 (task: local vs. global; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) mixed-effects ANOVA. There was a main effect of condition [F(1,96) = 4.69, p = .03, η p 2 = .05], task [F(1,96) = 5.83, p = .02, η p 2 = .06], and matrix [F(1,96) = 68.84, p < .001, η p 2 = .42]. The only significant interaction was between condition and task [F(1,96) = 8.63, p = .004, η p 2 = .08], suggesting a greater RT difference between the local regularities condition and the random condition in the local task than in the global task (Fig. 1c).

To further explore this interaction, a 2 (condition: local regularities vs. random; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) mixed-effects ANOVA was run in the local task condition, revealing a main effect of condition [F(1,48) = 14.53, p < .001, η p 2 = .23] and matrix [F(1,48) = 25.67, p < .001, η p 2 = .35], but no interaction [F(1,48) = 0.01, p = .95, η p 2 < .001]. Thus, local shape identification was faster in the local regularities condition than in the random condition. For the global task, however, there was no RT difference between the local regularities and the random conditions [F(1,48) = 0.27, p = .61, η p 2 = .006]. The RT advantage for local shape identification was also present within the local regularities condition, as local shape identification was faster than global shape identification [F(1,48) = 16.68, p < .001, η p 2 = .26]. Thus, local shape identification was facilitated by the triplets, suggesting that local regularities draw attention to the local scale.

Experiment 2

This experiment examined whether global regularities draw attention to the global scale.

Participants

One hundred new undergraduate students (76 female, mean age = 19.6 years, SD = 2.2) from UBC participated for course credit.

Stimuli

As in Experiment 1, the stimuli consisted of the same nine colored squares or diamonds presented in a global square or diamond matrix. To form global regularities, four colors were randomly assigned into one “quadruple” for each participant, which were the four corners of the matrix (Fig. 2a). The colors in a quadruple always followed the same spatial order within a matrix. To ensure that participants did not base their judgments on one specific shape in the quadruple, the quadruple could rotate in a clockwise order across trials (Fig. 2a). The other five colors were randomly assigned to the remaining five shapes in the matrix (Fig. 2b). The quadruple was global regularities because the four colors were not immediately presented next to each other, but rather were distributed to the corners of the matrix, indicating the global shape.

Fig. 2
figure 2

Experiment 2. (a) In a quadruple, the four colors at the corners of the matrix always appeared in the same spatial order. The quadruple rotated clockwise across trials so that no color always appeared in one fixed location. The five remaining colors were randomly assigned to the five remaining shapes in the matrix. (b) There were two types of congruent matrices and two types of incongruent matrices, as in Experiment 1. (c) Response times of shape identification were analyzed using a 2 (condition: global regularities vs. random; between-subjects) × 2 (task: local vs. global; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) ANOVA. Error bars reflect ±1 SEM; *p < .05; ***p < .001

Procedure

The procedure was identical to that in Experiment 1, except for two changes. First, to enhance learning of the quadruple which only contained four regular colors, the number of trials was increased to 600 trials in total (each matrix type now contained 150 trials). Second, at test each quadruple was presented against a foil where the four colors had never co-occurred before, and participants chose whether the quadruple or the foil looked more familiar.

Results and discussion

At test, quadruples were chosen over foils for 52.8 % of the time in the local task condition, not reliably above chance (50 %) [t(24) = 1.10, p = .28, d = 0.22], and 50.8 % of the time in the global task condition, again not reliably above chance [t(24) = 0.32, p = .75, d = 0.06]. This suggests that statistical learning of the quadruples was not successfully expressed at test. During debriefing, no participants reported noticing color co-occurrences.

Since the overall accuracy was high (95.3 %) and the only significant effect was that congruent trials (95.8 %) were more accurate than incongruent trials (94.9 %) [F(1,96) = 14.21, p < .001, η p 2 = .13], we used RTs as a more sensitive measure as in Experiment 1. RTs greater than 3 SDs from the mean were removed (1.0 % of all trials).

The RTs of the shape identification task were analyzed with a 2 (condition: global regularities vs. random; between-subjects) × 2 (task: local vs. global; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) mixed-effects ANOVA. There was a main effect of condition [F(1,96) = 12.26, p < .001, η p 2 = .11], task [F(1,96) = 4.88, p = .03, η p 2 = .05], and matrix [F(1,96) = 177.18, p < .001, η p 2 = .65]. The interaction between condition and task was significant [F(1,96) = 6.42, p = .01, η p 2 = .06], suggesting that a greater RT difference between the global regularities condition and the random condition in the global task than in the local task (Fig. 2c).

To further explore this interaction, a 2 (condition: global regularities vs. random; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) mixed-effects ANOVA was run in the global task condition, revealing a main effect of condition [F(1,48) = 26.19, p < .001, η p 2 = .35], matrix [F(1,48) = 119.45, p < .001, η p 2 = .71], and an interaction [F(1,48) = 11.01, p = .002, η p 2 = .19]. Thus, global shape identification was faster in the global regularities condition than in the random condition. For the local task, however, there was no RT difference between the global regularities and the random conditions [F(1,48) = 0.36, p = .55, η p 2 = .007]. The RT advantage for global shape identification was also present within the global regularities condition, as global shape identification was faster than local shape identification [F(1,48) = 20.44, p < .001, η p 2 = .30]. Thus, global shape identification was facilitated by quadruples, suggesting that global regularities draw attention to the global scale.

To investigate whether the task itself was drawing attention to the regularities, we conducted a 2 (experiment: 1 vs. 2; between-subjects) × 2 (matrix: congruent vs. incongruent; within-subjects) mixed-effects ANOVA. In the local regularities condition, there was a main effect of experiment [F(1,48) = 9.26, p = .004, η p 2 = .16], suggesting that local shape identification was faster in the local than in the global regularities condition. Likewise, in the global regularities condition, there was a main effect of experiment [F(1,48) = 31.38, p < .001, η p 2 = .40], suggesting that global shape identification was faster in the global than in the local regularities condition. This result suggests that task performance critically depends on the type of regularities: Local and global shape identification was selectively facilitated when local and global regularities were present, respectively.

General discussion

The current study examined how regularities between individual objects influenced the spatial scale of attention. Identification of the local shape was faster when local regularities were present, and identification of the global shape was faster when global regularities were present, suggesting that regularities can guide the spatial scale of attention.

The findings can be explained by a positive feedback loop between learning of regularities and attention to regularities. Specifically, the local shape identification task may initially draw attention to local shapes, allowing learning of local regularities (Turk-Browne et al., 2005). Learning of local regularities can in turn draw attention to the local shapes, speeding up local shape identification (Zhao et al., 2013). Likewise, the global shape identification task may initially draw attention to the global shape, allowing learning of global regularities, which in turn draws more attention to the global shape and speeds up global shape identification. As the previous study has suggested that attention is biased toward regularities (Zhao et al., 2013), the current findings extend this bias to demonstrate that the type of regularities can bias the spatial scale of attention. In addition, the results can explain the previously observed interference between summary perception and statistical learning (Hall, Mattingley, & Dux, 2015; Zhao, Ngo, McKendrick, & Turk-Browne, 2011), where regularities defined over individual objects impede summary task performance, presumably because learning about local regularities directs attention to individual objects, and impairs global processing.

A surprising finding in Experiment 2 was that the quadruples biased attention to the global scale, and yet, learning of these quadruples was not successfully expressed. This can be explained by the fact that statistical learning occurs without conscious intent, and knowledge about the regularities is often implicit (Brady & Oliva, 2008; Fiser & Aslin, 2001; Kim, Seitz, Feenstra, & Shams, 2009; Zhao et al., 2013). Given the implicit nature of statistical learning, one explanation is that learning of the quadruples was so weak that it was not robustly expressed in the explicit choice between the quadruple and the foil. This finding highlights the distinction between the online detection of regularities and the long-term retention of regularities in memory (Zhao & Yu, 2016; Zhao et al., 2013).

Despite the inherent differences between local and global regularities (i.e., local regularities contained three objects in a triplet, whereas global regularities contained four objects in a quadruple), the size of the precedence effect was similar. That is, the RT advantage for the local task compared to the global task in the presence of local regularities (in Experiment 1) was similar to the RT advantage for the global task compared to the local task in the presence of global regularities (Experiment 2). This could be due to the possibility that local and global regularities involved essentially the same learning mechanism, where the co-occurring objects required the participants to process the joint probabilities between these objects in the matrix. In other words, statistical learning drew similar amounts of attention to locally or globally co-occurring objects. However, it is currently unknown whether the size of the matrix influences the size of the precedence effect. Future studies are needed to elucidate how the spatial extent of regularities determines the degree of the precedence effect.

To conclude, the current study revealed a new factor in guiding spatial attention, namely, how individual objects co-occur in space prioritizes local and global processing in the presence of local and global object co-occurrences, respectively.