Surprisingly inflexible: Statistically learned suppression of distractors generalizes across contexts

de Waard, Jasper; Bogaerts, Louisa; van Moorselaar, Dirk; Theeuwes, Jan

doi:10.3758/s13414-021-02387-x

Surprisingly inflexible: Statistically learned suppression of distractors generalizes across contexts

Open access
Published: 03 December 2021

Volume 84, pages 459–473, (2022)
Cite this article

Download PDF

You have full access to this open access article

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Surprisingly inflexible: Statistically learned suppression of distractors generalizes across contexts

Download PDF

2345 Accesses
11 Citations
3 Altmetric
Explore all metrics

Abstract

The present study investigates the flexibility of statistically learned distractor suppression between different contexts. Participants performed the additional singleton task searching for a unique shape, while ignoring a uniquely colored distractor. Crucially, we created two contexts within the experiments, and each context was assigned its own high-probability distractor location, so that the location where the distractor was most likely to appear depended on the context. Experiment 1 signified context through the color of the background. In Experiment 2, we aimed to more strongly differentiate between the contexts using an auditory or visual cue to indicate the upcoming context. In Experiment 3, context determined the appropriate response ensuring that participants engaged the context in order to be able to perform the task. Across all experiments, participants learned to suppress both high-probability locations, even if they were not aware of these spatial regularities. However, these suppression effects occurred independent of context, as the pattern of suppression reflected a de-prioritization of both high-probability locations which did not change with the context. We employed Bayesian analyses to statistically quantify the absence of context-dependent suppression effects. We conclude that statistically learned distractor suppression is robust and generalizes across contexts.

Statistical learning of distractor locations is dependent on task context

Article Open access 11 July 2023

Spatial suppression due to statistical learning tracks the estimated spatial probability

Article 19 October 2020

More capture, more suppression: Distractor suppression due to statistical regularities is determined by the magnitude of attentional capture

Article Open access 17 December 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Most theories of attention posit that attentional selection takes place through a combination of top-down (voluntary, goal-driven) and bottom-up (automatic, stimulus-driven) factors (Corbetta & Shulman, 2002; Jonides, 1981; Posner & Petersen, 1990; Theeuwes, 2010). However, a growing body of literature points to attentional effects that can be explained by neither top-down nor bottom-up processes. To account for these effects, selection history was introduced, representing attentional biases that have been learned, often implicitly, from past experience (Awh et al., 2012). It is assumed that the three components of attentional selection (top-down, bottom-up, and selection history) are combined in an integrated priority map, where the input with the highest priority is selected in a winner-takes-all fashion (Theeuwes, 2018). Selection history effects are studied in paradigms such as contextual cueing (Chun & Jiang, 1998; Goujon et al., 2015), reward or punishment learning (Anderson et al., 2011; Della Libera & Chelazzi, 2009; Grégoire et al., 2020), and as of recently statistical learning of distractor suppression (e.g., Ferrante et al., 2018; Wang & Theeuwes, 2018b). Of interest to the current study, contextual cueing, reward learning, and punishment learning have been shown to involve context-dependent learning. For example, stimulus features that have been rewarded in a particular context later only capture attention when presented in the same context (e.g., same background scene; Anderson, 2015). By contrast, a study by Britton and Anderson (2020) found that the implicitly learned spatial suppression of distractors was insensitive to context. Given the theoretical importance of this surprising finding, the current study investigated whether learned distractor suppression might become sensitive to context under circumstances where context is more prominent, or whether, alternatively, the modulation of attentional capture by spatial distractor regularities results in generalized suppression across contexts for different types of context manipulations.

Statistical learning concerns the extraction of regularities in space and time from sensory input (see Frost et al., 2019, for a recent review). Statistical learning research has gained a lot of momentum after the seminal discovery that infants can learn the transitional probabilities from one syllable to the next, facilitating word segmentation (Saffran et al., 1996). Since then, the focus has been extended to adults (Frost et al., 2019), and the statistical learning paradigm was ported to the visual domain by replacing syllables with shapes (Fiser & Aslin, 2002; Turk-Browne et al., 2005). Similarly, spatial relations between shapes and distributional regularities regarding the frequencies of shapes are readily picked up even during passive viewing (Fiser & Aslin, 2001, 2002; Growns et al., 2020). Particularly relevant to the present study is the extraction of regularities concerning distracting stimuli. In the context of visual search, learning the likely properties or location of a distractor can help to decrease distraction, thereby facilitating target detection. Adapting the classic additional singleton paradigm (Theeuwes, 1991), Wang and Theeuwes (2018b) introduced a statistical regularity in the location of the uniquely colored distractor, such that it was far more likely to appear in one location (the high-probability location) than any of the seven other (low-probability) locations in the search display. As a result, participants learned to suppress the high-probability location. This was reflected in faster search times when the distractor appeared on the high-probability location and slower search times when the target did (see also Ferrante et al., 2018; van Moorselaar & Slagter, 2019). Crucially, an explicit knowledge test at the end of the experiment indicated that learning had taken place in the absence of awareness (but see Vadillo et al., 2016).

Context plays a major role in many theories of learning and memory (e.g., Godden & Baddeley, 1975), and history-based attentional biases have also been suggested to apply “when the relevant context is encountered” (Awh et al., 2012). There is a vast advantage of context-dependency. Context-independent learning can only be short-lived, requiring a constant re-learning of biases for contexts that have already been encountered, whereas context-dependent learning allows learned regularities to be stored while new ones are being learned or updated. It should come as no surprise then, that reward learning (Anderson, 2015; Anderson & Kim, 2018), punishment-based learning (Grégoire et al., 2020), and contextual cueing (Brooks et al., 2010; Jiang & Song, 2005) have been shown to be context dependent. In contextual cueing, a hyper specificity was reported, with no transfer at all between contexts that only differed in color (Jiang & Song, 2005). Furthermore, the application of different search modes is also context-dependent (Cosman & Vecera, 2013). Lastly, the finding that statistical learning of transitional regularities can be retained for up to 1 year (Arciuli & Simpson, 2012; Kim et al., 2009; Kóbor et al., 2017) is suggestive of context sensitivity; if insensitive to context, those regularities would have long been replaced by more recently learned ones. There are of course many differences between these paradigms and statistically learned suppression. Notably, studies on reward or punishment-based learning and contextual cueing all involved target-based as opposed to distractor-based learning, which might operate on different processes (Di Caro & Della Libera, 2021; Turatto & Pascucci, 2016; Won & Geng, 2020), and findings from other areas within statistical learning do not necessarily translate to distractor suppression. Nevertheless, the overall picture is one where the benefits of context-dependent learning are found across a range of implicit learning paradigms.

While the arguments presented above would lead to a prediction of context-dependent statistical learning of distractor suppression, a recent study by Britton and Anderson (2020) did not find evidence for these effects. Using an adapted version of the paradigm by Wang and Theeuwes (2018b), they reported suppression effects that generalized across contexts. In their study (Experiment 1), the context on each trial was determined by a grayscale background image of a forest or a city (as in the prior study by Anderson, 2015, on reward learning). Crucially, the high-probability distractor location depended on the context, so that the urban background predicted a different distractor location than the forest. The results indicated that learning had taken place: response times (RTs) were faster when the distractor was at a high-probability versus a low-probability location, even though participants had no awareness of the spatial regularities. However, this learning was insensitive to context. Between the two high-probability distractor locations, RTs were the same whether predicted by the context or not.

Given the discrepancy between advantages of context-dependent learning and the context-dependent effects in related paradigms on the one hand, and the context generalization in Britton and Anderson’s (2020) experiment on the other hand, we conducted three experiments to verify the conclusion that statistically learned distractor suppression generalizes across contexts within a task. Similar to Britton and Anderson’s (2020) experiment, in each of our experiments two contexts had their own high-probability distractor location, and these two locations were maximally distant. Experiment 1 conceptually replicated Britton and Anderson’s study, but signified context through the brightness of the background. To increase the subjective difference between the two contexts, in Experiment 2 we employed an auditory versus a visual cue. In Experiment 3, we coupled each context with a different response mapping, so that processing the context was a necessity for performing the task.

Experiment 1

Experiment 1 was a conceptual replication of Britton and Anderson’s (2020) study (Experiment 1). The background (visible throughout a trial) was light grey in one context, and dark grey in the other context, so that distinguishing between contexts would be effortless and fast. Each context had its own high-probability distractor location, which remained constant throughout the experiment. Following reports of a suppression gradient around the high-probability distractor location (e.g., Wang & Theeuwes, 2018b), the two high-probability locations were kept maximally distant. We used eight rather than six stimuli to increase the salience of the distractors and avoid any potential serial search effects.

The three distractor location conditions of interest were low-probability (a distractor on any of the less frequent locations), high-probability match (a distractor on the high-probability location of the current context), and high-probability mismatch (a distractor on the high-probability location of the other context). Figure 1 illustrates three possible outcomes. If learning is context-independent, each high-probability location should be suppressed equally, irrespective of whether it matches or mismatches with the current context (A). If learning is fully context-dependent, suppression should occur only for the high-probability match condition and RTs for the high-probability mismatch condition should roughly equal those for low-probability (C). Finally, if learning is partially context-dependent, we predict the fastest RTs for the high-probability match condition, but at the same time faster RTs for the high-probability mismatch condition as compared to the low-probability condition (B).

Methods

All experiments were approved by the Ethical Review Committee of the Faculty of Behavioral and Movement Sciences of the Vrije Universiteit Amsterdam.

Participants

Sixty-one adults (32 male, 27 female, one non-binary, one unknown, mean age = 30 years, age range: 20–46) participated in an online experiment through Prolific (Palan & Schitter, 2018). They all reported having normal or corrected-to-normal (color) vision, and at minimum an undergraduate degree. Participation took ± 30 min and participants earned £3.75. Following Britton and Anderson (2020), an effect with d = 0.6 (taken from Failing et al., 2019) would require a sample size of 31 to get β = 0.90 when α = 0.05. However, since they did not find a significant result, we attempted to detect a smaller effect size (d = 0.45), which required a sample size of 54 to get β = 0.90 when α = 0.05. The number of non-discarded participants exceeded this minimal sample size in all experiments.

Apparatus and stimuli

Because the experiment took place online, some factors (e.g., lighting and seating conditions) could not be controlled. For replication purposes, item sizes and colors are reported in pixels and RGB values (red/green/blue). The experiment was created in OpenSesame (Mathôt et al., 2012) using OSweb, and run using JATOS (Lange et al., 2015).

The experimental display for the two contexts is illustrated in Fig. 2. It consisted of eight shapes (one circle and seven diamonds, or vice versa), presented on an imaginary circle with a radius of 224 px. Each shape contained a grey (128/128/128) vertical or horizontal line (49 × 7 px). The circles and diamonds were 108 and 134 px high, respectively, in red (255/0/0) or green (0/200/0). Depending on the context, the background was light (204/204/204) or dark (51/51/51) grey. The fixation dot was grey (153/153/153, radius: 7 px).

Procedure and design

Figure 2 gives a schematic overview of a trial. The duration of the fixation period was randomly selected between 1,000 and 1,250 ms. The search display was visible until response or until a 3,000-ms limit was exceeded. Participants searched for the unique shape (i.e., a circle among diamonds or vice versa), and indicated the orientation of the line segment inside (horizontal/vertical) by pressing the up or left arrow key as quickly as possible. Subsequently, a smiley provided positive (250 ms) or negative (750 ms) feedback, followed by a blank screen (150 ms). The longer duration of negative feedback ensured that participants who aimed to finish the experiment quickly would benefit from providing correct responses. The background color, distinguishing between contexts, remained constant and at all times visible throughout a trial.

The target was always the uniquely shaped item, while the distractor was the uniquely colored item. Each context occurred equally often. A target was present on each trial, containing a line that was vertical or horizontal at random. A uniquely colored distractor was present on 84% of the trials. The distractor could be present at any of the eight locations. However, within each context, one distractor location occurred more often (67%) than the other locations (4.7% per location). The two high-probability locations (one for each context) were determined randomly for each participant, with the high-probability location of one context always opposite to that of the other context. The target location was determined randomly on each trial. Participants completed 20 practice trials, followed by four blocks of 125 trials each. A break was included after every block, and trial order was randomized within blocks. Awareness of the spatial regularities was assessed after all trials were completed by asking participants whether the distractor appeared more frequently in one location, and secondly to indicate this location in four trials (context A/B × circle/diamond-shaped target) by typing in a location-based number (1–8).

Results

The data for one participant were discarded because the experiment was not fully completed. A further five participants were discarded because their accuracy scores were below 75%. Incorrect trials (4.2% of trials) and trials on which the RTs were slower than 2,000 ms (4.8% of trials) were excluded from all further analyses. As there was no evidence in support of a speed-accuracy trade-off (neither here, nor in Experiments 2 and 3), we only report RT results.

First, to test if the statistical regularity in distractor location probability (across contexts) is modulating suppression, we performed a repeated-measures ANOVA. Second, to test whether there is evidence for context-dependent statistical learning, we performed a t-test analysis comparing high-probability match and high-probability mismatch trials. In order to uncover the strength of evidence for the null-hypothesis, we performed a Bayesian t-test for the same comparison. Finally, we analyze participants’ awareness of the regularities using Bayes factors.

ANOVAs and t-tests were performed using Jamovi (Sahin & Aybek, 2019). Note that due to a violation of the sphericity assumption, a Greenhouse-Geisser correction was applied to the ANOVA results. Bayesian analyses were performed in JASP (JASP Team, 2020), using the default Cauchy distribution (scale = 0.707) as the prior. Reported Bayes factors reflect the ratio of the likelihood of the null-hypothesis H₀ relative to the alternative hypothesis H₁ (i.e., BF₀₁).

Statistical learning: Are search times modulated by distractor probability?

Figure 3A shows mean RTs for the high-probability (in either context), low-probability, and no-distractor conditions. A one-way repeated-measures ANOVA with the factor distractor condition showed a significant main effect on RTs, F(1.81, 97.76) = 177, p < .001, partial η² = 0.767. Planned comparisons showed that the distractor captured attention reliably for both the low-probability, t(54) = 16.41, p < .001, d = 1.51, and high-probability locations, t(54) = 11.22, p < .001, d = 2.21. Furthermore, they reveal a reliable difference between the high- and low-probability locations, t(54) = 9.10, p < .001, d = 1.23, indicating that participants learned the overall regularities in the experiment.

Is the learned distractor suppression context-dependent?

Figure 3B shows the mean RTs for distractor location in greater detail. To investigate whether the distractor suppression was context-dependent, distractor location is coded as the distance from the high-probability location of the current context (high-probability match). As outlined in Fig. 1, the relevant comparison for investigating context-dependency is between high-probability match and high-probability mismatch trials. A paired t-test reveals that this difference was nonsignificant, t(54) = 0.77, p = .445, d = 0.10, BF = 5.13. This BF indicates that the observed data are about five times more likely to have occurred under the null hypothesis, providing substantial (Jeffreys, 1998) evidence that the learned distractor suppression was not context-dependent. RTs for the high-probability mismatch location were faster than for the low-probability locations, t(54) = 3.94, p < .001, d = 0.53, indicating that the mismatch location was indeed suppressed.

Awareness of the regularities

About half of the participants (47%) answered “yes” to the question if the distractor occurred more often at some locations than others. An awareness score was computed for every participant by taking the average distance between the location indicated by the participant and the actual high-probability distractor location on the four awareness trials. The mean awareness score across participants was 2.09 (SD = 0.53). A one-sided Bayesian t-test comparing awareness scores against chance level (2.0) yielded a BF of 12.4 in favor of no awareness. Furthermore, we found no difference in awareness scores between participants who answered “yes” versus those who answered “no” on the first question, t(53) = 0.95, p = .345, d = 0.26, BF = 2.52. We also repeated all previous RT analyses separately for the “yes” and the “no” group, and found no differences in the pattern of results. Most importantly, there was no difference in RTs between the high-probability match and mismatch conditions in the “yes” group, t(25) = 1.21, p = .239, BF = 2.51, d = 0.24, and in the “no” group, t(28) = 0.07, p = .942, BF = 5.05, d = 0.01. We conclude that participants had no or very little awareness of the learned regularities.

Discussion

The results of Experiment 1 indicate that statistical learning of distractor suppression is independent of context. Replicating Wang and Theeuwes (2018a, b) we observed that compared to low-probability locations, RTs were faster when the distractor was at a high-probability location. This indicates that participants learned the overall regularities of the experiment. Crucially, however, there was no context-specific suppression effect; participants responded equally fast when the distractor location matched the current context (high-probability match) or matched the other context (high-probability mismatch). This finding is in line with Britton and Anderson (2020). With regard to the predictions of Fig. 1, the results are consistent with what is displayed in panel A, indicating that the learned suppression of a probable distractor location generalized across contexts.

Analyses on awareness indicate that, at the group level, participants were likely unaware of the regularities (although caution in the interpretation is warranted, see Vadillo et al., 2016). We conclude that the suppression of distractor locations is most likely the consequence of implicit learning.

Experiment 2

In Experiment 2 we attempted to increase the chances of finding a significant context-dependent suppression effect. Since learned distractor suppression is proactive (Huang et al., 2021), context-specific suppression necessarily relies on an effective instantiation of context through the cue. In the domain of temporal preparation, Los et al. (2021) showed that a between-modalities cue is more effective in producing a temporal context than a within-modalities cue. We reasoned that this might also apply to spatial statistical learning. Therefore, we used a cue that was either a flashed ring or a tone, presented at the start of each trial. To further increase our chances of observing a significant effect, we decreased the smallest effect size that we would be able to detect to d = 0.35 by increasing the amount of non-discarded participants to 95.