Introduction

Experimental psychologists apply several methods to understand the interplay between cognitive representations and cognitive processing. Among those methods, one particularly effective approach involves systematic variation of the attributes of two successively presented stimuli: a prime followed by a target (also described as a cue followed by a probe). Measuring the speed and accuracy of the observers’ responses to the target can then reveal facilitation of cognitive processing as a result of shared cognitive representations with the prime. This so-called priming approach to the human mind (Bargh & Chartrand, 2000; McNamara, 2005; Weingarten et al., 2016) has informed a number of research domains, ranging from fundamental cognitive capacity limitations to the representation and processing of spatial attention shifts and spatial meaning representation. The relationship between these latter two aspects of cognition defines the focus of the present study.

For the study of spatial attention shifts, Posner, Nissen, and Ogden (1978; Posner, 1980) developed the attentional priming paradigm in which either a peripheral visual onset or centrally shown arrow primes the processing of the subsequently presented peripheral target. The typical result is faster target detection with spatially congruent compared to incongruent prime-target relations. This influential method has been widely replicated to reveal the time course of attention shifts (e.g., Klein, 2000).

For the study of spatial meaning representation, Logan (1995) and Hommel et al. (2001) found similar congruency effects when using words with explicit spatial meaning as primes (e.g., UP, DOWN, LEFT, RIGHT). These findings of conceptual cueing of attention were also repeatedly replicated (e.g., Gibson & Kingstone, 2006; Gibson & Sztybel, 2014; Pauszek & Gibson, 2018; see also Ostarek & Vigliocco, 2017). Thus, both directional arrow primes and explicitly spatial word primes induce spatial attention shifts that normally result in facilitated processing of congruent targets.

Further research has generalized this spatial priming effect to implicitly spatial words, such as religious expressions (e.g., GOD, SATAN; Chasteen et al., 2010), time-related words (BEFORE, AFTER; e.g., Beracci et al., 2022; Bonato et al., 2012) and single digits (e.g., 1, 2, 8, 9; Fischer et al., 2003), thereby characterizing details of a fundamental aspect of human symbolic cognition with spatially associated concepts.

However, Estes et al. (2008) repeatedly observed interference rather than facilitated processing for implicitly spatial prime words in congruent conditions. For example, the prime word ‘HAT’, which is associated with upper space, led to poorer discrimination performance between target letters ‘X’ and ‘O’ that were subsequently presented at the top compared to the bottom of the screen. The authors interpreted this result as evidence for a bi-phasic process: (a) the semantically induced shift of spatial attention to the primed location, followed by (b) perceptual simulation of the prime object that effectively masks the visual target and impairs target performance (for other interpretations, see Amer et al., 2018).

This counter-intuitive spatial interference effect with lexical primes has been replicated by Gozli et al. (2013), who independently assessed (a) the semantically induced spatial attention shift with a detection response on the space bar; and (b) the perceptual simulation penalty with a discrimination response on two lateralized keys. Testing both abstract and concrete spatially associated nouns (e.g., GOD and SKY, respectively), they observed spatial interference with short stimulus onset asynchronies (SOAs) between prime and target (200–400 ms SOAs) versus normal facilitation with long SOAs (800–1,200 ms).

While some other studies found no effect of spatially associated words on attention shifts (for review, see Petrova et al., 2018), others concluded that the attention shift might depend on processing depth of the prime (for review, see Estes & Barsalou, 2018). A similar argument was recently proposed to resolve the debate about spatial attention shifts induced by time words (von Sobbe et al., 2019) and by digits (Fischer et al., 2020). The present study aimed to shed new light on this ongoing debate by manipulating processing depth for explicit and implicit primes in both a bottom-up and a top-down manner: Like others (e.g., Amer et al., 2018; Klein, 2000), we manipulated attention in a bottom-up approach: We varied the time available between centrally presented primes and peripherally presented targets (the SOA) to allow different amounts of time before assessing the impact of our cues on the spatial distribution of attention, and we compared performances for explicit versus implicit primes. Extending previous work on the top-down manipulation of spatial attention (e.g., Lupyan, 2012), we also manipulated processing depth through the task instructions given to participants for responding in a go/no-go task. Our 'go' rules separately manipulated the processing depth for the prime and for the target, as shown in Table 1.

Table 1 Overview of task instructions

Specifically, we studied the attentional consequences of successively deeper spatial processing of either primes or targets. In Task 1 (detection), participants were told to ignore the centrally presented primes and to respond whenever a peripheral target was presented, regardless of its location or identity. This condition assessed the size of the well-documented effect of automatic attention shifts to peripheral target onsets in our paradigm (Posner, 1980; Posner et al., 1978).

In Task 2 (localization), participants also ignored the centrally presented primes but they only responded when the peripheral target appeared in the relevant location, resulting in two go rules (for targets in the upper vs. lower visual field). This instruction may result in spatial biases favouring one over the other location.

Finally, in Task 3 (semantic), responses were only allowed when the central primes had the task-relevant meaning (as stated in the go rule) and peripheral targets appeared at the same task-relevant locations; this conjunction resulted in four go rules (two congruent and two incongruent combinations) and required participants to compute the congruency relationship prior to responding in each trial.

A critical issue here is whether space-related prime words are processed differently when presented to an observer who is operating either without a spatial set (Task 1: detection) or with an instructed spatial set (Task 2: localization). We considered this to be an important difference because it addresses potential interactions between bottom-up and top-down control over spatial attention. Specifically, although the cue words were irrelevant under both task instructions, their central presentation made them hard to ignore, thus enabling potential congruency effects between their spatial meaning and the instructed spatial set in task 2 (localization) to occur.

Our experimental tasks were generally designed to replicate previous work in order to clarify reasons for divergent results and to facilitate comparability of findings. There were two reasons for presenting two different targets despite requiring participants merely to detect those targets in the present set of tasks. First, previous studies used a target discrimination task, so they always presented two different objects as targets (cf. Estes et al., 2015; Petrova et al., 2018, Table 1). Importantly, the mere variability of target features can affect the depth of attentional filtering (consider the textbook example of serial vs. parallel visual search slopes, i.e., different spatial processing resulting from target feature variability). Target feature variability might thus influence the spatial interactions we wanted to study in the present paradigm. Therefore, it was essential to replicate this important feature for comparability with previous work. We expected that spatial priming effects, regardless of their direction, would be similar for explicit primes in all tasks, thus replicating the previous literature and establishing a results pattern against which to compare the processing of implicitly spatial primes. For those latter conditions, we expected the priming effect to become stronger with cognitively deeper spatial processing. Finally, consistent with observations in Estes and Barsalou’s (2018) meta-analysis for deep orthographies such as Hebrew, the (negative) spatial priming effect should be largest at the shortest SOA, thereby reflecting rapid influence of bottom-up processing on the cognitive mechanism responsible for the spatial priming effects in this paradigm. We note that the present notion of processing depth refers to the analysis of the cues’ meaning and does not constitute a manipulation of orthographic processing depth as identified by Estes and Barsalou (2018).

Experiment 1: Explicit prime words

Method

Participants

A total of 79 native Hebrew-reading Israeli adults were recruited from the student populations of Ariel, Israel. They were 14 males and 65 females with ages ranging from 19 to 33 years (mean: 22.9 years). Based on the available effect sizes from similar conditions in the meta-analysis by Estes and Barsalou (2018), and G-Power calculations (Version 3.1.9.2; Faul et al., 2007), the recommended number of participants is approximately 40. In light of frequent failures to replicate the spatial interference effect (Petrova et al., 2018) we aimed to test 80 participants in each of the main experiments (cf. Simonsohn, 2015).

Stimuli and apparatus

The stimulus set consisted of four up-associated spatial words (meaning: up, upper, high, and above) and four down-associated spatial words (meaning: down, lower, low, and below) presented visually in Hebrew. These words had also been used as directional primes in previous published work (see above). All Hebrew words were four to five letters long and shown in black Arial font with 35-pt size on white background. Four additional non-spatial words with similar length and frequency in Hebrew (meaning: door, banana, clock, key; cf. Frost & Plaut, 2005) were used as fillers. Primes were displayed at the centre of the screen, while the targets (letters X or O) appeared centred horizontally and positioned 8° vertically above or below the centre of the display (Estes et al., 2008). Responses were made by pressing the space bar of a QWERTY keyboard. All other keyboard keys were covered. The presentation of task instructions, stimuli, event timing and response recording was controlled by Experiment-Builder software (SR Research, 2011).

Design

In the Detection Task there were 102 trials, comprising 32 catch trials without target and 72 experimental trials (69% go trials). These latter trials reflected the complete crossing of three SOAs, two target locations, two target identities, and three prime types. These 36 trials were randomly presented twice with randomly chosen exemplars of each prime type.

In the Localization Task there were also 102 trials, comprising two blocks of 51 trials, one for each response rule (go if target location is up or go if target location is down). In each block, we combined 33 go trials with 18 no-go trials (65% go trials). These were drawn from combining three SOAs and three prime types and then randomly selecting exemplars of each prime type.

In the Semantic Task there were 132 trials due to four response rules, reflecting the crossing of two prime meanings and two target locations. In each counterbalanced response rule block there were 33 trials, of which 21 were go trials (64% go trials). These 21 go trials reflected three SOAs and seven random combinations of target identity and prime word exemplar. The no-go trials were random combinations of the experimental factors.

Procedure

All trials consisted of two successive visual events: a lexical prime at fixation to which participants did not overtly react, and a target on which they made a decision that depended on the go rule in a go/no-go task (see Table 1). The SOA between these events was either 400 or 600 or 800 ms, reflecting the range of values that were effective in previous work (Estes & Barsalou, 2018).

Each trial was initiated by a central fixation dot presented for 250 ms. Then, the prime word was presented for 250 ms. Finally, the target letter (X or O) was presented after one of three different randomly chosen SOAs (400, 600 and 800 ms after the onset of the prime word) and remained visible until the participant’s response or 2,000 ms had elapsed (also in no-go trials). Reaction time (RT) was defined as the time from target onset until the participant’s response on the space bar. No feedback was given, regardless of whether the response was correct or a hit, miss, correct rejection or false alarm.

Participants were instructed to quickly and accurately make a decision according to the response rule (see Table 1) by pressing the space bar. Verbatim instructions for each task appear in Appendix 1. All participants worked under all response rules in a counterbalanced order, always beginning with eight practice trials. For the sake of efficiency, as well as in order not to repeat each prime word many times, the prime word (with upper, lower or neutral connotation), target location (top, bottom), target letter (X or O), and SOAs (three levels) were not fully randomized within participants. As mentioned above, this means that a given participant saw randomly chosen exemplars of each prime word in each condition of our design, with twice as many spatial compared to neutral primes.

Analysis

A total of 79 participants were tested but eight did not complete all sessions and were excluded from analyses. Practice and filler trials were not analysed. Forty-one trials (1.8%) reflected anticipatory responses in catch trials and 49 trials (0.9%) reflected omitted responses in go trials. These error trials ranged only from 0% to 6% across the remaining 71 participants so that none was excluded. Finally, RTs for go trials outside of the mean and 2.5 standard deviations from the group mean were excluded (119 trials). The remaining trials were averaged across Target Identity (X, O) and analysed with a repeated-measures analysis of variance (ANOVA) that evaluated the effects of three Task Instructions (detection, localization, semantic; see Table 1), two Prime Directions (down, up), two Target Locations (down, up) and three SOAs (400, 600, and 800 ms) on RT. Results were computed with JASP Version 0.16.3 (JASP Team, 2022). Full results appear in Appendix 2.

Results and discussion

There was a significant main effect of Task Instructions, F(2, 140) = 57.199, p < .001, ηp2 = .450. Average RTs for the detection, localization and semantic tasks were 450 ms, 392 ms and 392 ms, respectively. Post hoc t-tests showed that the detection task was significantly slower than the other two tasks (requiring deeper spatial processing), which did not differ. This disadvantage of the ‘shallow’ detection task compared to the other two tasks that were designed to induce ‘deeper’ spatial processing likely reflects more cautious responding in the presence of catch trials, which were only used in the first task (e.g., Grice et al., 1974; cf. Luce, 1986, p. 55f.).

The reliable main effect of SOAs, F(2, 140) = 21.794, p < .001, ηp2 = .237, was qualified by significant two-way interactions with Task Instruction, F(4, 280) = 26.486, p < .001, ηp2 = .275, and with Target Location, F(2, 140) = 3.600, p = .030, ηp2 = .049. The Task Instruction × SOA interaction reflected a general decrease of RT with longer SOAs in the localization and semantic tasks (a typical foreperiod effect, cf. Luce, 1986), compared to an increase in participants’ RT for the long SOA in the detection task, again indicating anticipation of catch trials. The Target Location × SOA interaction reflected a small processing advantage for upper targets with increasing SOAs (cf. Fischer et al., 1999; Heywood & Churcher, 1980; Previc, 1990).

More importantly, there was a significant interaction between Prime Direction and Target Direction, F(1, 70) = 26.203, p < .001, ηp2 = .272. This result shows the typical congruency benefit reported in the literature (e.g., Hommel et al., 2001; Logan, 1995). The congruency effect was not modulated by processing depth, as indicated by the complete absence of a triple interaction of Prime Direction and Target Direction with Task Instruction, F < 1. The upper panel of Figure 1 shows this result. All other main effects and interactions were non-significant, all p-values > .237.

Fig. 1
figure 1

Results of Experiment 1 (upper panels) and Experiment 2 (lower panels). Error bars reflect 1 SEM. Stars indicate significant interactions of Prime Direction and Target Direction. For Task descriptions see Table 1

Experiment 2: Implicit prime words

As expected, Experiment 1 found evidence for the typical spatial facilitation effect through the use of explicit spatial prime words. This priming benefit was independent of processing depth of either the primes or the targets, as indicated by lack of interactions with either task instruction or SOA. Experiment 2 asked whether this pattern of results would extend to implicit spatial prime words that were previously reported to either induce a spatial interference effect (Estes & Barsalou, 2018) or not (Petrova et al., 2018).

Method

Participants

We recruited 77 new participants from the same population for the second experiment. They were 12 males and 65 females with ages ranging from 18 to 30 years (mean: 22.7 years).

Stimuli and apparatus

The stimulus set consisted of four Hebrew words associated with upper locations (meaning: hat, roof, cloud, peak) and four Hebrew words associated with lower locations (meaning: carpet, floor, basement, roots). In addition, the four non-spatial object words from Experiment 1 served as fillers. The connotative word database was a translated subset of stimuli from the norming pre-tests of Estes et al. (2015) and Petrova et al. (2018), after being normed by Hebrew speakers. Three native Hebrew speakers ranked the words according to the strength of their spatial connotations, on a vertical scale from 1 (‘extremely low’) to 5 (‘extremely high’) (see Estes et al., 2015). Then we selected the most strongly associated words while controlling length and frequency. The upward (range = 4.00–4.66) and downward primes (range = 1.33–2.00) had non-overlapping ranges of spatial associations. In addition, the above-mentioned database served as frequency control, as well as for selecting the non-spatial words. All other characteristics of stimuli and apparatus were identical to Experiment 1.

Design and procedure

These were identical to Experiment 1.

Analysis

A total of 77 participants were tested but four did not complete all sessions and were excluded from analyses. Forty-three trials (1.9%) reflected anticipatory responses in catch trials and 81 trials (1.6%) reflected omitted responses in go trials. These error trials ranged only from 0% to 8.2% across the remaining 73 participants so that none was excluded. Finally, RTs for go trials outside of the mean and 2.5 standard deviations were excluded (140 trials). The remaining trials were averaged across Target Identity (X, O) and analysed with the same ANOVA design as before. Full results appear in Appendix 3.

Results and discussion

There was a significant main effect of Task Instruction, F(2, 144) = 10.148, p < .001, ηp2 = .124. Average RTs for the detection, localization and semantic tasks were 462 ms, 416 ms and 433 ms, respectively. Post hoc t-tests showed again the processing cost associated with the use of catch trials: Shallow target detection was significantly slower than the other two tasks (requiring deeper spatial processing), which did not statistically differ.

The reliable main effect of SOA, F(2, 144) = 55.885, p < .001, ηp2 = .437, was qualified by a significant two-way interaction with Task Instruction, F(4, 288) = 22.737, p < .001, ηp2 = .240. This interaction again reflected a foreperiod effect of anticipating the target in the localization and semantic tasks, compared to anticipation of a catch trial resulting in increased RTs for the long SOA in the detection task (cf. Luce, 1986).

More importantly, there was a significant interaction between Prime Direction and Target Direction, F(1, 72) = 6.907, p = .010, ηp2 = .088. This result was modulated by processing depth, as indicated by the triple interaction with Task Instructions, F(2, 144) = 7.686, p < .001, ηp2 = .096. Post hoc t-tests on the priming benefits in the three tasks revealed that only the semantic task incurred a reliable priming benefit (19.3 ms, t(72) = 3.434, p < .001, while the detection and localization tasks showed negligible priming (4 ms and -4 ms, t(72) = 1.059 and t(72) = -1.198, respectively, both p-values > .23). The lower panel of Fig. 1 shows these results. All other main effects and interactions were non-significant, all p-values > .12.

Given an apparent advantage of upper over lower targets noted by the handling editor and a reviewer, we conducted a 2 (task: detection, localization) × 2 (prime: down, up) × 2 (target: down, up) × 3 (SOAs: 400, 600, 800 ms) ANOVA. Full results appear in Appendix 4. There was indeed a main effect of target location, F(1, 72) = 5.531, p = .021, ηp2 = .071, that also interacted with SOA, F(2, 144) = 3.104, p = .048, ηp2 = .041. The up advantage was not significant for the short and medium SOAs (5 and 4 ms, respectively), but became significant for the long SOA (17 ms, p = .009). We interpret this small advantage of upper over lower targets (434 vs. 443 ms) as reflecting the typical attentional bias in favor of the upper visual field for both overt (saccadic) and covert attention (cf. Fischer et al., 1999; Heywood & Churcher, 1980). Its origin is not well understood (see Previc, 1990, and commentators for discussion).

General discussion

In light of recent conflicting views and results pertaining to the efficiency of conceptually mediated spatial priming with words and other symbols, this study asked the basic question: How does spatial processing depth influence attention shifts to a target location. Two experiments addressed this question by systematically manipulating both top-down and bottom-up determinants of processing depth in a priming paradigm: Our top-down manipulation related to the spatial processing depth of primes and probes: A go-rule determined under which condition participants were allowed to respond to the targets. Our bottom-up manipulation related to the time between primes and targets (the SOA).

In Experiment 1 we presented explicit vertical spatial prime words at fixation and found that subsequent target discrimination was better whenever prime meaning and target location were congruent. This result was independent of both manipulations of processing depth, i.e., Task Instruction and SOA. The findings replicate early work of Logan (1995), Hommel et al. (2001), and others on conceptual cueing; they confirm that the meaning of explicitly spatial words is rapidly and obligatorily extracted and then shifts the reader’s spatial attention in the denoted direction.

In Experiment 2 we replaced the explicit with implicit vertical spatial primes and found that the congruency benefit only emerged when participants had been instructed to determine a congruent relationship between the spatial connotation of the prime and the location of the subsequent target. This result reflects top-down control over the process of spatial attention allocation but was independent of SOA, thus signalling no effect of our bottom-up manipulation of processing depth. Thus, in spatial semantic priming, spatial processing depth matters only for implicit spatial primes and when it is cognitively controlled. In other words, explicit semantic analysis is a prerequisite for conceptual cueing.

Why was there no facilitation effect with implicit primes under the two shallow processing instructions of target detection and localization, even though words are generally processed automatically (e.g., Stroop, 1935)? This question helps to identify a key contribution of the present study: Comparing instructions across tasks (see Table 1) identifies semantic processing of the prime and the subsequent explicit computation of a spatial congruency relationship between prime and target as the critical ingredients for obtaining spatial attention shifts with implicitly spatial prime words. Only the go instructions in the semantic processing task enforced this cognitive operation while instructions for detection and localization did not. This result also illustrates the informativeness of our manipulation of processing depth across three distinct levels: By doing so we clarified that implicit spatial congruency between a top-down task set and a bottom-up location is not sufficient to modify attentional selectivity when compared to shallow target detection in the same paradigm To further corroborate this interpretation of our results, a future study should examine a go instruction according to which only the prime meaning is relevant but not the target location.

We found here only facilitation but no spatial interference effect as reported by Estes and colleagues (2008; Estes & Barsalou, 2018). This outcome likely reflects our incomplete compliance with their recommendations for obtaining interference. Although our study was (1) conducted in a deep orthography and (2) included short SOAs, we did not (3) provide participants with contextual cue words, thereby omitting one of the three recommended ingredients for obtaining an interference effect. Moreover, depth of processing was here not manipulated in terms of orthographic depth but in terms of spatial processing, i.e., the degree of semantic processing of the spatial meaning of the cue.

We conclude our discussion by defending two methodological decisions. First: Why did we not add contextual words, as recommended by Estes and Barsalou (2018)? Here, we focused on processing depth of single spatially connoted object words and effectively asked our participants where these objects are. Our results suggest that spatial associations are then computed in a controlled fashion and become available without (sufficiently powerful) mental simulations that might interfere with the discrimination task. More task instructions should be examined with our new instructional approach to pinpoint the exact degree of spatial processing for implicit spatial words that is needed to shift spatial attention. This will contribute to a better understanding of how spatially associated concepts, such as directional expressions, metaphors and the names of spatially associated objects or symbols shape our cognitive experience.

And secondly: Why did we present two different targets even though participants performed only a detection task? As indicated in the Introduction, the range of target features is known to contribute to the efficiency of spatial selectivity and we therefore replicated in our design the use of different targets in order to facilitate comparison to previous work. Conversely, and more importantly, using different targets in the present study enables us and other researchers in future work to modify only the task instructions without changing the stimulus set. Preventing this confound will clarify the pure impact of task set on results obtained with this analytical approach and is a methodological strength of our design decision.