The attentional blink is the phenomenon that identification/detection of a second target is impaired when presented shortly after the presentation of a first target. The attentional blink was first reported by Raymond et al. (1992); see also Broadbent & Broadbent, 1987; Weichselgartner & Sperling, 1987) using a rapid serial visual presentation (RSVP) task whereby a series of letters was presented successively at a central location. Participants had to identify the white letter (i.e., the first target; T1) presented among the black letters (i.e., the distractors) and indicate whether a black “x” (i.e., the second target; T2) was presented in the subsequent letter stream. It was revealed that when T1 was accurately identified, T2 (i.e., T2|T1 accuracy) was detected less frequently when presented shortly (i.e., 200–500 ms) but not immediately (i.e., <200 ms) or long after (i.e., >500 ms) T1. Accordingly, the attentional blink moniker is intended to reflect this short-lived lapse in attention produced by the acquisition of the first target’s identity.

A widely adopted theoretical account of the attentional blink posits that T1 processing depletes capacity-limited attentional resources such that they are unavailable for T2 processing (Chun & Potter, 1995; Jolicoeur & Dell’Acqua, 1998; Raymond et al., 1992; for a review, see Dux & Marois, 2009). A recent study by MacLellan et al. (2015; see also MacLellan et al., 2018) highlighted a particularly close link between selective attention resources and the attentional blink. They used a two-target procedure whereby a first target word was presented prior to a second target word following a variable stimulus onset asynchrony (SOA), and participants identified the first and second target words at the end of each trial (for a similar method, see Potter et al., 2005; Potter et al., 2002). To manipulate the attentional resources allocated for T1 processing, selective attention demands were varied such that the first target word was either spatially interleaved with a distractor word (i.e., high selective attention demand; see Fig. 1) or the first target word was presented on its own (i.e., low selective attention demand). The attentional blink was robust when selective attention demands were high but absent when selective attention demands were low. These results are consistent with the view that increasing attentional resources allocated to T1 identification depletes the resources needed to identify T2.

Fig. 1
figure 1

This is an example of a trial when the cue was congruent with the first target word, which occurred on some of the trials in all experiments. (Color figure online)

Complimentary to the notion that increasing the attentional resources allocated to T1 can impair T2 identification, decreasing the attentional resources allocated to T1 can improve T2 identification. A pair of studies that support this notion were reported by Olivers and Nieuwenhuis (2005, 2006). Both studies used an RSVP task whereby T1 and T2 were numbers presented among a stream of distractor letters, and these numbers were to be identified at the end of each trial. In the study of Olivers and Nieuwenhuis (2005), participants in separate groups were asked to think about an irrelevant topic (e.g., their holiday plans) or to listen to music, while concurrently performing the RSVP task. Both groups showed improved T2 identification compared with a group of participants that simply performed the RSVP task. In the study of Olivers and Nieuwenhuis (2006), a random line pattern was presented at the beginning of each trial of the RSVP task, and participants had to indicate whether it was the same as or different than the line pattern presented at the beginning of the previous trial. In a separate experiment of Olivers and Nieuwenhuis (2006), positive, negative, or neutral emotion-laden images were presented to different groups of participants at the beginning of each trial of the RSVP task. The line pattern task led to improved T2 identification relative to a group of participants that observed the line pattern without making same/different judgments, and T2 identification was improved for the group of participants that observed the positive emotion-laden images relative to the groups of participants that observed the negative and neutral emotion-laden images. Together, the studies of Olivers and Nieuwenhuis indicate that reducing the investment of attentional resources in the RSVP task by having participants perform an irrelevant task concurrently can improve T2 identification.

While Olivers and Nieuwenhuis (2005, 2006) offer a set of methods for reducing the investment of attentional resources allocated to T1, another possible method to reduce the investment of attentional resources to T1 is to improve the efficiency of the T1 selective attention process. A plausible way to improve the efficiency of selective attention is by cuing color imagery that is congruent with the color of an upcoming target. There are several recent studies that have demonstrated that cuing color imagery can influence performance on a following cognitive task. One such example is reported by Cochrane et al. (2019). They cued participants to generate color imagery prior to a singleton search task where participants had to identify the different colored object (i.e., the target) among homogenously colored objects (i.e., the distractors). Responses were faster when the imagery cue was congruent with the upcoming target color than when the imagery cue was congruent with the distractor color. Other studies have utilized a neutral cue that had participants generate imagery that was incongruent with both the target and distractor colors, which demonstrated that responses were sped when the imagery cue was congruent with the target color, but also slowed when the imagery cue was congruent with the distractor color (Cochrane et al., 2021d; Moriya, 2018). Also, color imagery cues increased the amplitude of the electrophysiological marker known as the N2pc component when congruent with an upcoming target color, indicating that imagery can influence the processes involved in selective attention (Cochrane et al., 2021c). Congruent color imagery cues can also produce faster detection and discrimination responses for color targets presented centrally as is typical in studies of the attentional blink (Cochrane & Milliken, 2020).

Accordingly, the primary purpose of the present study was to assess whether cuing color imagery can influence selective attention of T1 such that it influences T2 identification. That is, given that it has been demonstrated that cuing color imagery aids selective attention in visual search, the present study assessed whether color imagery could reduce the attentional resources required to identify T1, and thus, increase the available attentional resources to identify T2. To evaluate this issue, a cue was presented prior to the high attentional demand condition of the two-target attentional blink procedure of MacLellan et al. (2015). This procedure had participants identify a first target word presented in red that was spatially interleaved with a distractor word presented in green and then identify a second target word presented in white following a variable SOA (see Fig. 1). The cue preceding the first display signaled participants to generate color imagery that was congruent with the color of the first target word, to generate imagery that was congruent with the color of the distractor word, to generate imagery that was incongruent with the color of both the first target and distractor words (Experiment 2), or to withhold generating imagery (Experiment 3). If cuing color imagery can improve T2 identification by improving the efficiency of selective attention to T1, then T2|T1 accuracy (i.e., T2 accuracy given that T1 was accurately performed) should be highest when the color imagery cue is congruent with the color of the first target word compared with all other cue types. If cuing color imagery can impair T2 identification by decreasing the efficiency of selective attention to T1, then T2|T1 accuracy should be lowest when the imagery cue is congruent with the distractor word compared with all other cue types. The present study will provide insight into whether generating visual imagery can reduce the attentional resources necessary for object identification to help overcome bottlenecking of the information processing stream.

Experiment 1

As a first pass, the participants of Experiment 1 were cued to generate color imagery that was either congruent with the color of the first target word or congruent with the color of the distractor word prior to each trial of the two-target attentional blink procedure of MacLellan et al. (2015). It was predicted that when the cue was congruent with the first target word, T1 accuracy would be higher than when the cue was congruent with the distractor word. It was also predicted that when the cue was congruent with the first target word, the attentional blink would be reduced such that T2|T1 accuracy would be higher than when the cue was congruent with the distractor word.

Method

Participants

Twenty McMaster University undergraduates (18 female, ages 18–21 years, M = 18.89 years) participated in exchange for course credit or $10.00 CAD monetary compensation. All participants had normal or corrected-to-normal visual acuity and normal color vision. A sample size of 20 participants was selected a priori.

Apparatus and stimuli

The experiment was conducted using PsychoPy and displayed on a BenQ 24-inch LED monitor with a 60 Hz refresh rate connected to a Mac Mini computer. The target and distractor words were the five-letter medium frequency words “bread,” “chief,” “dream,” “flute,” “grain,” “place,” “right,” and “stick” in uppercase Lucida Console font. The height of each letter of the target and distractor words was approximately 1° of visual angle. The space between each letter was approximately 1° of visual angle. The first target word was displayed in red and was presented in the center of the screen. The distractor word was displayed in green and was presented such that each letter was positioned in the space between the letters of the first target word in a nonoverlapping manner. That is, the first letter of the target word was always the leftmost stimulus in the display. The distractor word was centered approximately 0.1° above or below the center of the screen. The second target word was white and was presented in the center of the screen. The cue was the first two letters of a color (i.e., “re” or “gr”) in uppercase white Lucida Console font, approximately 1° of visual angle in height, and presented in the center of the screen. The fixation cross was displayed in white, approximately 0.5° of visual angle in height and width, and presented in the center of the screen. The visual mask was composed of an overlaid ampersand and uppercase “x” in white Lucida Console font positioned at the same location as each letter of the target words. Each letter of the visual mask was approximately 1° of visual angle in height. Target prompts were the phrases “First word?” and “Second word?” in gray Lucida Console font, approximately 0.3° of visual angle in height, and positioned in the bottom left corner of the screen. All stimuli were presented on a black background.

Procedure

Participants were seated approximately 60 cm in front of the computer monitor. Each trial began with the presentation of the cue for 750 ms followed by the fixation cross for 1,500 ms. At this time, participants were to generate color imagery indicated by the cue. Specifically, participants were to remain fixated on the fixation cross and imagine red if the “re” cue was displayed, and to imagine green if the “gr” cue was displayed. Each cue was equally likely and randomized on a trial-by-trial basis. The first target and distractor words were then displayed for 117 ms. The distractor word was displayed either above or below the first target word, and the location of the distractor word was equally likely and randomized on a trial-by-trial basis. The first target word was the word displayed in red and the distractor word was the word displayed in green. A blank display was then presented for either 117, 350, or 583 ms (producing T1–T2 SOAs of 234, 467, or 700 ms).Footnote 1 Each interval was equally likely and randomized on a trial-by-trial basis. The second target word was then displayed for 100 ms, followed by the visual mask displayed for 100 ms. The first target, distractor, and second target words were three different words randomly selected on a trial-by-trial basis from the list of target and distractor words. The first target prompt was then displayed and participants were to indicate the word that appeared as the first target word. Participants responded using the number pad on a standard QWERTY keyboard with the following number–word correspondences: 1 = bread, 2 = place, 3 = chief, 4 = right, 5 = stick, 6 = dream, 7 = flute, and 8 = grain. Each word was indicated on the corresponding response key. Following a response, the second target prompt was displayed and participants were to indicate the second target word in the same manner as the first target word. A trial example is depicted in Fig. 1.

The experimental session began with 12 practice trials that were split into two phases of six trials each. During the first phase, participants were instructed to generate color imagery as indicated by the cue while fixating the central fixation cross. The target and distractor words were not displayed during this phase. During the second phase, participants performed trials similar to the experimental trials with the exception that the duration of the first and second target displays were lengthened to 200 ms, and that the duration of the interval between the first and second target displays was lengthened to 1,000 ms. At this time, participants were instructed that the first target word was always the word displayed in red and the distractor word was always the word displayed in green. Following completion of this second practice phase, participants then performed 178 experimental trials split across five blocks.Footnote 2 Participants were permitted to take a break between each block of trials. Upon completion of the experimental trials, participants were instructed to estimate the percentage of trials on which they generated red (i.e., target congruent) and green (i.e., distractor congruent) color imagery when cued to do so.

Results

The primary dependent variables were the first target word accuracy rates (T1) and second target word accuracy rates (T2), and second target word accuracy rates provided the first target word was accurately identified (T2|T1). One participant was replaced for consistently indicating the first target word was the word congruent with the color indicated by the cue. Mean T1, T2, and T2|T1 accuracy rates were submitted to separate within-subject analyses of variance (ANOVAs) that treated cue congruency (target/distractor) and SOA (234/467/700) as factors. An alpha criterion of .05 was used to determine statistical significance in these primary analyses, and the alpha criterion of all follow-up t tests were Bonferroni-corrected to account for multiple comparisons (α = .0167). T1 and T2|T1 accuracy rates are depicted in Fig. 2.

Fig. 2
figure 2

T1 and T2|T1 mean percentage accuracy rates for Experiment 1 when the imagery cue was congruent with the color of the first target and distractor words. The error bars reflect the standard error of the mean corrected to remove between-subject variability (Cousineau, 2005; Morey, 2008)

T1 accuracy

The analysis of T1 accuracy revealed a significant main effect of cue congruency, F(1, 19) = 8.45, p = .009, ηp2 = .31, reflecting higher accuracy rates when the cue was congruent with the target color than when the cue was congruent with the distractor color.

The analysis of T1 accuracy also revealed a significant main effect of SOA, F(2, 38) = 29.42, p < .001, ηp2 = .61. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in the t tests comparing the 234 and 467 levels, and 234 and 700 levels of SOA (all ts ≥ 5.31, p < .001, d ≥ 1.04), reflecting lower accuracy rates when the SOA was 234 ms than 467 ms or 700 ms. There was no effect in the t test comparing the 467 and 700 levels of SOA, t(19) = 0.87, p = .40, d = 0.12. There was no interaction of cue congruency and SOA, F(2, 38) = 1.39, p = .26, ηp2 = .07.

T2 accuracy

The analysis of T2 accuracy revealed a significant main effect of cue congruency, F(1,19) = 11.89, p = .003, ηp2 = .38, reflecting higher accuracy rates when the cue was congruent with the target color than when the cue was congruent with the distractor color.

The analysis of T2 accuracy also revealed a significant main effect of SOA, F(2, 38) = 73.37, p < .001, ηp2 = .79. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 4.56, p < .001, d ≥ 0.70) reflecting an increase in accuracy with increased SOA.

There was no interaction of cue congruency and SOA, F(2, 38) = 1.87, p = .17, ηp2 = .09.

T2|T1 accuracy

The analysis of T2|T1 accuracy revealed a significant main effect of cue congruency, F(1, 19) = 5.78, p = .027, ηp2 = .23, reflecting higher accuracy rates when the cue was congruent with the target color than when the cue was congruent with the distractor color.

The analysis of T2|T1 accuracy also revealed a significant main effect of SOA, F(2, 38) = 36.50, p < .001, ηp2 = .66. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 4.48, p < .001, d ≥ 1.16) reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA, F(2, 38) = 0.57, p = .57, ηp2 = .03.

Cue use estimates

The estimates of the percentage of trials that participants generated color imagery indicated by the cue were submitted to a paired t test that treated cue congruency (target/distractor) as a factor. This analysis revealed a significant effect of cue congruency, t(19) = 2.30, p = .033, d = 0.60, reflecting that participants reported generating color imagery more often when the cue was congruent with the target (M = 71.13%, SD = 23.12%) than the distractor (M = 56.75%, SD = 24.62%) color.

Split-half of cue estimates

The participants were split into high imagery and low imagery groups based on their estimates of the percentage of trials they generated imagery. Averaged across the cue congruency conditions, the mean percentage of imagery use for the low group was 49.12% and 78.75% for the high group. T1, T2, and T2|T1 accuracy analyses like reported above were conducted on these two groups independently. For the high group, the significant and nonsignificant outcomes of the analyses were the same as those reported above. For the low group, the significant and nonsignificant outcomes of the analyses were the same as those reported above except that there was no effect of cue congruency in any of the analyses (all Fs ≤ 2.25, p ≥ .17, ηp2 ≤ .20), reflecting similar accuracy rates for T1, T2, and T2|T1 measures when the imagery cue was congruent with the target and distractor word. T1, T2, and T2|T1 accuracy rates were further submitted to mixed-factor ANOVAs that treated imagery cue congruency (target/distractor) and SOA (234/467/700) as within-subject factors and imagery group (high/low) as a between-subjects factor to evaluate whether the magnitude of the cue congruency finding differed across the high and low imagery groups. Neither the three-way interaction or the pertinent two-way interaction of cue congruency and imagery group were significant in these analyses (all Fs ≤ 0.77, p ≥ .39, ηp2 ≤ .04) indicating that, while the cue congruency effect was only present in the high imagery group, its magnitude did not significantly differ from the nonsignificant pattern of results in the low imagery group.

T1 and T2 errors

The error percentages when the distractor word was reported as T1, the T2 word was reported as T1, the distractor word was reported as T2, and the T1 word was reported as T2 were submitted to separate paired t tests that treated cue congruency as a factor. It was revealed that the distractor word was more frequently reported as the T1 word when the cue matched the distractor than target word, t(19) = 4.24, p < .001, d = 0.90. No other of these analyses produced significant effects (all ts ≤ 1.66, p ≥ .11, d ≤ 0.49). These error percentages are depicted in Table 1.

Table 1 The error percentages (%) of Experiment 1 when T1 and T2 were reported as the distractor word, each other, or one of the other possible words

Discussion

First, Experiment 1 showed that T1, T2, and T2|T1 accuracy increased as SOA increased. This finding indicates that the pertinent findings of the two-target attentional blink procedure of MacLellan et al. (2015) were reproduced, with the T2|T1 accuracy finding reflecting a successful demonstration of the attentional blink. Second, T1 accuracy was higher when the cue was congruent with the first target word than when the cue was congruent with the distractor word. This finding indicates that cuing color imagery influenced performance on a subsequent task featuring congruent perceptual colors. Third, there was higher T2|T1 accuracy when the cue was congruent with the first target word than when the cue was congruent with the distractor word. This finding indicates that the cue modulated T2 identification, although the basis of this finding remains unclear. Fourth, participants were less likely to report generating color imagery when it was congruent with the distractor word than the first target word.

Experiment 2

A limitation of Experiment 1 was that it could not reveal why T2 identification differed when the cue was congruent with the first target word and was congruent with the distractor word. This result could be due either to facilitation of selective attention when the cue was congruent with the target, to interference of selective attention when the cue was congruent with the distractor, or to both facilitation and interference. To evaluate this issue in Experiment 2, the cue could now be congruent with the color of the first target word, congruent with the color of the distractor word, or neutral (i.e., incongruent with the first target and distractor words). If the cue being congruent with the first target word improves T1 and T2|T1 accuracy by virtue of facilitated T1 selective attention, then T1 and T2|T1 accuracy should be higher when the cue is congruent with the first target word than when the cue is neutral. In contrast, if the cue being congruent with the distractor word reduces T1 and T2|T1 accuracy by virtue of interference to T1 selective attention, then T1 and T2|T1 accuracy should be higher when the cue is neutral than when the cue is congruent with the distractor word.

Method

Participants

Twenty McMaster University undergraduates (15 female, ages 18–24 years, M = 19.75 years) participated in exchange for course credit or $10.00 CAD monetary compensation. All participants had normal or corrected-to-normal visual acuity and normal color vision.

Apparatus and stimuli

The apparatus was identical to that used in Experiment 1. The stimuli were identical to Experiment 1 except for the neutral cue, which was the two letters “bl.”

Procedure

The procedure was identical to Experiment 1 with the following exceptions. The neutral cue (i.e., “bl”) could be presented instead of the “re” and “gr” cues. Each cue was equally likely and randomized on a trial-by-trial basis. Participants were instructed to imagine blue on trials that the neutral cue was presented. Participants now performed 270 experimental trials evenly distributed across five blocks. At the end of the experimental session, participants were instructed to estimate the percentage of trials on which they generated blue imagery on trials that the neutral cue was presented.

Results

The primary dependent variables were the same as Experiment 1. Mean T1 and T2|T1 accuracy rates were submitted to separate within-subject ANOVAs that treated cue congruency (target/distractor/neutral) and SOA (234/467/700) as factors. An alpha criterion of .05 was used to determine statistical significance in all primary analyses, and the alpha criterion of all follow-up t tests were Bonferroni-corrected to account for multiple comparisons (α = .0167). T1 and T2|T1 accuracy rates are depicted in Fig. 3.

Fig. 3
figure 3

T1 and T2|T1 mean percentage accuracy rates for Experiment 2 when the imagery cue was congruent with the color of the first target word, congruent with the color of the distractor word, and of a neutral color. The error bars reflect the standard error of the mean corrected to remove between-subject variability (Cousineau, 2005; Morey, 2008)

T1 accuracy

The analysis of T1 accuracy revealed a significant main effect of cue congruency, F(2, 38) = 8.33, p < .001, ηp2 = .30. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There were significant effects in t tests comparing the target and distractor levels, and distractor and neutral levels of cue congruency (all ts ≥ 2.97, p ≤ .010, d ≥ 0.32), reflecting higher accuracy rates when the cue was congruent with the target or neutral than when the cue was congruent with the distractor color. There was no effect in the t test comparing the target and neutral levels of cue congruency, t(19) = 1.38, p = .18, d = 0.13.

The analysis of T1 accuracy also revealed a significant main effect of SOA, F(2, 38) = 39.47, p < .001, ηp2 = .68. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 3.02, p ≤ .007, d ≥ 0.26) reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA in the analysis of T1 accuracy, F(4, 76) = 0.23, p = .92, ηp2 = .01.

T2 accuracy

The analysis of T2 accuracy revealed a significant main effect of cue congruency, F(2, 38) = 12.87, p < .001, ηp2 = .40. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There were significant effects in t tests comparing the target and distractor levels, and distractor and neutral levels of cue congruency (all ts ≥ 3.15, p ≤ .005, d ≥ 0.38), reflecting higher accuracy rates when the cue was congruent with the target or neutral than when the cue was congruent with the distractor color. There was no effect in the t test comparing the target and neutral levels of cue congruency, t(19) = 1.88, p = .076, d = 0.23.

The analysis of T2 accuracy also revealed a significant main effect of SOA, F(2, 38) = 101.22, p < .001, ηp2 = .84. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 4.24, p < .001, d ≥ 0.47) reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA, F(4, 76) = 0.96, p = .44, ηp2 = .05.

T2|T1 accuracy

The analysis of T2|T1 accuracy revealed a significant main effect of cue congruency, F(2, 38) = 8.81, p < .001, ηp2 = .32. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There was a significant effect in the t test comparing the target and distractor levels of cue congruency, t(19) = 3.89, p < .001, d = 0.73, reflecting higher accuracy rates when the cue was congruent with the target than when the cue was congruent with the distractor color. There was no effect in the t tests comparing the distractor and neutral levels and target and neutral levels of cue congruency (all ts ≤ 2.08, p ≥ .051, d ≤ 0.35).

The analysis of T2|T1 accuracy also revealed a significant main effect of SOA, F(2, 38) = 75.28, p < .001, ηp2 = .80. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 3.39, p ≤ .003, d ≥ 0.62) reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA in the analysis of T2|T1 accuracy, F(4, 76) = 0.87, p = .49, ηp2 = .04.

Cue use estimates

The estimates of the percentage of trials that participants generated color imagery indicated by the cue were submitted to a within-subjects ANOVA that treated cue congruency (target/distractor/neutral) as a factor. This analysis revealed a significant effect of cue congruency, F(2, 38) = 6.53, p = .003, ηp2 = .26. This effect was explored further by conducting paired t tests comparing each level of cue congruency. There was a significant effect in the analysis comparing the target and distractor levels of cue congruency, t(19) = 3.70, p < .001, d = 0.78, reflecting that participants reported generating color imagery more often when the cue was congruent with the target (M = 78.85%, SD = 13.18%) than the distractor (M = 67.73%, SD = 15.19%) color. After correcting for multiple comparisons, there was no effect in the analysis comparing the distractor and neutral levels of cue congruency, and target and neutral levels of cue congruency (all ts ≤ 2.40, p ≥ .027, d ≤ 0.55; neutral cue: M = 70.48%, SD = 17.16%).

Split-half of cue estimates

The participants were split into high imagery and low imagery groups based on their estimates of the percentage of trials they generated imagery. Averaged across the cue congruency conditions, the mean percentage of imagery use for the low group was 63.17% and 81.53% for the high group. T1, T2, and T2|T1 accuracy analyses like those reported above were conducted on these two groups independently. For the high group, the significant and nonsignificant outcomes of the analyses were the same as those reported above. For the low group, the significant and nonsignificant outcomes of the analyses were the same as those reported above except that there was no effect of cue congruency in the T1 and T2|T1 analyses (all Fs ≤ 3.12, p ≥ .069, ηp2 ≤ .26), reflecting similar accuracy rates when the imagery cue was congruent with the target word, congruent with the distractor word, and neutral. T1, T2, and T2|T1 accuracy rates were submitted to mixed-factor ANOVAs that treated imagery cue congruency (target/distractor) and SOA (234/467/700) as within-subjects factors and imagery group (high/low) as a between-subjects factor to evaluate whether the magnitude of the cue congruency finding differed across the high and low imagery groups. Neither the three-way interaction or the pertinent two-way interaction of cue congruency and imagery group were significant in these analyses (all Fs ≤ 1.91, p ≥ .12, ηp2 ≤ .10). Like in Experiment 1, while the cue congruency effect was only present in the high imagery group, its magnitude did not significantly differ from the nonsignificant pattern of results in the low imagery group.

T1 and T2 errors

The error percentages when the distractor word was reported as T1, the T2 word was reported as T1, the distractor word was reported as T2, and the T1 word was reported as T2 were submitted to separate paired t tests that compared each level of cue congruency. None of these analyses produced significant effects (all ts ≤ 1.88, p ≥ .075, d ≤ 0.56). These error percentages are depicted in Table 2.

Table 2 The error percentages (%) of Experiment 2 when T1 and T2 were reported as the distractor word, each other, or one of the other possible words

Discussion

First, Experiment 2 showed that T1, T2, and T2|T1 accuracy increased as SOA increased. This finding indicates that the pertinent findings of MacLellan et al. (2015) and Experiment 1 were reproduced. Second, T1, T2, and T2|T1 accuracy was higher when the cue was congruent with the first target word than when the cue was congruent with the distractor word. These findings indicate that the pertinent findings of Experiment 1 were reproduced in that cuing color imagery influenced performance on a subsequent task featuring congruent perceptual colors, and that T2 identification was impaired when the cue was congruent with the distractor word relative to when the cue was congruent with the first target word. Third, T1 accuracy was higher when the cue was neutral than when the cue was congruent with the distractor word, and T1 accuracy was equivalent when the cue was neutral and when the cue was congruent with the first target word. This finding indicates that identification of the first target word was impaired when the cue was congruent with the distractor word, and that there was no indication that identification of the first target word was facilitated when the cue was congruent with it. Fourth, T2|T1 accuracy when the cue was neutral was numerically lower than when the cue was congruent with the first target word and numerically higher than when the cue was congruent with the distractor word with neither of the relevant comparisons reaching statistical significance. These findings do not clearly indicate whether T2 identification was facilitated when the cue was congruent with the first target word or impaired when the cue was congruent with the distractor word relative to when neutral imagery was cued. Fifth, participants were less likely to report generating color imagery when it was congruent with the distractor word than the first target word.

Experiment 3

The results of Experiment 2 offered preliminary evidence that cued imagery that is congruent with a following distractor interferes with selective attention, and surprisingly little evidence that imagery congruent with a target facilitates selective attention. To examine this issue further with a converging method, Experiment 3 employed a different neutral condition. Participants now withheld generating imagery when the neutral cue was presented. If cuing color imagery that matches the first target word facilitates selective attention, then T1, T2, and T2|T1 accuracy should be higher when the cue is congruent with the first target word than when participants withhold color imagery in response to the neutral cue. If cuing color imagery that matches the distractor word interferes with selective attention, then T1, T2, and T2|T1 accuracy should be higher when participants withhold color imagery in response to the neutral cue than when the cue is congruent with the distractor word. That is, it could be that color imagery does not produce a faciliatory benefit above and beyond the knowledge of the upcoming target color.

Method

Participants

During data collection it was determined that some participants failed to follow the task instructions, and accordingly, the sample size was increased to ensure the data of at least 20 participants was useable. Ultimately, the data from three participants were excluded for failure to follow the task instructions (see the Results section). The final sample included 22 McMaster University undergraduates (15 female, ages 17–23 years, M = 19.27 years) who participated in exchange for course credit. All participants had normal or corrected-to-normal visual acuity and normal color vision.

Apparatus and stimuli

The apparatus was identical to previous experiments. The stimuli were identical to previous experiments except that the neutral cue was now two number signs (i.e., “##”).

Procedure

The procedure was identical to Experiment 2 with the following exceptions. Participants were instructed not to generate imagery on trials that the neutral cue was presented. Participants were instructed to estimate the percentage of trials on which they withheld generating imagery for the neutral cue at the end of the experimental session.

Results

The primary dependent variables were the same as previous experiments. Three participants were excluded from analyses for consistently indicating the first target word was the word congruent with the color indicated by the “re”/“gr” cue. The statistical analyses were identical to Experiment 2. T1 and T2|T1 accuracy rates are depicted in Fig. 4.

Fig. 4
figure 4

T1 and T2|T1 mean percentage accuracy rates for Experiment 3 when the imagery cue was congruent with the color of the first target word, congruent with the color of the distractor word, and neutral (i.e., color imagery was withheld). The error bars reflect the standard error of the mean corrected to remove between-subject variability (Cousineau, 2005; Morey, 2008)

T1 accuracy

The analysis of T1 accuracy revealed a significant main effect of cue congruency, F(2, 42) = 13.42, p < .001, ηp2 = .39. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There were significant effects in t tests comparing the target and distractor levels, and distractor and neutral levels of cue congruency (all ts ≥ 3.50, p ≤ .002, d ≥ 0.64), reflecting higher accuracy rates when the cue was congruent with the target or neutral than when the cue was congruent with the distractor color. There was no effect in the t test comparing the target and neutral levels of cue congruency, t(21) = 1.30, p = .21, d = 0.19.

The analysis of T1 accuracy also revealed a significant main effect of SOA, F(2, 42) = 41.83, p < .001, ηp2 = .67. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 3.99, p < .001, d ≥ 0.36), reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA in the analysis of T1 accuracy, F(4, 84) = 1.07, p = .37, ηp2 = .05.

T2 accuracy

The analysis of T2 accuracy revealed a significant main effect of cue congruency, F(2, 42) = 17.59, p < .001, ηp2 = .46. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There were significant effects in t tests comparing the target and distractor levels, and distractor and neutral levels of cue congruency (all ts ≥ 4.03, p < .001, d ≥ 0.78), reflecting higher accuracy rates when the cue was congruent with the target or neutral than when the cue was congruent with the distractor color. There was no effect in the t test comparing the target and neutral levels of cue congruency, t(19) = 0.64, p = .53, d = 0.07.

The analysis of T2 accuracy also revealed a significant main effect of SOA, F(2, 42) = 76.90, p < .001, ηp2 = .79. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 4.14, p < .001, d ≥ 0.50) reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA, F(4, 84) = 0.60, p = .67, ηp2 = .03.

T2|T1 accuracy

The analysis of T2|T1 accuracy revealed a significant main effect of cue congruency, F(2, 42) = 11.35, p < .001, ηp2 = .35. This main effect was explored further by conducting paired t tests comparing each level of cue congruency. There were significant effects in the t tests comparing the target and distractor levels and distractor and neutral levels of cue congruency (all ts ≥ 3.47, p ≤ .002, d ≥ 0.63), reflecting higher accuracy rates when the cue was congruent with the target or neutral than when the cue was congruent with the distractor color. There was no effect in the analysis comparing the target and neutral levels of cue congruency, t(21) = 0.76, p = .46, d = 0.09.

The analysis of T2|T1 accuracy also revealed a significant main effect of SOA, F(2, 42) = 37.50, p < .001, ηp2 = .64. This main effect was explored further by conducting paired t tests comparing each level of SOA. There were significant effects in all t tests of SOA (all ts ≥ 2.70, p < .013, d ≥ 0.55), reflecting an increase in accuracy with increased SOA. There was no interaction of cue congruency and SOA in the analysis of T2|T1 accuracy, F(4, 84) = 1.21, p = .31, ηp2 = .05.

Cue use estimates

The estimates of the percentage of trials that participants generated color imagery indicated by the imagery cues and withheld imagery for the neutral cue were submitted to a within-subject ANOVA that treated cue congruency (target/distractor/neutral) as a factor. There was no effect of cue congruency in this analysis, F(2, 42) = 2.79, p = .073, ηp2 = 12. Given a priori interests, paired t tests comparing each level of cue congruency were conducted. There was a significant effect in the analysis comparing the target and distractor levels of cue congruency, t(21) = 3.09, p = .006, d = 0.40, reflecting that participants reported generating color imagery more often when the cue was congruent with the target (M = 76.36%, SD = 21.50%) than the distractor (M = 68.50%, SD = 18.07%) color. After correcting for multiple comparisons there was no effect in the analysis comparing the distractor and neutral levels, and target and neutral levels of cue congruency (all ts ≤ 2.11, p ≥ .047, d ≤ 0.42; neutral cue: M = 78.64%, SD = 28.92%).

Split-half of cue estimates

The participants were split into high imagery and low imagery groups based on their estimates of the percentage of trials they generated imagery. Averaged across the target and distractor congruency conditions, the mean percentage of imagery use for the low group was 59.09% and 85.77% for the high group. T1, T2, and T2|T1 accuracy analyses like those reported above were conducted on these two groups independently. For the high group, the significant and nonsignificant outcomes of the analyses were the same as those reported above. For the low group, the significant and nonsignificant outcomes of the analyses were the same as those reported above except that there was no effect of cue congruency in the T1 and T2|T1 analyses (all Fs ≤ 3.02, p ≥ .071, ηp2 ≤ .23), reflecting similar accuracy rates when the imagery cue was congruent with the target word, congruent with the distractor word, and neutral. T1, T2, and T2|T1 accuracy rates were submitted to mixed-factor ANOVAs that treated imagery cue congruency (target/distractor) and SOA (234/467/700) as within-subject factors and imagery group (high/low) as a between-subjects factor to evaluate whether the magnitude of the cue congruency finding differed across the high and low imagery groups. Neither the three-way interaction or the pertinent two-way interaction of cue congruency and imagery group were significant in these analyses (all Fs ≤ 1.77, p ≥ .14, ηp2 ≤ .08). Like in the previous experiments, while the cue congruency effect was only present in the high imagery group, its magnitude did not significantly differ from the nonsignificant pattern of results in the low imagery group.

T1 and T2 errors

The error percentages when the distractor word was reported as T1, the T2 word was reported as T1, the distractor word was reported as T2, and the T1 word was reported as T2 were submitted to separate paired t tests that compared each level of cue congruency. It was revealed that the distractor word was more frequently reported as T2 when the cue was congruent with the target word color, t(21) = 3.09, p = .006, d = 0.73. All other analyses did not produce significant effects (all ts ≤ 1.94, p ≥ .065, d ≤ 0.39). These error percentages are depicted in Table 3.

Table 3 The error percentages (%) of Experiment 3 when T1 and T2 were reported as the distractor word, each other, or one of the other possible words

Discussion

First, Experiment 3 showed that T1, T2, and T2|T1 accuracy increased as SOA increased. This finding indicates that the pertinent findings of MacLellan et al. (2015) and the previous two experiments were reproduced. Second, T1, T2, and T2|T1 accuracy were higher when the cue was congruent with the first target word than when the cue was congruent with the distractor word. These findings indicate that the pertinent findings of the previous two experiments were reproduced in that cuing color imagery influenced performance on a subsequent task featuring the congruent perceptual colors, and that T2 identification was impaired when the cue was congruent with the distractor word relative to when the cue was congruent with the first target word. Third, T1 accuracy was higher when the cue was neutral than when the cue was congruent with the distractor word, and T1 accuracy was equivalent when the cue was neutral and when the cue was congruent with the first target word. This finding indicates that the pertinent finding of Experiment 2 was reproduced in that identification of the first target word was impaired when the cue was congruent with the distractor word, and there was no indication that identification of the first target word was facilitated when the cue was congruent with it. Fourth, T2|T1 accuracy was higher when the imagery cue required the withholding of imagery than when the imagery cue was congruent with the distractor word, and T2|T1 accuracy was equivalent when the imagery cue required the withholding of imagery and when the imagery cue was congruent with the first target word. These findings indicate that cuing color imagery interfered with T2 identification when congruent with the distractor word, and there was no indication that cuing color imagery facilitated T2 identification when congruent with the first target word. Fifth, participants were less likely to report generating color imagery when it was congruent with the distractor word than the first target word.

General discussion

The primary purpose of the present study was to evaluate whether cued imagery could influence T2 identification. To do so, a cue was presented prior to the two-target attentional blink procedure of MacLellan et al. (2015). Importantly, the cue could be congruent with the color of the first target word, the distractor word, or neutral (i.e., a cue to generate imagery that was incongruent with the first target and distractor words in Experiment 2, and a cue to withhold imagery in Experiment 3). First, all experiments showed that T1, T2, and T2|T1 accuracy increased as SOA increased. Second, all experiments showed that T1, T2, and T2|T1 accuracy were higher when the cue was congruent with the first target word than when the cue was congruent with the distractor word. Third, Experiment 2 and 3 showed that T1 accuracy was higher when the cue was neutral than when the cue was congruent with the distractor word, and T1 accuracy did not differ when the cue was neutral and when the cue was congruent with the first target word. Fourth, Experiment 3 showed that T2|T1 accuracy was higher when the imagery cue required the withholding of imagery than when the imagery cue was congruent with the distractor word, and T2|T1 accuracy did not differ when the imagery cue required withholding imagery and when the imagery cue was congruent with the first target word. Fifth, all experiments showed that participants were less likely to report generating color imagery when it was congruent with the distractor word than the first target word. Overall, the present study provides evidence that there was an increase in attentional resources required to identify T1 when cued color imagery was congruent with the distractor, such that T1 and T2 identification were impaired. At the same time, there was little evidence that cuing color imagery that was congruent with the first target could facilitate selective attention and identification of T1, as both T1 and T2 identification did not benefit from this cuing manipulation.

It is worth considering how the present study relates to theories concerning the attentional blink. As stated at the outset of the manuscript, a popular theory is that the attentional blink reflects the unavailability of attentional resources to T2 processing when independent T1 processing is ongoing (Chun & Potter, 1995; Jolicoeur & Dell’Acqua, 1998; Raymond et al., 1992). Another popular theory is that the attentional blink reflects a temporary loss of cognitive control (Di Lollo et al., 2005; see also Dehaene & Naccache, 2001). According to this theory, the attentional blink is produced when the input filter of T1 and T2 differ, leading to T2 items being inefficiently processed such that they are vulnerable to backwards masking. With respect to the present study, it seems that the cognitive control theory is an insufficient explanation. For one, this theory indicates that the basis of the effect is due to processing that occurs at T2, and given that the cuing manipulation directly targets T1 processing in the present study, it does not seem tenable that this would affect the input filter responsible for T2. Regarding the two-target attentional blink procedure generally, it does not appear that an input filter process used to detect the presence of T1 and T2 among a stream of distractors could be a viable explanation. That is, while selective attention at T1 may be viewed as a form of “filtering,” it is also phenomenologically distinct as it functions to localize the target, not to detect its presence. While we do not question whether the cognitive control theory is viable, we believe that the present findings are most parsimoniously explained by a T1 resource depletion interpretation, complimenting research exploring this issue (Lagroix et al., 2012; Tang et al., 2020).

In addition, the present study provides insight into the cognitive resources that were depleted at T1. It has been proposed that a variety of different cognitive processes can deplete resources to produce the attentional blink. There is evidence that the resources depleted can be sourced from working memory (Akyürek et al., 2007; Colzato et al., 2007; Ouimet & Jolicoeur, 2007), episodic memory (Chun & Potter, 1995; Wyble et al., 2009), and response selection (Jolicoeur, 1998, 1999) functions (for a review, see Dux & Marois, 2009). It is further proposed that another type of cognitive resource that can be depleted to produce the attentional blink are those concerning selective attention functions (Hommel et al., 2006; Marois & Ivanoff, 2005). It is our position that the findings of the present study support this latter proposal—that the depletion of selective attention resources can affect T2 processing. That is, it is our interpretation that spatially interleaving the target and distractor words increase the selective attention resources required to identify T1. Indeed, previous work using this experimental procedure has demonstrated that alleviating this selective attention demand (by presenting the target word without the distractor word) eliminates the attentional blink (MacLellan et al., 2015, 2018). Similarly, we suspect that the imagery cue served a similar function, as when imagery was generated in the color of the distractor word, attention was initially pulled to the distractor word before shifting to the target word, and thus, increasing the selective attention resources required to perform the task.

It is then an open question why it only was that imagery congruent with the distractor interfered with T2 identification, and cuing imagery congruent with the target did not facilitate T2 identification. One possible reason owes to the first target word always being presented in red, meaning that the color of the first target word was fully predictable. Accordingly, cuing imagery that was congruent with this red target word may not have improved participants’ preparation to attend selectively to red beyond the usual state of preparation one has when preparing to attend selectively to a red target. By this view, although participants may have withheld generating imagery when cued to do so in the neutral cuing condition, they may have continued to prepare to identify the first target word based on its color. In turn, imagery that was congruent with the target color may not have had a meaningful influence on the selective attention and identification of T1 beyond ensuring that participants were prepared for the target color red. This interpretation can also explain why imagery influenced the attentional blink when congruent with the distractor word, as it would have produced a counterproductive preparatory state by having participants prepare for the wrong color. In summary, it seems that generating imagery that was congruent with the first target word may have been superfluous since an effective preparatory state was established based on the task demands in a seemingly automatic manner—however, this effective preparatory state could be thwarted by imagery congruent with the distractor word.

As noted at the outset of the article, we hypothesized that imagery ought to reduce the attentional resources required for selective attention when congruent with the target; however, it seems we were incorrect. It is worth considering why imagery congruency benefits have been observed elsewhere, but not in our attentional blink procedure. Color imagery when used as an attentional template has a profound influence on attentional guidance during visual search (Cochrane et al., 2018a, b, 2019, 2021a, b, c, d; Moriya, 2018). That is, when color imagery is congruent with a perceptual target in visual search, attention is directed to the location of the perceptual color. We suspect that these processes were only minimally influential in the present study since the target was always presented in the same approximate location, and like noted above, that generating color imagery and preparing for a perceptual color target were not sufficiently different. Color imagery can produce congruency benefits when perceptual targets are consistently presented at the same location (Cochrane & Milliken, 2019, 2020; Wantz et al., 2015). It has been demonstrated however, that these congruency benefits are mostly due to response biases when imagery and the basis of the response are not orthogonal (Cochrane et al., 2019), and accordingly, they cannot explain the present results. There is some evidence that imagery can aid in the perceptual processing of color, but this influence appears to be weak (Cochrane et al., 2019; Cochrane & Milliken, 2020). An important insight comes from the recent findings of Cochrane, Wang, Pratt, et al. (2021d) that evaluated eye movements when color imagery was generated prior to a visual search task. While it was revealed that imagery had a profound influence on search, there were substantial disengagement costs, such that it took participants more time to initiate the first eye movement for when they generated color imagery relative to a group of participants that were shown perceptual colored templates instead. That is, it seems that cognitive resources were required to disengage from imagery to perform the search task, which were not required when imagery was not generated. Further, it is typically observed that performance is slower when participants generate imagery prior to the task than when they did not generate imagery (Cochrane & Milliken, 2019; Cochrane et al., 2018a, 2021b, d). Overall, we suspect imagery depletes cognitive resources to a greater degree than when not generated, and if T2 performance is impaired by cognitive resources used at T1, this is a possible explanation as to why imagery that is congruent with the target did not produce a benefit here but can produce a benefit in tasks where resource depletion is inconsequential to performance.

The estimates of the frequency that participants reported implementing the instruction indicated by the cue merits further discussion. Across all experiments, imagery was generated less frequently when it was congruent with the distractor (64.33%) than the target (75.45%). There are several possibilities for this finding. One possibility is that, because the target was always presented in red and the distractor in green, it could be that red imagery was easier to generate than green imagery. Another possibility is that when the target is always presented in the same color, it aids generating congruent imagery—in other words, attention to a perceptual colored target may prime congruent color imagery. Another possibility is that the distractor congruent condition was more difficult than the target congruent condition, and this task difficulty difference biased the estimates of imagery. That is, it could be that imagery was generated equally for the target and distractor congruent conditions, but participants assume they generated imagery less frequently when congruent with the distractor because it was the more difficult of the two conditions. While there are several possibilities, we suspect that the reason for this finding was that participants were aware that generating imagery that was congruent with the distractor word hindered identification of the first target word, and, therefore, they chose not to generate it. That is, we have observed in our previous work that when imagery does not aid task performance, participants are less likely to generate imagery (Cochrane et al., 2018a, 2021a). Further, these imagery effects appear to be phenomenologically perceptible, as it is often reported by participants that their attention was pulled to the perceptual representation congruent with imagery. In summary, we suspect that participants had conscious access to the influence imagery had on the task and they chose to generate imagery in accordance with whether it hindered performance or not.

On a related note, it is worth discussing the findings of the split-half of cue estimates analyses. Here, participants were grouped based on their percentage estimates of imagery use such that the 50% of participants with the highest estimates constituted the high imagery group, and the 50% of participants with the lowest estimates constituted the low imagery group. In all experiments, the cue congruency effect was significant for the high imagery group, and it was absent for the low imagery group, suggesting that imagery only affected performance for the participants that frequently reported generating it. In further analyses however, it was revealed that the magnitude of the imagery congruency effect did not differ across the imagery groups. One reason for this latter finding was that the study was insufficiently powered to adequately assess this issue. Another reason is that our split-half method is not a particularly robust method for assessing this issue since it is a onetime estimate that depends on the participant accurately assessing and reporting their performance across the experimental session. While we caution the reader from drawing strong conclusions from these analyses, if we are for the moment to presume that there was a difference between high and low imaginers, it is worth pondering what this difference might be. One possibility is these differences were due to differences in dispositional imagery ability. While there appear to be dispositional differences in imagery ability (Kosslyn et al., 1984), we suspect that this is not the reason for this finding. In unpublished work conducted in the lab, we have assessed whether dispositional differences in imagery ability may contribute to the magnitude of simple color imagery effects like those of the present study. While there were dispositional differences in the ability to generate complex imagery (as measured by the VVIQ-2; Marks, 1995) participants consistently reported being able to generate high quality simple color imagery (i.e., imagine the redness of a ripe tomato). That is, 29 of our 32 participants gave a rating of 5 on our 5-point Likert scale, which indicated “imagery that was clear and vivid almost like perception” with the other three participants giving ratings of 4. Further, it seemed all participants of the present study had the capacity for imagery as there were no participants that reported never generating it. While it is possible that dispositional imagery ability played a role in the ease at which imagery was generated, we suspect that the participants’ willingness to implement the imagery instruction was the primary reason for the difference across the high and low imaginers.

In summary, the present study revealed that cuing imagery that was congruent with the color of a distractor prior to each trial of the two-target attentional blink procedure increased the selective attention demands necessary for T1 identification, such that T2 identification was impaired. Further, cuing imagery that was congruent with the color of the first target did not decrease the selective attention demands necessary for T1 identification, such that T2 identification accuracy was statistically equivalent to when imagery was withheld.