Introduction

When a viewer is required to analyze and encode complex, rapidly changing sensory inputs, two attentional states help to achieve better performance. One is being generally alert, trying to process as much of the information as possible. The other is searching for specific targets while ignoring or suppressing irrelevant distractors. Both these states of attention are important and therefore have been studied intensively. However, less is known about how these two states interact when viewers are searching for targets while trying to remember all stimuli. Here, we investigate how target detection affects memory for nontarget items in a rapid serial visual presentation (RSVP).

When viewers search for targets in an RSVP stream, the first of two targets may be easy to spot but the second may be missed if it arrives within about 500 ms of the first one: an attentional blink (e.g., Raymond, Shapiro, & Arnell, 1992). Interpretation of this phenomenon has been heavily debated (e.g., Bowman & Wyble, 2007; Chun & Potter, 1995; Di Lollo, Kawahara, Ghorashi, & Enns, 2005; Nieuwenhuis, Gilzenrat, Holmes, & Cohen, 2005; Olivers, 2007; Shapiro, Arnell, & Raymond, 1997; Vogel & Luck, 2002; Wyble, Bowman, & Nieuwenstein, 2009; see a review by Dux & Marois, 2009). Most models agree, however, that attentional selection facilitates processing of the first target at a cost to subsequent targets, with the exception of a target that follows the first target at a stimulus onset asynchrony (SOA) of 150 ms or less (lag 1). At lag 1, the second target benefits from the attention generated by detection of the first target, allowing both targets to be processed together.

The attentional blink phenomenon is intriguing because it may reflect a general selective mechanism of attention that applies to simultaneous spatial search as well as temporal search. In the biased competition model of selective attention in space (Desimone & Duncan, 1995), attention enhances the target and suppresses nearby nontargets. Similarly, in the normalization model of attention (Reynolds & Heeger, 2009), attention reduces the processing of nontargets, especially when the attention field is narrow. These models have been very successful in interpreting neurophysiological and psychophysical data in spatial attention studies. Could a similar model, extended to temporal rather than spatial proximity, account for the attentional blink of a second target? That is, does detection of the first target suppress processing of subsequent items, both nontargets and targets? As we will see, however, few studies have looked at processing of nontarget items.

A related question is whether simply remembering rapidly presented items produces an attentional blink. Nieuwenstein and Potter (2006; see also Potter, Nieuwenstein, & Strohminger, 2008) compared memory encoding of items in RSVP when subjects were asked to report all of them (whole report) or to selectively report a subset of them (partial report, the standard task that produces an attentional blink). In the partial-report condition, there was an attentional blink for the second target. Interestingly, recall of the corresponding item in the whole-report condition was significantly more accurate than report of the second target, even though subjects needed to remember more items in whole report. These results suggest that an attentional blink does not occur when subjects were instructed to pay equal attention to all items in RSVP. This is consistent with the normalization model of attention (Reynolds & Heeger, 2009) were that model extended to temporal proximity: the attention span across time is large during whole report conditions—therefore there is no suppression.

The question addressed here is whether target detection interferes with encoding of nontarget items that the participant is trying to remember. We employed a novel dual-task procedure in which participants searched a sequence of words for a target specified by semantic category (e.g., “a four-footed animal”) and were then tested for recognition of nontarget words. Our hypothesis was that detection of a target will negatively affect memory for subsequent nontarget words and perhaps the immediately preceding nontarget word, when compared with trials on which no target is presented.

The effect of target detection on the processing of distractors has previously been studied using indirect methods such as priming from a distractor to a target or subsequent probe word. Using words as stimuli, Maki, Frigen, and Paulson (1997) found that a distractor that was semantically associated with a subsequent target increased the probability of reporting that target, even when the distractor prime appeared at an SOA that typically produces an attentional blink. That is, detection of the first target did not appear to interfere with semantic processing of following distractors. Loach and Mari-Beffa (2003), in contrast, found that the reaction time to report a probe letter at the end of an RSVP trial was longer when the same letter had appeared as a distractor at lag 1 or 3 after a red target letter in the trial, suggesting that the proximity of the distractor to the target caused the distractor to be inhibited, producing negative priming of the subsequent probe. Their design, however, was complicated by the fact that targets, critical distractors, and probes were limited to four letters, X, H, S, O, any of which could appear as a target, critical distractor, or probe; the remaining distractors in the sequence were other letters that were never targets or probes. In the present experiments, the nontarget stimuli were distractors in relation to the search task, but were to be attended and remembered to the extent possible, while searching for a target word in the specified category. All the words were new on every trial, so no target ever appeared as a nontarget.

Experiment 1

Method

Participants

The 18 participants were from Massachusetts Institute of Technology community and received payment. All participants reported English as their first language, and were naïve to the purpose of the experiment. They all reported normal or corrected-to-normal visual acuity.

Apparatus and procedure

The experiment combined a target detection task and a recognition memory task for nontargets. Each trial began with a written description of the target category (e.g., “a four-footed animal”) followed by a sequence of 8 nouns. The stimuli were presented in the center of a 17-inch (c.43.2-cm) CRT monitor (refresh rate = 75 Hz, resolution = 1,024 × 768 pixels). Following the sequence, the participant pressed a key labeled “YES” or “NO” on the keyboard to indicate whether he/she had seen a word in that target category. In a random half of the trials there was a target word, presented at serial position 3, 4, 5, or 6 (i.e., it was never one of the first two or last two words in the sequence). In the other half of the trials, a nontarget noun belonging to another category was presented in place of the target noun. Whether a given trial included a target or not was counterbalanced between participants. The category name, the target nouns, and the replacing nontarget nouns were chosen from the category norms of Battig and Montague (1969). All other words were nouns chosen from the Penn Treebank corpus (Marcus, Santorini, & Marcinkiewicz 1993) with frequency higher than 1 per million. The word length was 4 letters in half the trials and 5 letters in the other half. No word was repeated in the experiment, except in the recognition test.

In addition to the detection task, participants were also instructed to try to remember the other words in the sequence. After a fixed 2-s intervalFootnote 1 given to make the detection task response, two words were presented side by side. One of them had just been presented among the sequence of 8 nouns, either immediately before or immediately after the detection target (or in the target-absent condition, at the corresponding serial positions). The other word was a new noun that had not been presented. Participants were instructed to press the key labeled “Left” or “Right” to indicate which one was the old word that he/she remembered seeing in the sequence. To control for possible interactions between serial position and memory performance, the tested word was always one that was presented in the middle of the sequence (i.e., position 4 or 5). For example, when the detection target position was 3 or 4, the word that was tested for this memory test was the one that was presented immediately after the detection target position (i.e., 4 or 5); when the detection target position was 5 or 6, the word that was tested for this memory test was the one that was presented immediately before the detection target position (again, 4 or 5).

To avoid the possibility that participants might build up strategies such as concentrating particularly on items immediately before and after the detection target, we added a second memory test after the first one. The testing paradigm was the same as the first memory test as described above. But here, the testing word was randomly chosen from the rest of the sequence, excluding the first and the last word.

Stimulus presentation was controlled using MATLAB and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) driven by an Apple Macintosh G4 computer. All stimuli were black characters presented on a white background with Courier font (size 14). In each trial, the category name that designated the target was presented for 1.2 s, followed by a fixation cross for 500 ms and a blank of 200 ms, and then the RSVP sequence. Three presentation durations were used (SOA = 120, 240, and 360 ms/item) to cover a range in which target detection was below ceiling and nontarget recognition was above chance. All together, there were 96 trials and 8 practice trials. Between subjects, a given trial was presented equally often in the no-target and target conditions, and equally often with each of the three SOAs. Participants were allowed to rest between trials at any point.

Results

In the detection task, overall accuracy was high at all rates of presentation: .80 (.67 correct Yeses and .080 false Yeses) at an SOA of 120 ms, .86 (.79 correct Yeses and .073 false Yeses) at 240 ms, and .88 (.82 correct Yeses and .062 false Yeses) at 360 ms. The effect of SOA was statistically significant, F(2, 51) = 4.46, p < .02.

The results in the memory task were analyzed only for correct target detection trials (i.e., correct Yes and correct No). The main results are shown in Figs. 1 and 2. Separate ANOVAs were carried out on the first test pair, words adjacent to the target or control (Fig. 1a and b), and the second test pair, other words in the sequence (Fig. 2a and b). In the analysis of words adjacent to the target or control, the main effect of whether the target was present or absent was significant, with performance significantly worse when the target was present, F(1, 17) = 8.097, p < .02. The main effect of whether the tested word was before or after the target was also significant, F(1, 17) = 7.409, p < .02, with the word following the target remembered better than the one just before. The interaction between present/absent target and before/after position was not significant (F < 1); separate planned analyses confirmed that the effect of target present/absent was significant in each position: immediately after, F(1, 17) = 4.955, p < .05, and immediately before, F(1, 17) = 5.945, p < .05. The main effect of SOA was significant, F(2, 34) = 14.79, p < .0001. The interaction between SOA and before/after position was also significant, F(2, 34) = 3.296, p < .05. Participants were more accurate as SOA increased, particularly for the word following the target (after: F(2, 34) = 14.53, p < .0001; before: F(2, 34) = 3.515, p < .05). Neither the interaction between present/absent target and SOA, F (2, 34) = 2.346, p = .11 nor the triple interaction, F(2, 34) = 1.508, p = .24, were statistically significant. In a planned analysis, the interaction between target present/absent and SOA was marginally significant for words presented immediately before the detection target, F(2, 34) = 2.641, p = .086. Inspection shows that the decrement produced by a following target was found only at the 360-ms SOA (Fig. 1a). For words presented immediately after the detection target, the interaction between target presence and SOA was not significant, F(2, 34) = 0.603, p = .553. However, again inspection of Fig. 1b shows that there was no decrement at the 120-ms SOA, consistent with evidence from a later experiment that there is lag 1 sparing at the 120-ms SOA.

Fig. 1
figure 1

Experiment 1, first memory test: two-alternative forced-choice recognition of the word immediately before the target or control (left) and immediately after the target or control (right). Error bars ± 1SEM

Fig. 2
figure 2

Experiment 1, second memory test: two-alternative forced-choice of a word before (left) or after (right) the target or control, excluding the two words adjacent to the target. Error bars ± 1SEM

Figure 2 shows results of the second memory task, in which the testing word was randomly chosen from the remaining three words in the sequence excluding the first and the last word, the target or control, and the two words before and after the target or control. The tested word could have appeared 2, 3 or 4 items either before or after the target, depending on the serial position of the target. In an omnibus analysis with before/after, target present/absent, and SOA as variables, the main effect of whether the target was present or absent was significant, with performance significantly worse when the target was present, F(1, 17) = 21.78, p < .001. The main effect of SOA was also significant, F(2, 34) = 12.63, p < .0001. However, the main effect of whether the tested word was before or after the target was not significant, F(1, 17) = 3.029, p = .100. There was a significant interaction between present/absent target and before/after position, F(1, 17) = 6.988, p < .02. To explore that interaction, words before and after the target were analyzed separately. The effect of target presence/absence was marginally significant for memory encoding of words appearing before the target, F(1, 17) = 4.053, p = .060, and highly significant for memory encoding of words appearing after the target, F(1, 17) = 30.56, p < .001. No other 2-way or 3-way interaction was statistically significant (F < 2).

To sum up, words that appeared immediately before or after a target tended to be less well remembered than the same words on target-absent trials, particularly at longer SOAs. Words that were at least two serial positions away from a target showed this negative effect even more strongly, and at all SOAs. Strikingly, the word immediately following the target (lag 1) was not affected at the shortest SOA (120 ms), whereas a word that followed at lags 2–4 (240–480 ms) was markedly impaired, suggesting lag 1 sparing and an attentional blink. However, because the order of testing and the exact lags were not counterbalanced, this result cannot be interpreted with confidence. Experiment 2 addressed that problem.

Experiment 2

The results of Experiment 1 revealed that target detection negatively affects memory encoding of preceding as well as subsequent nontargets in RSVP, suggesting a selective mechanism that suppresses processing of other items when a target is detected in a temporal sequence. Experiment 1 focused on testing items immediately preceding and following the target in RSVP. Longer lags were tested in the second memory test only, and were sorted into only two bins: before or after the target. In Experiment 2, nontarget words at lags −3 to + 3 were tested in a counterbalanced design, with three forced-choice tests on each trial.

We were particularly interested in lags 1 and 2, because of previous evidence from studies of the attentional blink that at lag 1 there is sparing, with a blink occurring at lag 2 and later (Potter, Chun, Banks, & Muckenhoupt, 1998; cf. Visser, Bischof, & Di Lollo, 1999). There was a hint of this pattern in Experiment 1. Lag 1 sparing has been thought to reflect an episode of transient attention that is induced by detection of a target, lasting for about 150 ms and including not only the target but also the following item (Bowman & Wyble, 2007; Wyble et al., 2009). Therefore, we expected that memory for the word at lag 1 would be relatively good on trials in which the target is present and reported, compared to target-absent trials. Similarly, memory for the lag 2 and lag 3 words would be expected to be impaired, as in the attentional blink. However, if transient attention is responsible for lag 1 sparing, the effect should be greater at a presentation duration of 120 ms than at one of 240 ms.

Method

Participants

The 36 new participants from the same pool as in Experiment 1 received payment for participation. They were randomly assigned to one of two SOAs, 120-ms or 240-ms SOA, with an equal number in each group.

Apparatus and procedure

The apparatus and procedure were the same as Experiment 1 except as follows. The two SOAs—120 ms and 240 ms—were presented between subjects. The memory task consisted of three two-alternative forced-choice recognition tests on each trial. Tested words were chosen from items presented 1–3 positions preceding or following the detection target in RSVP. The serial positions of the targets were the same as in Experiment 1, but the tested nontarget words were equally likely to be from any position 3 words before the target (or control) to 3 words following the target (or control), excluding the first and last word in the sequence. Of the three words tested on a given trial, at least one word came before the target and one word after the target. Across trials, words presented at each serial position relative to the target were tested an equal number of times. Inevitably, the lags were confounded with the serial position in the trial: e.g., −3 lag could only have been presented at serial positions 2 and 3; −2 lag could have be presented at serial positions 2, 3 and 4; and so on. The order of testing the different lags on a given trial was counterbalanced between participants, as was the presence or absence of a target.

Results

As in Experiment 1, detection was quite accurate; .78 correct (.72 correct Yeses and .16 false Yeses) with an SOA of 120 ms, and .88 correct (.84 correct Yeses and .084 false Yeses) at 240 ms. The effect of SOA was significant, F(1, 34) = 17.41, p < .001.

Results of the memory task were analyzed only on correct target detection trials (correct yeses and correct nos). Figure 3 shows the results of the memory task. An ANOVA with six lags and presence/absence of the target as within-S variables and SOA (120 vs 240 ms) as a between-S variable showed significant main effects of all three variables, F > 4, p < .01. Moreover, the interaction between lag and presence/absence of the target was significant, F(5, 170) = 6.689, p < .001. No other interaction was significant (F < 2). We carried out a series of planned comparisons to explore these interactions. For SOA = 120 ms, target detection significantly impaired memory encoding of the immediately preceding word, t(17) = 2.164, p < .05. (No such effect was observed in Experiment 1 at 120 ms, however.) Target detection also significantly impaired memory encoding of the word presented 2 positions after the target, t(17) = 3.243, p < .01, and the word presented 3 positions after the target, t(17) = 4.463, p < .001, reflecting an attentional blink effect. However, target detection did not significantly affect memory encoding of the immediately following word, t(17) = −0.206, n.s., suggesting lag 1 sparing. For SOA = 240 ms, target detection did not significantly affect memory encoding of any preceding word (p > .30). For words after a detected target, an attentional blink effect and a suggestion of lag 1 sparing were found [lag 1: t(17) = 1.162, n.s.; lag 2: t(17) = 4.068, p < .001; lag 3: t(17) = 3.461, p < .01].

Fig. 3
figure 3

Experiment 2, memory test: accuracy as a function of lag relative to target or control position in RSVP. Error bars ± 1SEM

The results for memory of words following the target are generally consistent with the results of Experiment 1 for the same two presentation durations: lag 1 sparing at an SOA of 120 ms, followed by a substantial blink at longer lags, and at an SOA of 240 ms, a small, nonsignificant blink at lag 1 followed by a larger, significant blink at longer lags. The results for words preceding the target were less consistent across experiments, however.

Discussion

We used a novel dual-task design to test whether and how detection of a target affects memory for nontargets. The detection of a target significantly impaired memory encoding of nontargets that were presented after the target, especially at lag-2 and lag-3 positions in the RSVP. Similar to lag 1 sparing (Potter et al., 1998), target detection did not impair memory encoding of the immediately following item when the SOA was short. Interestingly, in Experiment 1 at 360 ms and in Experiment 2 at 120 ms (but not in Experiment 1 at 120 ms) target detection also negatively affected memory for the immediately preceding nontarget. These results are consistent with the biased competition model of selective attention (Desimone & Duncan, 1995) and the newer normalization model of attention (Reynolds & Heeger, 2009), were those models extended to temporal proximity. While attentional selection enhances the neural representation of the target, it may suppress the representation and encoding of nontargets arriving shortly before or after the target in RSVP.

More specifically, it has been suggested that target detection may induce an episode of transient attention (Muller & Rabbitt, 1989; Nakayama & Mackeben, 1989; Wyble et al., 2009; Yeshurun & Carrasco, 1999). Nontargets that are presented within spatio-temporal limits of this episode (i.e., within 150 ms) may receive processing along with the target, accounting for the lag 1 sparing seen in the present experiments for the word directly following the target, at an SOA of 120 ms. By contrast, memory encoding of nontargets that are presented outside of the transient attention episode may be suppressed. This effect is most evident at lag-2 and lag-3 positions in the RSVP.

It is important to note that target detection required that the participant comprehend each word, at least to determine whether it fell in the specified category. After a target was detected, one might imagine that the viewer would give added attention to the remaining nontarget words, but there is no suggestion of that in the results. Except at lag 1, memory was significantly worse for the remaining nontarget words when the target had been detected, than when there was no target and the participant had to keep looking for it. This was the case even though the viewer had only to remember that the target had appeared, not its identity. We can speculate that memory for the nontargets would have improved at still longer lags, if the sequence been longer, in line with recovery from an attentional blink.

Alternatively, could impaired nontarget recognition on target-present trials be explained by the switch from a dual task to a memory-only task once a target is detected? We think this is unlikely because the item that immediately followed the switch (i.e., the moment right after a target is presented) should show the greatest switch cost. Instead, encoding of the lag 1 item showed little or no interference from target detection, at faster presentation rates.

To sum up, we have shown that an attentional blink-like effect on retention of nontarget words is observed shortly after detection of a semantically defined target word, indicating that the attentional blink affects processing of subsequent nontarget items in the sequence, not just targets. More generally, we propose that future studies may adopt our paradigm to further investigate whether models of spatial attention (e.g., Desimone & Duncan, 1995; Reynolds & Heeger, 2009) can be extended to temporal proximity.