Emotion plays an important role in guiding social interactions, motivational behavior, decision-making, memory, attention and perception (Dolan, 2002). From an evolutionary point of view, one might expect that affective visual stimuli should be subject to preferential perceptual analysis, in order to promote adaptive behavior in situations that are relevant for survival or reproduction (LeDoux, 1995). Facial emotion, for example, has been shown to influence behavioral performance in various experimental paradigms, ranging from simple perceptual tasks (e.g., Bocanegra, Huijding, & Zeelenberg, 2012; Phelps, Ling, & Carrasco, 2006) to more complex attentional tasks (e.g., Fox, Russo, & Georgiou, 2005).

Many previous studies have compared the perception of emotional and neutral stimuli using paradigms that measure reaction times (RTs) to a single stimulus, whereas other studies have measured accuracy (ACC) using multistimulus presentations. Throughout this article, I will refer to the former paradigms as measuring single-stimulus RTs, and I will refer to the latter paradigms as measuring multistimulus accuracy. Although of course both speed and accuracy can be measured in either type of paradigm, I choose to contrast them in terms of speed versus accuracy due to the different task requirements of these paradigms. The rationale is that in single-stimulus RT paradigms, accuracy is not the main factor determining RT (i.e., identification is very easy: Accuracy is very close to ceiling, and participants are instructed to respond as quickly as possible), whereas in multistimulus ACC paradigms, RT is not the main factor determining accuracy (i.e., identification is very difficult: Accuracy is usually halfway between chance performance and ceiling, and participants are therefore not put under any time pressure to respond).

Interestingly, a seeming contradiction in the literature concerns the effect of emotion in these two types of paradigms. In single-stimulus paradigms, it has been shown that emotional expressions slow down RTs when participants have to detect or discriminate aspects of facial identity, such as gender, person, shape, or fine-grained features (e.g., Gilboa-Schechtman, Ben-Artzi, Jeczemien, Marom, & Hermesh, 2004; Holmes, Nielsen, & Green, 2008; Kolassa & Miltner, 2006; Passamonti et al., 2008; Sagaspe, Schwartz, & Vuilleumier, 2011; Van Dillen, Lakens, & den Bos, 2011; Winston, Vuilleumier, & Dolan, 2003). However, in multistimulus paradigms it has been shown that emotional expressions improve accuracy when participants have to detect or discriminate aspects of facial identity when presented with multiple competing stimuli (e.g., De Jong, Koster, Van Wees, & Martens, 2009; Fox et al., 2005; Maratos, Mogg, & Bradley, 2008; Milders, Sahraie, Logan, & Donnellon, 2006; Roesch, Sander, Mumenthaler, Kerzel, & Scherer, 2010).

Why do emotional expressions on the one hand slow down speed in single-stimulus RT paradigms, but at the same time improve accuracy in multistimulus ACC paradigms? Importantly, it seems unlikely that this difference can be attributed to different task-relevant features being used in both types of paradigms. For example, when gender is the task-relevant feature, emotion both improves accuracy (e.g., Milders et al., 2006) and slows down RT (e.g., Gilboa-Schechtman et al., 2004). Slower RTs are usually explained by the claim that the task-irrelevant emotional significance of the stimulus distracts attention away from the identity of the stimulus (van Honk, Tuiten, de Haan, van de Hout, & Stam, 2001; Williams, Mathews, & MacLeod, 1996), whereas improved accuracy is usually explained by the claim that the task-irrelevant emotional significance attracts attention toward the identity of the stimulus (Anderson, 2005; Arnell, Killman, & Fijavz, 2007; Bocanegra & Zeelenberg, 2009b). Although it has been suggested that the allocation of attentional resources is responsible for both the affective slowdown in speed in single-stimulus RT paradigms and the affective improvement in multistimulus ACC paradigms, no coherent account currently explains how both modulations can occur at the same time.

In the present study, I propose an account of affective modulations in single-stimulus RT paradigms and multistimulus ACC paradigms (ASAP: Affecting Speed and Accuracy in Perception). This account gives a straightforward explanation for the otherwise counterintuitive emotional effects in RT and ACC paradigms. First, I will present the theoretical assumptions and explain how these can account for the apparent contradiction in the literature described before. Next, I will describe the type of experimental paradigms used in the present study and explain the experimental predictions that follow from the theoretical assumptions. Finally, three series of experiments are presented that tested these predictions and controlled for alternative explanations.

Theoretical assumptions

Recent findings suggest that emotion modulates interactions between parallel channels in the visual system (Bocanegra et al., 2012; Bocanegra & Zeelenberg, 2009a, 2011a, 2011b; Borst & Kosslyn, 2010; Nicol, Perrotta, Caliciuri, & Wachowiak, 2013; Song & Keil, 2013; Vuilleumier, Armony, Driver, & Dolan, 2003). Traditionally, these parallel channels are differentiated in terms of their spatial properties (see Fig. 1): Parvocellular-type (P-type) channels are sensitive to fine-grained spatial information (high spatial frequencies: HSFs), and magnocellular-type (M-type) channels are sensitive to coarse-grained spatial information (low spatial frequencies: LSFs) (Callaway, 1998). Apart from their spatial properties, these visual channels are also differentiated in terms of their temporal properties (see the left panels in Fig. 2). P-type channels have relatively slower onset latencies and temporally sustained response durations, as compared to M-type channels, which have relatively faster onset latencies and temporally transient response durations (Maunsell et al., 1999).

Fig. 1
figure 1

Examples of stimuli used in the experiments. LSF = low spatial frequency, HSF = high spatial frequency

Fig. 2
figure 2

(Left) Schematic illustration of the hypothesized affective modulations according to the ASAP theory. Solid lines represent responses for neutral stimuli, and dotted lines represent responses for emotional stimuli. (Right) Predicted effects of emotional versus neutral faces in RTs and accuracy for the localization and identification tasks. LSF = low spatial frequency, HSF = high spatial frequency

Interestingly, recent findings have suggested that emotion induces an interaction between these channels, such that activation of the P-type channel is inhibited and the activation of the M-type channel is potentiated, relative to neutral stimuli (Bocanegra et al., 2012; Bocanegra & Zeelenberg, 2009a, 2011b; Borst & Kosslyn, 2010; Nicol et al., 2013; Song & Keil, 2013). This was first shown for the spatial response properties of M-type and P-type channels (Bocanegra & Zeelenberg, 2009a). However, subsequent findings indicate that this emotion-induced interaction also affects the temporal response properties of these channels (Bocanegra & Zeelenberg, 2011a).

On the basis of these findings, the present account assumes that the temporal response profiles of the M-type and P-type channels are modulated in opposite directions (see the schematic illustration in the left panels of Fig. 2). Specifically, the temporal onset latencies of the P-type and M-type channels are predicted to shift, such that M-type cells are accelerated, whereas P-type cells are decelerated in their onset. Critically, this mechanism also predicts that the overall sustained durations of temporal response would be modulated in opposite directions: M-type cells become more transient, whereas P-type cells become more sustained in their responses.

Independently of emotion, this type of mechanism has been postulated to explain a wide variety of experimental findings in visual perception. For example, interchannel inhibition has been used to account for pattern-masking effects in contrast sensitivity (Itti, Koch, & Braun, 2000), metacontrast masking effects in contour and brightness perception (Breitmeyer & Ogmen, 2000), saccadic suppression effects in contrast sensitivity (Burr, Morrone, & Ross, 1994), and attentional cuing effects in texture segmentation and temporal resolution (Yeshurun & Carrasco, 2000; Yeshurun & Levy, 2003).

It is important to note that this mechanism is not explicit as to the specific level within the visual system at which this interchannel inhibition is supposed to occur. The assumption is that emotional stimuli quickly and automatically activate the amygdala (a medial temporal lobe structure involved in emotion processing), which in turn modulates ongoing visual processing (Vuilleumier 2005). A possible way that this modulation may occur is through subcortical input to the amygdala from pulvinar and superior colliculus (Vuilleumier et al., 2003). An alternative possibility is the “multiple-waves” model, which posits that the amygdala receives fast course-grained cortical input that can modulate subsequent waves of activation in later processing stages in the visual cortex (Pessoa & Adolphs, 2010). In both cases, a rapid and automatic emotional modulation in the amygdala may influence the activation of M-type and P-type circuits. Importantly, this mechanism concerns systems that receive their dominant input from subcortical magno- and parvocellular systems, but that may be operating at V1 or beyond (for a detailed explanation of this point within the domain of meta-contrast masking, see Öğmen, Purushothaman, & Breitmeyer, 2008).

Due to the differences in their spatial and temporal response properties, M-type and P-type channels are inherently specialized to process conflicting, though complementary, functions in visual information processing. Broadly speaking, ASAP assumes that stimulus identification relies relatively more on the slower and more sustained activation of fine-grained HSFs in the P-type channel, whereas stimulus localization relies more on the faster and more transient activation of course-grained LSFs in the M-type channel (for similar assumptions, see Lamme & Roelfsema, 2000; Ungerleider & Haxby, 1994).

Furthermore, bottom-up visuo-motor processing will depend on the initial feed-forward sweep through the visual system, and thus will be influenced by the temporal onset latency of a channel: The earlier the onset of the visual signal, the faster the cascade of processing leading up to the motor response (e.g., Schmidt, Niehaus, & Nagel, 2006). Also, top-down visuo-attentional processing will depend on the recurrent processing between higher and lower areas in the visual system, and thus will be influenced by the temporal response duration of a channel: the more sustained the visual signal, the higher the probability of successful attentional selection (e.g., Fahrenfort, Scholte, & Lamme, 2007). These general assumptions are consistent with many models of bottom-up and top-down processing in the visual system (e.g., Chun & Potter, 1995; Lamme & Roelfsema, 2000; Treisman & Gelade, 1980).

Now, with these assumptions in place, one can explain the empirical contradiction in the literature between emotional modulations in single-stimulus RT paradigms and multistimulus ACC paradigms: Interchannel inhibition will both decelerate the onset latency and increase the response duration of the P-type channel. This shift in the response profile of the P-type channel predicts that emotion will both slow down the speed of identification in single-stimulus RTs (i.e., the later the visual onset, the slower the motor response), but at the same time improve the accuracy of identification in multistimulus ACC (i.e., the more sustained the visual signal, the higher probability of attentional selection).

Experimental paradigms and theoretical predictions

In order to test this account, the present experiments were designed to manipulate the type of task (single-stimulus RT vs. multistimulus ACC),Footnote 1 the type of perceptual judgment (identification vs. localization), the visual content of the stimuli (LSFs vs. HSFs), and the emotional significance of the stimuli (emotional vs. neutral).

In the single-stimulus RT tasks, participants were presented a face stimulus and performed either a speeded identification judgment (was the face gender male or female?) or a speeded localization judgment (was the face presented to the left or right?). Importantly, the single stimulus presentation was meant to ensure that the attentional demands of the RT task would be low enough to keep accuracy close to ceiling.

In the multistimulus ACC tasks, participants were first presented a sample array of four faces for 500 ms. Then, after a 1-s interstimulus interval (ISI), they were presented a probe array and asked to perform a nonspeeded identification judgment (which face was present in the previous array?) or a nonspeeded localization judgment (what was the location of the face in the previous array?). Importantly, the short presentation of the sample array combined with the multiple simultaneously presented faces ensured that the attentional demands of the task would be high enough to keep accuracy off ceiling. Also, the 1-s ISI between the sample and probe arrays was meant to ensure that performance would reflect limited-capacity attentional selection instead of high-capacity iconic visual persistence.

In general, an interchannel inhibition account predicts that emotion will have opposite effects on performance depending on (a) the spatial-frequency (SF) content of the perceptual stimulus (LSF vs. HSF; due to the assumption that LSF information is predominantly processed by M-type channels, whereas HSF information is predominantly processed by P-type channels), (b) the type of perceptual judgment performed by the participant (localization vs. identification; due to the assumption that localization relies predominantly on the M-type channel, whereas identification relies predominantly on the P-type channel), and (c) the type of task (RT vs. ACC; due to the assumption that single-stimulus RTs depend on the onset latency of a channel, whereas multistimulus ACC depends on the response duration of a channel).

Overall, interchannel inhibition predicts the following six experimental effects of emotion (see the right panels in Fig. 2): (1) an LSF bias in single-stimulus RTs (Negative Emotion × SF interactions; i.e., a relative emotional benefit for LSFs and deficit for HSFs), and (2) an HSF bias in multistimulus ACC (Positive Emotion × SF interactions; i.e., a relative emotional deficit for LSFs and benefit for HSFs). Also, the performance differences between emotional and neutral stimuli are specifically predicted to be most pronounced when the stimulus activates the same channel that is being used for the perceptual judgment (i.e., during the identification of an HSF stimulus, and the localization of an LSF stimulus; see the right panels in Fig. 2). In other words, emotion will (3) speed up the localization of LSFs and (4) slow down the identification of HSFs in a single-stimulus paradigm, whereas emotion will (5) impair the localization of LSFs and (6) improve the identification of HSFs in a multistimulus paradigm.

As was described previously, Predictions 4 and 6 are partially supported in the literature. Importantly, however, these studies did not manipulate the SF content of the stimuli. To my knowledge, the novel Predictions 1, 2, 3, and 5 have not been tested before. All six predictions were tested in three series of experiments (see Figs. 3, 5, and 7 below).

Fig. 3
figure 3

Illustrations of the trials in Experiments 1a1d. Faces were presented until response

Experiments 1a–1d: Localization and identification of LSF and HSF emotional stimuli in single-stimulus RTs

In the first series of experiments, the effects of emotion on the localization and identification of LSF and HSF faces were investigated in single-stimulus RT tasks. In each of the four experiments (1a, 1b, 1c, and 1d), the general Prediction 1 was tested, that emotion would induce an LSF bias in RTs.

Additionally, Experiments 1a and 1b tested the specific Prediction 3, that emotion would speed up the localization of an LSF stimulus. In Experiment 1a, participants performed a localization task on fearful or neutral faces (see Fig. 3). In Experiment 1b, the same localization task was used with angry, happy, and neutral faces.

Experiments 1c and 1d also tested the specific Prediction 4, that emotion would slow down the identification of an HSF stimulus. Experiments 1c and 1d were identical to Experiments 1a and 1b, respectively, except that participants performed an identification task on the gender of faces presented at the center of the screen (see Fig. 3).

The use of fearful faces in Experiments 1a and 1c as an emotional manipulation was based on previous findings indicating that fearful faces induce an interchannel inhibition in perception (e.g., Bocanegra & Zeelenberg, 2009a; Borst & Kosslyn, 2010). Although the account does not specifically predict effects of emotional valence (i.e., performance differences between angry and happy faces), many single-stimulus RT tasks have used angry and/or happy faces instead of fearful faces (e.g., Kolassa & Miltner, 2006). Happy and angry faces were used in Experiments 1b and 1d to test whether any effects observed in Experiments 1a and 1c with fearful faces would generalize to these emotional expressions.

Method

Participants

The participants were recruited using the Amazon Mechanical TurkFootnote 2 (https://www.mturk.com); 81 participated in Experiment 1a, 78 in Experiment 1b, 85 in Experiment 1c, and 81 in Experiment 1d.Footnote 3 All participants completed an informed consent form prior to the start of the experiment, were from the United States,Footnote 4 and were paid $0.80 for approximately 15 min of their time (see Buhrmester, Kwang, & Gosling, 2011).

Stimuli and procedure

Experiments were programmed using Qualtrics software and were presented online. A set of ten facial photographs was selected portraying fearful, angry, happy, and neutral expressions from the Pictures of Facial Affect series (Ekman & Friesen, 1976). To generate the HSF and LSF faces, low-pass and high-pass two-dimensional Gaussian filters were applied (see Fig. 2). A low-pass filter cutoff was chosen that would target M-type LSF channels (<12 cycles per face, which would be approximately <3 cycles per degree) and a high-pass filter cutoff that would target P-type HSF channels (>12 cycles per face, which would be approximately >3 cycles per degree; De Valois, Albrecht, & Thorell, 1982; Vuilleumier et al., 2003). It is important to note that the variability in stimulus presentation size due to the use of Web-based experimentation would make it difficult to make inferences concerning the absolute SF differences between the faces (i.e., cycles per degree). However, the relative SF differences between the faces (i.e., cycles per face) can be interpreted unambiguously. First, a blank screen was presented for 500 ms (uniform mid-gray background on 256 gray-level scale). Next, a face (250 × 250 pixels) was presented until response. In Experiments 1a and 1b, participants performed a speeded localization task on a face that was presented flanking the screen center to the left or the right (Fig. 3, left). If the face was presented on the left, they pressed the “A” key on the keyboard; if the target was presented on the right, they pressed the “L” key. In Experiments 1c and 1d, participants performed a speeded identification task on a centrally presented face (Fig. 3, right). If the presented face was female, they pressed the “A” key on the keyboard; if the presented face was male, they pressed the “L” key (both speed and accuracy were stressed in all experiments). All experiments consisted of 120 trials, and all variables varied randomly from trial to trial.

Data analysis

Incorrect responses were excluded from the analyses (<5% in all experimental conditions; no significant main effects or interaction effects were observed in the errors, Fs < 1.5, ps > .25). Mean RTs were calculated for correct responses, removing trials with RTs of less than 200 ms or more than 3,000 ms (<4% of correct trials).

Results and discussion

In the localization Experiments 1a and 1b, significant interactions were observed between emotion and spatial frequency, F(1, 80) = 7.38, p < .01, η p 2 = .08; F(1, 77) = 4.82, p < .05, η p 2 = .06, respectively, indicating LSF biases in RTs due to emotion (see Fig. 4, left panels). Planned contrasts for the LSFs indicated faster responses for fearful than for neutral faces, t(80) = 2.39, p < .05, Cohen’s d = 0.27, angry than for neutral faces, t(77) = 2.92, p < .01, Cohen’s d = 0.33, and happy than for neutral faces, t(77) = 2.13, p < .05, Cohen’s d = 0.24. For the HSFs, no significant contrasts were observed, ps > .08.

Fig. 4
figure 4

Reaction times (RTs) for each of the conditions in Experiments 1a (top left), 1b (bottom left), 1c (top right), and 1d (bottom right). Error bars represent within-subjects standard errors (Loftus & Masson, 1994). * p < .05, ** p < .01 (interaction)

In the identification Experiments 1c and 1d, significant interactions were found, F(1, 84) = 4.38, p < .05, η p 2 = .05; F(1, 80) = 7.28, p < .01, η p 2 = .08, respectively, again indicating LSF biases in RTs due to emotion (see Fig. 4, right panels). Planned comparisons for the HSFs indicated slower responses for fearful than for neutral faces, t(84) = 3.90, p < .01, Cohen’s d = 0.42, angry than for neutral faces, t(80) = 2.93, p < .01, Cohen’s d = 0.33, and happy than for neutral faces, t(80) = 2.81, p < .01, Cohen’s d = 0.31. For the LSFs, no significant contrasts were observed, ps > .25, except that fearful faces were slower than neutral faces, t(84) = 2.79, p < .01, Cohen’s d = 0.30.

Experiments 1a1d confirmed the predictions for single-stimulus RTs (1) that emotion induces overall LSF biases in speed. Also, they confirmed the specific Predictions 3 and 4, that emotion speeds up the localization of LSFs and slows down the identification of HSFs.

Experiments 2a–2d: Localization and identification of LSF and HSF emotional stimuli in multistimulus ACC

In the second series of experiments, the effects of emotion on localization and identification of LSF and HSF faces were investigated in multistimulus ACC tasks. In each of the Experiments 2a, 2b, and 2d, the general Prediction 2 was tested, that emotion would induce an HSF bias in accuracy.

Additionally, Experiments 2a and 2b tested the specific Prediction 5, that emotion would impair the localization of an LSF stimulus. In Experiment 2a, participants performed a localization task on angry or neutral faces (see Fig. 5). In Experiment 2b, the same task was used, but with fearful and neutral faces.

Fig. 5
figure 5

Illustrations of the trials in Experiments 2a2d. Note that the sample array was presented for 1,500 ms in Experiment 2c. Test arrays were presented until response

Experiment 2d also tested the specific Prediction 6, that emotion would improve the identification of an HSF stimulus. This experiment was identical to Experiment 2a, except that participants performed an identification task (see Fig. 5).

Experiment 2c tested whether any of the interactions observed in Experiments 2a and 2b were attributable to a genuine competition for attentional resources. Experiment 2c was identical to Experiment 2a, except for the long sample-array duration (1,500 ms). The hypothesis was that no HSF bias should be observed if participants were given enough time to process all four faces.

The use of angry faces in Experiment 2a as an emotional manipulation was based on previous findings indicating that this facial expression reliably influences performance in attentional tasks (e.g., Maratos et al., 2008). Considering that the effect of emotional valence was not of primary interest in the present study, I decided only to replicate the HSF bias with fearful faces. Experiment 2b therefore tested whether any effects observed in Experiment 2a would generalize to fearful faces, given previous perceptual findings using this emotional expression (cf. Bocanegra & Zeelenberg, 2009a).

Method

Participants, stimuli, and procedure

Additional participants were recruited: 71 participated in Experiment 2a, 76 in Experiment 2b, 74 in Experiment 2c, and 72 in Experiment 2d. Five facial photographs of fearful, angry, and neutral expressions were selected (Ekman & Friesen, 1976). As in previous experiments, low-pass and high-pass filters were applied to generate the HSF and LSF faces. First, a blank screen was presented for 500 ms. Next, a sample array of four faces was presented (each 250 × 250 pixels, presented for 500 ms in Exps. 2a, 2b, and 2d, and 1,500 ms in Exp. 2c). Then, a blank ISI was presented for 1,000 ms. Finally, a test array was presented until response (a single face in Exps. 2a, 2b, and 2c; two faces in Exp. 2d). In Experiments 2a, 2b, and 2c, participants performed a localization task on the centrally presented face (Fig. 5, left). Participants indicated the face’s previous location in the sample array by clicking one of four response buttons—labeled “top left,” “bottom left,” “top right,” and “bottom right”—with the mouse. The four response buttons were presented below the test array, using a rectangular spatial layout. In Experiment 2d, participants performed an identification task on two faces (Fig. 5, right). Participants indicated which of the two faces had been present in the previous sample array by clicking one of two response buttons with the mouse (“left” or “right”) (only accuracy was stressed in all experiments). All experiments consisted of 96 trials, and all variables varied randomly from trial to trial. Reaction times were not recorded.

Results and discussion

In the localization Experiments 2a and 2b, significant interactions were observed between emotion and spatial frequency, F(1, 80) = 4.49, p < .05, η p 2 = .06; F(1, 77) = 4.68, p < .05, η p 2 = .05, respectively, indicating HSF biases in accuracy due to emotion (see Fig. 6, left panels). Planned contrasts for the LSFs indicated impaired accuracy for angry relative to neutral faces, t(70) = 4.92, p < .01, Cohen’s d = 0.59, and for fearful relative to neutral faces, t(75) = 4.59, p < .01, Cohen’s d = 0.52. For the HSFs, contrasts indicated that angry faces were less accurate than neutral faces, t(70) = 2.63, p < .05, Cohen’s d = 0.31, but no difference emerged between fearful and neutral faces, p = .09.

Fig. 6
figure 6

Accuracies for each of the conditions in Experiments 2a (top left), 2b (bottom left), 2c (top right), and 2d (bottom right). Note that location was varied in all except Experiment 2d (dotted lines), where identity was varied instead. Error bars represent within-subjects standard errors (Loftus & Masson, 1994). * p < .05 (interaction)

In the localization Experiment 2c, no interaction was observed when the sample array was presented for 1,500 ms, F < 1, p > .80 (see Fig. 6, top right). In order to interpret this absence of an interaction, the JZS Bayes factor was calculated. The Bayes factor can be used to provide confirmative evidence for the null effect by estimating how much more likely the null hypothesis is given the data, relative to the alternative hypothesis (Rouder, Speckman, Sun, Morey, & Iverson, 2009). The JZS Bayes factor indicated that the null hypothesis was more than ten times more likely than the alternative hypothesis (JZS-BF = 10.81), which is typically considered very strong evidence for the null hypothesis by researchers advocating the use of Bayesian statistics.

In the identification Experiment 2d, a significant interaction was observed, F(1, 71) = 6.01, p < .05, η p 2 = .08, again indicating an emotion-induced HSF bias in accuracy (see Fig. 6, bottom right). A planned comparison for the HSF condition indicated improved accuracy for angry relative to neutral faces, t(71) = 2.93, p < .01, Cohen’s d = 0.34. For the LSF condition, no difference was found, p = .34.

Experiments 2a2d confirmed the prediction for multistimulus ACC (2) that emotion induces an overall HSF bias in accuracy. Also, they confirmed the specific Predictions 5 and 6, that emotion impairs the localization of LSFs and improve the identification of HSFs. Interestingly, when the array was presented for 1,500 ms, the Emotion × SF interaction was abolished, suggesting that the HSF bias depends critically on a competition for attentional selection.

Experiments 3a–3d: An emotion-induced tradeoff between speed and accuracy of perception?

The previous experiments confirmed the predictions using single-stimulus RT tasks and multistimulus ACC tasks. However, one might argue that the different emotional effects observed in these paradigms might be due to different perceptual judgments used in the single-stimulus RT tasks and the multistimulus ACC tasks (e.g., the difference between gender identification in the RT tasks and person identification in the ACC tasks).

The third series of experiments tested the predictions in a single-stimulus RT task and a multistimulus ACC task using the same perceptual judgment: an identity change-detection task (see Fig. 7). In this paradigm, participants were first presented a sample array consisting of one or multiple faces. Then, they were presented with a probe array and were instructed to detect an identity change within the face(s). If the interchannel inhibition account is correct, then the predicted emotional effects in RTs and ACC should again be observed (i.e., a trade-off between perceptual speed and attentional accuracy) using the same identity change-detection task. I specifically chose an identification paradigm because most previous studies have investigated identification instead of localization (see the introduction).

Fig. 7
figure 7

Illustrations of the trials in Experiments 3a3d. Note that the sample array was presented for 500 ms in Experiment 3a, 200 ms in Experiments 3b and 3c, and 1,500 ms in Experiment 3d. Also, no ISI was presented in Experiments 3b and 3c. Test arrays were presented until response

Experiments 3a and 3b tested whether an LSF bias in RTs (Prediction 1) and an HSF bias in ACC (Prediction 2) could be observed within the same paradigm. Additionally, these two experiments tested the specific Prediction 4, that emotion would slow down identification of an HSF stimulus, and Prediction 6, that emotion would improve the identification of an HSF stimulus.

Although Experiments 3a and 3b were identical in terms of perceptual judgments, Experiment 3a included an additional 1-s ISI between the probe and test arrays. In Experiment 3c, this ISI was removed in order to exclude the possibility that any differences between Experiments 3a and 3b could be due to differential decay of the facial features in visual short-term memory.

Experiment 3d further verified whether an interaction in Experiment 3a would reflect a genuine competition for attentional resources. As in Experiment 2c, a long probe-array duration was used (1,500 ms; see Fig. 7). Here, I similarly hypothesized that the HSF bias should disappear if participants were given enough time to process all of the faces.

Method

Participants, stimuli, and procedure

Additional participants were recruited: 85 participated in Experiment 3a, 79 in Experiment 3b, 81 in Experiment 3c, and 84 in Experiment 3d. Five facial photographs of angry and neutral expressions were selected (Ekman & Friesen, 1976). As in previous experiments, low-pass and high-pass filters were applied to generate the HSF and LSF faces. First, a blank screen was presented for 500 ms. Next, a sample array (four faces in Exps. 3a, 3c, and 3d; one face in Exp. 3b) was presented (for 500 ms in Exp. 3a, 200 ms in Exps. 3b and 3c, and 1,500 ms in Exp. 3d). Then, a 1,000-ms blank ISI (Exps. 3a and 3d) or no ISI (Exps. 3b and 3c) was presented. Finally, a test array was presented until response (four faces in Exps. 3a, 3c, and 3d; one face in Exp. 3b). The faces in the sample array were intact (HSFs and LSFs), and those in the test array consisted of either HSFs or LSFs. In Experiments 3a, 3c, and 3d, participants performed a nonspeeded identity change-detection judgment in a multistimulus task (Fig. 7, right): Participants indicated which face had changed identity by clicking one of four response buttons with the mouse (only accuracy was stressed). In Experiment 3b, they performed a speeded identity change-detection judgment in a single-stimulus task (Fig. 7, left). If face identity changed, they pressed the “A” key; if face identity stayed the same, they pressed the “L” key (both speed and accuracy were stressed). In Experiments 3a, 3c, and 3d, RTs were not recorded. In Experiment 3b, incorrect responses were excluded from the analyses (<7% in all experimental conditions; no significant effects were observed in the errors, Fs < 1, ps > .35), and mean RTs were calculated for correct responses, removing trials with RTs of less than 200 ms or more than 3,000 ms (<3% of correct trials). All experiments consisted of 112 trials, and all of the variables varied randomly from trial to trial.

Results and discussion

In the multistimulus Experiments 3a and 3c, significant interactions between emotion and spatial frequency were observed, F(1, 84) = 7.15, p < .01, η p 2 = .08; F(1, 80) = 11.34, p < .01, η p 2 = .12, respectively, indicating HSF biases in accuracy due to emotion (see Fig. 8, top panels). Planned contrasts for the HSFs indicated improved accuracy for angry relative to neutral faces, t(84) = 9.95, p < .01, Cohen’s d = 1.08, and t(80) = 8.91, p < .01, Cohen’s d = 0.93, for Experiments 3a and 3c, respectively. For the LSFs, contrasts indicated that angry faces were more accurate than neutral faces, t(84) = 5.72, p < .01, Cohen’s d = 0.60, and t(80) = 5.63, p < .01, Cohen’s d = 0.63, for Experiments 3a and 3c, respectively.

Fig. 8
figure 8

Accuracies or reaction times (RTs, indicated by dotted lines around panel) for each of the conditions in Experiments 3a (top left), 3b (bottom left), 3c (top right), and 3d (bottom right). Error bars represent within-subjects standard errors (Loftus & Masson, 1994). * p < .05, ** p < .01 (interaction)

In the multistimulus Experiment 3d, no interaction was found when the sample array was presented for 1,500 ms, F < 1, p > .90 (see Fig. 8, bottom right). The JZS Bayes factor indicated that the null hypothesis was more than ten times more likely than the alternative hypothesis (JZS-BF = 11.53), indicating very strong evidence for the null hypothesis.

In the single-stimulus Experiment 3b, a significant interaction was found, F(1, 78) = 4.53, p < .05, η p 2 = .06, indicating an LSF bias in RTs (see Fig. 8, bottom left). A planned contrast for the HSF condition indicated slower responses for angry than for neutral faces, t(78) = 3.14, p < .01, Cohen’s d = 0.35. For the LSF condition, no difference was observed, p = .68.

Experiments 3a, 3c, and 3dfurther confirmed Prediction 2, that emotion induces an overall HSF bias in accuracy, and Prediction 6, that emotion improves the identification of HSFs. Importantly, however, using the same identity change-detection paradigm, an LSF bias was found when the task was modified from a multistimulus ACC task to a single-stimulus RT task in Experimental 3b. Here, emotion slowed down the identification of HSFs. This confirmed Predictions 1 and 4. Overall, these findings suggest that emotion induces a trade-off in the speed versus the accuracy of perception. Furthermore, the absence of an ISI and the short array presentation in Experiment 3c suggest that the HSF bias was not due to memory decay of the faces. Instead, the HSF bias disappeared when the array was presented for 1,500 ms in Experiment 3d, suggesting that it depends critically on a competition for attentional selection.

General discussion

The present study proposes a unified account of affective modulations in the speed and accuracy of perception (ASAP), which assumes that emotion induces an inhibitory interaction between parallel M-type and P-type channels in the visual system.Footnote 5 This mechanism predicts various novel effects of emotion. For example, ASAP predicts LSF biases in RTs, due to visual channel onset latencies (an accelerated onset in the M-type channel, and a decelerated onset in the P-type channel). However, ASAP predicts HSF biases in accuracy, due to visual channel response durations (more transient responses in the M-type channel, and more sustained responses in the P-type channel; see Fig. 1). These predictions were confirmed by the experiments: LSF biases were consistently observed in the single-stimulus RT tasks (Exps. 1a1d and 3b), and HSF biases in the multistimulus ACC tasks (Exps. 2a2b, 2d, 3a, and 3c) (see Fig. 9, top panel).

Fig. 9
figure 9

Aggregated summary of experimental findings. Spatial-frequency biases for reaction times (RTs) were calculated as [(HSFneutral – HSFemotion) – (LSFneutral – LSFemotion)], and those for accuracies as [(HSFemotion – HSFneutral) – (LSFemotion – LSFneutral)]. Error bars represent standard errors of the means. Experiments 1a, 1b, 1c, 1d, and 3b were included in the results for single-stimulus RTs, and Experiments 2a, 2b, 2d, 3a, and 3c in those for multi-stimulus accuracies

Although LSF biases in RTs have been reported recently (e.g., Bocanegra et al., 2012), the present study provides the first demonstrations of emotion-induced HSF biases in accuracy. Importantly, an HSF bias was still observed when the ISI between the sample and test arrays was removed (Exp. 3c). This excludes the possibility that it was due to the decay of the features in visual short-term memory, and suggests that the multistimulus ACC tasks were tapping into a genuine competition for attentional resources.

ASAP assumes that emotion modulates the response duration of a visual channel, which in turn increases the probability of attentional selection. Critically, this mechanism predicts that HSF biases should only be observed when multiple stimuli are competing for attention. For example, when time is not a limiting factor, emotional modulations in the response duration of visual channels should not influence attentional selection: Indeed, two experiments failed to reveal SF biases for long sample-array presentations (Exps. 2c and 3d), which provides further support for the hypothesis that emotional modulations of the temporal response properties of visual channels influence the probability of successful attentional selection.

Can the emotion-induced HSF bias in the multistimulus ACC tasks be explained by visual differences between the stimuli? In other words, were the emotional faces more confusable in the LSFs and more discriminable in the HSFs? In Experiments 3a3c, both the LSF bias in RTs and the HSF bias in accuracy were observed using the same change-detection paradigm. If the emotional face changes had been less salient in the LSFs and more salient in the HSFs, one would expect to find exactly the same SF bias in both the perceptual and attentional tasks, which was not the case.

In general, the opposite SF biases in the single-stimulus tasks and the multistimulus tasks are inconsistent with the general explanation that the emotional faces were more confusable in the LSFs and more discriminable in the HSFs. Also, in Experiments 2a2b the same HSF biases were found for structurally very different expressions (fearful vs. angry faces), whose diagnostic subfeatures are in fact inversely correlated (Smith, Cottrell, Gosselin, & Schyns, 2005). Moreover, an account based on the confusability or discriminability of facial features would have difficulties explaining the absent interactions in Experiments 2c and 3d. Instead, it appears that the HSF biases in the multistimulus tasks (as well as the LSF biases in the single-stimulus tasks and the absent biases in the control experiments) are more easily explained by an affective modulation of the response properties of visual channels.

The present account postulates that stimulus localization and identification are predominantly subserved by the M-type and P-type channels, respectively (see Lamme & Roelfsema, 2000; Ungerleider & Haxby, 1994). This generates two specific predictions that had already been partially confirmed in the literature: an emotional slowdown in single-stimulus identification, and an emotional improvement in multistimulus identification. Experiments 1c1d and 3b and Experiments 2d, 3a, and 3c further confirm these two effects. Importantly, two novel counterintuitive effects of emotion were predicted and confirmed in stimulus localization: an emotional speedup in single-stimulus localization (Exps. 1a1b), and an emotional impairment in multistimulus localization (2a–2b) (see Fig. 9, bottom panel). To my knowledge, these findings have never been reported before. An interesting question for future research will be whether emotion may have opposite effects on the perception of a broadband visual stimulus, depending on the type of perceptual judgment that is performed (localization vs. identification). If so, this would suggest that emotion makes optimal use of the visual system’s capacities in order to react adaptively to threat: enhancing localization in visuo-motor action, and enhancing identification in visuo-attentional selection.

Do the SF biases in perception depend on emotional valence or arousal? In Experiments 1a1d, similar effects were found for fearful, angry, and happy faces in RTs, suggesting a critical role for emotional arousal. Indeed, a recent study has shown similar LSF biases in perceptual processing for both positive and negative arousing IAPS pictures (Song & Keil, 2013). Also, another recent study has observed an LSF bias in contrast sensitivity due to aversively conditioned auditory tones, suggesting that this interaction is not restricted to visual emotional stimuli (Lee, Baek, Lu, & Mather, in press). An interesting open question is whether the same would be the case for the HSF biases in multistimulus attentional tasks. Consistent with the findings in perception, previous studies have suggested that emotional modulations in attentional identification depend critically on emotional arousal, instead of valence (Arnell et al., 2007; De Jong et al., 2009).

Although ASAP proposes a general neural mechanism, the exact neural substrates underlying affective modulations in perception are currently unknown. Importantly, interchannel inhibition as a mechanism may be implemented at various levels throughout the visual system (see Öğmen et al., 2008). Overall, ASAP is consistent with the idea that emotional stimuli may quickly and automatically activate the amygdala, which in turn modulates ongoing processing in the visual system (see Vuilleumier, 2005). However, detailed structural knowledge is lacking as to the amygdala’s visual input, its interlaminar connectivity, and how its output projections feed back onto the magnocellular and parvocellular dominant circuits of V1 and beyond. Therefore, an important question for future research will be how and at what level in the visual system emotion-induced interchannel inhibition is instantiated.

According to the influential biased-competition account of attention and emotion, emotional stimuli bias the competition for attentional resources in such a way that they are at a competitive advantage relative to neutral stimuli (Pessoa, Kastner, & Ungerleider, 2002; Pessoa & Ungerleider, 2004). Within this perspective, emotional modulations in perception are essentially due to increases in the amount of attention that is allocated to a stimulus. The present account departs somewhat from this perspective, by postulating that emotion does not alter the workings of attention per se, but rather modulates the visual input signal to the attentional system (for a related claim, see Most & Wang, 2011): Emotion modulates the temporal onset and response duration of visual signals. When an emotional stimulus is in competition with neutral stimuli, this modulation may influence performance in various ways. For example, an increased duration of the visual signal may increase the likelihood that the emotional stimulus will be selected, and assuming that attentional selection is a capacity-limited process, this may occur at the relative expense of the neutral distractors. Also, an accelerated onset of a visual signal may influence how fast spatial attention may be directed toward the location of an emotional stimulus (see also West, Anderson, & Pratt, 2009), which may also occur at the expense of neutral distractors if they are in spatial competition with each other.

The present study suggests that affective influences in perceptuo-motor speed and the accuracy of attentional selection may be due to a common underlying visual mechanism. Also, ASAP is based on the novel proposition that emotion trades off speed and accuracy between visual channels. In this manner, emotion achieves the “best of both worlds,” in terms of evolutionary advantage: (a) fast visuo-motor responding to course-grained information, and (b) accurate visuo-attentional selection of fine-grained information. In sum, ASAP provides a functional account of otherwise counterintuitive findings, which may be useful for explaining affective influences in both featural-level single-stimulus tasks and object-level multistimulus tasks.