Cognitive processes are affected by emotion. Many studies have shown preferential perceptual processing of emotional information, especially for threatening content (see Dolan, 2002, for a review). People pay attention more readily to, and are slower in disengaging from, emotional than neutral stimuli (Fox, Russo, Bowles, & Dutton, 2001; Hajcak & Olvet, 2008; Vuilleumier, 2005). These types of effects have been observed for visual stimulation, such as pictures (e.g., Foti, Hajcak, & Dien, 2009) and faces (Adolphs, 2002), and also for symbolic and learned stimuli, such as words (Kissler, Assadollahi, & Herbert, 2006). Abnormalities in the processing of emotional information have been described in numerous psychiatric pathologies (Phillips, Drevets, Rauch, & Lane, 2003).

In this study, we aimed to investigate the time course of the processing of affective words by using event-related potentials (ERPs). The ERP technique is a suitable tool for studying the interaction between emotion and attention, since it is sensitive to both emotional (e.g., Chapman, McCrary, Chapman, & Martin, 1980) and attentional manipulations (e.g., Kissler, Herbert, Winkler, & Junghöfer, 2009), and it provides information about the time course and extent of cerebral processing beyond behavioral data.

For visual stimuli like faces and pictures, ERP differences between emotional and neutral information usually appear from the first 100 ms after stimulus onset (Eimer & Holmes, 2002; Frühholz, Jellinghaus, & Herrmann, 2011). However, these early effects are not so evident for words. Although the traditional view suggests that the emotional content of words is extracted at around 400 ms (Kutas & Federmeier, 2000; Posner, Abdullaev, McCandliss, & Sereno, 1999), several studies have reported larger amplitudes to emotional than to neutral words in ERP components at around 100 ms, even when participants do not explicitly attend to the affective content of the stimuli (Bernat, Bunce, & Shevrin, 2001; Ortigue et al., 2004; Sass et al., 2010; Skrandies, 1998). ERP effects have been more consistently reported in components such as the early posterior negativity (EPN), a negative wave generated over the extra-striate cortex peaking at around 250 ms after stimulus onset, and the late positive complex (LPC), a positivity belonging to the P300 family; both components usually show larger amplitudes for emotional than for neutral words (Carretié et al., 2008; Herbert, Junghöfer, & Kissler, 2008; Hinojosa, Carretié, Valcárcel, Méndez-Bértolo, & Pozo, 2009; Kanske & Kotz, 2007; Kissler et al., 2009; for a review, see Citron, 2012, and Kissler et al., 2006).

Nevertheless, some inconsistencies are apparent concerning the emotional modulation of these ERP components when the level of attention to the emotional stimuli is considered. The EPN has been found to be enhanced for emotional words even when emotion is irrelevant to the task, suggesting that this component may reflect early automatic emotional processing (Kissler et al., 2009). However, these results were refuted in later studies, which showed that EPN amplitudes were sensitive to the emotional content of words only when they were deeply processed (i.e., in lexico-semantic but not in perceptual discrimination tasks), and it was concluded that the emotional modulation of EPN is not as automatic as previously suggested (Bayer, Sommer, & Schacht, 2012; Hinojosa, Méndez-Bértolo, & Pozo, 2010). Hinojosa et al. (2010) also found that the enhanced amplitudes of LPC for emotional relative to neutral words only appear during deeper processing of words. Therefore, ERPs seem to be more emotionally modulated as the level of attention to the valence increases, although it seems necessary to confirm this idea and also to test the hypothesis of early automatic processing of the emotional content of words.

One way to shed light on this controversy may be to compare two paradigms that differ in the degree of attention to emotional words: the emotional Stroop task and the emotional categorization task.

The emotional Stroop task (EST) has been used extensively to study the processing of emotional information when this dimension is unnecessary for task performance (Wells & Matthews, 1996; Williams, Mathews, & MacLeod, 1996). The EST is a modified version of the classical Stroop task (Stroop, 1935), in which participants are required to name the color of printed words with different emotional content while ignoring their meaning. Given that extensive practice makes reading an involuntary act, words and their emotional content are processed even though they are task-irrelevant, as demonstrated by the longer reaction times (interference) when responding to the color of emotional words. Two effects were proposed to be at the root of this interference: a “fast effect”—which refers to an increased capture of attention by emotional information (mainly threatening), leaving a reduced amount of processing resources for naming the colors—and a “slow” or “carryover” effect—related to the greater difficulty in disengaging attention from emotional words than from neutral ones, thus causing a slowing down in naming the color of the subsequent words in the series. The emotional interference increases when words are presented in blocks of the same valence, since both fast and slow effects are summed together (McKenna & Sharma, 2004; Phaf & Kan, 2007; Waters, Sayette, Franken, & Schwartz, 2005).

Emotional Stroop interference has mainly been observed in clinical populations in response to pathology-related stimuli (see Williams et al., 1996, for a review). In nonclinical populations, the size of the effect seems to be diminished and differences between emotional categories are not always significant. On combining the EST with the ERP technique in healthy populations, the most consistent effects are also observed in components such as the early posterior negativity (Franken, Gootjes, & van Strien, 2009) and the P3 and LPC (Franken et al., 2009; Gootjes, Coppens, Zwaan, Franken, & Van Strien, 2011), suggesting sustained attention to irrelevant affective words. Nevertheless, some authors have found that emotional words induce increased amplitudes of early components like P2 (Thomas, Johnstone, & Gonsalvez, 2007), indicating early and automatic processing of irrelevant emotional information.

When the emotional content of the words is task relevant—that is, in emotional categorization tasks (ECT)—categorization of the emotional words is assumed to be faster than categorization of neutral words, because the stimulus valence is relevant and disengagement of attention from the emotional content is not necessary (Estes & Verges, 2008; Fischler & Bradley, 2006). It has also been suggested that facilitation of categorization will be more important for negative than for positive words because of the greater biological relevance of aversive than of appetitive stimuli (Estes & Verges, 2008). Larger P2 and P3 or LPC amplitudes have been observed for emotional than for neutral words in ERP studies (Fischler & Bradley, 2006; Herbert, Kissler, Junghöfer, Peyk, & Rockstroh, 2006; Naumann, Bartussek, Diedrich, & Laufer, 1992; Naumann, Maier, Diedrich, Becker, & Bartussek, 1997; Schapkin, Gusev, & Kuhl, 2000). Differences in ERPs associated with the categorization of positive versus negative words have been also observed in components such as P2 and LPC (Herbert et al., 2006; Schapkin et al., 2000), although these differences are not very consistent.

Two previous studies have directly compared the EST with the ECT, and have obtained discrepant results. Thomas et al. (2007) observed that threatening words were accompanied by larger P2 (in the right hemisphere), only in the EST, and larger P3 amplitudes, particularly in the ECT. Emotion did not appear to affect the amplitude of the N2 component. Frühholz et al. (2011) reported effects of emotion on EPN (larger amplitudes for negative words in the left hemisphere), only in the ECT, and also enhanced LPC amplitudes for negative relative to neutral words across tasks. Both studies found task-related effects for reaction times (longer for the ECT than the EST), but no emotional effects on this behavioral index.

Thus, although there is evidence for preferential processing of affective rather than neutral words, the large variability in results (concerning RTs, the ERPs components that are sensitive to emotional content, and the effects of attentional manipulations) make it difficult to reach clear conclusions about the processing of emotional words. In this context, it appears necessary to study how the relevance of the emotional content of words affects RTs and ERPs and also to clarify whether the emotional effects are due to the valence or arousal dimensions. Moreover, the application of the EST plus ERP recording in a healthy population remains limited, and usually the positive category is omitted. The inclusion of positive words could provide interesting insights into the processing of emotional valence. Most previous studies have used undergraduate students as participants, which may produce bias and prevent generalization of the results obtained (Peterson, 2001).

In an attempt to resolve these questions, we recorded ERPs and RTs in a large sample of healthy middle-aged women during processing of negative, neutral and positive words when the emotional content of the words was task-irrelevant (EST) and task-relevant (ECT).

Regarding RTs, we hypothesized faster responses in the EST than in the ECT. Given the reported interference from emotional content, we expected faster RTs for neutral than for emotional words in the EST. We also expected that negative nouns would create more interference (longer RTs) than positive nouns because of the biological relevance of negative information. However, given that in the ECT the emotional content is task relevant, facilitated categorization should be reflected in faster RTs for negative words, followed by positive words, and finally by neutral words.

Regarding ERPs, and taking into account previous results, we did not expect emotional modulation in ERP components earlier than 250 ms after stimulus onset, like N1 or P2. For later components (the EPN and components belonging to the P3 family), we expect larger amplitudes for negative than for positive and neutral words in the EST and larger amplitudes for emotional than for neutral information in the ECT. We expect more pronounced emotional effects when words are deeply processed (i.e., in the ECT).

Materials and method

Participants

A group of 57 healthy middle-aged women (mean age = 45.3 years old; range = 25–65 years, SD = 9.4) were recruited by placing adverts in community centers in the city of Santiago de Compostela.

Participants were asked to abstain for consuming alcohol, coffee, tea, or tobacco for at least 4 h before the experiment. The mean years of education completed by the participants was 14.8 (SD = 4.5). All had normal or corrected-to-normal vision, and 93 % of them were right-handed (mean score = 85, SD = 40), according to a handedness questionnaire (Oldfield, 1971). At the end of the session, the participants completed the Spanish-validated version of the Beck Depression Inventory (Beck, Rush, Shaw, & Emery, 1979; Sanz & Vázquez, 1998); the mean score was 5.7 (SD = 5.7), which is interpreted as minimal depression.

Participants signed an informed consent form prior to the experiment and were paid €30 for their participation. The University of Santiago de Compostela Ethics Committee approved this research project prior to the study.

Stimuli

We used the circumplex model (Russell, 1980) to select and classify 60 words into three groups of 20 words each, according to their characteristics of valence and arousal, as follows:

  • Negative words: ratings between 1.6–3.0 in valence and 4.7–7.6 in arousal

  • Neutral words: ratings between 4.5–5.3 in valence and 4.2–5.1 in arousal

  • Positive words: ratings between 6.8–8.5 in valence and 5.6–7.6 in arousal

Most of the words were obtained from the Spanish adaptation of the Affective Norms for English Words (ANEW; Redondo, Fraga, Padrón, & Comesaña, 2007). Given that some negative words (related to emotional and physical distress) were not included in the ANEW, we performed a pilot study with undergraduate students to obtain valence and arousal values for these words. The words used were nouns, and for all them several objective indices such as grammatical category, frequency of usage, length, imageability, familiarity, number of syllables, and concreteness were controlled using the Spanish LEXESP and BPAL databases (Davis & Perea, 2005; Sebastián-Gallés, Martí, Carreiras, & Cuetos, 2000). One-way analysis of variance (ANOVA) for the three categories revealed differences only in valence [F(2, 59) = 1,017.9, p < .001] and arousal [F(2, 59) = 71.47, p < .001] from among all of the indices considered. Post-hoc comparisons (corrected with the Games–Howell procedure) revealed differences in valence between the three categories (p < .001 in all the cases) and in activation between the neutral and positive (p < .001) and the neutral and negative (p < .001) words (see Table 1). The words are listed in the Appendix in Spanish, along with English translations.

Table 1 Values of the controlled variables in the words used

Procedure

Participants sat in a comfortable armchair, at a distance of 1 m from the computer screen, in a darkened room. Words were presented in upper case type (Chicago 100-point font) centered on a black background on a 17-in. screen. The stimulus duration was 500 ms and the interstimulus interval (ISI) was 1,500 ms.

The participants first performed the EST. They were required to press one of three colored buttons according to the color of the words, without paying attention to the emotional content. Each word was presented three times, each time in a different color (red, blue or green), so that a total of 180 trials were presented, in blocks of words of the same valence. The order of the blocks (negative–neutral–positive or positive–neutral–negative) was counterbalanced among participants, whereas the color of the word was randomly distributed. This block design was used because it has been reported that the emotional interference is greater in blocked than in intermixed designs (Waters et al., 2005).

The participants then performed the ECT. In this case, the 60 words printed in the three different colors were presented at random, so that a total of 180 trials were presented in the same order for all the participants. The participants were asked to press a button (labeled “+,” “0,” and “−” for positive, neutral, and negative content, respectively) classifying the emotional valence of the word. To ensure that the participants understood the requirements of the task, they were asked to complete ten training trials prior to each task, with different words from those used in the experiment. Responses were emitted with the dominant hand.

At the end of the session, participants filled in Self-Assessment Manikins (Bradley & Lang, 1994), rating each word from 1 to 9 in valence (where 1 = completely sad and 9 = completely happy) and activation (where 1 = completely calm and 9 = completely aroused).

EEG recording and offline processing

The electroencephalogram (EEG) was recorded at 28 scalp electrodes placed at FP1, FPz, FP2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T3, C3, Cz, C4, T4, CP3, CPz, CP4, T5, P3, Pz, P4, T6, FT7, FT8, O1, and O2 (according to the 10–20 International System), using tin electrodes inserted in an electrode cap (ElectroCap International). The nose tip was set as reference, and Fz as ground. Three additional Ag/AgCl electrodes placed 1 cm above, below, and in the outer canthus of the right eye were used to monitor ocular movements. The electrode impedances were maintained below 5 kΩ. EEG signals were continuously amplified 10,000 times, digitized at 500 Hz, and filtered online with a 0.1- to 100-Hz bandpass filter and a 50-Hz notch filter (using a Synamp Neuroscan amplifier).

The EEG data were analyzed with Brain Vision Analyzer 1.05 software. Bad channels were replaced by linear interpolation. The EEG data were then digitally filtered with a linear 0.1- to 30-Hz bandpass filter. For correctly responded trials only, 1,500-ms segments were extracted from 200 ms prestimulus to 1,300 ms poststimulus. Baseline correction was applied. EEG epochs showing significant eye movements or muscular artifacts were manually excluded from the data. To correct for minor vertical and horizontal ocular artifacts, an eye movement artifact correction procedure was applied to the EEG segments (Gratton, Coles, & Donchin, 1983). The accepted segments were averaged; in the EST, the mean numbers of segments were 45.1 (SD = 11.17) for negative words, 48.5 (SD = 11.7) for neutral words, and 46.6 (SD = 11.9) for positive words, and in the ECT, the mean numbers of segments were 44.4 (SD = 9.2) for negative words, 41.8 (SD = 10) for neutral words, and 43.8 (SD = 10.7) for positive words. A minimum of 20 epochs was required to obtain the average for each emotional category in each participant. We found no significant differences in the numbers of accepted epochs among the valence categories.

Nine participants had noisy EEGs in the EST (mainly caused by ocular artifacts) and were excluded from the posterior ERP analyses. The ECT was more difficult for the participants, and several of them did not respond to a sufficient number of trials (between 300 and 1,300 ms after stimulus presentation), or the ERP waveforms were contaminated by a large number of ocular artifacts. For these reasons, 18 participants with poor signal/noise ratios were excluded from the posterior ERP analyses. Finally, 37 participants completed both tasks, and the corresponding data were included in the ERP analyses. Figure 1 shows the grand-mean ERP waveforms for the three emotional categories in each task.

Principal components analysis

To disentangle temporally overlapping components, the averaged ERPs were submitted to a sequential temporospatial principal components analysis (PCA). The PCA is used in data exploration to detect features that might be overlooked on visual inspection of the ERPs. The information is decomposed, thus simplifying the analysis and description of complex data (Dien & Frishkoff, 2005). This technique yields a number of extracted temporospatial factors free from the influences of nearby or underlying components and enables elimination of subjective influences in the identification of ERP peaks.

Temporal PCA (tPCA) based on the covariance matrix was first applied to the data. The tPCA used all time points as variables, and each participant, task, condition and electrode as an observation. This approach provides one matrix with the factor loadings and another with the factor scores; the former describes the temporal course of the factor, and the latter provides a value for each observation (Foti et al., 2009). Ten temporal factors that explained most of the variance (92 %) were retained using the scree test (Cattell, 1966), and these were then submitted to Promax rotation (Dien, Beal, & Berg, 2005; see Fig. 2). The tPCA reduced the number of temporal dimensions from 750 to 10. Spatial PCA (sPCA) was then performed for each of the temporal factors, using the covariance matrix. The sPCA used all time points as variables, and each participant, task, condition, and electrode as an observation. After averaging the resulting scree plots, three spatial factors were extracted for each temporal factor and submitted to Infomax rotation. The sPCA procedure reduced the number of spatial dimensions from 28 (the number of scalp electrodes) to 3. A single factor score was then obtained for each participant, task, condition, and temporospatial factor. The factor scores are transformed measures of the original voltage and can be used as a measure of the amplitude; higher negative or positive factor scores represent larger negative or positive amplitudes. The factor scores were used in the statistical analysis of the electrophysiological data.

Fig. 1
figure 1

Grand-mean event-related potential (ERP) plots comparing the waveforms of the three emotional categories in each task (upper row), and comparing each emotional category in both tasks (lower row)

We looked for any correspondence between the temporospatial factors and the deflections in the grand-average waveforms and selected the factors that corresponded to a peak and accounted for at least 2 % of the variance. Temporal Factor 6 Spatial Factor 1 (TF6SF1), a factor peaking at 136 ms, was identified as N1 for its latency and occipital topography. TF9SF1, a positive factor peaking at 204 ms with a parietal topography, was identified as P2. The TF8SF1 was a negative factor peaking at 290 ms and with a centro-parietal topography; although this factor had characteristics similar to those of the EPN, given its centro-parietal topography, it was named as N2. Finally, TF1SF1, TF5SF1, and TF2SF1 were considered together as part of the LPC, for their positive polarity, parietal topography, and peak latency (378, 536, and 702 ms, respectively).

The PCA was performed using the ERP PCA Toolkit (Dien, 2010).

Statistics

To analyze behavioral data, repeated measures ANOVAs for reaction times (RTs) and for the numbers of correct responses were applied to the whole sample (n = 57), with Valence (positive, neutral, negative) and Task (EST and ECT) as within-subjects factors. Correlation analyses were also used to determine any relationships between subjective ratings of valence and activation (obtained in the SAM) and RTs.

For the electrophysiological data, repeated measures ANOVAs with Valence (positive, neutral, negative) and Task (EST and ECT) as within-subjects factors were performed for each temporospatial factor (only for the 37 participants that produced good ERP data in both tasks).

Greenhouse–Geisser correction was applied to adjust the degrees of freedom of the F ratios for violations of the sphericity assumption. Uncorrected degrees of freedom are presented together with the corrected p and ε values for cases in which the assumption of sphericity was violated. Bonferroni correction was applied to all pairwise comparisons in the post-hoc tests. Partial eta-square tests (η p 2) were computed to estimate the effect sizes of the significant differences.

The statistical analyses were performed using IBM SPSS Statistics 20 software.

Results

Behavioral data and self-assessment of words

The ANOVA revealed significant main effects of task [F(1, 56) = 196.16, p < .001, η p 2 = .778], and valence [F(2, 112) = 6.44, p = .002, η p 2 = .103], as well as a Task × Valence interaction [F(2, 112) = 15.37, p < .001, η p 2 = .215] for the RT values.

The EST was associated with shorter RTs than was the ECT (Table 2). Although valence had a major effect, the Task × Valence interaction revealed important differences between the two tasks. Post-hoc comparisons showed that in the EST, neutral words had shorter RTs than did emotionally negative (p < .001) and positive (p = .001) words; in the ECT, positive words had shorter RTs than did negative (p = .001) and neutral (p < .001) words, with no difference between the negative and neutral categories. Identical RT results were obtained for the subsample of 37 participants.

Table 2 Percentages of correct responses and reaction times (RTs) for both tasks, and mean ratings on the Self-Assessment Manikin (SAM) in valence and arousal for each emotional category

For the number of correct responses, the ANOVA revealed a significant main effect of task [F(1, 56) = 36.66, p < .001, η p 2 = .395] and a Task × Valence interaction [F(2, 112) = 6.29, p = .003, η p 2 = .101]. Post-hoc comparisons showed a larger number of correct responses for neutral than for negative (p = .048) and positive (p = .028) words in the EST, and more correct responses for negative than for positive (p = .04) and neutral (p = .002) words in the ECT. The ECT yielded fewer correct responses, mainly in the neutral category.

The subjective ratings of valence and arousal for each word category obtained by the SAM (Table 2) were similar to the values obtained in the pilot study (see Table 1). When the SAM ratings were correlated with RTs, the Pearson’s r coefficients were significant for subjectively assessed arousal and RTs in the EST (r = .282, p < .05) and for valence and RTs in the ECT (r = –.285, p < .05). Thus, the most arousing words were associated with longer RTs in the Stroop task, and the most positive words had the shorter RTs in the categorization task.

Electrophysiological data

N1 (TF6SF1) No significant main effect of task or valence, nor a Task × Valence interaction, was observed for the factor identified as N1 (see Fig. 3a).

Fig. 2
figure 2

Factor loadings, in microvolt scaling, for each temporal factor (TF). The polarity of the factor loadings of components with negative maximum peaks was inverted to simplify the interpretation. Bold lines represent the TFs selected for the statistical analyses

Fig. 3
figure 3

Topographies and factor loadings, in microvolt scaling, for each of the selected temporospatial principal component analysis factors, in each emotional category (negative, neutral, and positive) and/or task (emotional Stroop task [EST] and emotional categorization task [ECT])

P2 (TF9SF1) A main effect of valence [F(2, 72) = 6.16, p = .003, η p 2 = .146] was observed: Negative words had significantly larger P2 factor scores than did positive (p = .028) and neutral (p = .006) words (see Fig. 3b). No other significant effects or interactions emerged.

N2 (TF8SF1 ) The ANOVA revealed a main effect of valence [F(2, 72) = 3.96, p = .023, η p 2 = .099], although this effect was modulated by a significant Task × Valence interaction [F(2, 72) = 5.32, p = .007, η p 2 = .128]. Post-hoc comparison showed less-negative N2 mean factor scores for neutral than for positive (p = .007) and negative (p = .001) words, only in the ECT (see Fig. 3c).

LPC The following results were obtained for the three temporospatial factors identified with the LPC.

First, for TF1SF1, task had a significant main effect [F(1, 36) = 42.27, p < .001, η p 2 = .540], with enhanced factor scores in the EST relative to the ECT (see Fig. 3d).

In addition, for TF5SF1, valence had a main effect [F(2, 72) = 6.19, p = .003, η p 2 = .147], and the Task × Valence interaction was also significant [F(2, 72) = 8.07, p = .001, η p2 = .183]. Post-hoc comparisons revealed emotional differences only in the ECT, with lower factor scores for neutral than for negative (p = .002) and positive (p < .001) words (see Fig. 3e).

Finally, for TF2SF1, task had a main effect [F(1, 36) = 8.4, p = .003, η p 2 = .189], with enhanced factor scores in the ECT (see Fig. 3f).

Discussion

In this study, we attempted to investigate how the emotional content of words modulates cerebral functioning and behavior when the content is task-irrelevant or task-relevant. To this end, we recorded ERPs and RTs from a sample of healthy middle-aged women while they performed an emotional Stroop task (EST) and an emotional categorization task (ECT) with positive, negative, and neutral words. The electrophysiological and behavioral results suggest that emotional words are processed preferentially even when the emotional content is task-irrelevant. In the EST, early automatic processing of emotional nouns was accompanied by an interference effect on performance. When the affective content of words was task-relevant, early and late ERPs revealed greater allocation of attentional resources to the processing of affective words, which resulted in shorter classification times, especially for positive words.

The reaction time data obtained from the EST showed that participants took longer to respond to the color of emotional words than to the color of neutral words. Although this is a classical finding in clinical populations, especially for negative words in samples with affective disorders (Williams et al., 1996), it has been reported less frequently in nonclinical populations (Pérez-Edgar & Fox, 2003). In this study, we observed emotional interference effects in a large sample of depression-free women. The blocked stimulus presentation used in this task may have enhanced the interference, since the valence of a word may slow down the color-naming reaction times of subsequent words (Waters et al., 2005). Since gender differences have been described in the processing of emotional information (e.g., Montagne, Kessels, Frigerio, de Haan, & Perrett, 2005), the use of a gender-homogeneous sample may also be a relevant factor in the observation of the emotional Stroop effect. We also observed emotional interference for positive information. As far as we know, this has not been found in previous studies, although these have chiefly involved clinical samples and personally threatening words, and have overlooked the positive category. This emotional interference in positive words can be explained by the verbal nature of the stimuli, for which bias toward positive information has been described (Kissler et al., 2006).

It has been suggested that the so-called “emotional interference” in previous research using the EST may be partly due to differences between emotional categories in lexical variables other than valence or arousal, such as frequency of use or length (Larsen, Mercer, & Balota, 2006). To overcome this criticism, we used well-controlled stimuli, which were matched on lexical features known to influence word recognition and which only differed in valence and arousal. Since more arousing words (negative and positive) yielded the longest reaction times in the EST, our data suggest that emotional interference can be partly explained by the arousal dimension, as previously suggested (Dresler, Mériau, Heekeren, & van der Meer, 2009). To shed some light on this question, we studied the relationship between the subjective ratings of valence and arousal (by SAM) and reaction times, and we found that the more arousing words were associated with longer response times in the EST. Clearly, further research is needed to explain the interference effect found in the emotional Stroop.

When the emotional dimension is task-relevant—that is, in the ECT—the RTs were longer for neutral and negative words than for positive words. Moreover, subjective ratings of valence were negatively correlated with reaction times (i.e., more positive words had shorter RTs). The shorter mean RTs for emotional versus neutral words suggest that the affective load facilitates their processing, speeding up the categorization. The classification of affective words was also accompanied by a higher number of correct responses. It must be noted that depending on personal experiences, participants may classify some words in a different emotional category to the established by the reference values of valence. Although this is not a categorization error, these disparities were not considered as correct responses in our study.

Following Estes and Verges (2008), we hypothesized that negative words should have shorter RTs than neutral and positive words when they are evaluated in the affective dimension, because negative information is more relevant and participants do not have to disengage their attention from the valence to respond appropriately. Nevertheless, the fact that negative words were associated with longer RTs than positive words does not support this hypothesis. We propose two explanations for these discrepancies, which are based on the sample and the type of negative stimuli used. The slowing down in the categorization of negative words may be understood considering that women usually present less dynamic, defensive responses to threatening stimuli than men do (Taylor et al., 2000). Moreover, Estes and Verges used words like “grenade,” “shark,” and “spider,” which may produce a fight-or-flight response, whereas we used words like “fatigue,” “dizziness,” and “strain” (see the Appendix), which describe situations in which the fight or flee defensive responses are less appropriate. Our findings may be consistent with an explanation of the slowing down in terms of a “freezing” response or temporal inhibition of motor activity (Algom, Chajut, & Lev, 2004). We suggest that future studies should take into account the type of negative stimuli used, because the stimuli may cause different defensive responses. Also, it could be interesting to consider the effects of stimuli repetition. Although there is no clear evidence about these possible effects, it seems that they are more evident with more than six repetitions (see Kissler et al., 2006), that is, with a higher number of repetitions than that used in the present report.

Comparing the behavioral performance between both tasks, the RTs for the color categorization were faster than for the emotional categorization, thus reinforcing the idea that perceptual characteristics are easier to classify than semantic characteristics (Frühholz et al., 2011; Thomas et al., 2007). This conclusion is consistent with the fewer number of correct responses observed in the ECT.

In the ERP data, and contrary to our hypothesis, the first observable difference among valence categories appeared in P2, a parietal positivity peaking at around 200 ms. This indicates that stimuli with negative content activate more neural resources at relatively low level processing stages, especially in comparison with neutral stimuli. Although the literature contains discrepancies related to the early modulation of ERPs by emotional words, some previous studies also suggest that affective content may be automatically discriminated at relatively early stages of processing, showing emotional modulation of the P2 component (Gootjes et al., 2011; Herbert et al., 2006; Kanske, Plitschka, & Kotz, 2011; Thomas et al., 2007; Trauer, Andersen, Kotz, & Muller, 2012). We found that early brain modulation by emotion is independent of the degree of attention directed to the word content.

Although previous research involving processing of emotional words described an occipital negativity denominated early posterior negativity (Frühholz et al., 2011; Herbert et al., 2008; Hinojosa et al., 2010; Kissler et al., 2009; Schacht & Sommer, 2009; Scott, O’Donnell, Leuthold, & Sereno, 2009), the temporospatial PCA performed in this study identified a negative factor (N2) with maximum peak over centro-parietal regions (see Fig. 3c). Despite this topographical discrepancy (possibly due to the paradigm used), other characteristics indicate that this negativity is similar to the EPN: Both have a similar latency range and are similarly affected by manipulation of attention and the emotional content. We found that N2 was enhanced for negative and positive words relative to neutral words, suggesting that the modulation may be explained by the arousal level of the stimuli, and not only by valence, as has been described for the EPN (Bayer et al., 2012; Herbert et al., 2008; Schacht & Sommer, 2009). We also found that the emotional modulation of N2 appeared only with deep processing of words (in the ECT), as reported for the EPN (Bayer et al., 2012; Hinojosa et al., 2010). This negativity seems to reflect rudimentary semantic stimulus classification (Kissler et al., 2009), as the N2 amplitude for neutral words is flattened relative to the amplitude for emotional words. Contrary to our hypothesis, we did not find any emotional modulation of N2 when participants were not required to attend to the emotional content of the words (in the EST).

Finally, three of the temporospatial factors extracted with the PCA procedure were identified as the LPC, taking into account their topography, polarity, and latency range. The first latency component (TF1SF1; see Fig. 3d) is only observable in the EST. Since late positivities are evoked once the stimulus has been evaluated (Kok, 2001), the presence of this factor, peaking at 378 ms, only in the EST supports the idea that evaluation of color occurs faster than evaluation of emotional content, as also found for RTs. Since this factor did not differentiate between emotional categories and as the LPC, as part of the P3 family, has been associated with resource allocation (e.g., Luck, 2005), the data suggest that individuals do not invest additional controlled resources to processing emotion when this dimension is task irrelevant. Thus, contrary to our expectations, we did not find emotional modulation of late ERP components in the EST.

The other two factors identified as the LPC (peaking at 536 and 702 ms, respectively) were only observable in the ECT. Only the first factor differentiated the emotional categories, with enhanced factor scores for negative and positive words relative to neutral words. This is consistent with the findings of previous studies that observed larger amplitudes at similar latencies in the categorization of emotional versus neutral words (Fischler & Bradley, 2006; Herbert et al., 2006). This finding also constitutes evidence of increased allocation of resources to affective words, both positive and negative, which would facilitate processing of the words. The ERP results are partly consistent with the RT data. Positive nouns were categorized faster than neutral nouns. The slowing down of RTs when categorizing negative words may be explained by the motor inhibition hypothesis (see above). It would be interesting to use response-ERPs such as the lateralized readiness potential to study motor preparation and selection in response to negative information. Given that in the present study participants had to respond only with their dominant hand, we could not analyze this component.

In summary, the data suggest automatic processing of emotional (mainly negative) information, reflected in enhanced P2 factor scores in both tasks. When emotional content was task-irrelevant, positive and negative words produced an interference effect (longer RTs). When the affective content was task-relevant, the N2 and LPC factor scores were enhanced for emotional words, although the RTs were only faster for positive words than for neutral words. Thus, ERPs were more sensitive to the emotional content when it was deeply processed. It seems that after early automatic processing, the emotional content of written stimuli boosts the processing at later stages, but only when attention is explicitly directed to this content.