Studies that have investigated the processing of emotional language with event-related potentials (ERPs) have been mainly concerned with the impact of several lexical and/or semantic word features on word comprehension (Bernat, Bunce, & Shevrin, 2001; Herbert, Junghofer, & Kissler, 2008; Hinojosa, Carretié, Valcárcel, Méndez-Bértolo, & Pozo, 2009b; Hinojosa, Méndez-Bértolo, & Pozo, 2010; Hofmann, Kuchinke, Tamm, Võ, & Jacobs, 2009; Kanske & Kotz, 2007; Kissler, Herbert, Peyk, & Junghofer, 2007; Méndez-Bértolo, Pozo, & Hinojosa, 2011; Schacht & Sommer, 2009; Scott, O’Donnell, Leuthold, & Sereno, 2009). An alternative approach has been to explore semantic expectation effects in both neutral (Jiménez-Ortega et al., 2012) and emotional (Delaney-Busch & Kuperberg, 2013; León, Díaz, de Vega, & Hernández, 2010; Moreno & Vázquez, 2011) discourse contexts. At least three different effects have been reported: the early posterior negativity (EPN), the N400, and the late posterior component (LPC), which represent different aspects of semantic and attentional processes (see Citron, 2012, for a review).

A central issue in the domain of language comprehension addresses how the parser establishes agreement relations between sentence constituents by means of several sources of information. Agreement, defined as the “covariation of the inflectional morphology between related words” (Molinaro, Barber, & Carreiras, 2011, p. 908), involves the variation of four main features—that is, number, gender, person, and case (Corbett, 2006; Wechsler, 2009). Notably, few efforts have been dedicated to study how emotional content modulates the processing of those features involved in agreement. The present study was aimed at contributing to this effort by exploring the impact of negative content on the processing of gender agreement relations.

Gender distinctions are important for many living beings. Therefore, it is not surprising that languages have developed linguistic gender distinctions that parallel biological distinctions (Corbett, 2001). Corbett (2001) postulated a difference between languages with a semantic gender system and languages with a formal gender system. In the first category are languages such as English or Chinese, in which gender is encoded in linguistic elements only for referents having biological sex. In the second category are, for example, Romance languages such as French, Italian, or Spanish. In these languages, all nouns are marked for gender, either masculine or feminine. Gender has thus important communicative functions, such as the disambiguation of anaphoric structures or deictic reference, which increase sentence cohesion, thereby facilitating discourse comprehension (Urrutia, Domínguez, & Álvarez, 2009; van Berkum, Brown, Zwitserlood, Kooijman, & Hagoort, 2005). Two sources of information have been mainly proposed as being relevant during the gender assignment process: sublexical and lexico-syntactic information (Afonso, Domínguez, Álvarez, & Morales, 2014; Gollan & Frost, 2002). The question remains, however, whether the processing of gender is driven purely by syntactic form (Clifton, Speer, & Abney, 1991; Frazier, 1987) or is influenced by conceptual/semantic factors (MacDonald, Pearlmutter, & Seidenberg, 1994; Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995).

In the ERP domain, the processing of gender information has been mainly studied focusing on the processing of gender mismatches. Thus, effects are mainly observed when control conditions are compared to violations conditions. Most studies have focused on either determiner-noun or noun-adjectives gender mismatches (see Molinaro et al., 2011, for a review). The main ERP components that have been found for gender agreement errors are the left anterior negativity (LAN) and the P600. The LAN is a negative wave between 250 and 500 ms post-stimulus-onset, which shows larger amplitudes to gender agreement mismatches than to correct sentences. This effect is typically observed at left anterior electrodes (Barber & Carreiras, 2005; Barber, Salillas, & Carreiras, 2004; Deutsch & Bentin, 2001; Gunter, Friederici, & Schriefers, 2000; Hagoort & Brown, 1999), although anterior bilateral effects have been reported in number agreement studies (Hinojosa, Martín-Loeches, Casado, Muñoz, & Rubia, 2003; Kaan, 2002; Leinonen, Brattico, Järvenpää, & Krause, 2008). Two main proposals have been put forward regarding the functional significance of the LAN. A widely accepted view suggests that LAN effects reflect first-pass processes involved in the early detection of agreement mismatches (Friederici, 2002; Molinaro et al., 2011). Under a different account, the LAN reflects some aspects of working memory operations (Kluender & Kutas, 1993; Vos, Gunter, Kolk, & Mulder, 2001). However, several studies have failed to identify LAN effects for gender agreement violations (e.g., Hagoort, 2003; Martín-Loeches, Nigbur, Casado, Hohlfeld, & Sommer, 2006; Wicha, Moreno, & Kutas, 2004), which reveals the need for a more fine-grained account of its functional significance and a better description of the conditions under which it is elicited (see Alemán-Bañón, Fiorentino, & Gabriele, 2012, and Molinaro et al., 2011, for discussions on this issue).

The component that is most consistently reported for agreement mismatches is the P600, a positive shift that typically emerges around 500 ms after stimulus onset and lasts until approximately 900 ms over posterior electrodes (Friederici, 2002; Kuperberg, 2007). It usually shows enhanced amplitudes to grammatical agreement violations as compared to correct agreement relations. However, P600 effects have been observed during the processing of temporary ambiguities in which syntactic information may help to select between the different options (e.g., Carreiras, Salillas, & Barber, 2004; Kaan, Harris, Gibson, & Holcomb, 2000). Several explanations have been proposed to interpret the P600 component (see Kuperberg, 2007, for a review). According to a serial conception of language processing, the P600 is thought to reflect a reanalysis and revision of the syntactic structure of the sentence (Friederici, 1995). Alternatively, constraint-based models assume that the P600 is an index of the difficulty of syntactic integration once syntactic frames had already been activated and modulated by multiple sources of information (Fiebach, Schlesewsky, & Friederici, 2002; Kaan et al., 2000). Finally, the P600 has been interpreted as a manifestation of the P3b, an ERP wave that is sensitive to the probability and salience of the stimuli of interest (Coulson, King, & Kutas, 1998). Nonetheless, studies that have focused on the processing of gender agreement relations have generally considered that the P600 represents difficulties in integrating the processed constituent with the previous sentence fragment, and that these difficulties invoke sentence reanalysis and repair processes (Alemán-Bañón et al., 2012; Barber & Carreiras, 2005; Molinaro et al., 2011; Xu, Jiang, & Zhou, 2013).

Only few studies on gender processing have reported behavioral data. The main finding of these studies was that participants were faster and especially more accurate when detecting gender mismatches than correct agreement relations (Hagoort, 2003; Hagoort & Brown, 1999; Martín-Loeches et al., 2006; Xu et al., 2013). Taken together, behavioral and ERP findings may be indicating that accurate detection of gender agreement errors would increase the costs associated with the early detection of such errors (as reflected in larger LAN amplitudes to agreement violations), as well as those related to the reanalysis of agreement relations that operate at late stages of processing (as reflected in larger P600 amplitudes to gender agreement mismatches).

There are some reasons to expect that emotional content may impact the processing of gender agreement relations, even though this question has not been addressed in prior research. Evidence comes from the results of a study by Martín-Loeches and colleagues (2012, Exp. 1). These authors showed that the amplitude of the LAN to violations of number agreement increased during the processing of negative as compared to neutral adjectives, which was interpreted in terms of a disruption of morphosyntactic processing by negative content. In contrast, number agreement errors in positive adjectives elicited decreased LAN amplitudes relative to neutral words, indicating a facilitation of morphosyntactic processing by positive content. Also, the processing of number mismatches in both emotional and neutral words was associated with enhanced P600 amplitudes, as compared with correct number agreement relations, for both emotional and neutral adjectives. Similarly, incorrect constructions were detected more accurately than correct constructions. The results of this study make plausible the assumption that the processing of other agreement features, including gender, might be affected by emotional information.

Therefore, in the present study we compared behavioral and ERP effects of gender agreement violations in negative and neutral adjectives embedded in short phrases, while participants performed a syntactic judgment task. On the basis of many previous studies (Alemán-Bañón et al., 2012; Barber & Carreiras, 2005; Hagoort, 2003; Molinaro et al., 2011; Xu et al., 2013), we focused on the LAN and P600 components. In particular, we predicted that gender agreement violations should elicit larger LAN and P600 effects than would correct phrases, as well as more accurate responses. Our main aim, however, was to compare the processing of gender agreement violations in negative and neutral words. Since no previous study has directly addressed this question, predictions should be based on the indirect evidence provided by the study of Martín-Loeches et al. (2012). Thus, we would expect that negative content would convey additional processing costs associated with the detection of a gender agreement violation. This would predict larger LAN effects for gender agreement violations in negative as compared to neutral words. No differences in the processing of gender information between negative and neutral words were expected in the P600 or at the behavioral level. However, evidence has suggested that number and gender behave differently. For example, the proportions of number agreement errors in Spanish and English are higher than the proportions of gender agreement errors in both language production and comprehension (Antón-Méndez, Nicol, & Garrett, 2002; Igoa, García-Albea, & Sánchez-Casas, 1999; Nicol & O’Donnell, 1999). These findings have led some authors to argue that differences emerge because number projects its own syntactic phrase, whereas gender, in contrast, is a lexical property of the nouns (Carstens, 2000; Ritter, 1993; but see Picallo, 1991). Under this view, the possibility that gender agreement mismatches in negative words would modulate LAN and/or P600 effects in different ways than in the case of number agreement errors cannot be totally ruled out.

Additionally, we also explored the processing of emotional words when the phrasal contexts that preceded emotional target words were neutral, a question that remains controversial. The results of prior studies have suggested that emotional ERP effects are attenuated when emotional words are presented following neutral phrase contexts. In particular, enhanced LPC amplitudes for emotional words have been found whenever the task demanded either a semantic judgment (Bayer, Sommer, & Schacht, 2010) or an explicit emotional categorization (Holt, Lynn, & Kuperberg, 2009). However, emotional ERP modulations were not observed when participants performed a syntactic judgment task (Martín-Loeches et al., 2012). Thus, on the basis of the specific features of our experimental design—in which participants judged the syntactic correctness of phrases—and previous findings, a noticeable reduction of emotion ERP effects would be expected.

Method

Participants

A total of 48 native speakers of Spanish (28 females, 20 males), with ages ranging from 18 to 42 (M = 20.9 years, SD = 5.16), were recruited from the student population of the Complutense University of Madrid. They had no history of neurological or psychiatric impairment. All had normal or corrected-to-normal vision. Participants were right-handed, as assessed with the Edinburgh Handedness Inventory (Oldfield, 1971): LQ > +60. They were volunteers and received credit course for their participation. The data from three participants were excluded from the analyses for excessive EEG artifacts, as will be explained later.

Stimuli

In the present experiment, the processing of gender agreement was studied in Spanish adjectives that could either agree or disagree in gender with the previous neutral phrase context. A norming procedure was conducted with 240 participants (128 females, 112 males; ages ranging from 17 to 38 years, M = 20.2 years, SD = 4.83) to determine the valence, arousal, and concreteness values of the adjectives used as target stimuli. On a 9-point Likert scale (with 9 being highly pleasant, highly arousing and highly concrete, respectively), participants rated 262 adjectives in both their feminine and masculine versions (thus, a total of 524 adjectives were assessed). The participants completed one of four lists, each of which comprised 131 words (half of the adjectives were feminine and half were masculine; orthogonally, half of the adjectives were negative and half were neutral). Equal numbers of negative and neutral adjectives were selected according to the following criteria and were contrasted via a 2 × 2 (factors: Emotion and Word Gender) analysis of variance (ANOVA; see Table 1): (a) Negative and neutral adjectives differed in valence and arousal ratings; (b) all adjectives had similar concreteness, word length, and frequency of use (according to LEXESP; Sebastián-Gallés, Martí, Carreiras, & Cuetos, 2000); and (c) feminine negative words and their masculine versions, on the one hand, and feminine neutral adjectives and their masculine versions, on the other, were equated in valence and arousal, concreteness, word length, and frequency of use. Since masculine adjectives tended to show a higher frequency of use than feminine adjectives, only 64 feminine adjectives (32 negative and 32 neutral) and their masculine versions met these restricted criteria. Table 1 summarizes the mean values for arousal, valence, and concreteness for the adjectives, as well as their mean word frequency and word length.

Table 1 Means and standard deviations (in parentheses) of valence (1 = highly unpleasant, 9 = highly pleasant), arousal (1 = highly calming, 9 = highly arousing), concreteness (1 = highly abstract, 9 = highly concrete), frequency of use (per one million), number of syllables, and number of letters

A total of 128 noun phrases, formed by a determiner (e.g., El [The]), a noun (e.g., camarero [waiter]), and an adjective (e.g., furioso [furious]), were generated. The nouns were always neutral, and the adjectives were the critical words that showed the emotional experimental manipulation. Thus, from the 128 experimental phrases, 64 each had negative and neutral adjectives. In Spanish, morphological gender can be marked by several suffixes. The /-a/ suffix is mainly associated with the feminine gender and the /-o/ suffix mostly with the masculine gender, although there are exceptions. The experimental phrases were constructed so that all nouns and adjectives ended with the canonical suffixes. All nouns and adjectives had animate referents.

Phrases with a short, three-word length were intentionally chosen for two reasons. First, since the focus of our study was on morphosyntactic processing, we aimed to keep contextual constraints at a minimum. Previous evidence had shown that the amplitude of the P600 is modulated by highly semantically expected words that violated gender agreement relations (Gunter et al., 2000). For this reason, cloze probabilities were measured to make sure that none of the critical adjectives could be predicted on the basis of the preceding context (0 % values). Forty-one participants were asked to read each phrase fragment (determiner and noun) and write down the word that they would generally expect to find completing the fragment. Second, the short length might contribute to minimizing possible confounding effects associated with working memory demands, which have been shown to modulate the LAN component (Kluender & Kutas, 1993).

A second set of incorrect phrases was generated. The gender of the adjectives was manipulated in order to produce disagreement with the nouns, so that feminine nouns were followed by masculine adjectives, and vice versa. In short, the adjectives were manipulated to create four experimental conditions:

  • Gender agreement in negative adjectives

  • Gender agreement in neutral adjectives

  • Gender disagreement in negative adjectives

  • Gender disagreement in neutral adjectives

Four experimental lists were constructed, such that each phrase version was assigned to a list in order to avoid repetition effects. Thus, every list included 32 phrases belonging to each of the four experimental conditions. Half of the adjectives were feminine and half were masculine. Each list was randomly assigned to 12 participants. Phrases were randomized within each list. The assignment of phrases to conditions in each list was counterbalanced across participants.

In addition, a list of 96 filler phrases was introduced. Thirty-two of these fillers included nouns and adjectives with opaque gender (e.g., the word actriz [actress] lacks any explicit morphological mark) and thirty two fillers included irregular words (e.g., cura [priest] ends with the letter “-a” but is masculine). This type of filler was included to prevent participants from using a superficial strategy for solving the task, such as attending just to the suffixes. Finally, thirty two filler phrases included common adjectives with a neuter suffix, which is applied indistinctly both to masculine and feminine gender nouns (e.g., “-e”). This type of filler was necessary because in the experimental conditions when a word pair agreed in gender, the nouns and adjectives ended with the same letter “-a” or “-o,” whereas in the disagreement condition the letters were different for nouns and adjectives. Thus, in these fillers, nouns and adjectives (niñatriste [girlf–sadcom]) agreed in gender and did not present orthographic overlap. Half of 96 filler phrases had a gender agreement violation, and half had no gender agreement violation. Also, half of the adjectives were negative and half were neutral. Each participant received 224 phrases, half of which agreed and the other half disagreed. Within conditions, the same numbers of masculine, feminine, negative, and neutral words were in all possible combinations.

Procedure

Participants were seated comfortably in a darkened, sound-attenuated chamber. The stimuli were presented on a computer monitor that was positioned at eye level about 65 cm in front of the participant. The words were displayed in black lowercase letters (except the first word of each phrase, which began with a capital letter) against a white background.

Participants performed a syntactic judgment task. They had to indicate whether the phrase was well-formed or not by pressing one of two buttons with the middle and the index fingers. The assignment of correct/incorrect buttons to the responding fingers and hands was counterbalanced across participants.

The sequence of events in each trial is described as follows: First, a fixation cross appeared in the center of the screen and remained there for 1,000 ms. This fixation cross was followed by a blank screen interval of 100 ms, and then the phrase was displayed word by word. Each word appeared for 300 ms and was followed by a 300-ms blank interval (SOA = 600 ms). One second after the offset of the last word, a question mark was presented and remained until the participant’s response. During this interval, participants had to indicate the correctness of the just-presented phrase. As in previous research with ERPs (e.g., Martín-Loeches et al., 2012; Taylor-Clarke, Kennett, & Haggard, 2002), this delayed responding procedure was used to avoid contaminating ERPs with motor-related responses. The intertrial interval was 500 ms.

A training block of ten phrases (other than those presented in the experimental session, five correct and five incorrect) was provided at the beginning of the session. Participants were asked to avoid eye movements and blinks during the reading of the phrases. Each session lasted approximately 1 h.

EEG recording, preprocessing and analysis

Electroencephalogram (EEG) activity was recorded from 62 electrode locations mounted in an electrode cap (Quick-Cap, Neuroscan, Inc., USA), arranged according to the International 10–20 system (American Electroencephalographic Society, 1991). All electrodes were referenced to the linked mastoids. Bipolar horizontal and vertical electrooculograms (EOGs) were also recorded to monitor eye movements and blinks. Electrode impedances were kept below 5 kΩ. Recordings were amplified using Neuroscan SynAmps amplifiers, continuously digitized at a sample rate of 1000 Hz, and filtered online with a frequency band-pass of 0.01–100 Hz.

Data were processed offline using EEGLAB v.12.01 toolbox (Delorme & Makeig, 2004) implemented in MATLAB (Mathworks, Inc.). Recordings were down-sampled to 500 Hz and filtered between 0.3 and 25 Hz using a basic FIR filter (12 dB/oct. roll-off). The continuous EEG was epoched from 200 ms before to 800 ms after the presentation of the adjectives, which were the critical words in this study. Baseline correction was made using the 200-ms period prior to the onset of stimulus. Only correct-response epochs were further analyzed. Prior to artifact correction procedures, epochs in which recordings at any channel exceed ± 150 μV were discarded. Independent component analysis (ICA) was then used to remove ocular and other artifacts from individual EEG data sets (see a description of this procedure and its advantages over traditional regression/covariance methods in Jung et al., 2000). Artifact-related independent components (ICs) were first identified using ADJUST (Mognon, Jovicich, Bruzzone, & Buiatti, 2011) and then carefully visually inspected. ADJUST is an automatic, validated method for detecting artifacted ICs on the basis of the simultaneous use of spatial and temporal features (for further details, see Mognon et al., 2011). After the ICA-based removing process, visual inspection of individual EEG epochs was also conducted. If any further artifact was present, the corresponding trial was discarded. This artifact rejection procedure led to the average admission of 27.93 (2.51) negative–correct trials, 29.11 (2.24) negative–incorrect trials, 28.56 (2.6) neutral–correct trials, and 28.38 (2.71) neutral–incorrect trials. The minimum number of trials accepted for averaging was 20 per participant per condition. The data from three participants were eliminated for further analyses since they did not meet this criterion. Thus, grand averages were computed over 45 participants separately for each condition and electrode location.

On the basis of previous findings on the LAN and P600 (Alemán-Bañón et al., 2012; Barber & Carreiras, 2005; Molinaro et al., 2011; Xu et al., 2013), and after visual inspection of the grand-average waveforms, two time windows were selected for further analysis: 250–450 and 500–800 ms after the onset of the adjectives. As can be seen in Fig. 1 below, the effects consisted of a frontal negativity (LAN) followed by a posterior positivity (P600). Thus, the mean amplitudes of these windows were measured for different subsets of electrodes for each component. These regions of interest (ROIs) were calculated by averaging together neighbor electrodes sites. For the LAN, two scalp ROIs were defined: left anterior (comprising Fp1, AF3, AF7, F5, and F7) and right anterior (comprising Fp2, AF4, AF8, F6, and F8). For the P600, the following scalp ROIs were defined: left posterior (comprising CP1, CP3, P1, P3, and PO3) and right posterior (comprising CP2, CP4, P2, P4, and PO4). Mean amplitude voltage values were analyzed for each separate component using a three-way repeated measures ANOVA with Hemisphere (left vs. right), Emotion (negative vs. neutral), and Correctness (incorrect vs. correct) as within-subjects factors.

Fig. 1
figure 1

Event-related potentials (ERPs) to gender agreement errors and correct adjectives as a function of negative and neutral emotional valence. Left: ERP waveforms at selected electrodes. Right: Topographic voltage maps showing the difference between incongruent and congruent items for the LAN (absent for negative adjectives) and P600 effects

Standard effects of emotion were explored in three posteriorly distributed ROIs (left centro-parietal: CP1, CP3, P1, P3, and PO3; right centro-parietal: CP2, CP4, P2, P4, and PO4; occipital: O1, Oz, and O2), in which the EPN, the N400, and the LPC components have previously been found to be more evident (Citron, 2012; Herbert et al., 2008; Hofmann et al., 2009; Moreno & Vázquez, 2011; Schacht & Sommer, 2009). Mean amplitude voltages of consecutive 50-ms time windows were computed starting from 0 ms throughout the entire epoch (800 ms). The within-subject factors Emotion (negative vs. neutral) and ROI (three levels) were included in the repeated measures ANOVAs.

Regarding analyses of our behavioral data, error rates and reaction times were submitted to a repeated measures ANOVA that included the within-subjects factors Emotion (negative vs. neutral) and Correctness (incorrect vs. correct).

For all ANOVAs, significant interactions were further evaluated using simple effects with Bonferroni correction for multiple comparisons. The Greenhouse–Geisser (GG) epsilon correction was applied to adjust the degrees of freedom of the F ratios where necessary. Effect sizes were computed using the partial eta-square (η p 2) method. All statistical analyses were carried out using IBM SPSS Statistics (version 20).

Results

Performance

Analysis of the accuracy data revealed a main effect of correctness, F(1, 44) = 5.06, p < .05, η p 2 = .1, indicating a lower percentage of errors to incorrect items (mean ± standard error: 1.32 ± 0.28) than to correct ones (2.4 ± 0.46). Given the delay of the established response period relative to the occurrence of mismatches, reaction times should be considered with caution. Indeed, we found that the detection of agreement mismatches (444.06 ms ± 19.68) was associated with faster responses than was the identification of correct agreement relations (468.99 ms ± 23.51), as reflected in a significant main effect of correctness, F(1, 44) = 5.5, p < .05, η p 2 = .11. Means and standard deviations for the behavioral measures are shown in Table 2.

Table 2 Means and standard deviations (in parentheses) of our behavioral and electrophysiological recordings

ERP data

Effects of gender agreement (LAN and P600)

Figure 1 displays the ERP patterns elicited by correct and incorrect words, separately for negative and neutral adjectives. These waves correspond to frontal and parietal electrode sites, where the critical ERP components (LAN and P600) are clearly visible. Table 2 shows the means and standard deviations of the LAN and P600 amplitudes.

Neutral adjectives when contrasting incorrect with correct conditions clearly showed an anterior negativity in left frontal electrodes (LAN). Interestingly, however, no such effect seems to have occurred for negative adjectives. Repeated measures ANOVA substantiated this impression, revealing a significant interaction between hemisphere, emotion, and correctness, F(1, 44) = 6.03, p < .05, η p 2 = .12. No other interactions or main effects were significant. Tests of simple effects, with Bonferroni correction for multiple comparisons, contrasting the correctness effect (incorrect > correct) were performed within each emotion and hemisphere. The results of these analyses revealed that, in the left hemisphere, this effect was significant for neutral items (p = .02), but not for negative ones (p = .83). In the right hemisphere, however, we found no congruency effect for either neutral (p = .14) or negative (p = .51) items. Overall, we observed the typical LAN effect only with neutral items.

The P600 components appeared to be similar in amplitude for negative and neutral adjectives. An ANOVA over posterior ROIs supported this assertion, since a main effect of correctness, F(1, 44) = 15.14, p < .001, η p 2 = .26, was found, with larger amplitudes to incongruent (1.86 ± 0.21) than to congruent (1.09 ± 0.17) words. No two-way or three-way interactions reached significance.

Effects of emotion

Figure 2 displays the ERPs elicited by negative and neutral adjectives, collapsed across correct and incorrect items. No main effects of emotion or of the interaction between emotion and ROI were found, in a thorough exploration of the whole epoch using consecutive 50-ms windows from 0 to 800 ms (all p values > .05, uncorrected for multiple comparisons): Fs(1, 44) = 0.01 to 3.26 for emotion; Fs(2, 88) = 0.02 to 2.43 for Emotion × ROI). Thus, in the time windows in which the EPN (200–400 ms), the N400 (300–500 ms), and the LPC (500–800 ms) are typically observed, no emotional modulations were found.

Fig. 2
figure 2

Event-related potential effects elicited by negative and neutral adjectives, collapsed across correct and incorrect items at posterior electrode locations

Additionally, the ERPs to negative and neutral items were submitted to repeated measures, two-tailed t tests at all time points between 0 and 800 ms at all 62 electrodes (i.e., 24,800 total comparisons) using the Mass Univariate ERP toolbox written in MATLAB (Groppe, Urbach, & Kutas, 2011a, 2011b). The Benjamini and Yekutieli (2001) procedure for the control of the false discovery rate (FDR) was applied to assess the significance of each test, using an FDR level of 5 %. This analysis also failed to reveal any significant differences between ERPs to negative and to neutral items (all FDR-corrected p values > .05).

Discussion

The aim of the present study was to explore how negative content impacts the processing of gender agreement dependencies. As an additional goal, we aimed to add some light to the processing of negative words presented in the context of neutral phrases. Gender agreement errors were detected more quickly and accurately than correct agreement relations. They also elicited enhanced P600 amplitudes. Importantly, LAN effects were only noticeable in neutral incorrect contexts, which seems to indicate that gender agreement relations in negative words are given processing priority, as compared to neutral words. We found no other standard ERP emotional effects. The implications of this pattern of results are discussed below.

Effects of negative content on the processing of gender agreement information

In the present study, we replicated the results of prior work, which had shown that participants give more accurate and faster responses when they detect gender mismatches rather than correct agreement relations (Hagoort, 2003; Martín-Loeches et al., 2012; Martín-Loeches et al., 2006; Xu et al., 2013). Also, the processing of gender agreement information in neutral words was evident in both the early detection of morphosyntactic errors and in later stages of reanalysis. This was reflected in enhanced LAN and P600 effects for incorrect relative to correct critical adjectives. Interestingly, we observed an absence of LAN effects to gender agreement mismatches in negative adjectives.

In the gender-processing domain, the LAN component has been proposed to reflect first-pass morphosyntactic analyses of gender agreement relations (Friederici, 2002; Molinaro et al., 2011). In our study, the finding of more accurate and faster responses to incorrect gender agreement relations indicates that enhanced LAN amplitudes could be interpreted as a reflection of the processing costs associated with the correct detection of inappropriate gender agreement relations. According to this view and in light of previous proposals (e.g., Friederici, 2002; Martín-Loeches et al., 2012), the lack of LAN effects during the processing of gender information in negative adjectives suggests that the parser had detected the presence of information that may be of potential biological relevance for the reader, which resulted in prioritized processing of gender agreement anomalies. The postulates of the affective-primacy hypothesis may account for the mechanisms that explain why the presence of negative content results in a more efficient identification of gender agreement errors. According to this view, emotional processing dominates over other types of cognitive processing. Because of its privileged status in promoting survival, under some circumstances the access and evaluation of affective information is facilitated over access to and evaluation of nonaffective information. This would reduce the costs associated with extensive processing and promote quick adaptive emotional responses (Delaney-Busch & Kuperberg, 2013; Storbeck & Clore, 2007; Zajonc, 1980). Indeed, facilitated processing of emotional information has been observed at other stages of language comprehension. In this respect, evidence has suggested that the semantic processing of emotional words is facilitated when the valence of the word is incongruous with the preceding context, which results in diminished N400 amplitudes for emotional as compared to neutral words (Delaney-Busch & Kuperberg, 2013). Also, the results of some studies in which words were presented in isolation showed that less effort was needed to access lexical information in negative words (Hofmann et al., 2009; Méndez-Bértolo et al., 2011; Scott et al., 2009). We argue that, consistent with this view, negative content facilitates the early detection of gender agreement anomalies, allowing readers to move straight on to a further analysis of the words. Subsequently, however, the larger P600 amplitudes found in our study to the critical adjectives in incorrect as compared to correct phrases seems to reflect an attempt to reanalyze and repair syntactic structure (Alemán-Bañón et al., 2012; Friederici, 1995; Molinaro et al., 2011). Interestingly, such processes were equally triggered during the processing of gender mismatches in both negative and neutral adjectives.

Our findings highlight the importance of exploring how emotional content modulates the processing of those features that are involved in the establishment of structural relations between the words in an utterance. This is especially relevant in light of the results reported by Martín-Loeches and collaborators (2012). These authors found that negative content increased the costs of processing number agreement errors, as reflected in larger LAN amplitudes for negative than for neutral words. Contrary to our hypothesis, we observed that the processing of gender agreement mismatches in negative words is facilitated at early stages of morphosyntactic processing. This discrepancy might be not surprising, given previous research that has reported differences in the processing of number and gender information (e.g., Alemán-Bañón et al., 2012; Antón-Méndez et al., 2002; Barber & Carreiras, 2003, 2005; Igoa et al., 1999). Also, some psycholinguistic models, such as the feature hierarchy hypothesis (Carminati, 2005; Silverstein, 1985), predict that different features will have different statuses, in the sense that each feature could potentially trigger qualitatively different processing demands, depending on its intrinsic nature. Therefore, a number of reasons could account for the differences between the results observed in Martín-Loeches et al.’s (2012) study and our own findings. On a theoretical level, it has been argued that gender is a feature of the lexical representation (the lemma—that is, the part of a word’s representation that contains the syntactic and semantic information), whereas number is considered to be a pure morphological feature. This distinction is relevant, since previous evidence has shown facilitated processing of lexical information in negative words (Hofmann et al., 2009; Scott et al., 2009). Differences in the properties of the negative words used in the two studies may also explain the discrepant results. The negative adjectives in Martín-Loeches et al.’s (2012) study had arousal and valence values of 3.27 and 3.03, respectively, whereas in our study the negative words had arousal and valence values of 6.88 and 2.12, on average. It has been found that the level of arousal affects early lexical processing of negative words presented in isolation. In particular, Hofmann et al. (2009) reported that arousal determined whether the processing of lexical information was facilitated in high-arousal negative words or was hindered in low-arousal negative words, in comparison with that of neutral words. Thus, given these previous findings, the processing of a lexically based feature in high-arousal negative adjectives may be facilitated, on the basis of the reduced LAN effects found in our study. In contrast, the analysis of pure morphologically based features in low-arousal negative adjectives during the early detection of agreement errors seems to be more demanding.

On another level, our results are likely to contribute to the more general debate regarding the independence of morphosyntactic processing. Syntax-first models assume a modular view during early stages of analyses—that is, morphosyntactic processing does not interact with other types of linguistic information. This theoretical proposal predicts that the interaction between semantic and syntactic information would occur at a later stage of processing (Clifton et al., 1991; Frazier, 1987). An alternative view postulates mutual influence between the morphosyntactic domain and multiple sources of linguistic constraints throughout the processing of the sentence (e.g., MacDonald et al., 1994; Marslen-Wilson & Tyler, 1980; Tanenhaus et al., 1995). In the case of agreement patterns, the proponents of the modular view argue that agreement processing is governed only by formal principles, which are mainly based on the checking of inflectional features. In accordance with this view, Friederici (2002, 2011) has proposed an electrophysiological model of language processing. In this account, the timing of the analysis of agreement dependencies, which is reflected by the LAN, should be independent of semantic processes. However, both levels of linguistic analysis would interact during a later processing phase, which would be reflected in modulations of the P600 by semantic and syntactic information. In contrast, according to interactive multiple-constraint-based models, agreement analysis is influenced by the semantic properties of a word as well as its form. This view predicts that semantic cues would influence the LAN effect. In agreement with previous findings (Martín-Loeches et al., 2012), we observed that early morphosyntactic processing is modulated by negative content, which is considered a semantic property of the words (Kissler, Assadollahi, & Herbert, 2006; Schacht & Sommer, 2009). Thus, our data do not support the predictions of those models that postulate a modular view of early morphosyntactic processing, which would expect effects of negative content to occur in the P600 rather than in the LAN (Friederici, 2002, 2011). The finding of facilitated processing during the early detection rather than the later reanalysis of gender agreement errors may reflect a special status of emotional information within the semantic system due to its relevance for individuals. Nonetheless, the results of the present study are in accordance with the predictions of those models that assume a simultaneous multiple-constraint process for sentence comprehension. They also contribute to the previous literature that has shown conceptual effects in the processing of gender agreement dependencies, especially in highly inflected languages such as Spanish, French, or Italian (e.g., Deutsch & Bentin, 2001; Vigliocco, Butterworth, & Garrett, 1996; Vigliocco & Franck, 1999).

Lack of emotional effects

We did not observe modulations in any of those components that have been typically associated with emotional processing in isolated words—that is, the EPN, the N400, and the LPC. This was true even when analyses of every 50-ms time window without any correction for multiple comparisons and t tests at all time points throughout the entire epoch (mass univariate statistical analysis) were conducted in a large sample of 45 participants. Prior studies had reported weak ERP effects, if any, for the processing of emotional relative to neutral words embedded in neutral contexts when semantic tasks were required. In this sense, Holt et al. (2009) reported larger LPC amplitudes for negative than for positive words in an emotional categorization task, whereas differences were observed between emotional and neutral words. In another study, Bayer et al. (2010) observed enhanced LPC amplitudes for negative words in comparison to neutral words in a semantic decision task. Also, effects on the N400 have been found when the emotional words were presented in the context of emotional sentences. Highly expected positive final words have been found to elicit larger N400 amplitudes than do highly expected negative words in negatively and positively biased sentence frames (Moreno & Vázquez, 2011). In this study, participants read each sentence for comprehension, and questions about the sentences were asked at the end of the recording session.

The results of a different set of studies suggest that standard emotional effects might be relatively unstable when the emotional word is embedded in a sentence. León et al. (2010) presented stories describing positive and negative episodes that were followed by emotionally consistent, emotionally inconsistent, semantically anomalous, and neutral sentences. The authors measured ERPs to the last words in the sentences and found N400 effects for words that were emotionally inconsistent with the story. Importantly, no main effects of emotion were reported in this study. In addition, Martín-Loeches et al. (2012), who also studied agreement relations with a syntactic judgment task in their Exp. 1, failed to report standard emotional effects to critical adjectives presented in neutral sentence contexts. Interestingly, when the same sentences were presented in the frame of a semantic task in Exp. 2, emotional words elicited larger N400 amplitudes than did neutral adjectives.

Some authors have proposed that the prevailing requirements of the task performed by the participants determine the type of main ERP effects associated with the processing of affective information (Bayer et al., 2010; Martín-Loeches et al., 2012). This proposal fits well not only with the results of the studies summarized above, but also with the high sensitivity to task effects shown by emotional words when they are presented in isolation (Fischler & Bradley, 2006; Hinojosa, Albert, López-Martín, & Carretié, 2014; Hinojosa et al., 2009a, 2010; Schacht & Sommer, 2009). However, as has been pointed out, the lack of emotional ERP effects does not necessarily imply that the processes reflected in these components also vanish; it is more likely that they are hardly visible under these circumstances (Martín-Loeches et al., 2012). The results of prior studies have suggested that some degree of analysis at a semantic level is crucial to observing reliable standard ERP effects of emotion when emotional words are embedded in sentences. In contrast, directing participants’ attention to the analysis of grammatical aspects seems to minimize the emergence of emotion-related ERP activity. Plausibly, the straightforwardness of morphosyntactic errors—especially in Spanish, where the canonical inflectional morphemes /-o/ for masculine and /-a/ for feminine mark most adjectives—may have contributed to a strategy that minimized the necessity of deep semantic analyses of the adjectives. Therefore, our data highlight the importance of considering task demands when presenting emotional words in phrasal contexts (Holt et al., 2009).

Our results also shed some light on the study of the interactions between cognitive and affective processes with ERPs, by showing that a focus on a given process may undermine the emergence of brain activity associated with the other process. The timing of several emotional ERP effects (EPN, LPC), which appear within the same latency range as those related to morphosyntactic ERP effects (LAN, P600), could also account for the lack of emotional effects in our study. In fact, the difference between the LPC and the P600 is only functional, since these potentials show similar latencies and distributions: Whereas the LPC effect may be defined as an amplitude difference between the processing of emotional and neutral stimuli, the P600 reflects an amplitude difference between morphosyntactic correct and incorrect words. Focusing on the processing of morphosyntactic cues in our study—which resulted in solid LAN and P600 effects—may have contributed to obscuring the presence of emotion-related ERP components that mainly appear in similar time windows.

Limitations and open questions

The present study is the first approximation to the study of emotional influences on the processing of gender information. However, the gender morphological representation of a concept presents a complex pattern across languages. For instance, in Spanish, names can be masculine or feminine, whereas in German and Dutch, nouns can also be neuter (Molinaro et al., 2011). Thus, it will be important for future studies to determine how emotion modulates gender processing in languages with different gender features. Also, some languages represent gender either as a conceptual characteristic (biological gender) or as a formal property of words (grammatical gender). Another important way to explore the relationships between emotion and gender would be to compare the processing of emotional words with gender values that refer to animate entities and those in which gender becomes a purely arbitrary formal feature.

The influence of positive content during the processing of gender agreement relations may be another source of potential interest for future research. The finding of a similar processing advantage in the detection of gender agreement errors in positive words would provide additional support for the claims of the affective primacy hypothesis. However, the greater biological relevance of negative than of positive information leaves open the possibility that the less effortful detection of gender agreement mismatches in negative words during early stages of morphosyntactic processing does not generalize to the processing of positive words.

A potential confound that we considered was the measurement of ERPs in sentence-final positions. It has been claimed that apart from local effects, sentence-final words are often strong attractors of global processing factors related to sentence wrap-up effects (Hagoort & Brown, 1999; Molinaro et al., 2011). Sentence wrap-up refers to the fact that readers tend to spend longer when reading sentence-final words. This phenomenon has traditionally been thought to be due to integrative processing that occurs at sentence end constituents, such as the processes involved in relating sentences or clauses and updating a discourse model (Just & Carpenter, 1980; Rayner, Kambe, & Duffy, 2000; Warren, White, & Reichle, 2009). In our study, the use of three-word phrases minimized the impact of global integrative processes associated with wrap-up effects that might have obscured differences due to local morphosyntactic analysis. In fact, the neutral words in our study elicited the LAN–P600 pattern that has typically been found in the gender-processing literature with both word-pair (Barber & Carreiras, 2005; Münte & Heinze 1994) and sentence (Alemán-Bañón et al., 2012; Gunter et al., 2000; Xu et al., 2013) materials. Nonetheless, future research can address this issue by comparing the influences of emotion on gender agreement processing in sentence-internal and sentence-final positions.

Another potential confound of the present design could be related to the presence of affective congruency effects. In our study, it might be argued that negative words had to be integrated in an emotionally incongruent neutral context, whereas the affective context was congruent in the case of neutral adjectives. According to this view, the absence of LAN effects in gender agreement errors for negative words would rather reflect difficulties in the integration of a negative word in an incongruent neutral context. We have several reasons, however, to disregard this possibility. In most previous studies that have explicitly explored emotional congruency effects in either sentence contexts (e.g., Delaney-Busch & Kuperberg, 2013; León et al., 2010; Moreno & Vázquez, 2011) or evaluative/affective priming paradigms (e.g., Herring, Taylor, White, & Crites, 2011; Hinojosa et al., 2009a; Zhang, Lawson, Guo, & Jiang, 2006), incongruent words modulated the activity of the N400 and/or the LPC. We did not find emotional effects in the time windows in which these components usually appear, which suggests that neutral contexts were not so emotionally constraining in our study. Furthermore, enhanced LAN amplitudes—not reduced amplitudes, as in the present study—have been related to indexing the costs associated with processing difficulties (Deutsch & Bentin, 2001; Gunter et al., 2000; Hagoort & Brown, 1999). Therefore, it seems questionable that negative words that supposedly would be more difficult to integrate in a neutral affective context would elicit diminished instead of increased LAN effects.

Conclusions

In sum, negative content appears to facilitate the early detection of mismatches in gender agreement relations. Similar effects have been described at several processing stages during language comprehension, such as lexical access or the integration of context information (Delaney-Busch & Kuperberg, 2013; Scott et al., 2009). Thus, it seems that the early detection of gender agreement errors is especially relevant when the message conveys information about a threat or a danger. Nonetheless, this processing advantage disappears at later stages of processing, where attempts to reanalyze and repair syntactic structure seem to be equally triggered by negative and neutral words.