Introduction

To deal with the profusion and complexity of visual input, the brain has developed mechanisms to select relevant and important information. For example, emotional stimuli might convey information critical for survival, such as the presence of a threat. Indeed, the processing of emotional stimuli appears to be facilitated by triggering attentional processes quickly after stimulus onset (see Pourtois, Schettino, & Vuilleumier, 2013 for review).

The earliest visually evoked EEG component is the C1, which occurs within 100 ms of stimulus onset. The C1 generators are thought to lie in primary visual cortex V1. Observation of the C1 is difficult, because its topography depends on the exact stimulus location in the visual field (Clark, Fan, & Hillyard, 1995; Jeffreys & Axford, 1972a, 1972b; Kelly, Gomez-Ramirez, & Foxe, 2008; Proverbio, Del Zotto, & Zani, 2007). In particular, stimuli presented along the horizontal midline do generally not elicit a C1. For this reason, C1 studies present stimuli in the lower or upper visual hemifield. In addition, the sensitivity of the C1 to low-level stimulus features, such as contrast (Foxe et al., 2008), requires strict stimulus control.

While earlier research failed to find endogenous modulation of the C1 (Clark et al., 1995; Gomez Gonzalez, Clark, Fan, Luck, & Hillyard, 1994; Hillyard & Anllo-Vento, 1998), more recent research has reported C1 modulation by spatial attention (Fu, Fedota, Greenwood, & Parasuraman, 2010b; Fu et al., 2009; Kelly et al., 2008; Proverbio et al., 2007) and attentional load (Fu et al., 2009; Fu, Fedota, Greenwood, & Parasuraman, 2010a; Rauss, Pourtois, Vuilleumier, & Schwartz, 2009; Rossi & Pourtois, 2012, 2017; see Rauss, Schwartz, & Pourtois, 2011 for review).

A large body of EEG and MEG literature has focused on very early processing of emotional facial stimuli. Pizzagalli et al. (1999) were the first to report very early effects of emotional faces with EEG, observing a difference in scalp topography at 80 ms between liked and disliked faces. Eger et al. (2003) found early differences between positive, negative and neutral facial dichoptic stimuli. Pourtois et al. (2004) specifically designed a study to generate a large C1 component and presented pairs of emotional faces bilaterally in the upper hemifield as cues in a dot-probe task and reported an increased C1 amplitude for fearful-neutral compared with happy-neutral stimuli. The C1 amplitude in the fearful-neutral condition was correlated with the later component P1 amplitude—typically indexing spatial attentional orienting—generated by the subsequent valid target, possibly indicating early capture of attention by the emotional face.

Two replication attempts of the study by Pourtois et al. (2004), however, did not succeed. Eldar et al. (2010) found a larger C1 for anxious participants presented with the angry-neutral pair, compared with the nonanxious population, but found no effect between happy-neutral versus angry-neutral faces condition and no P1 validity effect as reported by Pourtois and colleagues. Santesso et al. (2008) failed to find an early C1 effect as well. However, West et al. (2011) reported changes in C1 consistent with an increase of activity in V1 neurons retinotopically corresponding to the location of the fearful faces. Zhu & Luo (2012) found a larger C1 for fearful faces compared with happy faces in an emotional face-word Stroop task. Finally, Rossi & Pourtois (2017) found an increased C1 for peripherally presented task-irrelevant fearful eyes but only under high task demand.

These Event-Related Potential (ERP) studies are complemented by Event-Related Field (ERF) studies reporting early MEG signal changes by facial expression (Bayle & Taylor, 2010; Halgren, Raij, Marinkovic, Jousmäki, & Hari, 2000; Liu & Ioannides, 2010; Morel, Ponz, Mercier, Vuilleumier, & George, 2009). In particular, Bayle & Taylor (2010) presented faces with neutral, fearful, and happy facial expressions under different attentional conditions: attend-to-emotion or attend-to-identity. Interestingly, they found that early frontal activity (≈90 ms) was increased for fearful faces in the attend-to-identity condition, but not in the attend-to-emotion condition, suggesting that the task-relevance of the emotional information may be important for the early processing of facial expression and may explain discrepant results.

The current paper presents three experiments aimed at better understanding the factors that affect very early emotional face processing. Because spatial attention affects the C1 (Fu et al., 2009; Fu, Fedota, Greenwood, & Parasuraman, 2010b; Kelly et al., 2008; Proverbio et al., 2007) and task-relevance may be critical for early modulation by facial expression (Bayle & Taylor, 2010), we manipulated these two factors across three EEG experiments.

Traditional ERP analysis averages potentials over electrodes and subjects. However, when analysing subtle changes in the C1 responses, averaging across participants can be problematic as due to anatomical differences, the location of the effect on the scalp may vary across participants (Kelly et al., 2008). For instance, if in one subject the modulation is positive and, in another negative, no net effect will be observed. Similarly, averaging over electrodes within a subject might reduce effects. We therefore used multivariate pattern analysis and trained classifiers to help and capture more subtle differences than the classical potential average may be able to uncover. To date, only a few EEG studies have specifically analysed the C1 using classifiers (but see Tzovara et al., 2012).

In Experiment 1, the sensitivity of the C1 component to fearful facial expression was tested by presenting a face unilaterally in the upper visual field while the participants fixated and were asked to report whether the fixation point changed colour (Fig. 1a). The design of this experiment was inspired by Eimer & Kiss (2006). The task was not demanding to allow the processing of the peripheral stimuli. To control for low-level features of the stimuli, to which the C1 is known to be sensitive, faces also were presented upside-down. From past literature, we expected that upright fearful faces would generate a larger C1 component than neutral faces.

Fig. 1
figure 1

Stimuli and procedure used in the three experiments. (a) Two example trials from Experiment 1: a target neutral right upright trial followed by a non-target neutral left inverted trial. Neutral and fearful faces were presented upright or upside down in the left or right upper hemifield. In target trials, the grey fixation point turned red (dark grey on the figure) during facial stimulus display; participants were instructed to press a button when detecting the red fixation point. (b) Example trial (cued left jug) from Experiment 2. A cue pointing left or right appeared near the fixation point, followed by a stimulus in the upper left or right hemifield. Half of the participants were instructed to press a button when detecting a jug at the cued location, the other half were instructed to detect cued kettles. (c) Example trial (uncued, fearful, right) from Experiment 3. The task was to press a button when detecting a happy face at the cued location, while maintaining fixation on the dot. (d) Two examples of stimuli for each condition used in the three experiments. Experiment 1 used neutral and fearful faces, upright and inverted; Experiment 2 used neutral and fearful faces, as well as jugs and kettles; Experiment 3 used neutral, fearful and happy faces

However, because we did not observe an effect of fearful faces on the C1 in Experiment 1, we conducted Experiment 2, in which participants were asked to attend to different locations in the visual field (Fig. 1b). We expected to find C1 modulation by facial expression when spatial attention was oriented towards the location of the faces. Participants were cued to the left or right side of their upper visual field by a central arrow, after which four possible classes of stimuli could be presented: neutral faces, fearful faces, jugs, and kettles. Participants had to respond to jugs (or kettles) while ignoring faces. We found that cueing and facial expression affected C1 activity. Using classifiers, we found differences between cued and uncued faces but solely for neutral faces.

Finally, as in both previous experiments, faces were not task-relevant. We conducted Experiment 3 (Fig. 1c) by manipulating spatial attention as in Experiment 2 but giving participants a task explicitly related to the expression of the faces. We expected that emotional effects would be present when the task was relevant to the faces. For this purpose, happy faces were introduced, and participants had to respond to happy faces at the cued location only. However, no difference between cue condition or facial expression was found.

Methods

Participants

The experiments were approved by the Psychology Department Ethical Committee at the University of Edinburgh. Participants provided written, informed consent and reported right-handedness and normal or corrected-to-normal vision; they were compensated at a rate of £6/hour. In both Experiments 1 and 2, one participant’s data were removed because of excessive ERP artefacts. In Experiment 3, two participants’ data were removed: one due to poor behavioural results, and the other due to a technical issue. We therefore report data from 20 participants in Experiment 1 (age range: 19–30 [mean: 23] years, 10 females), 26 in Experiment 2 (age range: 18–50 [mean: 23] years, 16 females), and 24 in Experiment 3 (age range: 18–33 [mean: 22] years, 16 females). There was no significant age difference between the three groups [F2,67 = 0.036, p = 0.96]. Before the experiment, participants were asked to complete a State-Trait Anxiety Inventory (STAI) questionnaire (Spielberger et al., 1983). There was no significant difference between the state [F2,67 = 1.376, p = 0.26] and trait [F2,67 = 2.360, p = 0.10] of anxiety across the three experiments.

Stimuli

In Experiment 1, stimuli were displayed on a 40.5- × 30-cm ViewSonic P227f CRT monitor at a resolution of 1,280 × 1,024 pixels, with a refresh rate of 75 Hz. In Experiments 2 and 3, stimuli were displayed on a 47.5- × 29.5-cm Samsung SyncMaster LCD monitor at a resolution of 1,680 × 1,050 pixels, with a refresh rate of 100 Hz. In all experiments, the screen was placed at a viewing distance of 70 cm from the chin-rested participant. Stimuli were presented using Psychtoolbox 3 (Brainard, 1997) under Matlab (MathWorks, Inc., Natick, MA).

Example stimuli are shown in Fig. 1d. The faces with the neutral, fearful, and happy (the latter being used in Exp. 3 only) expressions were taken from the same 6 male and 6 female models of the NimStim face database (Tottenham et al., 2009). All nonfacial parts of the images were removed, including the shoulders, neck, and hair. Nonfacial stimuli consisted of 12 metal kettles and 12 glass jugs, selected from the Internet. These objects were chosen because kettles and jugs have a height-to-width ratio comparable to faces, can easily be discriminated from faces and from each other, and are presumably emotionally neutral. All stimuli were converted to greyscale, resized to the bounding box size (height: 15.3 cm; width: 11.5 cm), normalised for mean pixel value and RMS contrast, and presented on a black background. The mirror image of each stimulus was presented an equal number of times for each experimental condition.

Stimuli were presented in the upper visual hemifield to elicit a reliable C1 component characterised by a widespread centroparietal negativity (Jeffreys & Axford, 1972a, 1972b). For this purpose, the light grey fixation point (diameter: 0.5 cm) was placed at the bottom of the screen on its vertical median. The horizontal distance of the centre of the fixation point to the inner vertical side of the stimulus bounding box was 13.75 cm, while its vertical distance to the lower horizontal edge of the stimulus bounding box was 4.3 cm. Cue arrows consisted of two perpendicular 0.5-cm long light-grey segments, placed at a distance of 0.5 cm from the fixation point, and pointed towards the upper left or upper right.

Experimental procedure

Experiment 1 consisted of 960 trials, divided into 5 blocks of 192 trials. Neutral and fearful faces were presented upright or upside-down (inverted condition). Faces were presented for 300 ms, and the interstimulus interval was randomised between 1,200 and 2,100 ms. One sixth of the trials (160) were GO trials. In these trials, the fixation point turned red for the 300 ms during which the face was displayed. Participants had to respond by pressing the space bar of the keyboard with the index finger of their right hand. The participants were given 1,300 ms to respond after stimulus onset.

Experiment 2 consisted of 1,152 trials divided into 3 blocks. The stimuli consisted of facial (neutral and fearful faces) and nonfacial (jugs and kettles) stimuli. Before the experiment, participants were told that the stimuli would consist of faces, kettles, and jugs. The kettles and the jugs were shown to the participant before the experiment. Cues were presented for 200 ms and stimuli for 300 ms, separated by an interval of 750 ms. Time between stimulus offset and cue onset of the next trial was randomised and ranged from 1,400 to 2,300 ms. Half of the stimuli were presented on the left-hand side, and half of them were cued. Cueing (cued, uncued) and stimulus presentation side (left, right) were counter-balanced for each stimulus type within each block. One eighth of the trials were therefore GO trials. Half of the participants were instructed to press the space bar of the keyboard when detecting a jug at the cued location and the other half when detecting a kettle at the cued location. Participants were given 1,300 ms poststimulus onset to press the spacebar.

Experiment 3 consisted of 1,120 trials divided into 4 blocks. Following a cue pointing left or right, neutral, fearful and happy faces were presented on the upper left or right visual field. Half of the stimuli were presented on the left-hand side, and half of them were cued. Happy faces were presented on 352 trials, whereas the 768 remaining trials were equally divided between neutral and fearful facial expressions. Participants were instructed to press the space bar of the keyboard when they detected a happy face at the cued location (176 trials = 11/70 of trials). The happy faces were shown to the participants before the experiment. The timing was the same as in Experiment 2.

For Experiments 2 and 3, the cue indicated the location of the task-relevant stimulus. While participants were instructed to keep their eyes on the fixation point, the stimulus onset asynchrony between the cue and the stimulus (950 ms) provided sufficient time for participants to saccade to the cued location. To prevent such saccades, eye movements were monitored online, and participants were instructed during the experiment to keep fixating to the dot if they were nevertheless making saccades. The eye tracker was calibrated before starting each block. Trials with eye movements were subsequently removed offline by visual inspection of the electro-oculography (EOG) signals, alongside trials with other artefacts.

Participants were given a number of practice trials before any data were recorded. The experiments were paused approximately every 6 minutes to give the participant an opportunity to rest. Correct detection rate and average reaction time (RT) were given as feedback to the participant at the end of each block.

On average, participants performed the tasks very well. For Experiment 1, average correct detection rate of target trials was 99.69% (mean RT: 419 ms), and false-alarm rate was 0.06%. For Experiment 2, correct detection rate was 94.1% (mean RT: 584 ms), and false-alarm rate was 0.5%. For Experiment 3, correct detection rate was 95.4% (mean RT: 623 ms), and false-alarm rate was 1.1%. The mean RTs were obtained after trimming at two standard deviations above the participant’s mean.

Data acquisition

EEG was recorded using a BioSemi Active-Two system (BioSemi BV, Amsterdam, The Netherlands). The activity at 64 Ag-AgCl scalp electrodes following the location and label of the extended 10–20 system (Jasper, 1958), along with 4 EOG electrodes (above and below the right eye, and on the outer canthi) and 2 mastoid electrodes, was digitised on 24 bits with a sample rate of 1,024 Hz.

Processing was performed using the EEGLAB (Delorme & Makeig, 2004) and the Fieldtrip (Oostenveld, Fries, Maris, & Schoffelen, 2011) toolboxes with Matlab and custom scripts. The EEG signal was low-pass filtered using a basic finite impulse response filter with a cutoff frequency of 40 Hz. No high-pass filter was used, because this can introduce artefacts (Acunzo, MacKenzie, & van Rossum, 2012). Data were down-sampled to 512 Hz, epoched using stimulus onset time as time origin, and referenced to the average mastoids. Epochs containing artefacts were removed using a semiautomatic procedure labelling epochs containing EOG data beyond 100 μV in absolute value. The data were visually inspected and epochs containing artefacts were removed. On average, 723 trials per participant remained for Experiment 1, 918 for Experiment 2, and 962 trials for Experiment 3. We performed a three-way ANOVA on trial number for each experiment, with factors (1) Emotion, (2) Orientation (exp. 1) or Cueing (exp. 2 and 3), and (3) Side. We found no significant difference between conditions in any of the experiments. Supplementary Table 1 summarizes the number of trials for each condition and experiment.

Finally, the data were baseline-corrected by subtracting from each electrode its average value in the 100-ms time window preceding stimulus onset. Because the detection of the target could attenuate modulations by facial expression (Eimer & Kiss, 2006), the target trials were not included in the analysis. For all experiments, only correct NO-GO trials were analysed.

The signals of visually-identified noisy channels were reconstructed with spherical interpolation using EEGLAB. On average, 0.85 electrodes per participant were interpolated for Experiment 1, 2.30 for Experiment 2, and 0.75 for Experiment 3.

Data analysis

Conventional ERP analysis

For conventional ERP analyses, the signal was referenced to the average mastoid electrodes, as is common in C1 studies (Pourtois et al., 2004; West et al., 2011; Zhu & Luo, 2012). For statistical analysis of the C1, voltage was averaged over two electrodes on each hemisphere (CP1, C1, CP2, C2) and over the 60–100-ms time interval. The P1 component also was studied, using electrodes PO7 and PO8, in the 100–130 ms time window. The analysis of the P1 is particularly relevant for Experiments 2 and 3, as this component is known to be modulated by spatial attention (Luck et al., 2000). A P1 increase contralateral to the cue is indicative of a successful manipulation of spatial attention. Fig. 2 shows the location of the electrodes as well as the topographies of the time ranges of interest averaged across all conditions for each experiment.

Fig. 2
figure 2

Average potential across all conditions for Experiments 1, 2, and 3. Topographies indicate the average potential across all electrodes during the time interval of interest: 60–100 ms (C1) and 100–130 ms (P1). The position of the electrodes used for statistical analyses is highlighted in orange for the C1 (CP1, CP2, C1, C2) and in blue for the P1 (PO7, PO8). The right column shows the time course of the average potential for the two sets of electrodes. Consistent with the literature, the C1 is characterised by a widespread negativity over the centroparietal region peaking before 100 ms, and the P1 time window by positive potentials in the parieto-occipital region peaking after 100 ms.

Repeated-measures analyses of variance (ANOVAs) were performed on the average amplitudes using R. The factors of the repeated-measures ANOVA for Experiment 1 were Emotion (Neutral, Fearful), Orientation (Upright, Inverted), Position (Left, Right) and Hemisphere (Left, Right). For Experiment 2 and 3, factors were Emotion (Neutral, Fearful), Cueing (Cued, Uncued), Position (Left, Right), and Hemisphere (Left, Right).

A negative, linear prestimulus trend can be observed on the C1 electrodes in Experiments 2 and 3 (Fig. 2). This is may be an effect of temporal expectation and preparation, as the delay between cue onset and the subsequent stimulus was constant (950 ms). Anticipation of stimuli or responses can generate a slow evolving potential called the contingent negative variation (CNV) originating from motor-related areas with a topography overlapping with the C1 (Cravo, Rohenkohl, Wyart, & Nobre, 2011; Los & Heslenfeld, 2005; Miniussi, Wilding, Coull, & Nobre, 1999; Pfeuty, Ragot, & Pouthas, 2005; Trillenberg, Verleger, Wascher, Wauschkuhn, & Wessel, 2000; Walter, Cooper, Aldridge, McCallum, & Winter, 1964). To ensure that this linear trend was not confounding the results, we ran an ANOVA on the slope of the average signal over the groups of electrodes of interest over the [−100, 0] ms time interval. The factors were the same as for the C1 and P1 analyses: Emotion (Neutral, Fearful), Cueing (Cued, Uncued), Position (Left, Right), Hemisphere (Left, Right) as within-participant factors, and Experiment (Exp. 2, Exp. 3) as a between participant factor. For both the C1 and P1 electrodes, we found a significant Cueing × Position × Hemisphere interaction (C1: F1,48 = 4.05, p = 0.050, ηp2 = 0.08; P1: F1,48 = 9.29, p = 0.004, ηp2 = 0.16). This is not surprising as attentional orienting following the cue generates lateralized differences in potentials even before stimulus onset (Nobre, Sebestyen, & Miniussi, 2000). However, no significant main effect or interaction involving Emotion or Experiment was found, indicating that these conditions were not confounded by differences in the CNV.

Classification analyses

We also used a multivariate approach in order to better deal with the large C1 variations across participants (Kelly et al., 2008) and to be more sensitive to differences in EEG responses. While for some modalities and signals it might be possible to train a single classifier for all participants (De Lucia et al., 2012; Chouiter et al., 2015), the C1 signal is highly variable across subjects. Therefore, we trained classifiers for each participant independently to discriminate between conditions and compared the resulting classifier accuracy (averaged over participants) to chance level.

For the multivariate analyses, which can be compared to topographical analyses, the average reference was used, as justified by Bertrand et al. (1985). The number of trials was equalized between conditions for each participant using random sampling. For each single time point, linear discriminant analysis classifiers were trained and tested using all 64 channels and a k-fold cross-validation with k = 15. Classifiers were trained and tested for the contrasts of interest only, i.e., Emotion (for each Orientation or Cueing condition) and Cueing. To minimise within-class signal variability, which may hinder the ability to train the classifiers, trials from different subconditions were not grouped before training and testing. In particular, the topography in the C1 time range depends on the stimulated visual hemifield. To remove this possible source of variability, we trained and tested classifiers exclusively on signals generated by left (or right) visual hemifield presentations. For instance, the classification accuracy between Fearful Cued and Neutral Cued faces is the average accuracy of the two sets of classifiers trained and tested on stimuli presented on the Left hemifield and on the Right hemifield.

Statistical analysis on the overall time course of accuracy was performed with t-tests on the samples between in the C1 time range (60–100 ms) and P1 (100–130 ms). We corrected for multiple comparisons across time using threshold-free cluster enhancement (TFCE, Smith & Nichols, 2009). This method seeks to keep the sensitivity benefits of cluster-based thresholding while avoiding the arbitrary choice of a unique cluster threshold. The TFCE parameters used were dh = 0.1, E = 0.5, H = 2, and 10,000 iterations. We report points of significant z-scores at alpha = 0.05, one-tailed. The analyses were performed using the Matlab toolbox CoSMoMVPA (Oosterhof, Connolly, & Haxby, 2016).

In addition to training classifiers for each data point, resulting in time-resolved accuracies that could be averaged over the C1 and P1 time windows, we trained classifiers using all samples of each time window, resulting in a single accuracy value per participant and time window. At the cost of high temporal resolution, this method has the advantage of feeding more data to the classifiers and is thus more likely to uncover small effects.

Because for this analysis the prestimulus trend also could have confounded the classification results, we sought to assess whether there were systematic differences between the different conditions for which we found an effect (see Results). For Experiments 2 and 3, we therefore compared the slopes between the Cued and Uncued conditions, for each stimulus Position and Emotion separately (as was done with the classifiers). The average prestimulus slope was calculated for each condition and participant. An overall slope difference index was calculated for each participant as the squared differences in slope between the conditions, summed across all electrodes. This index is equal zero only if the slopes are equal for all electrodes between the two conditions. To check for statistical significance, a distribution of 1,000 average slope difference indices was calculated by permuting trials between the contrasted conditions. We found that none of the slope difference indices were significant, suggesting the absence of systematic slope pattern between the Cued and Uncued conditions for each Position and Emotion condition (Table S2).

Results

Experiment 1: fixation point task

Experiment 1 explored the effect of emotional stimuli on C1 response in a nondemanding task while attention was diverted. Upright and inverted neutral and fearful faces were presented in the upper visual hemifield while the participants were asked to detect a change of colour of the fixation point. A larger C1 amplitude was expected for upright fearful, compared to neutral faces, as in Pourtois et al. (2004). ERPs and classifier accuracy are shown in Fig. 3.

Fig. 3
figure 3

Potentials and classifiers accuracies in Experiment 1. (a) Time course of the average potential (± standard error) over electrodes CP1, CP2, C1, and C2 for Neutral (pink) and Fearful (green) faces in the Upright (left) and Inverted (centre) conditions. The bar plot indicates the average potential (± standard error) over the C1 time range, indicated by the grey shaded area. (b) Same as (a), but over electrodes PO7 and PO8, and the P1 time range. (c) Neutral versus Fearful faces classifier: Time course of the classifiers’ accuracy for the Upright (left) and Inverted (centre) faces. The bar plot indicates average accuracies (± standard error) over the C1 (light grey) and P1 (dark grey) time ranges.

Row (a) of Fig. 3 shows the C1 ERP time course (left and middle) and its average over the time-windows (right). Importantly, the Emotion × Orientation interaction was not significant (F < 1), and there was no main Emotion (F < 1) or Orientation effect (F1,19 = 2.8, p = 0.11, ηp2 = 0.13). The only significant effect was the Position × Hemisphere interaction (F1,19 = 4.72, p = 0.043, ηp2 = 0.20), reflecting the sensitivity of the C1 to stimulus position, and explained by a larger potential (in absolute value) for electrodes ipsilateral to stimulus location.

No significant effect on the P1 amplitude relevant to our factors of interest was found either, Fig. 3b. In particular, the main effect of Emotion, and the interaction Emotion × Orientation were non-significant (all F < 1). Only a main effect of Hemisphere was found (F1,19 = 5.46, p = 3×10-2, ηp2 = 0.22).

Next, two sets of classifiers were trained to discriminate between Neutral and Fearful faces for the Upright and Inverted conditions respectively. No significant cluster was found using TFCE. The time courses of classifier accuracy are shown in Fig. 3c. When training and testing classifiers using all data points within each time range, we surprisingly found above-chance Neutral vs Fearful classifier performance in the C1 time range for the Inverted (t19 = 2.19, p = 0.02, d = 0.49), but not the Upright faces (t19 = −0.57, p = 0.71, d = −0.13; Supplementary Table 3). This could be due to low-level features of the facial stimuli, such as the eyes which are closer to the fovea in the Inverted condition and are more salient in the Fearful condition, generating specific neural patterns that could be picked up by the classifiers. However, it should be noted that the effect is subtle, as it does not appear using the ANOVA over the average C1 signal, or in the time-resolved classification. No effect was found in the P1 time range (Supplementary Tables 3 and 4).

We hypothesized that the absence of an effect in the C1 time range for Upright faces could be explained by a lack of attention towards the stimuli. Indeed, participants had to pay attention exclusively to the fixation point to perform the task correctly. By contrast, the task used by Pourtois et al. (2004) involved stimuli presented at fixation but also at the location of the facial stimuli. We therefore set up Experiment 2, in which participants were asked to orient their attention to the location of the face, or to the opposite hemifield, using a foveal cue.

Experiment 2: object discrimination task

To test the hypothesis that spatial attention plays a role in the very early processing of emotional faces, we asked participants to pay attention to their left or right visual hemifield using a cue near the fixation point prior to stimulus display. The faces were distractors, as participants were asked to detect cued kettles or jugs. In this case, a larger C1 was expected for cued-fearful faces compared to cued-neutral faces. Additionally, larger C1 and P1 amplitudes for cued trials, compared with uncued trials, were expected.

ERPs and classifier accuracies for Experiment 2 are shown in Fig. 4. We saw no main Emotion or Cueing effect in C1 amplitude (Fig. 4a). However, C1 amplitude generated by the faces showed a significant Emotion × Cueing × Hemisphere interaction (F1,25 = 7.91, p = 9.4×10-3, ηp2 = 0.24), as well as a significant Position × Hemisphere interaction (F1,25 = 2.32, p = 6.05×10-5, ηp2 = 0.48). The three-way interaction involving Emotion reflects a more pronounced effect of Emotion on the Right hemisphere electrodes, characterized by a larger potential (in absolute value) for Fearful than Neutral in the Uncued condition, and a smaller one for Fearful than Neutral for the Cued condition (Fig. 4a, right panel). The direction of the effect thus appears opposite to the one expected.

Fig. 4
figure 4

Potentials and classifiers accuracies for Experiment 2. (a) Time course of the average potential (± standard error) over electrodes CP1, CP2, C1, and C2 for Neutral (pink) and Fearful (green) faces in the Cued (far-left panel) and Uncued (middle-left panel) conditions. The bar plot (middle-right) indicates the average potential (± standard error) over the C1 time range (grey shaded area). The far-right panel displays the nature of the significant Emotion × Electrode hemisphere × Stimulus position interaction. (b) Same as (a), but over electrodes PO7 and PO8, and the P1 time range. The P1 amplitude was significantly larger for Cued faces. The far-right panel displays the significant Electrode hemisphere × Cueing × Stimulus position interaction. (c) Neutral versus Fearful faces classifier: Time course of the classifiers’ accuracy for Cued (left), Uncued (centre), and all faces (right). (d) Cued versus Uncued classifier. Left: Time course of the classifiers’ accuracy for Neutral (left) faces, Fearful (centre), and All (right) faces. The bar plot (bottom) indicates average accuracies (± standard error) over the C1 (light grey) and P1 (dark grey) time ranges for neutral (left), fearful (centre), and all (right) faces. Significant departures from chance level are indicated by dots for the time course plots, and stars for the bar plots.

Analysis of the P1 amplitude confirmed the hypothesis on Cueing: a Cueing main effect was found (F1,25 = 6.76, p = 0.015, ηp2 = 0.21), reflecting a larger P1 for Cued, compared with Uncued trials. There was also a significant Cueing × Position × Hemisphere interaction (F1,25 = 1.52, p = 6.5×10-4, ηp2 = 0.38), reflecting the fact that P1 effect was observed over the contralateral electrodes only (Fig. 4b, right panel). This indicates that our manipulation of spatial attention was successful. The Emotion effect and the Emotion × Cueing interaction were non-significant (Fs < 1; Fig. 4b).

Two sets of classifiers were trained to discriminate between Neutral and Fearful faces for the Cued and Uncued conditions respectively. No time point in C1 or P1 was found to be significant. The overall accuracy for all faces (Cued and Uncued) was at chance level (Fig. 4c). In addition, a set of classifiers was trained to discriminate between the Cued and Uncued conditions and significantly performed above chance level in both the C1 and P1 time ranges (Fig. 4d): C1: t25 = 1.75, p = 0.047, d = 0.34; P1: t25 = 3.58, p = 7×10-4, d = 0.70.

To see if these effects came specifically from Neutral or Fearful faces, we looked at the corresponding classifiers. We only found a significant effect for Neutral faces in the C1 time range: Neutral: t25 = 1.77, p = 0.044, d = 0.35; Fearful: t25 = 0.22, p = 0.41, d = 0.04 (light grey bars in Fig. 4d). Training classifiers combining all samples of the C1 time range yielded the same pattern of results (Supplementary Table 5).

Average classification accuracy between the Cued and Uncued condition was highly significant in the P1 time range (t25 = 3.37, p = 0.001, d = 0.66), although it again did not reach significance for Fearful faces despite the presence of a significant cluster in this time window (t25 = 1.61, p = 0.06, d = 0.31). Indeed, the departure of the classifiers’ accuracy from baseline for the Fearful faces appear towards the end of the P1 time range (Fig. 4d). However, classifiers using the whole P1 time period performed significantly better than chance including for Fearful faces (t25 = 2.73, p = 5.7×10-3, d = 0.54; Supplementary Table S6).

Experiment 3: emotion discrimination task

Experiment 3 was intended to test whether attention to facial expression (task-relevance of facial expression), in addition to spatial attention, would enhance C1 component changes. Neutral, Fearful, and Happy faces were presented to the participants in a cueing paradigm. Participants had to press a button when detecting a Happy face at the cued location while maintaining fixation on the fixation point. Again, according to past literature, a larger C1 was expected for fearful faces, compared to neutral faces (Pourtois, Grandjean, Sander, & Vuilleumier, 2004). ERPs and classifier accuracies for Experiment 3 are shown in Fig. 5.

Fig. 5
figure 5

Potentials and classifiers accuracies for Experiment 3. (a) Time course of the average potential (± standard error) over electrodes CP1, CP2, C1, and C2 for Neutral (pink) and Fearful (green) faces in the Cued (left) and Uncued (centre) conditions. The bar plot indicates the average potential (± standard error) over the C1 time range, indicated by the grey shaded area. (b) Same as (a), but over electrodes PO7 and PO8, and the P1 time range. The P1 amplitude was significantly larger for Cued faces. (c) Neutral versus Fearful faces classifier: Time course of the classifiers’ accuracy for Cued (left), Uncued (centre), and all faces (right). (d) Cued versus Uncued classifiers. Time course of the classifiers’ accuracy for Neutral (left), Fearful (centre), and All (right) faces. The bar plot (bottom) indicates average accuracies (± standard error) over the C1 (light grey) and P1 (dark grey) time ranges for neutral (left), fearful (centre), and all (right) faces. Significant departures from chance level are indicated by dots for the time course plots and stars for the bar plots

The C1 average potential did not show any main effect of Emotion (F1,23 = 1.52, p = 0.23, ηp2 = 0.06), Cueing, and showed no interaction related to these two factors (Fig. 5a). There was only a main effect of Hemisphere (F1,23 = 9.73, p = 4.8×10-3, ηp2 = 0.30) with a larger C1 on the Right Hemisphere electrodes, and an Hemisphere × Position interaction (F1,23 = 9.8, p = 4.7×10-3, ηp2 = 0.30), reflecting the fact that the main effect is only present for Right hemifield stimulus presentations.

The P1 analysis showed a significant Cueing effect (F1,23 = 15.58, p = 6.41×10-4, ηp2 = 0.40), as well as a significant Cueing × Position × Hemisphere interaction (F1,23 = 45.66, p = 6.8×10-7, ηp2 = 0.66), reflecting the fact that the Cueing effect was present only on the contralateral electrodes. The Emotion contrast and the Emotion × Cueing interaction were nonsignificant (F < 1, and F1,22 = 1.12, p = 0.30, ηp2 = 0.046, respectively; Fig. 5b).

These analyses were complemented by an ANOVA including the data of both Experiments 2 and 3, with Experiment as an additional between-subject factor. We found a significant Experiment × Hemisphere interaction (F1,48 = 5.13, p = 2.81×10-2, ηp2 = 0.10), as the potential was smaller for Left hemisphere electrodes in Experiment 3, which suggests that regardless of Cueing, the C1 is affected by the nature of the task. In addition, we found an overall Emotion × Cueing × Hemisphere interaction (F1,48 = 4.59, p = 3.73×10-2, ηp2 = 0.087), reflecting a larger C1 for Uncued Fearful faces, in particular on the Right Hemisphere electrodes. Interestingly, this interaction was dependent on the Experiment, as indicated by a significant Experiment × Emotion × Cueing × Hemisphere interaction (F1,48 = 4.28, p = 4.39×10-2, ηp2 = 0.082). As reported above, the Emotion × Cueing × Hemisphere interaction was indeed significant for Experiment 2 but not for Experiment 3. This is an important result, because it further supports the idea that the paired Cueing and Emotion effect found in Experiment 2 is dependent on the task. Finally, the 5-way Experiment × Emotion × Cueing × Position × Hemisphere interaction was also significant (F1,48 = 4.09, p = 4.9×10-2, ηp2 = 0.079), indicating that this differential effect between experiments is also affected by stimulus Position. The same analysis was performed for the P1 electrodes but did not provide additional information to the ANOVAs performed for each experiment individually.

Classifiers could not discriminate between Neutral and Fearful faces, in either the Cued or Uncued conditions, and neither for both conditions pooled together (Fig. 5c). Classifiers trained over the whole C1 time period did not perform above chance either (Supplementary Table S7). However, consistent with the P1 analysis, classifiers could reliably discriminate between Cued and Uncued faces in the P1 time range (Fig. 5d; dark bars; t23 = 2.54, p = 9×10-3, d = 0.52). The effect also was present when looking at Neutral (t23 = 2.51, p = 0.01, d = 0.51) but did not reach significance for Fearful faces only despite the existence of a significant cluster (t23 = 1.35, p = 0.09, d = 0.27), as for Experiment 2. However, classifiers trained over the overall P1 time period performed significantly above chance, including for Fearful faces only (t23 = 2.40, p = 0.012, d = 0.49; Supplementary Table 8).

Discussion

We conducted three experiments aimed at finding differences in early activity, more specifically in the C1 time range, between neutral and fearful faces displayed in the left or right visual hemifield. In Experiment 1, participants were asked to detect a change in the colour of the fixation point (faces were not task-relevant, and displayed at unattended locations). In Experiment 2, participants were instructed to respond to a cued object (kettle or jug) while ignoring the faces (the emotion of the faces was not task-relevant, but the facial stimuli were displayed at spatially attended or unattended locations). In Experiment 3, participants were instructed to respond to cued happy faces (the emotion of the faces, displayed at spatially attended or unattended locations, was task-relevant).

The analysis of the EEG signal using both conventional and classification analyses yielded complex effects involving Emotion, Cueing and Task. We found that when faces were not task-relevant (Experiment 2), the neural pattern generated in the C1 time range differed between Neutral and Fearful faces, depending on whether they were spatially cued or not, with a stronger C1 change for Fearful faces. This effect is robust as the between-experiments interaction was significant. Furthermore, it also is supported by classification analyses, which uncovered different patterns between Cued and Uncued faces in Experiment 2, but not in Experiment 3 in which the faces were task-relevant. In addition, when specifically studying Facial expression, classifiers performed better than chance only on Neutral faces.

Analysis of the P1, using both ERP and classifiers, found reliable cueing effects in both cueing experiments (Exp. 2 and 3). This effect was relatively independent of facial expression, although classification analyses appear to show that the patterns were less pronounced, or appeared later, for Fearful compared to Neutral faces, in both Experiments 2 and 3.

The importance of task-irrelevance

Most studies finding very early facial expression effects involved tasks in which the facial expression was irrelevant (“implicit” emotion processing). For instance, in Pourtois et al. (2004), participants focused on the orientation of bars presented following face stimulus presentation. In West et al. (2011), participants were given a colour-matching task, and in Morel et al. (2009), participants had to detect stimulus repetitions.

Bayle & Taylor (2010) noted that studies using a task involving implicit emotion processing reported modulation by facial expression before 130 ms (Eger et al., 2003; Eimer, Holmes, & McGlone, 2003; Halgren et al., 2000; Holmes, Vuilleumier, & Eimer, 2003; Kawasaki et al., 2001; Pizzagalli et al., 1999; Streit et al., 2003), while those using an explicit emotion processing reported modulation only after 250 ms (Krolak-Salmon, Fischer, Vighetto, & Mauguière, 2001; Krolak-Salmon et al., 2003). Indeed, in their own MEG study, Bayle and Taylor found differential early processing between neutral and fearful faces only when attention was diverted away from facial expression.

In Experiment 3 only, participants were asked to pay attention to the emotional facial expression. According to the pattern described by Bayle & Taylor (2010), we should have found an emotion effect in both Experiments 1 and 2. Our results are partially consistent with this pattern. It is only in Experiment 2 that we found a reliable effect of Emotion, but this was a complex effect involving spatial Cueing, and classification analyses suggest that the effect is mainly driven by Neutral faces generating a different pattern depending on Cueing.

Attentional load as a critical parameter?

The C1 also has been found to be sensitive to attentional load (Rauss et al., 2009; Rossi & Pourtois, 2012, 2014, 2017). Our results parallel those of Rossi & Pourtois (2017), who found a difference in the C1 amplitude between task-irrelevant fearful and neutral eyes but only when attention was diverted from the eyes using a high demanding task (“high load”). In addition, they did not find any effect of load for fearful eyes stimuli. They interpreted this pattern as follows: under high load, task-irrelevant information is filtered out at the early stages of processing, but emotional stimuli are filtered out less due to a bias towards processing these stimuli despite the unrelated task. Our findings can be interpreted in light of attentional load. Our tasks are not very demanding (participants performed extremely well), thus our failure to find C1 time range modulation by emotion in Experiments 1 and 3 could be a result of low attentional load. Performance in Experiment 2, which required detecting similar shaped jugs and kettles, necessitated filtering out faces, which is arguably more demanding than discriminating emotions or detecting a colour-change. Indeed, facial stimuli draw overt and covert attention (see e.g. Bindemann, Burton, Langton, Schweinberger, & Doherty, 2007; Langton, Law, Burton, & Schweinberger, 2008; Theeuwes & Van der Stigchel, 2006) and do so within about 100 ms (Crouzet, Kirchner, & Thorpe, 2010). The presence of an effect in Experiment 2 therefore could be that the task was sufficiently demanding to yield an emotional effect. We therefore hypothesize that stronger effects of emotion in the C1 time range will be observed when load is increased.

Faces as cues versus cued faces

While we highlighted the idea of ‘implicit processing’ as a potential important parameter to uncover an effect in the C1 time range, it also should be noted that the design first presented in Pourtois et al. (2004) consisted of pairs of faces that were presented bilaterally, with the fearful faces being used as cues to trigger an attention orienting effect. By contrast, in our experiments, none of the faces were used as cues. On the contrary, attention was manipulated endogenously before stimulus onset while faces were presented unilaterally, within or outside the focus of attention. As Pourtois et al. (2004) found a relationship between the C1 amplitude generated by the faces (used as cues) and the P1 locked to the subsequent probe stimulus, the C1 may partly reflect the initiation of attentional processes triggered by the presence of an emotional face. It therefore could be argued that attentional capture can only be measured when a change in the focus of attention occurs. If there is no need to restrict a broad focus of attention, because the peripheral locations are always task-irrelevant (Experiment 1), or if nonspatial attention is already focused on faces (Experiment 3), attentional capture may be inhibited or inexistent. However, for Experiment 2, where participants had to orient their attention peripherally and attend to nonfacial stimuli, it could be speculated that the faces trigger attentional processes.

Relatedly, manipulating the informativeness of a probabilistic cue (i.e., manipulating the larger proportion of cued stimuli versus uncued stimuli) may help to better understand how the C1 generated by the facial stimuli is dependent on endogenous spatial orienting. We may find that a noninformative cue with a task independent of cueing yields results similar to the ones reported by Pourtois et al. (2004), as attention would be more broadly distributed across the visual field, and thus facilitating attentional capture by emotional faces.

Expectation effects

One limit of our study is the constant time delay between the cue and the stimuli in Experiments 2 and 3. This may have generated temporal expectations and preparation, as well as a CNV potential. Although we argue that the presence of a linear trend in the potentials at the time of stimulus onset was unlikely to have generated confounds, it is possible that the overlap with the CNV may have reduced potential differences in the C1 component and reduced our likelihood of finding an effect. Future experiments should add jitter between the cue and the stimulus to avoid this potential issue.

Classification analyses

Compared with classical ERP analyses, the single trial classification analysis has the advantage of taking into account patterns of activity occurring across sensors or time. In addition, when classifiers are trained and tested within each participant, they enable to circumvent individual differences that may cancel each other out when averaging ERPs across participants. In our case, classification analyses strengthened and clarified the results found using the conventional signal averaging method, in particular for Experiment 2 (Fig. 3; Supplementary Tables 5 and 6).

For each contrast tested, we used classification in two complementary ways: First, we trained and tested classifiers on each data point in time to uncover a time course of classification accuracy. Accuracies were then averaged over the time windows of interest after having run t-tests across time and were corrected for multiple comparisons using TFCE (Figs. 3, 4, and 5). Second, we trained and tested classifiers using datapoints from the whole time windows of interest (Supplementary tables). We found both approaches performed similarly. However, to evaluate the reliability and size of an effect in a given time window, we found the latter approach superior, as the former had less sensitivity if the clusters were very localised in time (e.g., P1 time range effect for Fearful faces; Figs. 3D and 4D; Supplementary Tables S6 and S8).

Conclusions

Using classifiers and conventional ERP analyses, we found that task-irrelevant faces were processed differently in the C1 time range when spatially attended versus unattended. By breaking down the analysis between facial expressions, we found that the effect was present for neutral faces, but not for fearful faces. Our study is to be added to those reporting an effect of emotional faces (Pourtois et al., 2004; Rossi & Pourtois, 2017; West, Anderson, Ferber, & Pratt, 2011; Zhu & Luo, 2012) and attention on the C1 component (Fu et al., 2010a, b; Fu et al., 2009; Kelly, Gomez-Ramirez, & Foxe, 2008). Our study is, in particular, in line with Bayle & Taylor (2010) who noted that studies using a task involving implicit emotion processing reported modulation by facial expression before 130 ms. Future research should further explore how the C1 may index attentional capture by emotional faces, by manipulating the task difficulty as well as the spatial focus of attention during face stimulus onset.