Theories of coding facial identity and emotion expression

The human face conveys complex information that contributes to the recognition of both emotional expression and individual identity. Facial identity and emotion are conveyed by overlapping physical features and can be equally quickly extracted and discriminated (Blau, Maurer, Tottenham, & McCandliss, 2007; Bruce & Young, 1986; Calder, Burton, Miller, Young, & Akamatsu, 2001; Ganel & Goshen-Gottstein, 2004; Holmes, Vuilleumier, & Eimer, 2003; Pegna, Khateb, Michel, & Landis, 2004; Schweinberger & Soukup, 1998). Views differ, however, on whether the information about identity and emotion is processed independently or in integral fashion.

One of the most influential models of face coding over the past 25 years, proposed by Bruce and Young (1986), holds that there is independent, parallel processing of identity and expression information from faces. A primary motivation for this argument comes from neuropsychological double dissociations showing that patients can have impaired recognition of face identity but not emotion (Campbell, Landis, & Regard, 1986) or impaired discrimination of face expression but not identity (Bruyer et al., 1983). These neuropsychological data have been supported by studies in normal observers (Young, McWeeny, Hay, & Ellis, 1986). For example, Young et al. had participants make identity judgments (are paired faces the same person?) or emotion judgments (do paired faces expressed the same emotion?) to faces. For identity matching, reaction times (RTs) to familiar faces were faster than those to unfamiliar faces, but there was no difference between familiar and unfamiliar faces for expression matching. These authors suggested that analyses of facial expressions proceed independently from processes involved in the processing of a person’s identity.

Calder, Young, Keane, and Dean (2000) examined the same issue using the composite-face paradigm, in which the top of one digitized photograph of a face and the bottom of another digitized face image were formed to create a test image of a face, either aligned or misaligned (the top half slightly offset from the bottom half). When two different faces are aligned, responses to one component (e.g., the top half) are slowed relative to when the faces were misaligned, presumably due to the forming of a new “gestalt” to the aligned components (the “face composite effect”). Calder et al. (2000) reported that the composite effects for identity and expression judgments operated independently of one another. For example, identity judgments were slowed by the aligning of two different face identities but not of two different expressions, with the reverse occurring for expression judgments. Furthermore, Calder, Burton, Miller, Young, and Akamatsu (2001), using principal components analysis (PCA), found that facial expression and identity were coded by largely different components. These authors argued that the functional dissociation between facial expression and identity related directly to the fact that these two facial characteristics load on different dimensions of the stimulus. These arguments for the fractionated processing of structural information about face identity and emotion have recently been bolstered by computational work that has suggested that independent processing of these two types of information is a natural consequence of the statistical independence between the image features for structural identity and emotion (Tromans, Harris, & Stringer, 2011).

Contrasting arguments to this—that is, for the nonindependent processing of structural identity and emotion—have been made from studies using face adaptation (Ellamil, Susskind, & Anderson, 2008; Fox & Barton, 2007). For example, Ellamil et al. found that adaptation to the basic emotions of anger, surprise, disgust, and fear resulted in biased perception away from the adapting expression. However, when the adapting and test images belonged to different people, the aftereffect decreased. This suggests that there is at least partly overlapping neural processing of identity and of facial expression (Ellamil et al., 2008).

The most straightforward evidence about possible interactions between facial identity and the perception of facial expressions has come from studies employing the selective-attention paradigm originally introduced by Garner (1974). Schweinberger and Soukup (1998) had participants classify unfamiliar faces along one dimension while disregarding an irrelevant dimension. The faces were presented in three different conditions: a control condition (in this case, the task-irrelevant dimension was held constant while the relevant dimension varied), an orthogonal condition (both the irrelevant and relevant dimensions were varied), and a correlated condition (changes in the irrelevant dimension covaried with changes in the relevant condition). RTs for identity judgments were not influenced by variations in expression, but expression recognition was affected by variation in identity. Similar results were obtained by Atkinson, Tipples, Burt, and Young (2005), who investigated the relationship between facial emotion expression and facial gender, and by Baudouin, Martin, Tiberghien, Verlut, and Franck (2002), who evaluated attention to facial identity and expression in both healthy individuals and individuals with schizophrenia. The results pointed to an asymmetric interaction between facial identity and the discrimination of facial expressions, with expression judgments more affected than identity and gender judgments more affected by variations in the other dimension.

Contrasting data have been reported by Ganel and Goshen-Gottstein (2004), who have explored face processing using the Garner task, which tests for interference effect from one stimulus property on responses to another. Participants categorized photographs according to personal identity information or to emotional expressions, and the effects of variation along the other dimension were explored. The stimuli were selected photographs of two people shown with two different emotional expressions and from two different views. Task-irrelevant information from the other dimension influenced participants’ judgment equally in the identity and emotional expression categorization tasks. The authors argued that the systems involved in processing identity and expression are interconnected and that facial identity can serve as a reference from which different expressions can be more easily derived (Ganel & Goshen-Gottstein, 2004). On the other hand, Etcoff (1984), also using Garner (1974) interference, showed that participants could selectively attend to either unfamiliar identity or emotional expression without interference from the irrelevant stimulus dimension. They suggest that relatively independent processing of facial identity and expression occurs.

Although the majority of the studies that have employed Garner interference have suggested some independence in the processing of facial identity and emotional expression, the inconsistency across studies also limits any conclusions. One limitation is that often a small stimulus set was used, with only two individuals shown displaying one of two emotions (see, e.g., Schweinberger & Soukup, 1998). When the limited set of stimuli is then repeated across trials, it is possible that participants respond to local image details (e.g., variations in lighting and photographic grain) rather than to expression and identity, limiting any interference from one dimension on the other. Another important issue is that different picture-based strategies may be used for either identity or emotion decision tasks in the Garner paradigm. In the identity decision task, pictorial strategies might be used to discriminate individuals on the basis of the shape of a face and of nonfacial cues such as hair style (see, e.g., Ganel & Goshen-Gottstein, 2004; Schweinberger & Soukup, 1998). For the expression decision task, in which participants are required to attend to internal facial features, this strategy may be inappropriate. This, in turn, can lead to differences in the difficulty, and possible asymmetric interference effects, between identity and emotional expression judgments. Although the latter concern was overcome in the Ganel and Goshen-Gottstein study, the issue about increasing variability of the relevant stimulus dimension within the orthogonal condition when compared to the irrelevant dimension was still there (see also Kaufmann, 2004, for a detailed discussion about the effects of increasing variability along the relevant stimulus dimension within the orthogonal condition in the Garner paradigm).

In addition, even when effects of one stimulus dimension are found on responses to the other, the means by which these effects occur is not clear. For example, in selective-attention experiments, the effects of unattended stimulus dimensions may arise due to trial-by-trial fluctuations in attention that may lead to the irrelevant dimension sometimes being attended (Lavie & Tsal, 1994; Weissman, Warner, & Woldorff, 2009). On these occasions, performance will be affected by variation in the irrelevant dimension, even though the dimensions might be processed independently of one another. This evidence does not mean that processing is nonindependent.

A different way to address the issue of independent processing of facial identity and emotional expression is to have both dimensions be potentially relevant to the task and to examine how the dimensions combine to influence performance. For example, consider a task in which observers are required to detect three targets: (1) Person A depicted with a neutral expression, (2) Person B with a sad expression, and (3) Person A with a sad expression. Here, the third target has a redundant combination of the identity properties and the emotion properties that define Targets 1 and 2. We can ask whether the combination of identity (Person A) and emotional expression (sad) would lead to an improvement in performance: a redundancy gain in responding to Target 3 relative to Targets 1 and 2. Moreover, by examining the nature of this redundancy effect, we can learn new details about how facial identity and emotional expression modulate information processing, since redundancy gains can occur that are above and beyond the effects that can be accounted for in any model assuming independent processing of facial dimensions. To understand this, we need to outline the logic of redundancy gains in information processing.

Redundancy gains in information processing

Considerable evidence has indicated that, when a visual display contains two targets that require the same response, RTs are faster than when only one target appears (Krummenacher, Müller, & Heller, 2001; Miller, 1982, 1986; Miller, Ulrich, & Lamarre, 2001; Mordkoff & Miller, 1993; Raab, 1962; Reinholz & Pollmann, 2007; Wenger & Townsend, 2006). For example, in Mordkoff and Miller’s study, participants were required to divide their attention between the separable dimensions of color and shape, with all stimulus features being attributes of a single object. In this task, participants were asked to press a button if the target color (green), the target shape (X), or both target features (green X) were displayed, and to not respond if neither target was present. In this case, single-target displays included a purple X or a green 0, and redundant-target displays always included a green X. The mean RT on redundant-target trials was significantly less than mean RTs on single-target trials (Mordkoff & Miller, 1993).

Different explanations have been developed to account for the redundant-target effect, the most relevant of these being the independent-race model (Raab, 1962) and the coactivation model (Miller, 1982). According to the independent-race model, redundancy gains are explained by means of “statistical facilitation” (Raab, 1962): Whenever two targets (in our case, facial identity and emotional expression) are presented simultaneously, the faster signal determines the “target present” response (i.e., this signal wins the race). As long as the processing time distributions for the two signals overlap, RTs will be speeded when two targets are present, since the winning signal can always be used for the response (Raab, 1962). Note that which signal finishes “first” may depend on whether it is attended. For example, emotional expression or identity may be computed first, if there are fluctuations in attention to each independent dimension.

The independent-race model contrasts with a coactivation account (Miller, 1982), which states that two signals combine in activating a response. According to the coactivation view, the information supporting a “target present” response is pooled across the features defining the targets prior to response execution (Miller, 1982, 1986). When, in this case, both target identity and target emotional expression contribute activation toward the same decision threshold, the response is activated more rapidly than when only one attribute contributes activation.

The critical contrast for the two models compares the probabilities for the RTs obtained on redundant-target trials relative to the sum of the probabilities for responses to both types of single-target trials. The independent-race model holds that at no point in the cumulative distribution functions should the probability of a response to redundant targets exceed the sum of the probabilities for responses to the single targets. In contrast, according to the coactivation account, responses to the redundant targets can be made before either single target generates enough activation to produce a response. Thus, here the fastest responses to a face containing both the target identity and the target emotional expression should be faster than the fastest responses to either target facial identity or target expression

Mordkoff and Yantis (1991) proposed a conceptual compromise between the independent-race and coactivation models. Their account, the interactive-race model, assumes a race between parallel processes on redundant-target trials, but holds that the two target features may exchange information prior to a response being made. Two mechanisms have been proposed for information exchange: interchannel crosstalk and non-target-driven decision bias (Mordkoff & Yantis, 1991). Crosstalk occurs when identification of one signal is influenced by another signal. For example, take the case in which one photograph contains both targets, say Person A with a sad expression. If participants associate the identity with the expression, processing face identity could reduce the threshold to detect the target expression, speeding responses when both the target identity and the emotional expression are present, relative to when the expression is present in a face not bearing the target identity.

Non-target-driven decision bias concerns the possible effects that the nontarget attributes may have on “target present” decisions (Mordkoff & Miller, 1993; Mordkoff & Yantis, 1991). In contrast to the independent-race and coactivation models, both of which hold that only target signals activate a response, the interactive-race model proposes that nontarget signals are correlated with “target not present” decisions. For instance, if a display of a face contains both the target identity and a nontarget emotional expression, the expression could activate an “absent” response. This could slow RTs on trials when just one target is present, relative to when both attributes are present. Thus, the interactive-race model explains the redundant-target effect in terms of the influence of nontarget signals on “target present” responses, rather than of interactive processing between the target signals.

Experimental design and hypotheses of the present study

In the present study, we examined the presence of redundancy gains when people respond to target face identities and emotional expressions. If identity and emotion signals are integrated, then RTs to a face containing both the target identity and the target emotional expression will be shorter than RTs when either the target emotion or the target face identity appears alone.

Specifically, if the probability of responses on a trial with redundant face and emotion targets is greater than the summed probabilities of responses on single-target trials at any part of the cumulative response distribution, then the independent-race model is refuted. We examined this question in Experiments 1–3. Given violation of the independent-race model, in Experiment 4 we tested whether facial identity and emotion are processed coactively or interactively. In Experiment 5, we assessed whether the apparent coactivation effect was based on pictorial properties of the image or depended on the discrimination of structural features from faces, by testing redundancy gains with inverted images. Redundancy gains from pictorial properties should be found with inverted as well as with upright faces.

The strongest test of the different models of information accumulation requires that the RT distributions for the single targets be as close as possible, in order to obtain a maximal redundancy gain. To ensure that this was the case here, an initial set of experiments was conducted in which participants made perceptual match decisions to pairs of faces differing in identity and emotional expression. On the basis of the speed with which “different” decisions were made, face identities and emotional expressions were chosen such that the time to discriminate between a potential redundant target (target identity + target emotional expression present) and a potential target with just one attribute (target identity + neutral emotion or nontarget identity + target emotion) was the same for both attributes (i.e., the possibility that the target identity was discriminated faster than the target emotional expression, or vice versa, was eliminated). The pre-experiments are reported in the supplementary materials.

A divided-attention task was employed. This task has commonly been used with written words and auditory stimuli (Grice, Canham, & Boroughs, 1984; Grice & Reed, 1992; Krummenacher et al., 2001; Miller, 1986; Mordkoff & Miller, 1993), but not in studies of face perception. In a divided-attention task, participants are required to monitor two sources of information simultaneously for a target, and then make a decision about the presence or absence of the target. There are two main advantages in employing the divided-attention task. First, a divided attention task in which people are required to simultaneously attend to identity and emotional expression in unfamiliar faces closely resembles daily life. Second, in contrast to the selective-attention task, the divided-attention task allows control for performance in the single-target conditions by including a double-target display (a control for target identification) as well as a nontarget display containing irrelevant dimensions that also accompany either single target (a control for participants’ strategy of making a “target present” response for either single target). To the extent that facial identity and emotion are independent, the time for encoding a face containing both targets will not differ reliably from the time for encoding faces containing either a single target (assuming equal perceptual discriminability of identity and emotional expression). Here, participants were presented with a set of selected photographs of faces that varied in both identity and emotion, and participants were instructed to respond “signal present” as quickly as possible when they saw a target person and/or a target emotional expression. When they saw neither the target person nor the target emotional expression, they were required to respond “target absent.” The experimental design equated the probabilities of “target present” and “target absent” responses.

Experiments 1–3

Coactive versus independent processing of facial identity and emotional expression

Three separate experiments (Exps. 1–3) were conducted to test whether the processing of face identity and emotional expression takes place in an independent or coactive manner. The aim of these experiments was to examine whether a redundancy gain would occur when a face image contained both the target identity and expression, relative to when it contained only the identity or the emotional expression. All three experiments featured the same experimental design but varied in the target identity that was present (using different actors) and the facial emotional expression (sad, angry, and neutral expressions in Exps. 1, 2, and 3, respectively). The image set with neutral faces as targets was tested in Experiment 3 so as to assess whether redundancy gains in responses required a definite emotion to be present. Emotions such as sadness and anger are likely conveyed by a universal set of muscle movements (Ekman, 1990). In contrast, neutral facial expressions are likely to be more idiosyncratic and also to reflect the absence of one configuration of muscles rather than the presence of a distinct and detectable configuration. This may mean that identity is less likely to be integrated with a neutral expression than with a universal one such as anger or sadness.

In all tasks, the participants had to detect target identities and target emotional expressions from six photographs presented subsequently in random order. Three of these photographs contained targets: Stimulus 1 had both the target identity and the target emotion; Stimulus 2 contained the target identity and a nontarget emotional expression; and Stimulus 3 contained the target emotional expression and a nontarget identity. The three nontarget faces were photographs of three different people and expressed emotions different from those in target faces.

If we found evidence for redundancy gains that are greater than can be predicted by an independent-processing model, this evidence would provide strong constraints of models in which emotional expression and identity are processed independently of each other.

General method

Participants

Three groups of 12 undergraduate students participated in Experiments 1–3 (ten males total). The participants were between 20 and 28 years of age and received credits for participation. All individuals reported normal or corrected-to-normal eyesight.

Stimuli and apparatus

All face images were sourced from the NimStim Face Stimulus Set (Tottenham, Borsheid, Ellertsen, Marcus, & Nelson, 2002). Recognition of facial expression in all of the photographs used in the present study was rated as 80 % or more (Tottenham et al., 2002). Six photographs of faces were used in each of the experiments (see the Appendix).

The photographs were cropped around the hairline to eliminate the possibility of target judgments being based on hairstyle. Any visible background was colored black. The faces were approximately 10 × 13 cm when displayed on a 17-in. monitor. The presentation of stimuli was controlled using E-Prime. The stimuli were presented on the monitor at a viewing distance of 0.8 m. The angular width subtended by each stimulus was approximately 10 °.

Design and procedure

A present–absent RT task was employed. Half of the trials contained images with at least one target (“present” trials), and half had nontarget faces (“absent” trials).

The participants were asked to decide whether either the target person or the target expression was present and to respond as quickly and accurately as possible by pressing “target present” whenever the identity and/or emotion appeared in a display. If no target signals were presented, participants were required to press the “target absent” button (Fig. 1). The instructions were displayed on the monitor and then repeated orally by the experimenter.

Fig. 1
figure 1

An example experimental sequence from Experiments 1–3

The stimuli were presented subsequently in random order in one block of 600 trials (100 trials for each of the six images). In Experiment 1, the first ten trials were designed as warm-up and were not included in further analyses. In Experiments 2 and 3, the test trials followed 60 trials of training. The main reason for this was that preliminary analysis of the RT distribution in Experiment 1 showed a decreasing average response speed for the first subset of 100 trials when compared with the later trials. Although this effect was not dramatic, the subsequent experiments were designed to avoid any such practice effect.

Each trial started with a central fixation cross for 500 ms; following offset of the cross, an image display appeared and remained until the participant had responded. The approximate time for each experiment was 20 min.

Analysis of the data

Two analyses were conducted for each experiment. First, each individual’s correct RTs to target faces were examined to see whether there was general evidence for the redundancy effect. Mean RTs across the two single targets (i.e., emotion or identity only) were substracted from the mean RT for redundant targets for each participant. A positive value following this substraction was considered a redundancy gain. Subsequently, the size of an individual’s redundancy effect was corrected using the fixed-favored-dimention test (Biederman & Checkosky, 1970). It has been shown that, when some observers favor one dimension over another, there is an overestimation of the mean RT redundancy gain relative to the fastest single-dimension condition for each observer (Biederman & Checkosky, 1970; Mordkoff & Yantis, 1991). The fixed-favored-dimension test involves comparing the two single-target conditions for each observer against each other. When the two conditions differ, the faster mean RT is retained as the conservative estimate of single-target mean RT; when the two conditions do not differ, the overall mean from both single-target conditions is used. In our Results sections below, we present the corrected redundant-target effects only, as being most relevant.

Our second analysis tested whether the independent-race model inequality was violated (Miller, 1982). This test made use of the cumulative probability density functions (CDFs) of the latencies obtained for the redundant targets and for each of the single targets, and the test can be expressed as follows:

$$ {G_{\text{IE}}}(t) < {G_{\text{I}}}(t) + {G_{\text{E}}}(t), $$
(1)

where G(t) is the probability that a response has been made by time t, “I” and “E” refer to a target defined by identity and a target defined by emotional expression, respectively, and “IE” refers to redundant targets.

The G IE variable in Inequality 1 sets an upper boundary for the cumulative probability of a correct response at any time (t) given redundant targets (IE). According to the independent-race model, the redundant-target function cannot exceed this upper bound because the mean of the minimum of two random variables (IE) is less than or equal to the sum of the smaller means of both variables (I and E). In contrast, the coactivation model holds that the upper bound should be violated, because responses to a redundant target must be faster than the fastest responses to either single target (Miller, 1982).

To conduct these tests of the Miller (1982) inequality, empirical CDFs were estimated for every participant and every target condition. All calculations followed the algorithm for testing the independent-race model inequality (Ulrich, Miller, & Schröter, 2007). First, the 100 RTs generated by each participant for all target trials were sorted in ascending order to estimate 19 percentiles (the 5th through the 95th, at five-percentage-point intervals). Then, these numbers were averaged across participants to produce the composite CDF for redundant targets and for each single-target condition. To produce the sum of CDFs for I and E trials, RTs for these trials were pooled together, and 19 quantiles were estimated on the basis of only the fastest 100 of the 200 trials. All calculations were conducted using a MATLAB script for computing the independent-race model test (Ulrich et al., 2007).

The 19 percentiles and CDFs were calculated for each participant and then averaged. Paired two-tailed t tests were used to assess the reliability of the difference between G IE and the sum of G I and G E at each percentile point.

Graphic representations of the distributions were constructed using group RT distributions obtained by averaging the individual RT distributions (Ulrich et al., 2007). When the CDFs are plotted, the independent-race model requires that the CDF of the redundant-target trials be below and to the right of the summed CDF.

Results

Experiment 1: Identity and sad expressions

The accuracy performance is displayed in Table 1.

Table 1 Percentages of errors for redundant targets (IE), identity targets (I), emotional expression targets (E), and the three nontarget faces (NTs) in Experiments 1–3

A one-way repeated measures ANOVA showed that the differences between the errors for redundant targets and for either single target were reliable [F(2, 22) = 6.2, p < .05, η 2 = .32; t(11) = 3.5, p < .05, d = 0.37; t(11) = 4.1, p < .05, d = 0.41]. Participants showed high sensitivity to images containing target signals (d′ = 3.64).

The results of the redundant-target effect analysis showed that overall the redundancy effect did occur: The redundant condition was faster (M = 536.3, SD = 85.8) than the fastest (in this case, the emotional expression target) single target (M = 664.7, SD = 85.6) condition. A one-way repeated measures ANOVA with Bonferroni correction for multiple comparisons showed a reliable difference between the mean RTs for redundant targets and both the identity-defined target (M = 669.9, SD = 99.4) and the emotional expression target [F(2, 22) = 75.03, p < .001, η 2 = .69; t(11) = 8.8, p < .001, d = 0.55; t(11) = 11.6, p < .001, d = 0.72].

The CDFs for redundant targets exceeded the CDF for the sum of the emotional expression targets and the identity targets at the first nine quantiles (all ps < .05) (Fig. 2).

Fig. 2
figure 2

Cumulative probability density functions (CDFs) for single targets (I and E), redundant targets (IE), and the sum of the distributions of the identity and emotional expression targets (I + E) in Experiment 1

Experiment 2: Identity and angry expressions

The overall percentage of errors was 4.51 (Table 1). Errors were greatest with the emotional expression target. Participants tended to use a conservative response bias (β = 1.54), but they showed good discrimination between images containing target information and those in which the target was absent (d′ = 3.22). A one-way repeated measures ANOVA revealed a reliable difference between the error rates for identity targets relative to the emotional expression targets and the redundant-target stimulus [F(2, 22) = 7.1, p < .05, η 2 = .58; t(11) = 4.3, p < .05, d = 0.37; t(11) = 3.8, p < .05, d = 0.36].

The redundant condition was faster (M = 520.9, SD = 69.7) than the condition with either an emotional expression target (M = 683.8, SD = 132.6) or an identity-defined target (M = 648, SD = 105.3). A one-way repeated measures ANOVA with Bonferroni correction for multiple comparisons revealed RT differences across the test blocks between trials with redundant targets as compared with trials with emotion expression and identity targets [F(2, 22) = 50.4, p < .001, η 2 = .81; t(11) = 7.4, p < .001, d = 0.64; t(11) = 10.2, p < .001, d = 0.76].

RTs for redundant targets were shorter than the sum of the RT distributions for the identity and emotional expression targets at the first eight quantiles (all ps < .05) (Fig. 3).

Fig. 3
figure 3

Cumulative probability density functions (CDFs) for single targets (I and E), redundant targets (IE), and the sum of the distributions of the identity and emotional expression targets (I + E) in Experiment 2

Experiment 3: Identity and neutral expressions

The overall percentage of errors was 2.09 (Table 1). Participants showed a conservative response bias (β = 1.4) and good discrimination (d′ = 3.94). A one-way repeated measured ANOVA showed that the differences between the errors for redundant targets and for each single target were not reliable [F(2, 22) = 1.24, p > .05, η 2 = .1; t(11) = 1.45, p > .05, d = 0.06 (for target identity); t(11) = 1.12, p > .05, d = 0.1 (for target emotional expression)].

For RTs, a one-way repeated measures ANOVA failed to reveal a significant difference between the different targets [redundant, identity, and emotional expression; F(2, 22) = 1.67, p > .05, η 2 = .08]. The RTs are displayed in Table 2. Redundant targets failed to exceed the sum of the two single targets at any quantile (Fig. 4).

Table 2 Mean reaction times (RTs) of responses to redundant (IE), identity (I), and emotional expression (E) targets in Experiment 3
Fig. 4
figure 4

Cumulative probability density functions (CDFs) for single targets (I and E), redundant targets (IE), and the sum of the distributions of the identity and emotional expression targets (I + E) in Experiment 3

Comparisons across Experiments 1–3

A one-way repeated measures ANOVA compared the sizes of the redundancy gains across Experiments 1, 2, and 3. The sizes of the redundancy gains differed significantly across experiments [F(2, 34) = 37.75, p < .001, η 2 = .72]. The size of the redundancy gain in Experiment 3 (M = –16.9, SD = 44.9) was reliably smaller than the sizes in Experiments 1 (M = 88.57, SD = 34.74) and 2 (M = 124.8, SD = 44.09) (all ps < .001, Bonferroni corrected). There was no difference between the sizes of the redundancy gains for Experiments 1 and 2 (p > .09).

Discussion

In Experiments 1 and 2, responses to redundant targets were faster than responses to the targets defined by identity and emotional expression alone. This is consistent with findings from prior experiments using simple stimuli in which performance was facilitated if targets were present in two rather than one stimulus (Grice & Reed, 1992; Miller, 1986; Mordkoff & Yantis, 1991). The present results show for the first time, though, that identity and emotional expression can combine to facilitate discrimination performance. Particularly striking is our finding that violations of the Miller (1982) inequality test occurred when structural identity was combined with a specific, universal emotional expression in a single target. This test provides a strict assessment of whether discrimination performance can be accounted for by independent processing of the critical, target-defining properties. Our evidence indicates that it cannot.

Violation of the Miller inequality occurred for combinations of identity and a sad (Exp. 1) and identity and an angry expression (Exp. 2), but not for the combination of identity and a neutral expression (Exp. 3). Indeed, in the last case, there was not even evidence for any overall redundancy gain. This result suggests that viewing a distinct emotional expression (e.g., sad or angry) paired with target identity perhaps benefits recognition because these emotions are conveyed by distinct visual features. In contrast, unfamiliar faces bearing a neutral expression do not carry expression-contingent features, and a neutral expression may be defined by the absence of a universal emotional expression, making it more idiosyncratic to the particular face. For these reasons, there may be no redundancy gain when the neutral expression for one face combines with the structural identity of a another target face.

Our data can at least partly explain why emotional expression may help in identity recognition, if the two dimensions combine to form a unitary identity–expression code (e.g., D’Argembeau, Van der Linden, Etienne, & Comblain, 2003). Positive effects of emotional expressions on face identification performance have also been demonstrated by de Gelder, Frissen, Barton, and Hadjikhani (2003). These authors reported that, for patients with impaired configural processing, the matching of face identities was improved dramatically when the faces had emotional rather than neutral expressions. De Gelder et al. suggested that their result arose because emotional expressions provided the patients with additional facial cues that made recognition of the person more efficient.

Experiment 4: Tests of feature overlap

Coactive versus interactive processing of facial identity and emotional expression

Having established that superadditive redundancy gains occur between face identity and emotional expression, at least when facial expressions convey distinct emotions, in Experiment 4 we went on to test whether identity and emotional expression information of faces are processed coactively or interactively. The interactive-race model (Mordkoff & Yantis, 1991) assumes that the probability of one target can be made dependent on the occurrences of the second target and nontarget information at different stages of target identification. One factor is that the greater predictability of one stimulus should speed RTs (the interstimulus contingency effect, or ISC). A second factor is a nontarget response contingency bias (NRCB), which refers to the possible use of attributes of nontargets to cue responses to targets. For example, in Experiment 1, participants might have associated target identity with a sad expression in the redundant face and used emotional expression cues only for “target present” responses (or vice versa, for the face identities). In this case, increasing the probability of the combination of a sad expression with a target identity would lead to the shortening of RTs on redundant trials. Another possibility that might benefit redundant-target trials over the identity and emotional expression target trials was that each single-target trial included nontarget information. For instance, in Experiment 1, the identity-defined target face contained a nontarget happy expression, and the emotional expression target face contained nontarget identity information (see the Appendix). According to the interactive-race model (Mordkoff & Yantis, 1991), including such stimuli in the experimental design slows RTs for the single identity and the emotional expression target trials, because these target trials are biased by the nontarget properties that are present.

The interactive-race model (Mordkoff & Yantis, 1991) holds that (a) if both ISC and NRCB are zero (i.e., there are equal numbers of trials for all stimuli), the independent-race model expressed in Inequality 1 will be always satisfied, while (b) if ISC and NRCB are positive (i.e., the probability of redundant-target or nontarget trials is higher than is the probability of either type of single-target trials), the independent-race model inequality will be violated. In contrast, the coactivation model (Miller, 1982) assumes that variation in ISC and NRCB does not affect the redundancy gain, and violation of the independent-race model will be relatively constant across these conditions.

To test the effects of ISC and NRCB on RTs, three settings (Exps. 4a, 4b, and 4c) were designed using go/no-go tasks. These experiments were based on the same stimuli, timing parameters, trial order, and response demands, differing only in the probabilities with which the stimuli were displayed. In Experiment 4a, both ISC and NRCB were zero; in Experiment 4b, ISC was positive; and in Experiment 4c, NRCB was positive (see Table 3 in the Design and Procedure section).

Table 3 Numbers of trials for the redundant target (IE), the identity target (I), the emotional expression target (E), and the three nontarget faces (NT1, NT2, and NT3) in Experiments 4a, 4b, and 4c

Method

Participants

Three groups of 15 undergraduate students (13 males total) were recruited. The participants were between 19 and 26 years of age, and all reported normal or corrected-to-normal eyesight.

Stimuli and apparatus

The stimuli from Experiment 1 were used (see the Appendix).

Design and procedure

A go/no-go task was employed to examine the effects of interstimulus contingency and nontarget response bias on identity and emotional expression judgments. Half of the trials used stimuli containing at least one target attribute (target identity, target emotional expression, or both targets; “go” trials). On the other half of the trials, the stimuli did not convey any target attribute (“no-go” trials).

Participants were randomly assigned to one of the three experiments. They were asked to respond as quickly and accurately as possible when the target identity and/or emotional expression was displayed by pressing a “target present” button on the keyboard. The targets were Person 1 expressing a sad emotion (redundant targets); Person 1 with a happy expression (target identity and nontarget expression); and Person 2 with a sad expression (target expression and nontarget identity). Faces of Person 2 with a happy expression, Person 3 with a neutral expression, and Person 4 with a neutral expression (nontarget identity and nontarget emotional expression) were employed as three nontargets.

Participants completed an initial practice block of 60 trials, during which they were given a feedback on their accuracy after each trial. Individuals participating in Experiments 4b and 4c after completing the practice block were then informed which images would be displayed more often. After a short break, the participants performed a test block of 600 trials. The stimulus contingencies for each experiment are displayed in Table 3.

Each trial started with the presentation of a fixation cross at the center of the screen for 500 ms. Images were presented successively in random order. On “go” trials, the image was displayed until a response was made. On “no-go” trials, the stimulus was displayed for 1,500 ms.

Analysis of data

For each experiment, two analyses were conducted. The first analysis determined whether redundant-targets trials were responded to more quickly than were single-target trials by using the favored-dimension test (Biederman & Checkosky, 1970). The second analysis assessed whether the independent-race model inequality was violated (Miller, 1982). Both analyses were conducted in the same manner as in Experiment 1. To examine the effects of ISCB and NRCB on violations of the independent-race model inequality, a one-way ANOVA was performed.

Results

In all three experiments, participants produced more errors in response to the single targets than to redundant targets. The highest error rate was for “no-go” trials to a nontarget face (NT1) containing both identity and emotion distractors (Table 4). Participants tended to have a liberal response bias (β = 0.19, 0.3, and 0.54, respectively, in the three experiments) and good discrimination (d′ = 3.54, 3.31 and 3.41, respectively). The differences in error rates between the experiments were not significant [F(2, 42) = 1.94, p > .12, η 2 = .06]. There was a significant main effect of target [F(2, 84) = 11.6, p < .05, η 2 = .79]. Pairwise Bonferroni-corrected comparisons showed that errors for nontarget face NT1 were reliably higher than those to all other stimuli (p < .05). There was no interaction between target and experiment [F(4, 84) = 1.92, p > .05, η 2 = .1].

Table 4 Error rates (as percentages) for redundant (IE), identity (I), and emotional expression (E) targets and for nontarget faces NT1, NT2, and NT3 in Experiments 4a, 4b, and 4c

A redundant-target effect was found in all three experiments (Table 5).

Table 5 Mean reaction times (RTs) to redundant (IE), emotional expression (E), and identity (I) targets in Experiments 4a, 4b, and 4c

A mixed-design ANOVA with Experiment as a between-subjects factor and Target (redundant, identity, and emotional expression) as the within-subjects factor was carried out to examine whether RTs for redundant targets were shorter than those for single-target identity and emotional expression trials. We found a main effect of target [F(2, 84) = 18.8, p < .001, η 2 = .83]. Pairwise Bonferroni-corrected comparisons showed that RTs were faster for redundant targets than for either type of single target (Table 5) for all experiments (ps < .001). There was no main effect of experiment [F(2, 42) = 1.55, p > .2, η 2 = .12] and no interaction between experiment and target [F(4, 84) = 1.75, p > .05, η 2 = .07].

All three experiments showed significant violations of the independent-race model inequality. In Experiments 4a and 4b, the independent-race model inequality was violated at the 5th to the 35th percentiles (all ps < .05). In Experiment 4c, violations were found at the 5th to 45th percentiles (all ps < .05). The group CDFs for redundant targets and for the sum of target identity and emotional expression targets in Experiments 4a–4c are displayed in Fig. 5.

Fig. 5
figure 5

Cumulative probability density functions (CDFs) for single targets (I and E), redundant targets (IE), and the sum of the distributions of the identity and emotional expression targets (I + E) in Experiments 4a (top left), 4b (top right), and 4c (bottom)

A univariate one-way ANOVA with Experiment as the between subject factor was used to test whether there were differences in the sizes of the redundancy gains across Experiments 4a–4c (Table 6). The sizes of the redundancy gain were calculated by subtracting RTs for redundant targets from those for the fastest of the single targets at each percentile. There was no effect of experiment on the size of the redundancy gain [F(2, 44) = 0.46].

Table 6 The size of the redundancy gain and standard deviation (in parentheses) in Experiments 4a, 4b, and 4c

Discussion

Experiments 4a–4c demonstrated that manipulations of interstimulus and nontarget contingencies did not affect the redundancy gains between facial identity and emotional expression. Experiment 4a showed a reliable violation of the race model inequality using a design that lacked biased contingencies. A similar result was obtained when the contingency was biased in favor of redundant-target trials (Exp. 4b) and of nontarget trials (Exp. 4c). Moreover, there were no differences between the sizes of the violations in Experiments 4a–4c. These results contradict the interactive-race model (Mordkoff & Yantis, 1991). The maintenance of significant violations of the independent-race model across all the subexperiments is consistent with a coactivation account.

Notably, participants in Experiment 4b—in which the redundant targets had a higher probability of occurrence than did either single target—were less accurate than those in Experiment 4a. According to the interactive-race model (Mordkoff & Yantis, 1991), the positive interstimulus contingency should improve accuracy in response to redundant targets. However, this was not the case. This provides additional support for the coactivation account and again counters the interstimulus contingency proposal.

In all subexperiments, a nontarget face containing both distractors (NT1) elicited more false alarms than did the other stimuli. Although the percentages of errors were not very high, this finding suggests that participants could not ignore the task-irrelevant information completely. On the other hand, this effect was observed in a go/no-go task, but not in a two-choice task (Exp. 1). Given that participants showed different response biases in Experiments 1 and 4, this might partly reflect a difference in nondecision processes in these tasks (Grice & Reed, 1992; Perea, Rosa, & Gómez, 2002).

Experiment 5: Inverted faces

Pictorial versus structural coding of facial identity and emotional expression

Although Experiments 1, 2, and 4 here demonstrated significant redundancy gains when participants responded to both structural identity and emotional expression in faces, it remains unclear what information was used for the task. It is possible, for example, that participants remembered the pictorial properties of specific targets and distinguished faces on the basis of these cues (Bruce & Young, 1986). It is not necessarily the case that responses were based on the true extraction of facial identity and emotion information. Now, many previous studies on face perception have shown that recognition of the structural properties of faces can be dramatically disrupted when faces are displayed upside down, in comparison with upright orientation (Freire, Lee, & Symons, 2000; Leder & Bruce, 2000; Searcy & Bartlett, 1996). The specific effect of inversion on identity processing has been attributed to the disruption of coding the configural relations between facial features (e.g., the distances between the eyes, nose, and mouth; Searcy & Bartlett, 1996; but see Sekuler, Gaspar, Gold, & Bennett, 2004, for an alternative view), and a similar argument can be made for emotional expression processing. For instance, McKelvie (1995) reported that inversion reduced accuracy for discriminating sadness, fear, anger, and disgust, with sad expressions being identified as neutral.

In Experiment 5, we tested for redundancy gains with inverted faces. If the redundancy gains depend on structural encoding of facial configurations, then the gains should be eliminated here. On the other hand, since the pictorial cues remain the same whether faces are upright or inverted, if the redundancy gains in Experiment 5 are based on pictorial cues, they should match those we have observed earlier.

Method

Participants

A group of 12 undergraduate students (three males total) between 21 and 26 years of age participated in this study, and received credits for participation. All participants reported normal or corrected-to-normal vision.

Stimuli and apparatus

A set of inverted images from Experiment 1 was employed. This set included sad and happy faces that gave a maximum opportunity to process the inverted faces.

Design and procedure

The design and procedure were identical to those of Experiment 1, except that the faces were inverted.

Results and discussion

The overall percentage of errors was 24.5. The participants were less accurate in responding to emotional expression than to redundant and identity targets (Fig. 6). False alarms to one of the inverted nontargets, NT3 (see the Appendix), occurred on 50.2 % of all trials. Thus, participants overall showed low sensitivity to images containing targets (d′ = 1.31).

Fig. 6
figure 6

Error rates (as percentages) for redundant (IE), identity (I), and emotional expression (E) targets and for nontarget faces NT1, NT2, and NT3 in Experiment 5

A one-way repeated measures ANOVA with Condition (IE, I, E, NT1, NT2, or NT3) as a within-subjects factor was conducted to assess whether accuracy differed across the conditions. There was a main effect of condition on accuracy [F(5, 55) = 6.55, p = .001, η 2 = .61]. Pairwise comparisons within the main effect of condition (corrected using a Bonferroni adjustment) indicated reliable differences between redundant targets (IE) and both nontarget NT3 (p < .05) and the emotional expression target (E) (p < .05). The mean RTs for all of the conditions in Experiment 5 are displayed in Table 7.

Table 7 Mean reaction times (RTs) for redundant (IE), identity (I), and emotional expression (E) targets and for nontarget faces NT1, NT2, and NT3 in Experiment 5

A one-way repeated measures ANOVA with Target (redundant, identity, and emotional expression targets) as a within-subjects factor was carried out for RTs. There was a main effect of target on RTs [F(2, 20) = 5.1, p < .05, η 2 = .57]. Pairwise Bonferroni-corrected comparisons showed that participants were faster in responding to the identity target than to either the redundant target (p = .001) or the emotional expression target (p < .05) (Table 7).

Two mixed-design ANOVAs were conducted with Experiment (Exps. 1 and 5) as a between-subjects factor and Condition (redundant, identity, and emotional expression targets and the three nontargets) as a within-subjects factor for accuracy and RTs. For accuracy, we found main effects of experiment [F(1, 22) = 48.5, p < .001, η 2 = .91] and condition [F(5, 110) = 5.4, p < .001, η 2 = .74] and a reliable interaction [F(5, 110) = 5.2, p < .05, η 2 = .77]. For RTs, there were main effects of experiment [F(1, 22) = 19.98, p < .001, η 2 = .86] and condition [F(5, 110) = 2. 54, p < .05, η 2 = .42], as well as an interaction [F(5, 110) = 12.8, p < .001, η 2 = .67]. Pairwise comparisons using Bonferroni adjustments showed that in Experiment 1 the RTs for redundant targets (M = 536.3, SD = 85.8) were faster than those to the same targets in Experiment 5 (M = 961.5, SD = 133.6; p < .05).

The manipulation of inversion in Experiment 5 produced longer RTs and reduced response accuracy, consistent with there being a decreased sensitivity for target signals. This finding is in line with previous studies on inverted faces (Freire et al., 2000; Searcy & Bartlett, 1996). Notably, the image containing redundant targets and image NT3 had similarly shaped faces (see the Appendix). When the faces were inverted, this similarity might have been a cause of the poor accuracy on NT3, because inversion impairs the recognition of internal facial features (Sekuler et al., 2004).

It could be argued that the low performance on the discrimination task here minimized any opportunity for redundancy gains to arise. For example, if participants could not discriminate facial emotion, then naturally the emotion would not facilitate responses to target identity. However, while accuracy did decrease here, it remained considerably higher than would be expected in terms of chance responses to one of the six stimuli (16.7 %). Hence, there was some opportunity for facial emotion still to affect responses to face identity, but we found no evidence for this.

Taken together, the result showing poor discrimination of target signals and higher error rates in response to both targets and nontargets suggests that structural encoding (sensitive to face inversion) contributed to the redundancy gain here, and that the effects were not solely dependent on pictorial encoding (common to upright and inverted faces).

General discussion

The experiments reported here demonstrated redundancy gains in the processing of facial identity and emotional expression. In Experiments 1, 2, and 4, there was evidence for violation of the Miller (1982) inequality, consistent with coactivation of identity and emotion information in faces. These violations occurred with different face identities and with the emotional expressions for sadness and anger. These data contradict independent-processing models for identity and emotion. Experiment 4 further showed that the effects were not dependent on interstimulus contingencies and nontarget associations, going against the interactive-race model (Mordkoff & Yantis, 1991) account of the data. We concluded that facial identity and emotion information are processed together and that both contributed here, in a nonindependent manner, to target detection.

Experiment 5 tested whether performance was dependent on pictorial or structural coding of faces by examining target detection when faces were inverted. The effects of redundancy on RTs were eliminated in this case. The effects were also eliminated in Experiment 3, where face identity was combined with a neutral facial emotion to create the redundant target. These last results suggest that the redundancy gains were not due to memory of the pictorial properties of the stimuli, but that there needs to be a specific expressed emotion in order for facial information to be processed coactively. In contrast to facially expressed emotions such as sadness and anger, neutral facial expressions may vary across individuals and may be difficult to extract as a common category from different faces—and as a consequence, redundancy gains are difficult to find.

An important aspect of our findings is that the redundancy effect was robust for different facial identities and emotional expressions. At least two features might have contributed to this result. First, we used a task in which both the structural identity and the expressed emotion were integrated in a single stimulus. Previously, similar results have been obtained in studies examining the relation between processing the color and shape of a single stimulus (Mordkoff & Yantis, 1991). In a task requiring participants to detect two targets (e.g., the color green and the shape X), the redundant-target display (a green X) was processed faster than either single target, and violations of the independent-race model were observed (Exps. 1–3; Mordkoff & Yantis, 1991). In contrast, when participants performed a task requiring the detection of a shape and color belonging to different objects, the data supported independent processing (Exps. 4 and 5; Mordkoff & Yantis, 1991). Second, in the present study the effect of differences in the discriminability of identity and emotional expression was controlled. The effects of discriminability on the processing of identity and emotional expression has previously been demonstrated in studies of Garner interference (Ganel & Goshen-Gottstein, 2004; Melara, Rao, & Tong, 2002). For instance, Ganel and Goshen-Gottshtein showed that when the discriminability of identity and expression judgments were equated, Garner interference occurred in both directions. In contrast, in studies in which discriminability was not controlled, either no interference (Etcoff, 1984) or asymmetric interference (Schweinberger & Soukup, 1998) occurred.

The present results suggest that the redundant-target effect is caused by an interaction between facial identity and emotional expression. This raises the question of the level of processing at which this interaction occurs. The coactivation model (Miller, 1982) proposes that the interaction between stimuli leading to a redundancy gain occurs prior to a decision about target presence, but, in this case, after identity and emotional expression have been separately coded. In contrast, the interactive-race model (Miller, 1982) suggests that information about facial identity and emotional expression may be exchanged at early perceptual levels (interstimulus crosstalk) or at a decisional stage (nontarget response bias). There are also suggestions that coactivation for redundant targets occurs at late, motor-related stages (Giray & Ulrich, 1993; Li, Wu, & Touge, 2010). Electroencephalographic studies (e.g., Schröger & Widmann, 1998) examining the processing of bimodal (audio and visual) stimuli have indicated that RT gains for redundant targets are located neither at early, sensory-specific nor at motor stages, but at an intermediate, central stage of processing, consistent with the coactivation view. It is interesting to note here that the redundancy effects in Experiments 1 and 2, using a task with two responses, was not different from that in Experiment 4, which required only a single response. This suggests that the interaction is unlikely to have occurred at a late, motor stage. Whether the effect arises at early or at more intermediate processing stages remains to be tested.

The present results make a theoretical contribution to the field by contravening a strictly modular account of processing facial identities and emotions, as has been suggested in both psychological (Bruce & Young, 1986) and neural-level models (Haxby, Hoffman, & Gobbini, 2000). According to the functional model of face recognition proposed by Bruce and Young (1986), the recognition of facial expressions and of identity are assumed to be independent. Our data refute this idea, since they show that an independent-processing model cannot account for the magnitude of the redundancy gain that we observed in the very fastest responses that were produced on a trial. The data showed that, at some point along the processing stream, facial identity and expression interact. This presents a strong constraint on models of face processing.

The neural basis of the present effects remains to be explored. Haxby et al. (2000) proposed that the occipital face area (OFA) and fusiform face areas (FFA) contribute to the processing of facial identities, while the superior temporal sulcus and amygdala contribute to the processing of emotional expression. However, there is also evidence that the FFA, traditionally associated with identity processing, is activated more by fearful than by neutral faces (Vuilleumier, Armony, Driver, & Dolan, 2001), and this area has been linked to processing emotional expression (Ganel, Valyear, Goshen-Gottstein, & Goodale, 2005). Furthermore, several functional neuroimaging studies have reported functional connectivity between brain areas that process the identities of individuals (OFA and FFA, plus also the inferior frontal gyrus, associated with stored face representations; Li et al., 2010) and emotion-processing regions (the superior temporal sulcus and amygdala; Kleinhans et al., 2008). Recently, Ishai (2008) proposed a working model of neural coupling that postulates that bidirectional connections between all visual, limbic, and prefrontal regions mediate the processing of faces. Taken together, these studies indicate functional connectivity between the processing of identity and emotional expression that might have contributed to the observed redundancy gains. This question needs to be examined in future research.

In conclusion, the present study provides strong evidence for the interactive processing of identity and emotional expression in faces. However, a number of caveats need to be noted. First, although the present study overcomes some of the limitations of previous studies employing the Garner task (e.g., matching the efficiency of discriminating identity and emotion targets), there remains an issue about the small number of stimuli being used. A modification of the task is needed to minimize effects of repetition against the large number of trials required to test the race model. Second, further work needs to be done to establish what information in faces is perceived as redundant. This will help us understand the stages of processing at which the coding of the structural identity of a face interacts with or bifurcates from the coding of emotional expression. Third, the present study has only examined a limited number of emotions (angry, sad, and neutral). The evidence points to there being interactive processing of identity and the universal emotions of sadness and anger, but there was no evidence for interactivity involving a possibly more idiosyncratic “neutral” expression. Whether the universality of the expression is critical requires further studies exploring a wider range of facial expressions.