Introduction

There is evidence that face processing is modulated by the observer’s emotional states (Attwood et al., 2017; Curby, Johnson, & Tyson, 2012; Johnson & Fredrickson, 2005). However, since existing studies have only demonstrated this with single faces, the influence of emotional states on processing of multiple faces remains largely unknown. In this study, we set out to explore the effects of emotional states on ensemble perception of multiple faces. Ensemble perception refers to our visual system’s ability to extract summary statistics along various stimulus dimensions. This can range from low-level features to mid- and high-level information (Whitney & Yamanashi Leib, 2018). For example, observers can not only extract an average size from a set of dots of varying sizes (Ariely, 2001), they can also extract an average facial expression (Haberman & Whitney, 2007) or an average face identity (de Fockert & Wolfenstein, 2009) from a set of face images.

Prior research has shown that positive emotional states facilitate global whereas negative emotional states promote local processing (Derryberry & Reed, 1998; Fredrickson & Branigan, 2005). When participants were induced with positive, negative, or neutral emotions via film clips before a holistic face-processing task, their increased level of positive emotional state resulted in an increase in holistic processing, whereas their negative emotional state resulted in a decrease of holistic processing (Curby et al., 2012). Although the processing of multiple faces may be different from the processing of a single face, it is possible that a modulation of global versus local processing via emotional state does have an impact on averaging of multiple faces.

This prediction is based on extensive evidence that adopting a global processing orientation tends to facilitate ensemble coding (Chong & Treisman, 2005). The greater focus of the East Asian culture on global information might be responsible for the stronger tendency of Easterners to average facial expressions in a set of faces relative to the Westerners (Im et al., 2017). A recent study by Peng et al. (2020) has shown that this default tendency to focus on global information by Easterners could be temporarily suppressed by a Navon letter task that primed local processing orientation. When a group of Chinese participants in the study were told to report the local letters in the priming task, they subsequently showed a reduced tendency to perceive the average face identity of a face set in a face identity-matching task where they were briefly shown a set of four faces and then indicated whether a subsequent probe face had been presented in the set.

A similar temporary shift of global or local processing could also be induced by modulating the emotional states of the observers. The purpose of this study was to explore whether induced emotional states would modulate ensemble coding of face identities. To this end, we induced participants with positive, negative, or neutral emotions via film clips and measured the effects of these on their performance in an implicit ensemble-coding task developed by de Fockert and Wolfenstein (2009). The task measured the tendency of averaging multiple face identities. It briefly presented a set of faces, and the task was to decide whether a subsequently presented probe face was present in the preceding set. The probe face could be (a) a member of the preceding face set, (b) a distractor face not presented in the set, (c) a morphed average of the preceding face set, or (d) a morphed average face of another face set. In the current study, participants first performed two blocks of this ensemble-coding task. They then viewed positive, negative, or neutral film clips. After the emotion induction, they performed another two blocks of the ensemble coding task.

Method

Participants

Ninety-six university students (66 females, 19.7 ± 1.2 years old) participated in this study. All participants were randomly assigned to one of three emotion induction conditions, with each condition containing 32 participants (Positive: Mage = 19.8 years old, SD = 1.1, Negative: Mage = 19.7 years old, SD = 1.3, Neutral: Mage = 19.6 years old, SD = 1.3). The sample size followed a similar study by Curby et al. (2012), who used 27–33 participants for each of their three conditions. Written consent was obtained from all participants. The study was approved by the local Institutional Review Board.

Materials

Film clips

These were obtained from the standardized database of Chinese emotional film clips developed by Ge et al. (2019). We used two clips for each emotion-induction condition. Just Another Pandora’s Box (67 s) and Love on Delivery (83 s) were used to induce positive emotion; City of Life and Death (73 s) and The Tokyo Trial (81 s) were used to induce negative emotion; and Raise the Red Lantern (62 s) and Black Coal, Thin Ice (65 s) were used to induce neutral emotion. The mean valence and arousal scores were 7.37 and 7.23 for the positive clips, 1.73 and 7.05 for the negative clips, and 3.19 and 3.26, respectively, for the neutral clips. Further details can be found in Ge et al. (2019).

Face stimuli

These were derived from Peng et al. (2019). A total of 56 Asian Chinese face images with neutral expressions were randomly grouped to 14 sets of four gender-matched faces. Based on these face sets, 14 morphed average faces were created using Abrosoft FantaMorph 5.

Positive and Negative Affect Scale (PANAS)

Chinese version (Huang, Yang, & Ji, 2003). This consisted of two subscales (one for positive emotion and one for negative emotion), with each containing ten words (e.g., positive: interested, excited; negative: irritable, distressed). Each word was rated on a 5-point scale, where 1 represented “not at all” and 5 “extremely.”

Procedure

Figure 1 illustrates the experimental procedure. It began with the first PANAS-rating task. This was followed by the first ensemble-coding task. There were five practice trials at the beginning of the task. Each trial began with a 500-ms fixation across at the center of the display followed by a set of four faces showing for 2,000 ms. A probe face was shown at the center of the screen immediately after this. Following de Fockert and Wolfenstein (2009), the probe could be any of four image types, which were matching member, matching average, nonmatching member, and nonmatching average. The participant was instructed to decide whether the probe had been “present” or “absent” in the preceding set. An “F” and “J” key press indicated “present” and “absent,” respectively. There was no time limitation for this response. After two blocks of 56 trials, participants completed the second PANAS and watched two film clips of the same type successively. Immediately after this, they completed the third PANAS and performed another two blocks of ensemble-coding task.

Fig. 1
figure 1

Schematic representation of the experimental procedure

Results

Emotion-induction check

Table 1 shows the mean ratings of all three PANAS. Analyses on the first PANAS data showed no difference among the three groups, either on positive emotions, F (2, 93) = .47, p = .63, or on negative emotions, F (2, 93) = 1.73, p = .18. Similarly, the second PANAS data also showed no difference among the three groups either for positive, F (2, 93) = .23, p = .79, or for negative emotions, F (2, 93) = .06, p = .94). However, the third PANAS results collected after viewing the emotion-induction film clips showed a significant effect of induction on both positive, F (2, 93) = 107.38, p < 001, and negative emotions, F (2, 93) = 462.81, p < 001. The group viewing positive film clips showed more positive feelings than groups viewing neutral (Diff = .52, 95% CI = [.29, .75]), t = 4.50, p < .001, or negative clips (Diff = 1.66, 95% CI = [1.43, 1.89]), t = 14.32, p < .001. Likewise, the group viewing negative film clips showed more negative feelings than the neutral (Diff = .95, 95% CI = [.88, 1.03]), t = 25.91, p < .001, and positive emotional clips (Diff = .98, 95% CI = [.91, 1.06]), t = 26.76, p < .001.

Table 1 Mean PANAS ratings under each emotion induction condition. Values in parentheses represent standard deviations

Ensemble coding performance

The descriptive statistics of “present” responses as the function of induction and the type of probe image conditions are shown in Table 2. To measure the strength of the visual averaging, we followed Rhodes et al. (2015) and calculated an unbiased index of endorsement scores for averages and members by subtracting the “present” responses of nonmatching trials from the “present” responses of matching trials in the table.

Table 2 The mean proportion of “present” responses and endorsement scores as a function of induction and test conditions. Values in parentheses represent standard deviations

We next combined the results of both ensemble tasks in a three-way ANOVA, adding Induction Status (Pre- vs. Post-Induction) as the third variable. The results revealed significant main effects of Group, F (2, 93) = 10.00, p < .001, η2 = .10, Probe Type, F (1, 93) = 47.90, p < .001, η2 = .21, and Induction Status, F (1, 93) = 3.92, p = .049, η2 = .02. These were qualified by both two-way and three-way interactions: Group × Time, F (2, 93) = 11.58, p < .001, η2 = .11, Group × Probe Type (F (2, 93) = 31.69, p < .001, η2 = .25), Probe Type × Induction Status, F (1, 93) = 6.88, p = .009, η2 = .04), and Group × Probe Type × Induction Status, F (2, 93) = 21.37, p < .001, η2 = .19. Two major findings were obtained from the simple effect analyses. First, before emotion induction, there was no group difference for either set average or set member (see Fig. 2A). In contrast, after emotion induction, group differences were found for endorsements of set average (Fig. 2B). Specifically, after viewing negative film clips, participants showed decreased endorsements of set average (M = .22) relative to those who responded to the same set average either after viewing positive film clips (M = .48), p < .001, 95% CI = [-.32, -.20], or neutral ones (M = .43), p < .001, 95% CI = [-.27, -.15]. Second, when endorsement scores were compared across the pre- and post-emotion induction conditions, it was evident that only the groups that received positive or negative emotion changed their endorsement level for set average. The group receiving neutral emotion induction showed no change of scores for this (Mpre- = 45, Mpost- = .43), but the group receiving positive emotion induction showed increased endorsement scores of set average (Mpre- = 41, Mpost- = .48), p = .005, 95% CI = [.02, .12], whereas the group receiving negative emotion induction showed decreased endorsement scores of set average after emotion induction (Mpre- = 41, Mpost- = .22), p < .001, 95% CI = [-.24, -.14].

Fig. 2
figure 2

Endorsement scores as a function of induction group and probe type. (A) Pre-emotion induction. (B) Post-emotion induction. Error bars represent one standard error of the mean. * p < .05, ** p < .01, *** p < .001

Taken together, the group-level data suggested a change of ensemble coding performance as a result of induced emotional state. A positively induced emotion increased whereas a negatively induced emotion decreased ensemble coding of multiple face identities.

Correlation analyses

To further explore the relationship between changes in emotional states and ensemble coding performance at an individual level, we conducted correlation analyses. We excluded the data of the neutral emotion group from these analyses because participants’ PANAS ratings were unchanged before and after they were shown neutral film clips. The changes of emotional status after viewing the film clips were calculated by taking the second and the third PANAS scores. The data from the positive and negative emotion-induction conditions were pooled to keep the full range of values of changes both in ensemble coding performance and of reported emotion. As shown in Fig. 3, changes in emotional states after emotion induction were correlated with changes in ensemble coding. An increase of positive emotional state after viewing positive film clips was accompanied by increased endorsement for the set average, r (62) = .70, p < .001 (Fig. 3A), whereas an increase of negative emotional state was accompanied by decreased endorsement for the set average, r (62) = -.75, p < .001 (Fig. 3B). We performed the same analysis for endorsement scores of the set member, which was uncorrelated with the changes of either positive or negative emotional state, rs (62) ≤ -.15, ps ≥ .22.

Fig. 3
figure 3

Relationship between changes in endorsement scores of set average and changes in reported emotional state. (Upper, A) After viewing positive film clips. (Lower, B) After viewing negative film clips. Black dots represent positive emotion induction in A and black triangles represent negative emotion induction in B

Discussion

The present study explored the effects of emotional states on the visual averaging of multiple face identities. Emotional states were induced with positive, negative, or neutral film clips. Emotional states were measured by PANAS and visual averaging was examined in an ensemble perception task before and after the induction. The results showed increased averaging of face identities after being induced with positive emotion, but reduced averaging of these after being induced with negative emotion. Results at the individual level also revealed that averaging was positively correlated with the increase of positive emotion, but negatively correlated with the increase of negative emotion. These findings provide initial evidence that emotional states can modulate ensemble perception.

Our results also revealed that although both positive and negative emotion modulated ensemble face perception, the size of their effects was not equal. Viewing positive film clips only slightly increased the endorsement of the set average in the ensemble perception task. In contrast, viewing negative film clips substantially reduced the endorsements of set average. The effect of positive emotion was thus considerably smaller compared to the effect of negative emotion. The asymmetry was similar to a finding in Peng et al. (2019), where little or no increase of endorsements of the set average was observed after the participants were primed to process relational/holistic information, yet the endorsements for this decreased clearly when they were primed to process non-relational/individual information. The unequal effects could in part be due to the cultural background of the samples in these studies. Both tested Chinese participants, who were likely to be influenced by an Eastern Asian collectivist cultural tradition. It is known that Eastern culture cultivates a more relational and holistic processing orientation. If this were already their default processing strategy, it should be hard to raise this even further by priming. On the other hand, it could be easier to discourage the use of the default processing orientation through priming a local processing orientation.

Similar unequal effects of positive and negative emotion were reported by Nobata, Hakoda, and Ninose (2010), who showed that the functional field of view became narrower when participants were shown negative emotional stimuli, but unchanged when participants were shown positive or neutral emotional stimuli. Future research could verify whether these common findings could be explained by cultural influences.

Effects in our study demonstrate that the same participants could undergo fairly minor induction and then display differential ensemble processes subsequently. This malleable aspect of ensemble perception has also been demonstrated previously through priming interdependent or independent self (Peng et al., 2019), or through global or local processing orientation (Peng at el. 2020). In addition, the fluid nature of ensemble perception is also reflected by the facts that different observers (e.g., ethnic group) can employ various levels of ensemble coding when looking at the same display (Im et al., 2017; Peng at el. 2020, 2021; Thornton et al., 2019). Developmental and pathological studies have also presented evidence that ensemble perception is not fixed or uniform among individuals (Rhodes et al. 2015, 2018). The correlation analyses in our study are consistent with this literature, showing that ensemble processing is not context invariant and can vary substantially across individuals.

Overall, our results lend further support to the broaden-and-build theory (Fredrickson & Levenson, 1998). According to this theory, positive emotion broadens the scope of attention, facilitating holistic processing, whereas negative emotion narrows the scope of attention, leading to more local processing. This has been verified in various tasks (Curby et al., 2012; Fredrickson & Branigan, 2005; Johnson, Waugh, & Fredrickson, 2010; Vanlessen et al. 2013). Our findings extend such verification to the realm of ensemble perception.