Communicating pain enables individuals to receive physical and social support and may help them obtain protection against potential and subsequent dangers (Prkachin et al., 1983). However, for pain to be communicated efficiently, the behavior of the person in pain must correspond, at least to a certain degree, to the observer’s expectations of how a person in pain looks (Gosselin & Schyns, 2002). In the present study, we used two data-driven methods to investigate how culture may modulate an observer’s expectations about the intensity with which pain should typically be facially conveyed and how it may modulate the visual strategies deployed to process the intensity of pain facial expressions.

Although pain can be communicated in various ways, such as verbal reports, vocalization, and body movements, facial expressions are considered to be the most specific pain behavior in humans (Williams, 2002). So far, many studies have already shown that cultural differences decrease the correspondence between individuals’ facial configuration of emotions and observers’ expectations. However, most of these studies only involved facial expressions of basic emotions, such as anger, fear, sadness, happiness, disgust, and surprise (e.g., Elfenbein & Ambady, 2002; Jack et al., 2009; Jack, Caldara et al., 2012; Jack, Garrod et al., 2012), leaving a very sparse body of research regarding the possible impact of culture on the configuration of pain facial expressions (Williams, 2002). The few studies addressing that question showed similar configurations for pain expressions across Western and East Asian cultures (Chen et al., 2018; Cordaro et al., 2018; Cowen et al., 2021). These configurations involved the same set of traits previously reported, in studies involving only Western samples, as being frequently observed during a painful experience: brow lowering, eyes narrowing, and nose wrinkling with upper lip raising (Kunz et al., 2019; Prkachin, 1992; Prkachin & Salomon, 2008).

Beyond facial configuration, culture may also modulate the intensity of facial expressions, as cultural differences in emotion regulation have been reported (Matsumoto et al., 2008). Variations in the typical intensity with which pain is facially conveyed in a given cultural group are important to study, because they may impact the ability to decode pain expressions in another group. For instance, an observer may expect an individual in pain to display pain with intense facial expressions, but if the individual has learned to display subtle expressions, the observer may fail to detect their pain or inaccurately conclude that it is of low intensity. It is easy to imagine the consequences that could result from an inability to adequately communicate pain to another. Hence, understanding how pain expressions vary, in terms of expected intensity, across different cultures is important and may help promote equal treatment for all. To gain a better understanding of the potential cultural variability in both the expectations regarding pain facial expression intensity and the visual information utilization during the pain intensity decoding, we compared Chinese and Canadians, two groups that are respectively part of the broader East Asian and Western cultures. These two groups were chosen because their geographic distance decreases the likelihood of cultural overlap. Moreover, the aforementioned studies comparing the configuration of pain facial expressions across cultures also involved these two groups (Chen et al., 2018; Cordaro et al., 2018).

In experiments 1 and 2, expectations regarding the typical intensity of pain expressions were measured by comparing the mental representations of East Asian and Western observers. Mental representations reflect the types of facial expressions one has encountered in their social environment and therefore make it possible to visualize an individual’s expectations regarding the outer world (Jack, Garrod et al., 2012). In experiment 3, the decoding of pain expressions was measured by comparing the visual information utilization strategy and the amount of signal needed by East Asian and Western observers during a pain intensity discrimination task. Two data-driven techniques were used to measure mental representations and the visual information utilization strategy: reverse correlation (Ahumada, 1996; Ahumada & Lovell, 1971; Mangini & Biederman, 2004) and bubbles (Gosselin & Schyns, 2001). The data-driven approach used in the present study has the advantage of not relying on a priori hypotheses regarding what visual features, in the mental representations and information utilization strategies, might differ between the two groups (Jack et al., 2018). Nevertheless, we hypothesized that East Asians and Westerners differ in the intensity with which they expect pain to be expressed and that these differences in expectations, in turn, affect the decoding efficiency.

Method

Language of Instructions

In all the experiments, the instructions were given in French to French-Canadian participants, in English to English-Canadian participants, and in Mandarin to Chinese participants. To make sure the instructions were as similar as possible in all languages, they were first translated from French to English by the last author of this study and from English to Mandarin by the fifth author. Then, they were back-translated from Mandarin to English by the sixth author and from English to French by a bilingual member of the last author’s lab. Finally, all the French-speaking authors compared the original instructions with the back-translated ones to make sure of their equivalence.

Sample Composition and Size

In all three main experiments, Canadian and Chinese participants were recruited. All participants had normal or corrected-to-normal visual acuity. The number of participants was selected a priori to reach a minimum statistical power of .8, assuming an effect size of 0.8, i.e., comparable to those obtained in previous studies assessing the differences between East Asians and Westerners in face perception (Blais et al., 2008; Tardif et al., 2017). The sample size criterion was also based on the high number of trials per participant required by both methods (reverse correlation and Bubbles) which have proven to lead to reliable data (Dotsch & Todorov, 2012; Jack et al., 2018; Royer et al., 2015). The required sample size was measured using G*Power 3.1 (Faul et al., 2007) taking into account the main test performed across the three experiments—a bilateral two-sample t-test (power =.8; d =0.8; α =.05). The software indicates that a sample size of at least 26 participants per group was required; thus, we recruited 30 participants per group for each of the three experiments.

Experimental Design

In all three experiments, two faces were presented on each trial, and participants were asked to indicate which of the two expressed the most pain. This task, called “two-alternative forced choice,” is considered to be mostly criterion-free (Kingdom & Prins, 2010; Lu & Dosher, 2013); it reduces the likelihood of decisional biases in the results. Reducing the likelihood of decisional biases was particularly important in the context of a cross-cultural study since culture has been shown to modulate decisions such as how to use a rating scale (Lee et al., 2002).

Catastrophizing Scale

Catastrophic thinking toward pain reveals a tendency to increase or exaggerate the threat or severity of the feelings of experienced pain (Sullivan et al., 1995; 2001; 2004). It has also been shown that higher levels of pain catastrophizing are associated with pain expression of greater intensity (Weissman-Fogel et al., 2008) that are sometimes of longer duration (Sullivan et al., 2004). Most importantly, an association between the tendency to catastrophize pain and the perception of facial expressions has been found (Sullivan et al., 2006a, 2006b), and cultural differences in the tendency to catastrophize pain have been reported (Fabian et al., 2011; Forsythe et al., 2011; Hsieh et al., 2010; Rhudy et al., 2019). Thus, participants from experiments 1 and 3 completed “The Pain Catastrophizing Scale” questionnaire (Sullivan et al., 1995; 2001; 2004). This scale allows one to measure an individual’s propensity to amplify the consequences of the experience of pain which will result in a set of negative thoughts (Sullivan et al., 1995). There are some cultural differences in the usage of a scale (Austin et al., 2006; Lee et al., 2002); however, this questionnaire has been validated among Chinese participants (Yap et al., 2008). The analysis and results are provided in section 3 of the Supplementary Material. No difference was found between groups.

Experiment 1—Extracting Mental Representations of Pain Expressions (with an Expressive Background)

One technique aimed at depicting mental representations is reverse correlation (Ahumada, 1996; Ahumada & Lovell, 1971; Mangini & Biederman, 2004). In this study, we used a version of this technique in which noise is added to an image—in the context of the present study, a picture of a face—in order to slightly and randomly modify its appearance. The background face, called “base face,” is constant, while only the added noise varies across stimuli. The underlying hypothesis is that on some trials, the noise added to the base face may make the modified images’ appearance closer to, or farther from, the participant’s mental representation of a target category—in this case, a facial expression of pain. Thus, by correlating the properties of the added noise on each trial with the corresponding percept of the participant, the properties of each mental representation can be inferred. An illustration of this technique is provided in Fig. 1.

Fig. 1
figure 1

a During each trial, two stimuli were presented. The stimuli were constructed using the same background face (base face displayed in first row) on which two inversely correlated patterns of visual noise were added (second and third rows). The patterns of noise varied randomly across trials. In experiment 1, the base face consisted of a morph of White and East Asian male avatars, in which the brow lowering, eyes narrowing, and nose wrinkling/upper lip raising features were slightly and equally activated. In experiment 2, the base face consisted of a morph of White and East Asian male avatars displaying a neutral expression. b Examples of stimuli on different trials

Participants

Thirty Canadian participants (19 females, Mage=22.86; SDage=4.50) and 30 Chinese participants (15 females, Mage=21.23; SDage=1.19) took part in this experiment. Canadian participants were recruited on the campus of the University of Quebec in Outaouais, in the province of Quebec, Canada. They all identified as being of White-European descent, were all born in Canada, and had never lived in an Asian country. Chinese participants were recruited on the campus of the Hangzhou Normal University of China, in the province of Zhejiang. They all identified as being of East-Asian descent, were all born in China, and had never lived in a Western country.

Material and Stimuli

In Canada, stimuli were displayed on a calibrated high-resolution LCD monitor with a refresh rate of 60 Hz. In China, they were displayed on a calibrated high-resolution CRT monitor with the same refresh rate. East Asian participants were tested with both LCD and CRT monitors in a previous study on how culture influences SF use, and the results were not affected by the type of monitors used. The experimental program was written in MATLAB, using functions from the Psychophysics Toolbox (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997).

The procedure to create a stimulus is illustrated in Fig. 1, along with a few stimulus examples. As is typically done in a reverse correlation task (Brinkman et al., 2017), the same base face was used across all trials. It consisted of the grayscale picture of a morph composed of avatars of a White and an East Asian male. The avatars were produced using FaceGen (Singular Inversions Inc., Vancouver, Canada) and FACSGen (Roesch et al., 2011). The White and East Asian male identities were randomly generated using FaceGen. Only male faces were included because including a female’s face as a second base face would have double the required number of trials. In fact, the physical properties of the background face may have an impact on the mental representation extracted. Using the same base face throughout the experiment allowed us to keep the length of the task reasonable while focusing on the main variable of interest in the present study, i.e., the culture of the observer. We chose to use a male, rather than a female, base face because studies have shown that pain is, in general, better processed in male than in female faces (Coll et al., 2012; Pronina & Rule, 2014; Simon et al., 2006). We decided to use a neutral face containing 50% of both face ethnicity instead of empirically deriving the point of subjective equality because the latter would have likely been different for the Westerners and East Asians. In fact, studies have shown the existence of a robust “other-race advantage,” whereby other-race faces are more quickly and more accurately perceived than own-race faces (Ge et al., 2009; Levin, 1996; Levin & Beale, 2000; Valentine & Endo, 1992). With morphs like the ones we used, this means that, in order to be perceived as completely ambiguous with regards to ethnicity, Westerner would need a morph in which a higher weight is given to the White than the East Asian face, and the reverse would be true for East Asian participants (Benton & Skinner, 2015). However, using different backgrounds would have made it difficult to compare the mental representations of the two groups, which is the focus of the current study.

Using FACSGen, brow lowering (AU4), eyes narrowing (AU6 and AU7), and nose wrinkling/upper lip raising (AU9 and AU10) features were activated at an intensity of 0.1, 1, and 0.6, respectively. The resulting expression can be seen on the top left panel of Fig. 1a. Explanations of the procedure to produce an expression with FACSGen are provided in Supplementary Material. The selection of the brow lowering, eyes narrowing, and nose wrinkling/upper lip raising features was based on multiple studies showing that they are the core of pain facial expressions (Kunz & Lautenbacher, 2014; Kunz et al., 2019; Prkachin, 1992; Prkachin & Salomon, 2008). The decision to partially activate facial traits in the base face followed the recommendations of Brinkman et al. (2017). It reduces the number of trials necessary to obtain a good signal-to-noise ratio in the results. Nevertheless, despite the presence of these facial traits in the base face, this method allows for the emergence of other traits if they are part of the participants’ mental representations (Dotsch et al., 2008; Murray & Gold, 2004). Moreover, although avatars may have the downside of having an artificial appearance, they offer the significant advantage of facilitating control of the intensity with which the different emotional facial traits are set. In the context of the present study, because we sought to measure processes inherent to the observers, it was particularly important to carefully control the input signal they received. The face presented to the participants subtended a width of 6° of visual angle and a height of 7.32° of visual angle (distance between the participants’ eyes and screen of 50 cm).

Procedure

Each participant completed five blocks of 100 trials. On each trial, two stimuli were created by generating a random patch of sinusoidal white noise and either adding it to, or subtracting it from, the base face (for a description of the noise creation, see Mangini & Biederman, 2004). The two noisy faces were then displayed in the middle of the computer screen and remained on the screen until a response was given. Participants were asked to decide which of the two faces expressed the most pain. Figure 2 illustrates the sequence of events in a trial for all three experiments.

Fig. 2
figure 2

Sequence of events on each trial for experiments 1 to 3. In all three experiments, two faces were presented on each trial, and participants were asked to indicate which of the two expressed the most pain. The two faces were either embedded in visual noise (experiments 1 and 2) or randomly sampled through space and spatial frequencies with bubbles (experiment 3)

Results

To reveal the mental representation of pain expressions, we employed the analysis procedure typically used with reverse correlation, which consists in averaging the noise patches composing the stimuli selected as expressing the most pain (Mangini & Biederman, 2004). The resulting average is called a classification image (CI), and it serves to determine what properties of the visual noise increase the probability of perceiving pain in a stimulus. For visualization purposes, the CI can be added to the base face, thereby revealing the mental representation of pain expressions. The average mental representation of East Asian and Western participants is presented in Fig. 3a.

Fig. 3
figure 3

a The first column depicts the average classification images of Western and East Asian participants overlaid on the base face in experiment 1. The second column depicts the average classification images of Western and East Asian participants overlaid on the base face in experiment 2. b Results of the one-sample t-tests in the objective, image-based analysis. The colors, ranging from blue to yellow, represent the T scores. Positive T scores (in yellow) represent brighter areas, and negative T scores (in blue) represent darker areas. The areas significantly associated with the perception of pain intensity are highlighted with a white contour. The significant clusters corresponding to a common area were each attributed a number, and those numbers are used in Table 1. c Results of the two-sample t-tests in the objective, image-based analysis. The colors, ranging from blue to yellow, represent the T scores. Positive T scores (in yellow) represent areas that were brighter in the East Asians’ than Westerners’ CIs, and negative T scores (in blue) represent areas that were darker in the East Asians’ than Westerners’ CIs. The areas that significantly differ between the East Asians’ and Westerners’ CIs are highlighted with a white contour. The significant clusters were each attributed a number, and those numbers are used in Table 1

Table 1 Average Z score, standard deviations, Cohen’s d, and confidence intervals for each significant area associated with the perception of pain facial expressions for Westerners and East Asians participants

First, we conducted an objective—image-based—analysis on CIs that were obtained. One CI was generated for each participant. To verify which facial areas were significantly associated with the percept of pain, and thus characterized the mental representation of pain, we conducted pixel-by-pixel bilateral one-sample t-tests on the individual CIs. Moreover, in order to verify if the mental representations of the two groups differed, we conducted pixel-by-pixel bilateral two-sample t-tests. This image-based analysis presents the advantage of taking into account the inter-individual variability in the individual participants’ CIs, as was recommended by Cone et al. (2021). Another important advantage of this test is that it allows to reveal which facial areas are associated with the perception of facial expressions of pain in a data-driven manner. In other words, because each pixel of the CI is independently tested, we do not have to arbitrarily select facial areas on which to perform the statistical test. Because of the very high number of t-tests performed, the statistical threshold was adjusted to avoid type-I errors, using the Cluster test from the Stat4Ci toolbox (Chauvin et al., 2005). This test compensates for the multiple comparisons across pixels while taking into account the fact that in structured images—like faces—each pixel is not independent of the others.

Prior to calculating the t-tests, the individual CIs were slightly smoothed, using a Gaussian window with a full width at half maximum of 7 pixels. They were then transformed into Z scores using the mean and standard deviation of the null hypothesis, estimated from the background containing no face pixels. Once transformed into Z scores, pixels with highly negative values corresponded to the ones that were systematically dark in the stimuli selected as expressing the most pain, and highly positive values corresponded to the ones that were systematically bright in the stimuli selected as expressing most pain. The individual CIs, transformed into Z score values, were then passed through one-sample t-tests to assess, for each pixel of the CIs, whether their values were systematically (i.e., in more participants than expected by chance) negative (leading to negative T scores) or positive (leading to positive T scores). Thus, an area of the CIs composed of negative T scores indicates that when this area was dark, participants perceived the stimuli as expressing the most pain, and an area composed of positive T scores indicates that when this area was bright, participants perceived the stimuli as expressing the most pain.

In Fig. 3b, the maps of T scores resulting from the pixel-by-pixel one-sample t-tests are displayed, and the facial areas significantly associated with the percept of pain are highlighted with a white contour (tcrit=2.7, minimum cluster size=288, p<.0125 after the Bonferroni correction for East Asian and Westerner groups). Interestingly, facial areas corresponding to the location of the facial features previously proposed as constituting the core of pain expressions (i.e., brow lowering, eyes narrowing, and nose wrinkling with upper lip raising) were associated with the perception of pain expressions in both groups. One exception was the area corresponding to the brow lowering feature, which did not reach the significance threshold for Westerners. The face contour was also significant for both groups, replicating previous findings (Blais et al., 2019). Although this is speculative, we think that a brighter face contour helped increase the contrast of the inner features, thus making the areas corresponding to the brow lowering, eyes narrowing, and nose wrinkling with upper lip raising features more visible. The average Z score, standard deviation, Cohen’s d, and 95% confidence intervals for each significant area are reported in Table 1. Note that when multiple clusters represented the same area, those statistics were calculated across all clusters. The clusters representing a common area were also given the same identifying number in Fig. 3. In both groups’ classification images, there were significant clusters outside of the face. Those are likely false positives and may shed doubt on whether the significant clusters located in the face area are truly associated with the perception of pain facial expressions. However, the fact that the significant areas located in the face corresponded to the ones that were found in previous studies (Blais et al., 2019) and were also found in experiment 2 supports the idea that they indeed modulated the percept.

Next, we conducted pixel-by-pixel bilateral two-sample t-tests to verify if the mental representations of the two groups differed (tcrit=2.7, minimum cluster size =308, p<.025). The results of this analysis are presented in Fig. 3c and Table 1. The facial areas that significantly differed between Westerners’ and East Asians’ mental representations are highlighted with a white contour. The mouth area was significantly brighter in East Asians’ than in Westerners’ mental representations.

As mentioned above, one important strength of this image-based analysis is that it takes into account the inter-individual variations in the mental representations. However, it only allows the identification of facial areas in which the manipulation of contrast is associated with changes in the percept of pain facial expression. It does not allow us to make any conclusion on the specific action units involved in the mental representations or the perceptual impact of those brighter or darker areas in the classification image. One way to gain more information about how those contrast variations affect the appearance of the facial expression is to conduct a subjective analysis of the average mental representations. This is what we did next.

To better understand if the intensity of mental representations of facial expressions of pain differed between Westerners and East Asians, a second group of Canadian participants who did not take part in the reverse correlation task (N=20 females, Mage=31.06; SDage=9.04) was asked to compare the intensity of pain expressed in the average mental representation of both groups. All participants identified as being of White-European descent and were born in Canada. The sample size was chosen to be smaller than the group from which the mental representations were extracted, since an equal or larger group may increase the probability of a false positive (Cone et al., 2021). Because the task was very short, we decided to ask the first 20 participants that would come to our lab to take part in other experiments unrelated to emotion recognition. The first 20 participants to come were female, which explains why there are no males in the sample. The two images depicting the average mental representations (i.e., those surrounded by yellow squares in Fig. 3a) were printed side-by-side with the following instruction: “Which of these two faces expresses the most pain?” Participants indicated whether it was the left or right image. The position of East Asians’ and Westerners’ mental representations was counterbalanced across participants. Participants did not know what the two images represented. Ultimately, all the participants indicated that the East Asians’ mental representation appeared more intense than the Westerners’. Although, in this case, the data speak for themselves; we conducted a chi-square test that confirmed the statistical significance of this finding, χ2(1)=20, p<.001.

Finally, although the main aim of the present study was to verify the impact of culture on pain expression perception, the results of pixel-by-pixel ANOVAs verifying the effect of participants’ sex and the interaction between sex and culture are reported in Supplementary Material (section 1).

Experiment 2—Extracting Mental Representations of Pain Expressions (with a Neutral Background)

In experiment 1, we chose to activate, in the background face used in the reverse correlation task, three facial features previously reported as being frequently observed in both Westerners’ and East Asians’ pain facial expressions. This decision was made because our primary goal was to verify if the intensity of expected pain expressions would differ in the two groups. However, this decision may have made it harder to find differences in terms of the configuration of expected pain facial expressions. Moreover, the appearance of the background face may interact with the percept during a reverse correlation task, and we wanted to assess the generalizability of the pattern of results obtained in experiment 1. Thus, in experiment 2, participants took part in a reverse correlation task using a different background face than in experiment 1: this new background face was composed of different avatars than in experiment 1, displaying a neutral facial expression (see Fig. 1). Because this experiment was conducted during the COVID-19 pandemic, labs were not accessible and traveling was not possible. The experiment was therefore conducted online in order to have access to both Chinese and Canadian participants.

Participants

Participants were recruited using the Prolific’s participant pool (www.prolific.co). We aimed to reach sample sizes similar to the ones in experiment 1, based on the same criteria as described before. We anticipated that a significant proportion of participants would not pass the attention check, so we tried to recruit 60 Western participants and 60 East Asian participants. The Prolific participant pool is not available in China, so we had to loosen our inclusion criteria compared with experiment 1. More specifically, to be included in the East Asian sample, the participants needed to identify as being of East Asian ethnicity, have the Chinese nationality, and speak Mandarin as their first language. We applied similar criteria for the Western sample: participants needed to identify as being of White/Caucasian ethnicity, have Canadian nationality, and speak either French or English as their first language. We left the experiment online for a week. In total, 60 Canadian participants and 50 Chinese participants took part in the task. The smaller number of participants in the Chinese sample likely reflects the smaller number of participants meeting the inclusion criteria: around 200 Chinese participants compared with over 2000 Canadian participants in the Prolific’s participant pool. To be included in the analyses, participants needed to meet the following criteria: (1) having completed at least 90% of the total number of trials and (2) reaching at least 95% of accuracy on the attention check trials (see procedure for details). Those criteria were decided a priori, before starting the analyses. Thus, the final samples of participants included 40 Canadians (19 females; M=28.3 years old; SD=10.9) and 34 East Asians (18 females; M=27.9 years old; SD=6.3).

Material and Stimuli

When recruited, participants were asked to complete the task on a desktop or laptop computer; smartphones and tablets were not allowed. Participants were asked to put a distance of about 2 ft between their eyes and the computer screen. However, since the screen size and resolution likely differed across monitors, the exact stimulus size on the retina was not controlled. The experimental program was written in MATLAB, using the functions from the Psychophysics Toolbox (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997). The experiment was run online using LabMaestro Pack&Go, a platform developed by VPixx Technologies (https://vpixx.com/products/labmaestro-packngo/). The stimuli were created in the exact same way as in experiment 1, except that the base face was different. As in experiment 1, it consisted of the grayscale picture of a morph composed of avatars of a White and an East Asian male. However, the White and East Asian male faces each represented the preset classification provided by FaceGen and were labeled as “European” and “East Asian.” Those presets are based on statistical algorithms developed from 3D laser scans of several real human faces. Thus, identity was not the same as in experiment 1. Moreover, the base face displayed a neutral facial expression. Finally, the hair, ears, and neck were not visible on the base face. This decision was made to maximize the space occupied by the face within the noise sampling space. It potentially allowed to reveal the contribution of smaller features to the percept of pain facial expressions.

Procedure

The procedure was the same as in experiment 1, except that 20 catch trials were added. During the catch trials, two cars were presented instead of the faces, and participants were asked to press on the “t” keyboard key.

Results

To reveal the mental representation of pain expressions, we employed the same analysis procedure as in experiment 1. The average mental representation of East Asian and Western participants is presented in Fig. 3a. As was done in experiment 1, objective—image-based—measures of the CIs were first obtained. The same analysis procedure as described in experiment 1 was followed. The results of the pixel-by-pixel bilateral one-sample t-tests are displayed in Fig. 3b and Table 2. The facial areas significantly associated with the perception of pain are highlighted with a white contour (tcrit=2.7, minimum cluster size=288, p<.0125 after the Bonferroni correction for East Asian and Westerner groups). To increase the perception of pain facial expressions, areas colored in dark blue needed to be darker, and areas colored in yellow needed to be brighter. More specifically, areas corresponding to the cheeks, mouth, some part of the face contour, and the area between reached the significance threshold for Westerners. For East Asians, areas corresponding to the left eye and eyebrow and between the eyebrows and some parts of the face contour reached significance. The mouth area did not come out as significant in the East Asians’ mental representation. However, looking at the maps of T scores, it is possible to see that the percept of pain increased, although below the significance threshold, when the inner part of the mouth was brighter and the outer parts were darker, making the mouth appear down-turned. The polarity of the contrast in the inner part of the mouth was reversed compared to Westerners. The average Z score, standard deviation, Cohen’s d, and 95% confidence intervals for each significant area are reported in Table 2. As for experiment 1, when multiple clusters represented the same area, those statistics were calculated across all clusters.

Table 2 Average Z score, standard deviations, Cohen’s d, and confidence intervals for each significant area associated with the perception of pain facial expression for Westerner and East Asian participants

The results of the pixel-by-pixel bilateral two-sample t-tests are displayed in Fig. 3c. The facial areas that significantly differed between Westerners’ and East Asians’ mental representations are highlighted with a white contour (tcrit=2.7, minimum cluster size=308, p<.025). Importantly, the finding of a significantly brighter mouth area in the East Asians’ than in the Westerners’ mental representation was replicated in experiment 2. However, more areas came out as significant: areas corresponding to the left eyebrow, the mouth, and some parts of the face contour were brighter in the East Asians’ than in the Westerners’ mental representations, and the bridge of the nose area was darker in the East Asians’ than in the Westerners’ mental representations. The average Z score, standard deviation, Cohen’s d, and 95% confidence intervals for each significant area are reported in Table 2.

Next, as was done in experiment 1, a subjective analysis of the average mental representations extracted in experiment 2 was conducted. A second group of participants who did not take part in the reverse correlation task (N=63, 38 females, Mage=33.8; SDage=14.8) was asked to compare the intensity of pain expressed in the average mental representation of both groups. The sample was composed of Canadian participants recruited on the campus of the University of Quebec in Outaouais, in the province of Quebec, Canada, and Swiss participants recruited on the campus of Fribourg’s University, in Switzerland. All participants identified as being of White-European descent, they were all born in a Western country, and had never lived in an Asian country. The sample size was chosen to be smaller than the group from which the mental representations were extracted, since an equal or larger group may increase the probability of a false positive (Cone et al., 2021). The two images depicting these mental representations (i.e., those surrounded by blue squares in Fig. 3a) were printed side-by-side with the following instruction: “Which of these two faces expresses the most pain?” Participants indicated whether it was the left or right image. The position of East Asians’ and Westerners’ mental representation was counterbalanced across participants. Participants did not know what the two images represented. A total of 51 out of 63 participants indicated that the East Asians’ mental representation appeared more intense than the Westerners’. A chi-square test confirmed the statistical significance of this finding, χ2(1)=24.14, p<.001.

Experiment 3—Comparing Ability and Visual Strategies During Pain Intensity Discrimination

In experiment 3, we moved on from the mental representations to measure the visual information utilization strategy used during a pain intensity discrimination task. To that end, the “bubbles technique” was used (Gosselin & Schyns, 2001). This data-driven technique involves presenting pictures of partially masked pain expressions, with available and unavailable facial information varying randomly across trials. The visual information is sampled through space (“x, y” coordinates) and spatial frequencies. Spatial frequencies code different granularity levels of visual information, with higher spatial frequencies coding finer information such as an iris shape or wrinkles and lower spatial frequencies coding coarser information such as the shape of the mouth or the outline of the face. By correlating every participant’s accuracy on each trial with the random masks, it becomes possible to infer which facial information (i.e., facial features and spatial frequencies) participants of both groups rely on to discriminate pain expressions of varying intensities. Figure 4 illustrates the bubbles technique.

Fig. 4
figure 4

a Illustration of the procedure to create a stimulus with the “bubbles method.” The original face is first decomposed into five spatial frequency bands (first row), after which a mask of randomly positioned Gaussian apertures, called bubbles, is created for each band (second row). Each of the five filtered images is then multiplied pixel-by-pixel with its corresponding bubbles mask. The five resulting stimuli (third row) are finally fused to create the final stimulus, called bubblized stimulus. Thus, in the bubblized stimuli, random facial parts are displayed in different spatial frequencies, making it possible to draw inferences about the facial features and spatial frequencies underlying pain intensity discrimination. b During each trial, two faces were presented, and participants were asked to indicate which had the most intense pain expression. The two faces’ expressions always differed in terms of intensity, with the intensity being manipulated by morphing different percentages of a neutral expression and a pain expression. The stimulus was either completely neutral, 33% in pain, 66% in pain, or 100% in pain. Three conditions of difficulty were created, where the two stimuli displayed either differed by 33%, by 66%, or by 100%

Participants

Thirty Canadian participants (15 females; M=21.3; SD=4.09) and 30 Chinese participants (15 females, M=21.23; SD=1.19) took part in this experiment. All of the Chinese participants and a subset of 17 Canadian participants had taken part in experiment 1. Canadian participants were recruited on the campus of the University of Quebec in Outaouais, in the province of Quebec, Canada. They all identified as being of White-European descent, were all born in Canada, and had never lived in an Asian country. Chinese participants were recruited on the campus of the Hangzhou Normal University of China, in the province of Zhejiang. They all identified as being of East-Asian descent, were all born in China, and had never lived in a Western country.

Material and Stimuli

The material was the same as in experiment 1. Stimuli consisted of pictures of avatars created with the FaceGen and FACSGen software programs. In total, 8 avatars were created: two genders, in two ethnicities (White and East Asian), each in two emotional states (neutral and pain). In contrast with the reverse correlation task, using avatars of varying identities and gender does not impact the required number of trials, since those variables can be collapsed during the analyses. We thus included two gender types in our face stimuli set to increase diversity. In the pain expressions, the same three facial traits as in experiment 1 were activated, i.e., brow lowering (AU4), eyes narrowing (AU6 and AU7), and nose wrinkling/upper lip raising (AU9 and AU10). The 100% intensity involved an activation of 0.1, 1, and 0.6 for the three features, respectively. Neutral avatars were then morphed in pairs with the corresponding identities’ pain avatars using FantaMorph (Abrosoft Co, 2002). FantaMorph is a digital morphing software program used to systematically create equally spaced intermediates along a continuum between two emotions (in this case neutral and pain). Using this technique, two intermediate pain states were created, 33% or 66%, which corresponded to setting the action units mentioned above at 33% and 66% of the maximum selected in FaceGen for the 100% intensity. This resulted in four facial pain levels: no pain, 33%, 66%, and 100% of pain. A total of 16 stimuli were subsequently used during the experiment (2 genders × 2 ethnicities × 4 levels of pain intensity). All the faces selected for the experiment were transformed into grayscale images with a homogeneous gray background. Their luminance was normalized using the SHINE toolbox (Willenbockel et al., 2010).

To create “bubblized” stimuli, the following procedure was used on each trial. First, the face picture was decomposed into five spatial frequency bands using the Laplacian pyramid included in the pyramid toolbox for MATLAB (Burt & Adelson, 1983; Simoncelli, 1999). These five spatial frequency bands were: 29.5–59.0, 14.8–29.5, 7.4–14.8, 3.7–7.4, and 1.8–3.7 cycles per face (cpf), and the frequencies below 1.8 cpf served as a constant background. An example of this first step is displayed in Fig. 4a, top row. Second, to randomly sample facial information in each of these spatial frequency bands, locations over which bubbles were placed were randomly selected. Bubbles, in fact, represent Gaussian apertures through which the information is visible. An example of this second step is displayed in Fig. 4a, middle row. The size of the bubbles was adjusted as a function of the frequency band so that each bubble revealed 1.5 cycles of spatial information. Specifically, the full width at half maximum of a bubble was of 14.1, 28.3, 56.5, 113.0, and 226.1 pixels from the highest to the lowest spatial frequency band. Because the size of the bubbles increased as the spatial scale became coarser, the number of bubbles differed across scales to keep the size of sampled area constant across frequency bands. A pointwise multiplication was then performed between the bubble masks and the filtered image in the corresponding spatial frequency bands. An example of this third step is provided in Fig. 4a, bottom row. Finally, the information revealed by the bubbles was fused across the five frequency bands to produce one experimental stimulus. This final step is illustrated in the image at the right of the bottom row of Fig. 4a. Since the bubbles’ locations vary randomly across trials, it is possible—after many trials—to statistically verify the link between the visibility of a pixel or group of pixels and the probability of correctly discriminating the intensity of pain expressions. The face width of stimuli subtended approximately 6° of visual angle.

Procedure

Participants were asked to complete a total of 3,024 trials divided into 21 experimental blocks of 144 trials. The total duration of the task was approximately 4.5 h, divided between two sessions occurring on different days. Different ethnicities and pain intensities were interleaved in each block and were presented in a random order.

At the beginning of each block, instructions were displayed on the screen monitor. Then, on each trial, a fixation cross first appeared in the center of the computer screen for a duration of 500 ms. It was immediately replaced by two bubblized faces of the same identity expressing different pain intensities. The two faces could display a difference of either 33%, 66%, or 100% in pain (see Fig. 4b for examples). Stimuli were displayed on the right and left sides of the screen center. They remained visible until the participant’s response. The task was to identify which of the two faces expressed the most pain by pressing the corresponding keyboard key. No feedback regarding the accuracy of participants’ responses was given. The number of bubbles was adjusted online with QUEST to maintain an accuracy of approximately 75% (Watson & Pelli, 1983). The number of bubbles was adjusted separately for the three conditions of pain intensities (i.e., intensity difference of 33%, 66%, or 100%). Within each intensity condition, an equal number of bubbles was applied for the two face ethnicities. This decision was made to ensure that the performance did not drop below the level of chance in the most difficult condition or jump near to perfection in the easiest condition.

Results

To compare the amount of signal needed by Western and East Asian observers to discriminate pain expression intensity, the average number of bubbles across all trials was used. As explained in the “Procedure” section, the number of bubbles was manipulated on a trial-by-trial basis to maintain a performance of approximately 75%. Previous studies have shown that the number of bubbles represents a good index of an individual’s ability in a task (Royer et al., 2015; 2018). In the bubbles method, a bubble reveals facial information; the higher number of bubbles there is, the more facial information is available to the participant to perform the task. At the extreme, with a sufficiently high number of bubbles, the face would be completely revealed to the participant. Here, we calculated the number of bubbles needed across all frequency bands. One East Asian participant was not included in the subsequent analyses because the number of bubbles they needed (222 bubbles) was more than 2.5 standard deviations above their group average. After removing this outlier, a 2 (East Asians vs Westerners) × 3 (conditions of pain intensity) ANOVA indicated a significant main effect of culture, F(1, 57)=20.2, p<.001, η2=0.26; a significant main effect of the conditions of pain intensity, F(2, 114)=216.7, p<.001, η2=0.79; and a significant interaction between culture and conditions of pain intensity, F(2, 114)=9.1, p=.003, η2=0.19. Two-sample t-tests comparing the two cultures in each condition of pain intensity confirmed that East Asian participants needed a significantly higher number of bubbles than Westerner participants, in all conditions of pain intensity, see Table 3 for the descriptive statistics and the results of the t-tests. In other words, those results suggest that Westerners needed less signal than East Asians to discriminate between subtle differences in pain intensities.

Table 3 Two-sample t-tests comparing the number of bubbles needed by Westerners and East Asians in each condition of pain intensity

To compare the visual information used by East Asian and Western participants to discriminate pain expression intensities, classification images (CIs) were computed using the following procedure. A weighted sum of all the bubbles masks used during the experiment was calculated, using the accuracy at discriminating pain expression intensities on each trial transformed into Z scores as weights. This resulted in one CI per spatial frequency band for each participant. In those CIs, facial information increasing the probability of a correct response had positive values, whereas information decreasing the probability of a correct response had negative values. CIs were then smoothed using Gaussian kernels of the same size as the ones used during the experiment. Simultaneously, random CIs were computed using a permutation procedure. This procedure consisted in calculating a weighted sum of all the bubble masks that were used during the experiment, with permuted accuracies transformed into Z scores as weights. These random CIs made it possible to estimate the average value and standard deviation expected under the null hypothesis. They were thus used to transform the CIs into Z scores, where the Z-values indicated the number of standard deviations from chance.

To verify which facial areas significantly increased the probability of correctly discriminating pain in each participant group, pixel-by-pixel one-sample t-tests were performed separately for each spatial frequency band. A cluster test from the Stat4CI toolbox was applied in order to control for type-I error inflation associated with multiple tests (tcrit=2.7, minimum cluster size =174, 530, 1,493, 3,595, 5,359 from the highest to the lowest spatial frequency band, p<.05). The significant areas are highlighted in color in Fig. 5, and the statistics are reported in Table 4 (see also Figure S1 for the pixel-by-pixel maps of T scores, with the significant areas highlighted with a white contour). Importantly, the fact that many areas come out as significant does not mean that all of those areas were used simultaneously by participants or needed to be available simultaneously on a given trial for a correct response to occur. Rather, this may reflect a flexible use of information, whereby any subset of these areas may be needed to correctly discriminate between pain intensities.

Fig. 5
figure 5

The first two rows illustrate the visual information used by Western and East Asian participants to correctly discriminate between two intensities of pain. Significant areas are depicted in color ranging from blue to yellow, representing T scores ranging from 2.7 to 10. The third row depicts the difference between the visual information used by Westerners and East Asians. In all images, the background face represents a White female, but the analysis was made by combining all trials with no regard to stimulus sex or ethnicity

Table 4 Average Z score, standard deviations, Cohen’s d, and confidence intervals for each significant facial area which increased the probability of correctly discriminating pain intensities among Westerner and East Asian participants

For Westerners, the eyes and a part of the nose in medium-to-high spatial frequencies ranging between 14.8 and 59.0 cpf were associated with a higher accuracy rate. Moreover, a large area comprising the eyes, eyebrows, nose, and mouth in spatial frequencies ranging between 7.4 and 14.8 cpf was also associated with a higher accuracy. For East Asians, both eyes in medium-to-high spatial frequencies ranging between 14.8 and 59.0 and 14.8 cpf were associated with a higher accuracy rate. A large area comprising the eyes, eyebrows, and upper part of the nose in spatial frequencies ranging between 7.4 and 14.8 cpf was also associated with a higher accuracy rate.

Two-sample t-tests (tcrit=2.7, minimum cluster size =159, 487, 1,365, 3,187, 3,882 from the highest to the lowest band, p<.025) confirmed that compared with East Asians, Westerners relied significantly more on the nose wrinkling/upper lip raising feature in spatial frequencies ranging between 30 and 59 cycles per face, on the brow lowering and nose wrinkling features in spatial frequencies ranging between 15 and 30 cpf, and on the eye narrowing, nose wrinkling, and upper lip raising features in spatial frequencies ranging between 7 and 15 cpf. There was, however, no area on which East Asians relied significantly more than Westerners. As for experiment 1, the result of pixel-by-pixel sex × culture ANOVAs is reported in section 1 of the Supplementary Material.

Control Experiment

The stimuli used in experiment 3 were artificial faces in which three facial features (brow lowering, eyes narrowing, and nose wrinkling with upper lip raising) considered as the core of pain expressions were activated. However, the studies that have proposed that these features compose the core of pain expression were conducted on Westerners (Didline & Atlas, 2019; Kunz & Lautenbacher, 2019). Thus, it is possible that the stimuli in experiment 3 did not accurately represent pain for East Asians. To confirm that the stimuli equally represented pain for both groups, two groups of observers (30 Westerners and 30 East Asians that did not take part in experiments 1 to 3) were asked to judge the degree to which they perceived the six basic emotions (anger, happiness, sadness, disgust, surprise, and fear) and the facial expression of pain in each of the stimuli displaying the highest pain intensity. Although current theoretical frameworks (e.g., Barrett et al., 2019) advocate for the inclusion of as many emotional categories as possible, or even for the use of open-ended questions in emotion recognition tasks, we decided to include only the six basic emotions to keep in line with previous studies examining the recognition of pain facial expressions in Westerner samples (Dildine & Atlas, 2019; Kappesser & Williams, 2002). Judgments were made on a scale from one (emotion not present) to seven (emotion extremely present). The average rating, across participants, was calculated separately for each emotion scale. Participants of both cultures did not differ on the degree to which they perceived pain (MWesterners=3.75, SDWesterners=1.08, MEast Asians=4.05, SDEast Asians=0.87, t(58)=1.18, p=.24, Cohen’s d =0.31, 95% CI [−0.21, 0.81]). Analyses of the other emotions are provided in section 2 of the Supplementary Material.

Discussion

The present study revealed differences in both mental representations and visual information utilization as a function of culture, suggesting that evaluating another person’s pain based on their facial expression may become an even more complex issue when multicultural settings are involved.

The few studies that have compared the configuration of pain facial expressions across cultures consistently revealed similar sets of facial features, suggesting the existence of universal pain expressions. The results of experiments 1 and 2 corroborate the notion that East Asians’ and Westerners’ expectations about pain expressions share some facial features. The facial area corresponding to the brow lowering feature was associated with the changes in the perception of pain facial expressions of both groups, in both experiments. Moreover, a change in the mouth area was expected by both groups in both experiments. This change mostly occurred in the upper part of the mouth in experiment 1 and the inner and outer parts of the mouth in experiment 2. When looking at the average mental representations, these changes seem to be related to the upper lip raising feature (experiment 1) and to a down-turned mouth shape (experiment 2). Experiment 2 also showed that East Asians expected a change in the outer part of the left eyebrow, a feature that was not expected by Westerners. Contrarily to experiment 1, a neutral base face was used in experiment 2. This methodological difference may have increased the chances of revealing other features than the typical ones. Moreover, the hair, ears, and neck were not visible on the base face of experiment 2. Given that facial adornments may have an impact on facial perception (Jack & Schyns, 2015; see however Butler et al. (2010), in facial identification), this methodological difference may also have influenced the appearance of the mental representations. Overall, the results of experiment 2 may suggest some cultural specificities in the expectations regarding pain facial expressions (Jack et al., 2016), but further research will be needed to confirm this finding.

Most importantly, the results of experiments 1 and 2 showed that compared with Westerners, the average mental representation of East Asians was perceived as more in pain than the one of Westerners. The fact that this same pattern of results was obtained in both experiments, despite using different stimuli, different participant samples, and even different experimental settings (in the laboratory vs. online), reinforces this finding and argues against the possibility of a type-1 error (Cone et al., 2021). Moreover, image-based analysis on the individual CIs corroborated the presence of differences in the mental representations that were systematic across the participants in both groups. Theoretical knowledge about how to interpret a more intense mental representation, when revealed with reverse correlation, is still lacking. On the one hand, the mental representation could be more intense because the participants are better at detecting pain-related signals, thereby leading to a high signal/low noise classification image. On the other hand, the mental representation could be more intense because the participants expect a pain facial expression to be displayed intensely and are therefore less good at detecting subtle pain-related signals. We think that the latter explanation is more likely, especially given the results obtained in experiment 3: East Asians needed more bubbles than Westerners to reach an equivalent accuracy rate, which has been shown to reflect a lower ability in past studies of face recognition (Royer et al., 2015; 2018; Tardif et al., 2019). Moreover, it has been proposed that mental representations reflect the types of facial expressions one has encountered in their social environment (Jack, Garrod et al., 2012). Within this framework, the present results suggest that East Asians expect more intense facial expressions of pain. Such differences in expectations could reflect a general tendency of East Asians to display more intense facial expressions when in pain, but it could also reflect a tendency to display facial expressions of pain only when experiencing very intense pain—thus expressing intensely. The latter scenario is consistent with the reports suggesting that East Asians have a tendency to express pain only when it becomes unbearable (Jongudomkarn et al., 2006; Tung & Li, 2015). It has been proposed that while facial displays of pain might be an inborn behavior, the ability to encode different intensities of pain through facial expressions is learned (Kunz et al., 2012). The present findings regarding similar facial features but with different intensity of expressions associated with pain across different cultures is congruent with this idea.

The present study also adds to previous knowledge by revealing the presence of two kinds of cultural differences in the visual extraction processes involved in pain intensity discrimination. First, East Asians needed more bubbles in order to discriminate between different intensity levels of pain expression. We interpret this finding as evidence that East Asians needed more signal than Westerners to succeed at this task. Such a difference could also be expected if the facial configuration of pain included in experiment 3 was not equally representative of pain for both groups. A control experiment was conducted to assess this possibility and suggests that this is not the case. Moreover, previous studies comparing the configuration of pain facial expressions across East Asians and Westerners have also revealed very similar expressions for both groups. Nevertheless, the results of experiment 2 shed some doubt on the idea that the stimuli were equally representative of pain for both groups by showing that East Asians expected changes in the left eyebrows that were not expected by the Westerners. To better understand the sensory mechanisms at play in the finding that more signal was needed by East Asians than by Westerners, future studies should include different configuration of pain expressions (e.g., see Kunz & Lautenbacher, 2014) as well as a more precise measure of sensitivity to pain intensity differences, for instance by measuring the participants’ psychophysical curve.

Second, East Asians generally made less use than Westerners of the lower part of the face when attempting to discriminate the intensity of pain expressions. This is consistent with the results of experiments 1 and 2, which show how East Asians expect sharp changes in this area of a person in pain. In fact, if such acute changes are expected, East Asians may have a harder time extracting the visual information associated with more subtle changes like the ones presented in experiment 3 stimuli. Moreover, our finding that East Asians made less use of the lower part of faces is congruent with the previous results showing that East Asians make fewer eye fixations on the lower part of faces during facial expression categorization (Jack et al., 2009).

The present results must be interpreted while acknowledging certain limitations. First, although the findings are understood as reflecting the differences between East Asians and Westerners, the sample only included Canadian and Chinese participants. The mental representations and visual strategies used to decode pain expressions are likely to vary from one individual to the other and may also vary within Western and East Asian countries. To gain a better understanding of the variability in pain mental representations and decoding processes, future work should include samples from various Western and East Asian countries, as well as from other cultures. Future studies should also consider how the sex of participants, the gender profile of the stimuli, and culture interacts together. It is possible that social norms pertaining to the intensity with which one expresses pain differ between women and men.

In conclusion, the present study revealed that, compared with Westerners, East Asians expect more intense pain expressions, suggesting that differences exist in the intensity with which they express pain. Moreover, East Asians need more signal to discriminate between subtle differences in the intensity of pain expressions, and they are less efficient at using the information conveyed by the facial features located in the lower part of the faces. Together, those findings suggest that different social norms regarding the intensity with which pain should be expressed lead to important differences in pain communication. Overall, our findings highlight the complexity of emotional facial expressions and the importance of studying pain communication in multicultural settings.