Introduction

Inferring other people’s feelings and motives is a complex process that requires integrating cues from multiple channels (Ambady and Rosenthal 1992), including faces and their expressions. A brief look at the face allows observers to make inferences about a person’s emotion, social motives, and even personal traits (Kappas et al. 2013; Parkinson 2005). However, facial expressions can be ambiguous and nuanced. Some of them are shown in order to be polite, prevent conflicts, or strategically mask one’s true intention to obtain resources that could otherwise be denied (Namba et al. 2016; Zloteanu et al. 2018). This is particularly true for smile expressions which occur frequently (Chapell 1997), are perceptually salient (Smith and Schyns 2009), and easy to produce on demand (Ekman and Friesen 1982; Ekman et al. 1988).

Smiles have many positive social consequences, making the displayer seem more attractive, sociable, and trustworthy (Harker and Keltner 2001; Krumhuber et al. 2007b; see also Rychlowska et al. 2019 for a review). People tend to share more resources with smiling than non-smiling individuals (Scharlemann et al. 2001), and this effect is particularly marked in case of genuine rather than false smiles (Johnston et al. 2010; Krumhuber et al. 2007a; Shore and Heerey 2011). Thus, interpreting smiles is an important social task in our daily lives. It may also be challenging as both genuine and posed smiles involve a bilateral activation of the zygomaticus major muscle (Action Unit (AU) 12; Facial Action Coding System (FACS), Ekman et al. 2002). One of the criteria for classifying smiles is the so-called Duchenne marker, or the presence of eye constriction and crow’s feet accompanying a genuine smile (AU6, Duchenne, 1862/1990; Ekman and Friesen 1982; Frank and Ekman 1993). Although the Duchenne marker allows observers to efficiently detect enjoyment smiles (Gunnery and Ruben 2016; Krumhuber and Manstead 2009; Malek et al. 2019), perceptual characteristics alone may not be sufficient to correctly interpret the meaning of a smile. This is because we rarely encounter facial expressions in isolation. Instead, they are typically embedded in a wider context provided by the expresser’s face and body as well as the situation in which a given expression occurs.

Taking contextual information into account is critical for accurately interpreting the meaning of emotion displays. Knowing who produces a facial expression, towards whom the expression is directed, and why it is displayed can influence our understanding of this communicative gesture (Aviezer et al. 2017; Greenaway et al. 2018). Such contextual cues can be communicated through different channels. Early studies by Goodenough and Tinker (1931) highlight the importance of verbal descriptions of specific social situations in the perception of emotion. They found that situational information in the form of short stories influenced which emotion was attributed to a facial expression. In later work by Carroll and Russell (1996), observers tended to judge emotion displays in line with contextual vignettes rather than the information provided by the face. For example, pairing a prototypical fearful expression with an anger-provoking story resulted in participants miscategorising the expression as angry. Verbal vignettes can also affect the perception of real life, naturally elicited facial expressions. When judging the emotional state of Olympic athletes, Kayyal et al. (2015) found that the same face was perceived as positive when participants were told that the athlete had won, but as negative when the athlete ostensibly lost.

Visual scenes are another way of conveying contextual information. Here, facial expressions are typically presented in isolation or within a surrounding. Early experiments using this paradigm (Munn 1940; Spignesi and Shor 1981) revealed the importance of social context in emotion perception. More recent studies suggest that information from visual scenes is rapidly integrated in the encoding of expressions (Righart and de Gelder 2008), such that identical facial displays can be interpreted differently depending on the type of background. For example, Aviezer et al. (2012) showed that bodily context cues were influential in discriminating between positive and negative emotions. Other findings suggest that judgments of facial expressions in everyday situations largely depend on hand movements and bodily postures (Abramson et al. 2017; for a review see Wieser and Brosch 2012). In a study by Koji and Fernandes (2010), valence attributions of facial expressions significantly changed with the type of visual background. Specifically, faces displayed in a positive scene were rated as more positive than the same faces accompanied by a neutral or negative context.

While there is convincing evidence pointing toward the influential role of social context in emotion perception, most research to date has been concerned with how contextual information affects judgments of prototypical and intense emotion displays. For example, Maringer et al. (2011) showed that, when observers were unable to imitate the perceived facial movements, their ratings of smile genuineness varied depending on the situation in which these smiles were ostensibly produced. In a study by Mui et al. (2020), participants judged smiles paired with positive context as more genuine than the same expressions presented with negative vignettes. Moreover, Ito et al. (2013) showed that, when rating the positivity of smiles, observers are influenced by background scenes. All three studies used prototypical, genuine-looking smiles, which are likely clear in meaning and easy to interpret. However, contextual effects may be strongest when expressive information is subtle or ambiguous (Wallbott 1988; Van den Stock et al. 2007).

Beyond the immediate situational context, facial expressions may be judged differently depending on the cultural background of the perceiver (Fang et al. 2019; Pogosyan and Engelmann 2011). For example, Chinese and Japanese observers tend to focus on the eyes when judging others’ emotions (Mai et al. 2011; Yuki et al. 2007), and the reliance on the Duchenne marker to distinguish between posed and genuine smiles may not extend to some non-Western cultures, such as Gabon (Thibault et al. 2012). The extent to which judgments of facial expressions are affected by contextual information can also vary across cultures. In Western cultures, emotion perception is thought to be more analytic and based on independent objects and category attributions (Markus and Kitayama 1991). By contrast, people from Eastern countries tend to perceive scenes in a more holistic manner in which different objects are interrelated (Masuda and Nisbett 2001; Nisbett and Masuda 2003), with a target person being thought to express a specific emotion if the neighbouring people display the same emotion (Hess et al. 2016). In line with this notion, neuroimaging evidence yielded that East Asian (vs. Westerners) observers tend to pay greater attention to relationships between target visual objects and the surrounding scenes (Goto et al. 2010).

Masuda and colleagues (Masuda et al. 2008, 2012) showed that facial expressions displayed by nearby people influenced Japanese but not Americans' perception of the central person. Subsequent eye-tracking data revealed that Japanese participants looked at the surrounding persons more than did Americans (Masuda et al. 2008). In a study by Ito et al. (2013), intensity ratings of target facial emotion significantly varied with the emotion displayed by the people in the background. However, such contextual influence was only observed among Japanese but not European Canadians. When testing for the effects of verbal context information, Matsumoto et al. (2010) found strong cultural differences in how facial expressions were judged in combination with vignettes describing emotion-eliciting situations. Specifically, Japanese and Korean participants relied more on context descriptions than did Americans. Together, these findings suggest that social context differentially influences emotion perception in Western and Eastern cultures.

The present research

Whilst existing evidence points toward the role of contextual information and related cross-cultural variations in emotion perception, most of the research to date has focused on the recognition of basic emotions, such as sadness, happiness, fear or disgust. Here, we explore whether similar effects can be observed within one expression, namely a smile. We focus on the interpretation of smiles given the frequency of these expressions (e.g., Chapell 1997), their importance in social decision-making (e.g., Krumhuber et al. 2007a, b), and the cultural differences in their interpretation (e.g., Rychlowska et al. 2015). Specifically, we investigate the extent to which social context impacts observers’ judgments of posed non-Duchenne smiles, which are more ambiguous and harder to interpret than genuine smiles (e.g., Owren and Bachorowski 2001; Johnston et al. 2010).

In two laboratory experiments, contextual influences were examined using distinct types of context information. Study 1 tested how visual context affects smile interpretation. Participants watched images of non-Duchenne smiles, presented without any background or embedded in scenes associated with happiness or politeness. We predicted that smiles accompanied by happy pleasurable visual scenes would be rated as more genuine (i.e., expressing positive feelings) than smiles presented on their own. By contrast, smiles accompanied by scenes depicting polite contexts should be rated as less genuine (i.e., expressing polite intentions) than smiles presented without context. Study 2 investigated whether these effects generalise to verbal vignettes and whether culture plays a moderating role. To this end, British and Japanese participants watched photographs of non-Duchenne smiles accompanied by verbal labels describing happy events or situations suggesting the need to be polite. As before, we predicted that smiles associated with contexts implying happiness would be perceived as more genuine than smiles without context and those that impose politeness. We also hypothesised that contextual influences would be more pronounced among Japanese than British observers.

Study 1

Method

Participants

One hundred nine participants (90 females; Mage = 22.79 years, SD = 6.38) from the United Kingdom, mostly students at University College London, took part in the study without remuneration. Post-hoc sensitivity power analysis using the simr package (Green and MacLeod 2016) in the R environment (R4.0.2. GUI 1.70) indicated that this sample size was sufficient to detect a standardized regression coefficient of |b| = 0.30 for the context condition with a significance level of α = 0.05 and 83% power. Because the target faces were White, we only recruited White/Caucasian participants to eliminate cross-race effects (Lindsay et al. 1991; Krumhuber and Manstead 2011). Ethical approval was granted by the departmental ethics committee. Subjects provided written informed consent prior to participation.

Materials

The facial stimuli consisted of 12 frontal images (size: 360 × 480 pixels) of White female faces displaying a posed non-Duchenne smile (see Fig. 1a), retrieved from validated sets of smile expressions developed by Johnston et al. (2010) and McLellan et al. (2010). All models were presented with direct gaze and on a plain background. They displayed deliberately posed non-enjoyment smiles, unrelated to positive emotional experience and produced upon instructions to smile (Johnston et al. 2010; McLellan et al. 2010). All expressions corresponded to the description of a non-Duchenne smile (Frank et al. 1993) in the sense that they involved the Lip Corner Puller (AU12) without the Cheek Raiser (AU6; FACS, Ekman et al. 2002). In the face-only condition, these stimuli were presented on their own and displayed in greyscale.

Fig. 1
figure 1

Examples of face-only (a) and face-context (b) stimuli implying a happy (left) and polite (right) situation in Study 1. All facial expressions represent a posed (non-Duchenne) smile. Image courtesy of Tracey McLellan

For the face-context images (mean size: 600 × 480 pixels), the same facial expressions were embedded into a context, such that the stimuli depicted a person with a background scene that implied either a happy (i.e., eating ice-cream on the beach) or polite situation (i.e. serving customers in a pharmacy; see Fig. 1b). All context images originated from a pool of 51 pictures and were validated in a pilot study with a separate group of participants (n = 30). For this, the facial region was blurred to remove information from the targets’ faces. Pre-testing results showed that happy context images were rated as more indicative of a situation in which a genuine smile would occur than polite context images, t(29) = 3.66, p < 0.01, d = 0.67. For the main study, we selected 6 polite (M = 36.84, SD = 16.84, 0–100 scale) and 6 happy context images (M = 58.10, SD = 19.38) that were most representative of each category (i.e., received lowest and highest genuineness ratings, respectively). Facial expressions and context images were converted to greyscale and realistically combined using Photoshop CC.

Procedure

Participants were tested individually using the Qualtrics software (Provo, UT). Upon arrival, they were randomly assigned to two groups (face-context group, n = 54; face-only group, n = 55) and informed that the study aimed to test the extent to which different smiles are likely to occur in various social situations. They were then provided with brief definitions of each smile type: (a) genuine smile: "a smile displayed when someone is happy or amused and is truly feeling the emotion", and (b) posed smile: "a smile that is intentional in the sense that someone wants to be nice and express positive intentions but does not feel the respective emotion". Participants in the face-only group saw smiles presented on their own, and participants in the face-context group saw faces presented with happy and polite background scenes.

The smile evaluation task involved the random presentation of 12 face-only or face-context images which were displayed for 3 or 4 s, respectively. For each stimulus, participants rated the extent to which the smile would communicate that the person wanted “to be nice and to express positive intentions” or was “feeling happy and content”, using scales ranging from 0 to 100%. Ratings across both response categories were complementary and had to add up to 100%. For the sake of simplicity, we focus in all analyses on the “feeling happy and content” dimension, with higher scores reflecting greater perceived smile genuineness. Participants could take as much time as needed to respond. At the end of the experiment, they indicated their age, gender and ethnicity, were thanked and debriefed.

Results and discussion

We tested whether the presence of context information accompanying smiles altered participants’ perception compared to the control face-only condition. For this, we compared ratings of genuineness in the face-only condition with the same ratings in trials in which smiles were accompanied by happy versus polite background scenes. The analyses were conducted in the R environment (R4.0.2. GUI 1.70) using the lme4 package (Bates et al. 2015). In order to control for non-independence due to multiple observations per participant and stimulus face, we estimated two linear mixed regression models including by-subject and by-item random intercepts. Each model estimated genuineness ratings as a function of context condition (face-only vs. happy context, face-only vs. polite context). The models thus compared genuineness scores for the two types of background scenes with the baseline face-only condition. Results indicated that genuineness ratings of smiles seen in a happy context were significantly higher (M = 53.18, SD = 17.55) than ratings of the same smiles presented without context (M = 43.18, SD = 11.45), \(\beta \)= 10.40, SE = 2.85, t(119) = 3.66, p = 0.001. By contrast, genuineness ratings of smiles seen in a polite context were significantly lower (M = 36.62, SD = 14.03) than ratings of these smiles presented without context, \(\beta \)= − 6.98, SE = 2.56, t(136) =  − 2.73, p = 0.007. Thus, context information influenced observer ratings such that the same smiles were perceived as more genuine in happy contexts and more posed in polite contexts.

The results corroborate and extend previous findings (e.g., Aviezer et al. 2012; Carroll and Russell 1996; Chen and Whitney 2019) by showing that similar effects of social context can be observed for the interpretation of smile expressions. Specifically, participants rated smiles embedded in negative visual scenes as less genuine than smiles presented on their own and smiles in positive visual scenes as more genuine than smiles presented in isolation. Interestingly, an opposite trend was observed in a study by Ito et al. (2013), where participants rated smiles with positive background scenes as similar or lower in positivity than smiles presented on their own. This could be due to the specificity of stimuli being used, comprising prototypical expressions of happiness. As such, it is possible that intense facial expressions presented without any contextual information were more salient to participants and, as a result, appeared more positive. The present research used smiles produced on demand which did not directly reflect a positive emotional state (Johnston et al. 2010; McLellan et al. 2010). Such posed smiles could have been more challenging to interpret and thus more sensitive to contextual influences. It is also important to note that Ito et al. (2013) used a within-subjects design, whereas the present study compared ratings from two separate groups of participants. This allowed us to compare trials with different types of background scenes to the face-only control condition. Given the nested structure of face-context pairings, however, the experimental design was not fully balanced.

Study 2 was designed to address this limitation and to replicate the findings from the first study using verbal vignettes as context manipulation. Furthermore, we tested whether culture moderates the effects of social context on smile interpretation. For this, British and Japanese participants viewed non-Duchenne smiles that were either presented on their own (face-only group) or in combination with verbal context labels describing a happy, pleasurable situation (happy context) or a situation that imposes the need to be polite (polite context). In line with the findings from Study 1, we predicted that smiles associated with contexts implying happiness would be rated as more genuine than smiles without context and smiles paired with situations that impose politeness. Based on past evidence revealing cultural variations in emotion perception (e.g., Masuda et al. 2012; Matsumoto et al. 2010), we also hypothesised that Japanese participants would be more influenced by context information than participants in the UK.

Study 2

Method

Participants and design

One hundred ninety-two White participants from the United Kingdom, mostly students at University College London (137 females, Mage = 21.69 years, SD = 4.48), and 186 participants from Japan, mostly students at Hiroshima University (142 females, Mage = 21.23 years, SD = 2.55) took part in the study without remuneration or in exchange for payment (UK: £4; Japan: ¥400). Post-hoc sensitivity power analysis using the simr package (Green and MacLeod 2016) in the R environment (R4.0.2. GUI 1.70) indicated that this sample size was sufficient to detect a standardized regression coefficient of |b| = 0.30 for the interaction between culture and context in mixed models with a significance level of α = 0.05 and 83% power. Ethical approval was granted by the departmental ethics committees in each country, and subjects provided written informed consent prior to participation.

The study had a full between-subjects experimental design. Participants in each country were randomly assigned to one of three conditions, featuring approximately the same number of participants in each group (happy context: n = 125; polite context: n = 125; face-only: n = 128).

Materials

For the British sample, facial stimuli were exactly the same as those used in Study 1. For the subjects in Japan, 12 frontal images (size: 370 × 486 pixels) of Japanese female faces displaying a smile expression were selected from several databases (e.g., Matsumoto and Ekman 1988; Fujimura and Umemura 2018). All models were presented with direct gaze and on a plain background. Because the majority of smiles included the Duchenne marker, we edited the images by cropping and pasting the eyes from the same person’s neutral expression into the smiling photograph using Photoshop CC. This allowed us to remove visible signs of the Duchenne marker (i.e., crow’s feet and wrinkles under the eyes), whilst keeping the eyebrows in their original form. After modification, smiles in both stimulus sets (UK, Japan) corresponded to the description of a non-Duchenne smile (Frank et al. 1993) in that they involved the Lip Corner Puller (AU12) without the Cheek Raiser (AU6). The two stimulus sets did not differ in AU12 intensity, t(21) = 0.74, p = 0.47, d = 0.30, as measured by OpenFace (Amos et al. 2016), a software for automated facial expression analysis. All stimuli were displayed in colour.

Context information consisted of 12 verbal context labels, describing either polite situations (e.g., smiling when asking a stranger for time) or happy situations (e.g., smiling when a person’s boyfriend returns from a long trip). Pilot-testing with separate groups of British (n = 49) and Japanese participants (n = 57) confirmed that the 6 happy labels (M = 75.52, SD = 12.22; 0–100 scale) were rated as more indicative of a situation in which a genuine smile would occur than the 6 polite labels (M = 23.07, SD = 12.80), F(1, 104) = 849.09, p < 0.001, ηp2 = 0.89. Ratings did not significantly differ between countries, F(1, 104) = 0.17, p = 0.680, ηp2 = 0.00, and the interaction of country and context type did not reach conventional levels of significance, F(1, 104) = 3.80, p = 0.054, ηp2 = 0.03.

Procedure

The study instructions and procedure were identical to those from Study 1, except that participants in the happy and polite context groups saw verbal context labels together with the facial stimuli. To allow sufficient time to process the verbal information, the context appeared first for 4 s and was shown at the top of the screen, followed by the picture of the face displayed for 3 s together with the context. Presentation order and face-context pairings were randomised, with each context label being displayed twice. For each stimulus, participants rated the extent to which the smile communicated that the person wanted “to be nice and to express positive intentions” or was “feeling happy and content”, using scales ranging from 0 to 100%. Ratings across both response categories were complementary and had to add up to 100%. Similar to Study 1, all analyses focus on the “feeling happy and content” dimension, with higher scores reflecting greater perceived smile genuineness. Participants could take as much time as needed to respond. At the end of the experiment, participants indicated their age, gender and ethnicity. Hereafter, they were thanked, paid, and debriefed.

Participants in Japan completed the study in Japanese. To ensure the conceptual equivalence of the two versions, we used a back-translation procedure. For this purpose, all task instructions were translated into Japanese by a bilingual psychologist and then translated back to English by an independent translator, who was a native English speaker unaware of the purpose of the study (Van de Vijver and Leung 1997).

Results and discussion

To explore whether participants’ ratings of genuineness vary as a function of context condition (face-happy context, face-polite context, face-only) and culture (UK, Japan), linear mixed model analyses were conducted in the R environment (R4.0.2. GUI 1.70) using the lme4 package. We predicted that smiles would be perceived as significantly more genuine when accompanied by happy vignettes than no context information (control condition). This control condition, in turn, should be rated as more genuine than smiles accompanied by polite vignettes. We hypothesised that the effects of context would be moderated by culture with larger contextual influences among Japanese than British participants. We thus expected an interaction between culture and context.

To test these predictions, we computed two linear mixed models with by-subject and by-item random intercepts separately for British and Japanese participants. Each model regressed ratings of genuineness on context condition. For participants in the UK, results indicated that genuineness ratings of smiles seen in a happy context were significantly higher (M = 51.83, SD = 12.99) than ratings of the same smiles presented without context (M = 38.93, SD = 10.10), β = 12.90, SE = 2.12, t(189) = 6.09, p < 0.001. By contrast, genuineness ratings of smiles seen in a polite context were significantly lower (M = 33.06, SD = 12.87) than ratings of these smiles presented without context, β = − 5.87, SE = 2.12, t(189) = 2.77, p < 0.01. For Japanese participants, a similar pattern of results occurred such that smiles were rated as more genuine in the happy (M = 53.91, SD = 13.57) than no-context condition (M = 42.16, SD = 11.91), β = 11.76, SE = 2.30, t(183) = 5.11, p < 0.001. By contrast, smiles were rated as less genuine in the polite (M = 27.96, SD = 12.87) than no-context condition, β = − 14.20, SE = 2.30, t(183) = 6.17, p < 0.001 (see Fig. 2).

Fig. 2
figure 2

Mean ratings of genuineness as a function of condition in the United Kingdom and Japan (Study 2)

In order to examine cross-cultural differences between British and Japanese participants, we tested whether the effects of context interact with culture using hierarchical models with by-subject and by-item random intercepts. The analysis regressing genuineness scores as a function of country (UK, Japan) and context condition (happy context, polite context, face-only), using Satterthwaite's method for computing degrees of freedom, revealed a significant interaction effect, F(2, 372) = 4.15, p = 0.02, ηp2 = 0.02. Overall, there was no significant difference between the UK and Japan for smile ratings in the face-only condition, β = − 3.22, SE = 5.33, p = 0.55. To assess the relative contribution of social context in the two countries, we used Z-scores to compare the effects of the context condition among Japanese participants to those of British participants. Results showed that the difference between ratings of genuineness in the face-only and the polite context condition was significantly larger in Japan than the UK (Z = 1.96, p = 0.05). No such cross-cultural difference occurred when comparing the difference between the face-only and happy context condition (Z = 0.28, p = 0.78).

The findings of Study 2 replicate and extend the results of Study 1 by showing that not only background scenes, but also verbal labels influence the interpretation of smiles. The presence of brief descriptions of situations associated with happiness or the need to be polite made the smiles appear more or less genuine compared to smiles presented without any contextual information. Whereas existing research suggests that observers from Eastern cultures may generally rely on the situational context more than Western observers (Masuda 2017; Matsumoto et al. 2010), we obtained a similar pattern of findings among British and Japanese participants. The main difference between the two samples, accounting for a significant interaction effect, was due to a stronger influence of the polite vignettes in the Japanese sample than the British sample.

General discussion

This research investigated the effects of contextual information on smile interpretation in two laboratory experiments. In Study 1, we focused on the visual context and investigated how background scenes associated with happiness vs. politeness affect smile judgments. As expected, smiles accompanied by happy background scenes were rated as more genuine than smiles presented on their own. By contrast, smiles accompanied by scenes depicting polite contexts were rated as less genuine than smiles presented without context. Study 2 replicated this result using verbal vignettes. By showing that smile judgements changed with happy vs. polite context labels, the present findings support increasing evidence that the meaning of facial expressions is inherently flexible and context-specific (Aviezer et al. 2017; Barrett et al. 2019; Brambilla et al. 2018).

Importantly, Study 2 revealed cross-cultural variations in how social context impacts smile interpretation. Although context-dependent judgments were made in both countries, the influence of verbal vignettes describing polite situations was stronger in Japan than the UK, with the effect of vignettes describing happy contexts being comparable in the two countries. This result is generally consistent with the notion of analytic versus holistic thinking styles (Masuda 2017). While the former (dominant in the Western world) is selectively focused on central objects or targets rather than the surroundings, holistic thinking (dominant among East Asians) emphasises a context-oriented focus of perception (Ma-Kellams 2014). Interestingly, in the present research the enhanced influence of context among Japanese participants was only observed in the polite context condition and not in the happy context condition. Japan is a homogeneous, socially cohesive society (Ballas et al. 2016; Putterman and Weil 2010), often experienced as collectivistic and thus concerned with group harmony (e.g., Suh et al. 1998). Sensitivity to polite gestures and situations is a crucial factor for facilitating affiliation and social bonding. Hence, it is possible that being nice and expressing prosocial intentions in situations implying civility is more expected in Eastern than in Western cultures. Such differences could be especially marked in the presence of contextual cues suggesting the need to be courteous and respectful and implying presence of other people, rather than in happy contexts which may be more culturally universal and less affected by social norms. Eastern (vs. Western) cultures may also find expressions of intense positive emotion more disruptive and less appropriate in social situations that imply politeness. Supportive evidence comes from research showing that collectivistic countries, like Japan, endorse expressing happiness to a lesser extent than individualistic countries (Matsumoto et al. 2008). Moreover, people are more likely to consider emotion suppression an acceptable (i.e., less negative) regulation strategy when they endorse Asian rather than Western-European values (Butler et al. 2007). In addition, Asian (vs. Western) observers value low-intensity, calm smiles more than excited smiles in official contexts (Tsai et al. 2016, 2019).

Besides top-down factors, bottom-up processes might play an important role in explaining cross-cultural differences. For example, Ozono et al. (2010) found that Americans tend to focus more on the mouth than the eye region of smile expressions when making trustworthiness judgments, whereas the opposite applied to Japanese participants. Other empirical findings suggest that East Asians primarily fixate on the eye region while neglecting the mouth area (e.g., Jack et al. 2009; Yuki et al. 2007). Given that all stimuli featured non-Duchenne smiles, the lack of additional information in the eye region (i.e., absence of the Cheek Raiser or AU6) could have prompted Japanese participants to rely on the context in which the expressions were presented. In this work we focused on non-Duchenne expressions because they are more ambiguous in their emotional meaning, which makes them harder to interpret than Duchenne smiles (e.g., Owren and Bachorowski 2001; Johnston et al. 2010). In future research, it would be interesting to test whether the present findings replicate with Duchenne smiles or dynamic videos of smiles, given that the manner in which smiles unfold provides important information about their authenticity (Krumhuber and Kappas 2005, for a review see Krumhuber et al. 2013). Such an investigation would require databases of dynamic posed and genuine smiles displayed by expressers from Western and non-Western cultures—tools that are much needed in cross-cultural research.

While the present findings generally align with those of previous research (Ito et al. 2013; Matsumoto et al. 2010; Masuda et al. 2008), Mui et al. (2020) obtained a different pattern in the sense that American participants were more sensitive to situational context than Chinese participants. Specifically, negative context information led smiles to be rated as less genuine and more so for American observers. In their study, smiles paired with a positive vignette were also rated similarly to smiles presented in isolation. Those differences in findings are attributable to a number of possible reasons. First, both studies used different sets of smile stimuli. Our research used non-Duchenne posed smiles, whereas Mui and colleagues employed photographs of spontaneous smiles extracted from the UvA-NEMO database (Dibeklioglu et al. 2012). While this database is a promising source of dynamic genuine and false smiles, it is yet to be established which facial movements distinguish these two groups of facial expressions and to what extent these features can be conveyed in still photographs rather than dynamic videos.

The second reason relates to differences in the dependent measures. In the study by Mui et al. (2020), participants rated the extent to which they perceived smiles as genuine without any further instruction. In the present research, subjects rated how much smiles expressed true happiness and amusement as opposed to positive intentions. It is thus possible that in Mui et al. (2020), Chinese participants interpreted smiling in the negative situation (i.e., in response to someone that the expresser does not like) as more genuine than did American participants because genuineness involves being sincere in communicating prosocial intentions. In the present research, the lack of genuineness could have been interpreted as greater politeness. It is important to note that, in the present manuscript, the term “genuineness” is used as a proxy for participants’ ratings of positive feelings versus polite intentions. This dimension was treated as a continuum and, even though contextual information implying happiness increased the perceived genuineness of non-Duchenne smiles, these smiles were still perceived as moderately genuine (i.e., expressing happiness and contentment) at best.

In addition to investigating cross-cultural variation in contextual influences on dynamic smiles, it would be useful to test if the effects observed in the present research generalize to other emotion displays, such as genuine and posed displays of surprise (Zloteanu et al. 2018) or pain (Bartlett et al. 2014). Arguably, judgments of facial expressions provide only partial explanation of real-life behaviours and decisions. Future experiments could complement ratings of smiles with measures of decision-making in economic games or with physiological measures including reaction times (Ito et al. 2012), eye-tracking (Jack et al. 2009) or neuroimaging (Masuda et al. 2014). Such triangulation will be useful for a fine-grained understanding of the mechanisms underlying contextual influences on smile interpretation, including the integration of information from multiple channels.

The present research shows that contextual information in the form of visual scenes or verbal vignettes affects how people interpret facial expressions. Whereas happy contexts made smiles appear more genuine, polite contexts made smiles less genuine. This pattern was obtained with a British and a Japanese sample, with the effects of the polite context being more marked among Japanese participants. Together, these results demonstrate the importance of situational and cultural contexts, showing that both are critical to how we perceive and interpret emotions.