Introduction

Psychotherapy is a complex process that involves numerous active ingredients that researchers continue to elucidate. Therapy outcomes have been found to be strongly associated with the quality of the therapeutic alliance, defined as “…the collaborative and affective bond between therapist and patient…” (Martin et al., 2000; see also Horvath et al., 2011). Therapeutic alliance, which is partially dependent on empathy, is a critical common factor in positive treatment outcome, is something that is formed rapidly during the first few encounters with the provider, and is believed to determine whether a patient returns for a second session (Wampold, 2015). Communication, including nonverbal communication, is critical for development of a strong therapeutic relationship. These nonverbal communications are interpreted within a context that consists of interpersonal space, body posture, eye gaze, various elements pertaining to the quality of speech (e.g., prosody and volume), and various biases of the perceiver (see Wieser & Brosch, 2012 for a review). Within the nonverbal domain, facial expressions are integral to successful decoding of emotion, particularly in clinical interactions (Foley & Gentile, 2010). This process is of particular importance in healthcare settings, as it allows for communication of empathy and the building of a therapeutic alliance.

During the SARS-CoV-2 pandemic, the center for Medicare and Medicaid (CMS) for the first time in history approved the use of both telephone and telemedicine for patient care. The challenges of conducting psychotherapy via telephone are documented and include lack of control over the patient’s environment, potential loss of privacy and confidentiality, and difficulties developing a therapeutic alliance without face-to-face contact (i.e., loss of many components of nonverbal communication; Brenes et al., 2011). While telemedicine via video has the significant advantage of allowing for increased nonverbal communication, the issues of patient environment, privacy/confidentiality, and the need to adjust treatment protocols and assessment procedures remain, as well as the need to adjust communication style (e.g., replacement or supplementation of body language with additional direct questions, and increased behavior suggesting active listening such as exaggerated nods; Gros et al., 2013; Henry et al., 2017).

For most of 2020, it was safest to conduct visits through telemedicine. The increasing availability of vaccines and lifting of mask mandates in 2021 made a resumption of in-person services possible; however, with the emergence of variants, mask mandates returned in some states and some individuals may choose to continue using masking for in-person services, regardless of mandates. Mandated masking may recur with seasonal outbreaks of SARS-CoV-2 (Liu et al., 2021) or another pandemic. Initially, there was very little empirical or theoretical guidance in the literature specific to wearing masks with regard to the impact on social interaction. However, much has been learned from clinical experience during the pandemic, and new research is emerging. In our experience as clinicians, conducting in-person psychotherapy with both patient and provider wearing masks presents unique challenges to the therapeutic process.

In this paper, we will provide a brief review of the literature on facial emotion expression and facial emotion recognition. For more detailed reviews on these topics, the reader is referred to comprehensive reviews (e.g., Keltner & Cordaro, 2017). We will focus primarily on aspects of this research that have relevance for the wearing of masks in a psychotherapeutic setting. It will be important to keep in mind that the impact of masks on facial emotion expression and recognition pertains, not just to the emotional expression of the patient and the therapist’s ability to recognize that emotion, but also to the therapist’s emotional expression and how a patient’s clinical concerns may affect their perception of that expression. Our goal is to raise awareness of potential challenges when conducting psychotherapy with masks and to propose some suggestions for how these may be overcome. Although this topic is approached within the framework of psychotherapy, these issues and challenges will be pertinent to any healthcare provider providing clinical services where patients and/or provider are masked.

Facial Emotion Expression

Basic Emotion Theory

Much of the work on facial emotion expression has been conducted within a framework of basic emotion theory (BET; see Keltner & Cordaro, 2017 for review). BET proposes that nonverbal expressions of emotion covary with distinctive subjective experiences and emotion-related physiological responses, signal the current emotional state, and are to some degree similar across cultures with regard to both expression and recognition. According to Ekman’s oft-used model, the six cardinal emotions are happiness, sadness, fear, surprise, disgust, and anger, although many other emotional states can be expressed in the face, such as pride or embarrassment (Ekman & Cordaro, 2011; Keltner & Cordaro, 2017; Russell, 1994). Compound emotions (i.e., blends of the basic six emotions, such as “happily surprised”) have also been identified and shown to be distinct from the basic six (Du et al., 2014).

The Facial Action Coding System (FACS; Ekman & Friesen, 1978 as cited in Ekman & Rosenberg, 2005) is often used to study emotional expression and is an anatomically based system for measuring perceivable facial movements, as well as aspects of head and eye movements. Observable facial movements are described in terms of action units (AUs), which are unique and reflect the movement of a single muscle or combination of muscles (Ekman & Rosenberg, 2005). Action units of importance in the upper face include regions that raise and tighten the eyelids and raise and lower the eyebrows. Action units of importance in the lower face include regions pulling the corners of the lips, parting the lips, and lifting the lips (Wegryzn et al., 2017).

Upper Versus Lower Face

For the purposes of this paper, “lower face” will refer to the part of the face that will be covered by a mask, thus the area below the eyes. There are more AUs in the lower face than the upper face (Ekman & Rosenberg, 2005) and the upper and lower face are differentially involved in facial emotion expressions. Regarding AUs involved in the expression of the basic six emotions, inspection of Table 1 in Gosselin et al.s’ (2010) paper on voluntary control of facial action units reveals that happiness and disgust involve exclusively lower face AUs. Anger and sadness involve more lower AUs than upper AUs, while fear and surprise involve equal numbers of upper and lower AUs. Thus, loss of lower face information due to mask wearing will result in loss of information that may be crucial for distinguishing between the basic six emotions.

It is outside the scope of this paper to delve deeply into the variety of possible facial emotional expressions beyond the basic six; however, complex expressions that have particular relevance to health care include empathy and compassion. Empathy in particular has been shown to be moderately related to patient ratings of the therapeutic alliance (Nienhuis et al., 2018). Empathy involves awareness, understanding, sensitivity to, and the vicarious experience of the feelings, thoughts, or experiences of another. A related construct, compassion, involves not only empathetic awareness and perspective-taking, but also the motivation to provide help and support. Falconer et al. (2019) further divide compassion into kind-compassion and empathic-compassion, with kind-compassion reflecting sympathy and kindness and empathic-compassion reflecting empathy, each with a distinct facial presentation (Condliffe & Maratos, 2020). Both forms were found to be perceived differently, with kind-compassion appearing happier and more contented and empathic-compassion more often appearing sad and concerned (Condliffe & Maratos, 2020; Falconer et al., 2019). Consistent with these perceptions, the expression of kind-compassion depends on the lower face, particularly the smile, while empathic-compassion expression was thought to involve primarily upper face (Falconer et al., 2019). Thus, mask wearing will result in loss of information regarding the expression of kind-compassion. These distinctions illustrate the nuances of emotion expression and recognition and how wearing a mask might impact a psychotherapeutic interaction.

Motor Control of Facial Expressions

In addition to being differentially involved in facial expressions of emotion, the upper and lower face differ with regard to motor control. Facial musculature is innervated by the facial nerve, cranial nerve VII, whose cell bodies in the pons receive a variety of descending inputs allowing for both voluntary and involuntary facial expression. There are at least two key differences in motor control of the upper and lower face. First, the lower face is innervated exclusively by the contralateral hemisphere, while the upper face is innervated bilaterally. This is true for both direct pathways from the cerebral cortex and indirect pathways from the brainstem (Müri, 2016; Rinn, 1984). This means that muscles in each side of the lower face can act independently; independent movement of upper face muscles is much more difficult because those muscles are yoked by their bilateral input (Gosselin et al., 2010; Mehu et al., 2012; Rinn, 1984). Second, voluntary movements of the face use different motor pathways than emotionally evoked movements (Müri, 2016; Rinn, 1984). Specifically, voluntary movements are controlled via the cortical motor strip, i.e., the pyramidal motor system. In contrast, emotionally evoked facial movements are controlled by areas outside the pyramidal system, i.e., the extrapyramidal system [see also Gothard (2014) for more detailed review of motor control of facial expressions]. What can be concluded from this brief review of the motor control of facial expression is that facial expressions visible while wearing a mask are largely emotion-evoked and are under less voluntary control.

Facial Emotion Recognition

The topic of facial emotion recognition is vast and has been approached from a variety of perspectives and techniques, with stimuli varying with regard to whether they are static or dynamic, posed or spontaneous, subtle or intense. It is beyond the scope of this paper to systematically review the relative contributions of these variables. However, our brief review indicates that different facial features are more diagnostically useful for recognizing different emotions, with the eyes and mouth commonly identified as the most important areas (e.g., Eisenbarth & Alpers, 2011; Nusseck et al., 2008). Intensity of expression is an important factor, with decreased recognition accuracy noted for less intense expressions (Palermo & Coltheart, 2004).

Some facial emotional expressions are more easily identified than others (Palermo & Coltheart, 2004). For example, happiness is recognized more quickly and more accurately than other facial emotion expressions and is rarely confused with other emotions. This is likely due to the distinctive expression of happiness in the mouth area (Calder et al., 2000; Calvo et al., 2018; Eisenbarth & Alpers, 2011; Wegryzn et al., 2017). Other facial emotion expressions are more difficult to recognize and more easily confused (Palermo & Coltheart, 2004; Pochedly et al., 2012). This confusion can be, at least partly, attributed to the fact that they involve partially overlapping AUs in the upper face; anger and disgust both involve lowered brow, while fear and surprise both involve raised brow (Matsumoto & Ekman, 2008). The lower face can be important in reducing confusion, at least between anger and disgust. Additionally, Jack et al. (2009) found that fear and surprise are confused when information is not extracted from the mouth.

In an effort to clarify which parts of the face are most diagnostically useful for decoding emotional expressions, a number of studies have been carried out using the Bubbles technique. Blais et al. (2012) found a preference for the mouth region in both static and dynamic facial displays. However, subsequent work determined that additional facial information is needed for spontaneous expressions, likely due to increased ambiguity and decreased intensity in spontaneous expressions compared to posed (Saumure et al., 2018). That the mouth carries significant diagnostic weight in facial emotion recognition is independent from findings that the eyes are a frequent target of gaze. The nature of facial expression in the mouth is such that it can be detected using parafoveal vision. For example, Calvo et al. (2014) found that happy faces can be recognized in peripheral vision and Peterson and Eckstein (2012) determined that fixations just below the eye are optimal for extracting important information about facial features.

Regardless of which part of the face is most diagnostically useful, individual differences have been found with regard to how people view faces. In a face identification study, Peterson and Eckstein (2013) found that different individuals had a preference for viewing the eyes, nose, or mouth area. Performance on the face identification task declined when subjects were forced to view their non-preferred face area. Yitzhak et al. (2020) used dynamic displays of the basic six emotions and analyzed the amount of time spent looking at the eyes, nose, and mouth. Although most participants were classified as “extreme” or “moderate” “eye-lookers”, a small number of participants were classified as mouth or nose lookers, and this held true across time and between different stimulus sets. The authors noted that these different styles were equally accurate and that patterns of scanning and emotion recognition were not related. The part of the face preferred by individuals also appears to vary with age. Older adults have been found to fixate the lower face more frequently than the upper (Circelli et al., 2013; Wong et al., 2005) and recognize the basic facial emotion expressions more easily when presented with the mouth region (Guarnera et al., 2018). Taking these findings together, it could be argued that older adults, and other individuals who tend to focus on the lower face, may experience increased difficulty when this preferred source of information is not available to them.

Overall, it can be concluded that the ability of an observer to perceive facial emotion expression accurately will be significantly hampered with the wearing of a mask. This likely would be particularly true for those emotions that are harder to detect without a mask, for the more subtle and less intense expressions likely seen in conversational interactions, and for individuals who tend to rely on the mouth more than other facial areas.

Cultural Considerations

Although there is considerable support for the concept that facial emotion expressions are culturally universal and that there are patterns of universality for as many as 22 emotions (Cordaro et al., 2018), this concept is debated (e.g., Chen & Jack, 2017; Jack et al., 2012b; Sato et al., 2019) and a number of studies have shown cultural differences in gaze patterns and facial emotion recognition.

Culture influences visual scan paths used when identifying faces. While the classic “T” pattern is a consistent finding in Western participants, with a particular focus on the eyes, East Asian participants tend to fixate more on the center of the face, i.e., on the nose region (Blais et al., 2008). However, when asked to recognize the emotional expression in a face, East Asian individuals tend to fixate the eye region more heavily, whereas Western Caucasian participants spread their attention across the face more evenly (Jack et al., 2009). Further, research has demonstrated that individuals from different cultures rely more heavily on facial areas that are more diagnostically valuable in their own culture. For example, emotion expression tends to be more overt in the USA and more subdued in Japan (Yuki et al., 2007). Accordingly, across two studies, individuals in the USA were shown to rely more heavily on the more expressive mouth area when interpreting facial expressions, whereas Japanese individuals were shown to rely more on the eye regions—consistent with their respective cultural norms (Yuki et al., 2007). In general, Japanese individuals are less accurate in identifying Ekman’s basic emotions of disgust, fear, sadness, anger, and contempt (Shioiri et al., 1999). They also are more likely to perceive neutral expressions as having emotional valence (Uono & Hietanen, 2015). For example, Akechi et al. (2013) found that Japanese individuals tend to interpret neutral expressions as unapproachable, unpleasant, and dominant. When interpreting emotion from faces, East Asian cultures heavily rely on direction of gaze, a finding much less pronounced in Western Europeans and North Americans (Akechi et al., 2013; Jack et al., 2012a).

A final cultural distinction that will be important to consider when wearing masks pertains to eye contact. Eye contact can indicate different information to an individual depending on their cultural upbringing. For example, expressions with direct gaze are interpreted by Japanese as angrier and sadder compared to Finnish individuals (Akechi et al., 2013).

Although certainly cultural differences exist, a meta-analysis of 97 studies of emotion recognition in various cultures, demonstrated that there does still appear to be a universal component (Elfenbein & Ambady, 2002). The researchers also found evidence of an in-group advantage to both emotion recognition and expression wherein accuracy was greater when recognizing emotion expressed by an individual from the same cultural group. Additionally, individuals from minority groups recognized emotion more accurately in individuals from majority groups, while the reverse was not true (Elfenbein & Ambady, 2002). The authors subsequently attributed this to the increased exposure to the majority group experienced by minority group members (Elfenbein & Ambady, 2003).

Overall, many variables can be important to consider when individuals from different cultures work together in psychotherapy. We have touched on only a few and it will be important to keep potential differences in mind when considering the impact of wearing masks.

Considerations for Patient Populations

Deficits in emotion expression associated with psychological disorders have been well-documented, particularly in patients with schizophrenia (see Trémeau, 2006 for review), although emotion expression deficits have also been identified in depression (Gehricke & Shapiro, 2000) and PTSD (Clapp et al., 2014). Since many clinicians already adapt to the difficulties with emotion expression seen in various disorders, we will focus on how masks may impact emotion recognition in patients with psychological disorders.

In the context of psychotherapy, the recognition of emotions is an important component of therapeutic engagement. People with mental health concerns show differences in perceptual, attentional, and cognitive biases when reading the faces of others. The etiology for these differing capacities may result from the confluence of environmental influences (e.g., childhood maltreatment, Pfaltz et al., 2019), personality features (Perlman et al., 2009), and the effects of psychological disorders (e.g., schizophrenia: Kohler et al., 2010). In this section, we will touch on a few findings related to facial emotion recognition in various psychological disorders.

Depression

In general, accurate recognition of facial expressions of emotion is particularly difficult for adults with major depressive disorder (Demenescu et al., 2010). Negative attentional biases in depression and anxiety have been demonstrated in experimental designs that include stimuli of faces among other types of pictures (Armstrong & Olatunji, 2012). In a meta-analysis of eye-tracking studies, persons with depressive symptoms showed a bias toward extended viewing of negative stimuli and spent less time viewing positive stimuli (Armstrong & Olatunji, 2012). Additionally, persons with clinically significant symptoms of depression differ from those with non-clinical symptoms with regard to the typical “T”-shaped pattern of facial scanning for all emotions except happiness (Hunter et al., 2020). These authors found that, compared to those endorsing low symptoms of depression, individuals endorsing relatively higher depressive symptoms tended to focus more on the mouth and nose than the eyes and cheekbones when decoding fear, neutrality, anger, and sadness.

Anxiety

Anxiety is a ubiquitous symptom that is present in multiple disorders. In general, accurate recognition of facial expressions of emotion is less difficult for adults with anxiety compared to those with depression (Demenescu et al., 2010). A meta-analysis of free viewing tasks in studies of individuals with a variety of anxiety disorder diagnoses found that persons with high anxiety symptoms showed an orienting bias to potential threat stimuli (Armstrong & Olatunji, 2012); i.e., threatening images capture the attention of anxious individuals more frequently than non-anxious individuals. For example, Mogg et al. (2000) compared patients with generalized anxiety disorder (GAD) to a depressed group and to healthy controls. Results indicated that people with GAD tended to orient more frequently to threatening faces than to neutral faces when presented with both simultaneously. This effect appears to be general across anxiety disorders and independent of the type of threatening stimulus used (i.e., face versus picture; Armstrong & Olatunji, 2012).

Post-traumatic Stress Disorder

In a review of eye-tracking studies assessing attention to threat in patients with PTSD, results consistently showed that these individuals exhibit sustained attention to threatening stimuli compared to controls (Lazarov et al., 2019). Veterans with PTSD have been found to maintain attention longest on fearful and disgusted facial expressions specifically (Armstrong et al., 2013). Veterans with lifetime combat-related PTSD were found to have impaired emotion recognition accuracy for all seven emotions studied (Castro-Vale et al., 2020). Examining eye gaze specifically in women with PTSD due to childhood abuse, it has been found that direct eye gaze activates regions of the brain involved in an innate alarm system rather than regions more typically involved in processing social interactions in healthy controls—a finding consistent across emotion expression (Steuwe et al., 2014). This may explain the consistent finding of sustained attention to threat stimuli in individuals with PTSD.

Aggression

Early research suggested that aggressive and anger-prone individuals tend to respond negatively in socially ambiguous situations (Barefoot et al., 1989; Dodge, 1980); however, research into how these individuals process and interpret facial expressions is mixed (Chapman et al., 2018; Kuin et al., 2017; Mellentin et al., 2015; Smeijers et al., 2017). Theories for the role of facial expressions in provoking aggression in anger-prone individuals are split and differentially emphasize bottom-up perceptual deficits in facial processing versus top-down cognitive biases (e.g., hostile attribution bias, social information processing). From a developmental perspective, aggressive pre-adolescent children spent more time processing non-hostile social contextual information compared to non-aggressive controls (Horsley et al., 2010). These subjects spent more time processing non-hostile information yet nonetheless erred in attributing hostile intent (Horsley et al., 2010). In a study with adults, incarcerated antisocial violent offenders were shown digitally blended pairs of pictures of happy, angry, and fearful faces. Offenders differed from controls only in their tendency to read hostile intent into faces that included any proportion of the angry face. Without anger in the mix, the incarcerated antisocial violent offenders performed similarly to controls (Schönenberg & Jusyte, 2014).

Schizophrenia

How persons with schizophrenia process the emotional aspect of faces is well-studied (Kohler et al., 2010; Martin et al., 2020) and represents a foundational component in multiple theories of social cognition in schizophrenia (e.g., Roberts & Penn, 2012). Multiple eye-tracking studies of attentional facial processing in persons with schizophrenia have shown a restricted facial scanning strategy, with fewer and briefer fixations that are closer together compared to the typical pattern, as well as a tendency to avoid emotionally salient facial features (see Toh et al., 2011 for review). Individuals with schizophrenia have also been shown to have difficulty decoding emotions, require more visual information to identify them accurately, and tend to rely more heavily on the mouth region rather than the eyes (Faghel-Soubeyrand et al., 2020; Lee et al., 2011). They have also been found to have particular difficulty with negative emotions (Namiki et al., 2007; Romero-Ferreiro et al., 2016). Difficulty in accurately identifying emotions has been shown to vary with symptom severity (Suárez-Salazar et al., 2020), which may be particularly true for severity of negative symptoms (Faghel-Soubeyrand et al., 2020).

Borderline Personality

Research on facial emotion recognition in borderline personality disorder (BPD) has been mixed with evidence of impairment (e.g., Unoka et al., 2011) versus greater accuracy (e.g., Schulze et al., 2013). Females with BPD have been shown to be more likely to assign an emotion to neutral faces compared to psychiatric controls (Daros et al., 2014); however, they did not show a preference for misattributing that emotion as either positive or negative. In this study, BPD also was associated with more emotionally intense ratings of mildly sad faces. The authors suggested that this latter finding may suggest that persons with BPD may perceive low intensity emotional expressions as more intense than other people. Along this thread, another study of self-reported BPD traits and facial emotion recognition traits in a non-clinical sample found that high levels of endorsed borderline features were associated with poorer performance in accurately identifying neutral facial expressions and improved accuracy in detecting low intensity expressions of negative emotions (Meehan et al., 2017).

Conclusions

Taken together, the above brief review suggests that individuals with a wide variety of psychological/psychiatric disorders may have increased difficulty with facial emotion recognition when their therapist is wearing a mask. This would be particularly true for patients who rely more on the mouth area than the eyes, as suggested for depressed persons and persons with schizophrenia. Cognitive processing and/or attentional biases can result in the misinterpretation of a neutral or ambiguous face as showing negative emotion.

Recent Literature on the Impact of Masks

Non-clinical Settings

In an opinion paper reviewing the burdens associated with wearing masks in educational settings, Spitzer (2020) concluded that a primary impact of mask wearing was on blocking communication of positive emotions like pleasure, joy, happiness, amusement, sociability, and friendliness, while amplifying negative emotions. A research study by Carbon (2020) investigated the accuracy and confidence of emotion recognition of the six basic emotions using faces virtually altered to include a face mask. Carbon found that all emotions except fear and neutrality became more difficult to decipher from masked faces. Despite the participants’ accuracy in detecting neutrality in masked faces, neutrality was also incorrectly detected when the true expression was anger, happiness, or sadness; disgust was frequently misidentified as anger. Stimuli used in this study were of high-intensity expressions, thus these difficulties with accurate facial emotion recognition may be more pronounced with less intense facial expression. Grundmann et al. (2021) also found that face masks significantly undermined participants’ ability to categorize emotional facial expressions accurately. This was particularly true for older adults, consistent with research cited above indicating the importance of the mouth area for older adults.

Clinical Settings

A randomized controlled trial in Hong Kong investigated the effects of physicians wearing masks on patients’ perceptions of empathy (Wong et al., 2013). A large sample of patients (N = 1030) were randomly assigned to a session with a masked or unmasked provider and were assessed for perceived empathy and satisfaction with care. Results indicated that mask wearing had a negative effect on perceived physician empathy especially when the physician and patient relationship was already well established. When patients did not know the physician well, wearing a mask had no significant negative effect on patients’ perception of physician empathy (Wong et al., 2013). This may tie into the distinction between empathic-compassion and kind-compassion (Falconer et al., 2019), and suggests that the type of compassion conveyed may depend, in part, on the relationship between the patient and the provider.

Theoretical Consequences of Wearing Masks in Psychotherapeutic Settings

It is likely that additional empirical studies about the impact of masks on facial emotion expression and recognition will be forthcoming. In the meantime, we believe the following may also be inferred from the literature reviewed above.

A properly worn mask will cover almost the entire face below the eye, including emotion-relevant action units of the lower face, which have been consistently shown to have high diagnostic weight in emotion recognition. This reduction in information available to the viewer increases ambiguity, which may be particularly true for lower intensity facial expressions. This can interfere with accurate facial emotion recognition, especially for individuals with psychological disorders or older individuals. The greatest impact may be seen for facial expressions with overlapping action units and those that rely more heavily on the lower face. For example, the brow is involved in multiple expressions, with the lower face providing additional context for making an accurate interpretation. Additionally, although not reviewed in this paper, the face expresses many things besides emotion. For example, the non-emotional facial expressions of concentration and active listening primarily involve the upper face (Rinn, 1984). In the absence of clarifying information from the lower face, these “cognitive” facial expressions may be misinterpreted as an emotional facial expression; this may be particularly true for individuals with borderline personality features, as reviewed above.

The absence of clarifying information from the lower face could result in several adaptations, which may or may not be accurate or useful. First, the brain is known to fill in missing information based on surrounding information (e.g., the physiological blind spot). Ehinger et al. (2017) showed that, when forced to choose, this information filled in by the brain is actually preferred over objectively reliable information presented outside the blind spot due to the perception of this information as being more reliable. It remains to be seen whether a similar filling-in phenomenon occurs with regard to absent facial information. That this is possible is suggested by the personal experience of one of the authors, SM. During wellness rounding on a COVID-19 unit, SM met a physician for the first time who was very animated, friendly, and engaging. Although the physician was masked and only the eyes and brow were visible, SM’s visual memory for this individual is of the whole face, including an animated smile. Additionally, we have all had the experience of seeing a person unmasked for the first time. This “unmasked” appearance is sometimes a surprise, supporting the notion that lower face information had indeed been filled in, albeit, incorrectly. It seems reasonable to infer that the way in which information is filled in could be colored by context (as in SM’s experience) or by an individual’s cognitive biases.

A second adaptation that could be initiated is disambiguation. Loss of information from the lower face increases ambiguity, leading to the search for additional information to disambiguate. In a one-on-one interpersonal interaction in an office setting, such as with psychotherapy, body language would be an important source of context in this process. Several studies have demonstrated that the assignment of a facial expression to an emotion category is strongly influenced by body context (Aviezer et al., 2008; Hassin et al., 2013; Van den Stock et al., 2007). The effect of context is automatic, occurs early in processing, and is a function of ambiguity in the facial expression (Van den Stock et al., 2007). The more similar facial expressions are, the greater the influence of context (Aviezer et al., 2008; Hassin et al., 2013; Karaaslan et al., 2020; Van den Stock et al., 2007). Arguably, facial expressions will become more similar with masks, increasing the importance of bodily context in disambiguating masked facial expressions.

A final adaptation to the loss of lower face information is that both patients and providers will likely focus more on the eye region when communicating. Increased eye contact could be seen either positively or negatively, depending on the patient’s culture. In addition, individuals who may avoid eye contact as a feature of a psychological disorder or who rely more on the lower face may have increased difficulty reading emotional expression in others. Increased focus on the eye region while extracting emotion from a masked face means that microexpressions, and their resulting messages, should be considered. According to Ekman (2003), single emotional expressions that do not need to be modified involve the whole face and can be referred to as macroexpressions. Macroexpressions can last between ½ and 4 s. In contrast, microexpressions are involuntary, last for a fraction of a second and are thought to reflect concealed emotion. Movements of muscles in the lower face, particularly a smile, occurring shortly after an authentic display can conceal the revealed authentic emotion (Ekman, 2009; Iwasaki & Noguchi, 2016). For example, the eyes may reveal disgust that may be successfully concealed with a rapidly occurring smile (Iwasaki & Noguchi, 2016). Thus, wearing a mask would greatly reduce the ability to cover a genuine microexpression.

Suggestions for Conducting Masked Psychotherapy

In the absence of important nonverbal information from the face, therapists are encouraged to be more intentional about discussing emotions verbally. More frequent reflections can help with verifying that they are reading a patient’s emotion correctly. Patients should also be encouraged to ask questions when they are uncertain about the therapist’s facial expression. The following sections highlight consequences of mask wearing that therapists should be aware of as well as suggestions for compensating (see Table 1).

Table 1 Suggestions and considerations for conducting psychotherapy with masks

The Face

Caution is recommended in attempting to compensate for the loss of lower facial expression by exaggerating upper facial expression. For example, attempting to show more overtly that one is actively listening or concentrating on what the patient is saying may give the impression of surprise or disgust, or may be interpreted as negative or hostile in those with cognitive biases in those directions. Therapists are encouraged to experiment with their masked facial expressions in front of a mirror, as some people have more expressive upper faces than others. Therapists should also be aware of aspects of their facial expression that are under less voluntary control, including the upper face in general, as well as microexpressions of emotion in the eyes. While a patient’s microexpressions may communicate useful information, a therapist’s microexpressions may be problematic by signaling a reaction they would ordinarily work to conceal with the lower face in order to maintain the therapeutic alliance.

The Eyes

Absence of information from the lower face means that both patients and providers may focus more on the eye region when communicating. This could lead to increased eye contact, which may be perceived as threatening or dominating, depending on the patient’s culture or clinical diagnosis. On the other hand, patients for whom eye contact is uncomfortable, due to their culture or a psychological disorder, may be at a disadvantage in deciphering emotion in their therapist or made even more uncomfortable by having to increase their eye contact. It may be helpful to check in with patients more often as to what they are thinking or feeling in the moment.

The Voice

Face masks attenuate high frequency sounds (Corey et al., 2020). Lowering voice pitch and slowing down the rate of speech slightly, as one would do with a hearing impaired person, is beneficial. Lowered pitch may also have the added benefit of being perceived as empathetic, as suggested by Imel et al. (2014). While some providers, particularly those working with deaf or hearing impaired persons, may choose to use transparent masks, these have poorer sound performance when compared to medical/cloth masks, although they do not affect lapel microphones or likely other assistive devices (Atcherson et al., 2021; Corey et al., 2020; Goldin et al., 2020). Therapists can also make conscious attempts to compensate for the reduced clarity of speech by reducing background noise and speaking louder (Spitzer, 2020). Nonverbal communication of emotion can also be enhanced by being mindful of prosody and making sure this matches what is being expressed verbally.

The Body

Therapists are also encouraged to be more intentional with body language, including head nodding and leaning into the patient. They can take extra care to match body language to facial expression as, when discrepant, body language may take precedence.

Conclusions

Although mask mandates are currently in flux, it is likely that the need for them will continue in the future, either seasonally for SARS-CoV-2, new variants, or due to another viral outbreak. Future research on the impact of masks in clinical interactions will be helpful in minimizing potential negative impacts of masking on nonverbal communication. In the meantime, we hope the above suggestions for providers will increase awareness of potential issues so that they may implement changes that will improve clinical practice when wearing masks.