1 Introduction

An attractively simple answer to the question of how we know the emotions of others is the direct perceptual model, which says that we know them through perception of those very emotions themselves.Footnote 1 Like ordinary objects and their properties, emotions are among the things in our environment that are perceptible.

This view finds support in the phenomenological traditionFootnote 2:

But that ‘experiences occur [in other people] is given for us in expressive phenomena – again, not by inference, but directly, as a sort of primary ‘perception’. It is in the blush that we perceive shame, in the laughter joy. (Scheler, 1954, p. 10)

The spirit and the soul shine through the human eye, through a man’s face, flesh, skin, through his whole figure…the inner shines in the outer and makes itself known through the outer. (Hegel, 1835, p. 20)

When I “see” shame “in” blushing, irritation in the furrowed brow, anger in the clenched fist, this is a still different phenomenon than when I look at the foreign living body’s level of sensation or perceive the other individual’s sensations and feelings of life with him. In the latter case, I comprehend one with the other. In the former case I see the one through the other. In the new phenomenon what is psychic is not only co-perceived with what is bodily but expressed through it. (Stein, 1964, pp. 75–76)

A recent challenge to the direct perception of emotion comes about through reflection on expressions of emotion. If we only ever perceive emotions by first perceiving expressions, then the perception of emotion is at best indirect. Here are some recent statements of the challenge:

What comes natural to us is to say that we see emotions but only in people’s expressions or behaviours. And this suggests a level of perceptual indirectness that does not intuitively hold between us and common objects or their colours. And it suggests a mediating role for people’s expressions and other behaviours for which there is no analogue in central cases of perceptual awareness or knowledge. (McNeill, 2019, p. 172)

And it seems to me that our ordinary ways of thinking take our knowledge of others’ minds to be mediated by other people’s expressive behaviour in a way that we do not take paradigmatic cases of perceptual knowledge to be mediated by the distinctive appearances of the objects of perceptual knowledge. (Gomes, 2019, p. 162)

In particular, [proponents of the perception of emotion] must claim that one can perceive an emotion in virtue of perceiving its expression, despite the fact that these are not identical. That is to say that there is a sense in which the perception of others’ emotions must be indirect. One sees someone’s fear in virtue of seeing their facial or other bodily expression. (Smith, 2017, p. 133)

So, the worry goes, while we may perceive emotions, such perception is disanalogous to paradigmatic cases of perception, since we perceive emotions via expressions. In comparison, we do not typically perceive a physical object by first perceiving something else. I call this the asymmetry objection.

This paper develops a response to the asymmetry objection. In particular, I argue that it rests on a particular assumption about the epistemic role played by expressions in our perception of emotion. It can be taken as assuming one of the following two epistemological roles for expression. First, we can think of expressions as evidence – evidence for our taking other people to be in particular emotional states. We see the speaker’s shaking hands and come to know that they are nervous on the basis of this evidence. This may come about through, for instance, some sort of post-perceptual inference.

While perceptual accounts of our knowledge of others’ emotions do not always rule out the expressions-as-evidence view (see Cassam, 2007), they deny that we gain knowledge of emotions by perceiving emotions themselves. As such, the evidence view of expressions is incompatible with our target view.

The second option is implicit in the quotation above from Smith. Here, expressions are assumed to be perceptual intermediaries – things by which we perceive other things. ‘Other things’ can be understood in terms of something being non-identical with something else, rather than something being ontologically distinct from something else, though both may be true. As such, it is possible for a perceptual intermediary to be a part of that which it mediates (more on this later). Perceptual intermediaries mediate our awareness of other things by being that with which we are directly aware. This direct awareness of the perceptual intermediary somehow affords us indirect perceptual awareness of the object it mediates. The indirect realist with respect to other minds holds that emotions are only ever perceived by perceiving expressions, just as some claim that we are only ever aware of physical objects by being aware of sense-data.

But while seeing something by seeing something else can capture the intuition that our knowledge of others’ emotions is in some sense perceptual, it does not capture what’s going on in the phenomenological remarks made above. To see something by seeing something else is importantly different to seeing something in or through something else. In the former, we perceive two things: the perceptual intermediary and the object it affords us perceptual awareness of. In the latter, we directly perceive the object of our awareness in or through something that is transparent to us. As such, neither the evidence view nor the perceptual intermediaries view of expressions seem to capture the phenomenologically motivated idea that we directly perceive emotions.

I present a new solution to the asymmetry objection by considering a third option, inspired by Fritz Heider’s classic work The Psychology of Interpersonal Relations (1958). This is that we can understand expressions to play an analogous role to perceptual media. Perceiving emotions is not some two-step process in which we perceive emotions by first perceiving expressions, just as we do not perceive a fox by first perceiving the light around it. Rather, we perceive the fox through the light, and we perceive the emotion through the expression.

Illumination and sound are examples of perceptual media. They enable perception such that without illumination (or light), we would not be able to see the colour of the fox at the end of the road, and without sound, we would not be able to hear its activities. Crucially, we see and hear things through or in perceptual media. As will be spelled out later, in our perceptual experience of objects through media, media may contribute to our perceptual awareness, but they are not the objects of it.

Understood in this way, expressions can aid the perception of emotion in a way that is compatible with the direct perception model. In what follows, I draw an analogy between expressions and the paradigmatic examples of perceptual media: sound and illumination. By this account, we can capture the intuition that we see emotions in the expressions of others, without having to suggest an asymmetry between this and paradigmatic cases of perceptual knowledge.

The plan for the rest of the paper is as follows. In § 2 I clarify my account by distinguishing it from some nearby alternatives. In § 3 I introduce the phenomenon of perceptual media in more detail and draw out three key features of media: their variation, their role in perceptual constancy and their transparency. In § 4 I introduce expressions, and in § 5, § 6 and § 7 I draw an analogy between expressions and media with respect to expressive variation, emotion constancy and expressive transparency. I consider some objections in § 8 before concluding in § 9.

2 Parts, wholes, and other solutions

Before proceeding with a discussion of perceptual media, I want to distinguish this solution from some nearby alternatives. Firstly, it can be argued that the characterisation of perceptual intermediaries above is wrong. Perceiving x by perceiving something non-identical to x need not always be a case of indirect perception. Parts of objects are intermediaries which afford us direct awareness of their wholes.Footnote 3 We perceive the table by perceiving its facing surface and we perceive the fox when we only perceive its tail poking out from behind the fence.

This is relevant to emotions because a number of philosophers and psychologists of emotion characterise emotions as consisting in various components and expressions being one of them. For Scherer, emotions are episodes involving five key elements: the cognitive component (the appraisal), the neurophysiological component (bodily changes), the motivational component (certain action tendencies), the motor expression component (facial and vocal expression), and the subjective feeling component (Scherer, 2005, see also Goldie, 2011, pp. 12–13). Given this, a plausible way to characterise the relationship between expressions and emotions is as part to whole. This approach is adopted by Glazer, 2017b), 2018); Green, 2007), 2010); Hampshire (1972); Krueger and Overgaard (2012); and Tormey (1971) and used to argue that we perceive emotions by perceiving expressions just as we perceive objects by perceiving their parts.

This proposal is discussed at length elsewhere (for a good overview, see Glazer (2018) and for critique, see Parrott (2017), but it will be useful to briefly mention a few concerns to motivate our pursuit of an alternative solution. Firstly, that expressions are parts of emotions is thought to be incompatible with the demand that emotions cause expressions (Parrott, 2017, p. 1049). This demand arises from the fact that we often cite emotions as being responsible for their corresponding expressions (Soteriou, 2017, p. 77). We laugh because we are amused and cry because we are sad. This is a problem since it is generally understood that parts cannot cause their wholes and vice versa (Craver & Bechtel, 2007).

Secondly, the proposal has been objected to on the basis that emotions are not the sorts of things that can have expressions as parts. If emotions are mental states and expressions are occurrences, then emotions cannot have expressions as parts because states cannot have occurrences as parts. This is because states move through time by being wholly present at each moment of their existence. Occurrences unfold through time and, unlike states, have temporal parts. At some point during an occurrence, there is some temporal part that is yet to take place – but if the occurrence is a part of the state, then given transitivity, the state would have a temporal part that is yet to occur at that time. This is not possible of states, so the argument goes (Parrott, 2017; Smith, 2017). Therefore, to defend this picture of emotions as having expressive components, one would need to move away from an understanding of emotions as mental states.

Those who are not committed to a view of emotions as mental states will not be troubled by objections like this. A number of theories, including the dominant Basic Emotions Theory in psychology, understand emotions not as mental states but as complexes involving something like the components listed by Scherer above (Glazer, 2018). Nonetheless, it is unclear just how similar the kind of part-whole relation that obtains between an overall complex of co-ordinated components and the components themselves is to the kind of part-whole relation pertinent to ordinary cases of part-whole perception. We would usually understand the part-whole relation relevant to part-whole perception in terms of spatial location (Hornsby, 1988). Some x is a part of y at time t if x takes up some volume of space within y at t. This is true of the relation between tables and their facing surfaces and foxes and their tails. But it is not obviously true of emotion complexes and their various constituents.

Aside from the part-whole account, another important clarification to make is that the solution I present in this paper supports the direct perception of emotion by analogy with object perception. This is not the only way we could go in defending the direct perceptual model. For example, García Rodríguez has recently argued that we should interpret the direct perception theorist’s claim that we see emotions in expressions as a form of Gestalt perception (García Rodríguez, 2021). An example of perceiving a Gestalt is when we perceive either a duck’s beak or rabbit’s ears in the famous duck-rabbit ambiguous drawing. We cannot perceive either without perceiving the lines in front of us as a total and in context; if we see a duck’s beak, our perception of it is direct and complete. Being aware of an emotion in an expression is like being aware of the duck’s beak in the drawing’s lines.

Another alternative perceptual model that could rescue the direct perception theorist is Richard Wollheim’s seeing-in.Footnote 4 Here, seeing-in describes a kind of perception appropriate to artistic representations, where we have a twofold experience of medium and object – the medium being the picture, the object being that which is represented in it (Wollheim, 2015, p. 142). The distinctive phenomenology of this kind of seeing is of a dual awareness of both things, where neither is more directly perceived than the other. If our perception of emotion is like this, we can explain the directness by analogy with representational seeing.

These last two solutions provide interesting avenues for the perception of emotion theorist to follow. They are distinct, however, from the solution put forth in this paper and the part-whole account, since they do not offer accounts of the direct perception of emotion on the model of our perception of ordinary objects.Footnote 5 There are two main reasons I provide a solution on the model of object-perception. Firstly, it is what the critics of the direct perception model have in mind. Take the McNeill quote above: ‘And this suggests a level of perceptual indirectness that does not intuitively hold between us and common objects or their colours.’ There is value, therefore, in meeting this objection on its own terms. Secondly, many of the historical motivators in phenomenology have analogies with ordinary objects in mind when they defend the perception of emotion. Take Nathalie Duddington’s opener in her paper on other minds: ‘our knowledge of other minds is as direct and immediate as our knowledge of physical things’ (Duddington, 1918, p. 147). The way I have framed the above objection is as a threat to this symmetry and it is worthwhile, therefore, exploring a solution that addresses this. The part-whole proposal does this, but as I am suggesting, the perceptual media proposal does it with fewer costs.

3 Perceptual media

Before turning to the analogy with expressions, we need to understand what perceptual media are and their core characteristics. Imagine seeing a fox on the road outside your window. We may ask how we can see the fox’s reddish colour, given that it lies some distance away from us. This is a question about how the fox can causally affect us. We may answer by positing various physical media: the particles in the air, the window, one’s eyes, and so on. In different situations, the physical media that enable one’s perception may change. Imagine snorkelling in the sea and seeing some reddish coral. In this case, the water mediates your perception of the coral’s colour where it didn’t with the fox’s.Footnote 6

Perceptual media are distinct from these physical media. Instead of answering a causal question, they answer the question of how it is that things are perceptually accessible to us. The propagation of light waves (the illumination) is what enables us to have a visual experience of the fox and the coral. In auditory perception, it is the patterned disturbance to the medium between us and a source (the sound) that enables us to hear the activities of objects. Without illumination, we would be unable to see most things, without sound, we would be unable to hear most things.Footnote 7

This is not yet enough to distinguish the phenomena fully. So far, perceptual media are understood as being enablers of perception such that, without them, things generally wouldn’t be seen or heard.Footnote 8 But this is also true of certain physical media like one’s eyes and ears. The following three features will help to refine the phenomenon further. They are important to have in view before the analogy with expressions is drawn.

3.1 Variation in media

We can distinguish differences in how illumination appears according to hue, saturation and brightness. Imagine watching a live concert. The spotlight on the singer is brighter than the surrounding light, which is atmospherically dim. This, of course, is useful in highlighting the singer. The backdrop is being lit with a slight blue hue, while the fire exits on either side of the stage are lit up in red so that they are easy to locate.

Sound, too, can vary according to volume, pitch, timbre and tone. This can sometimes be due to differences in the sources of sound; drums produce different sounds to violins. It can also be contingent on the surroundings. The density of the material in which the sound wave inheres will affect its form, and this is why the same object or event can sound different in different spaces. The same speech will be louder in a room with good acoustics. This raises an interesting feature of perceptual media: we can choose the way in which perceptual media present objects by adjusting our surrounding materials. We pick particular sources of light depending on how we want things to look, and we move to different rooms depending on how we want them to sound.Footnote 9

3.2 Perceptual constancy

Perceptual media play a particular role in the phenomenon of perceptual constancy.Footnote 10 Let’s take visual perception first. It is not the case that illumination remains constant throughout all our experiences. Imagine reading a paper, under a lamp, in a room dimly lit by an overhead light. Here, we have two illuminants, the lamp and overhead light, with the overall effect of variation in the illumination before us. The top of the paper closest to the lamp is brightest, whilst the other side falls under shadow.

The variation in the medium, however, is not necessarily mirrored in the perceived colour. There is a sense in which the whiteness of the paper remains fairly uniform, despite the change in brightness. As Heider puts it, ‘the color of an object appears surprisingly little influenced. In other words, perception of the object remains fairly constant in spite of the enormous variation in the proximal stimuli which mediate it’ (1958, p. 28).

Nonetheless, the colour appearance does admit of some change. The apparent whiteness of the paper appears in different shades as one’s eyes travel down the page. As such, there is some variation in the way the colour appears. We still take there to be one associated colour, but under the two sources of illumination, we have a dual experience of the paper’s colour as stable yet changing (Hilbert, 2005, p. 145).

What we can take from this is that a change to perceptual medium sometimes invokes a phenomenon of both change and stability, where the object’s colour looks the same, despite differences in colour appearance.

We get something similar in auditory perception:

So consider approaching a continuous source of sound, such as a waterfall. The waterfall, heard from different distances, sounds different. Heard from afar, the waterfall sounds quieter than it does when heard from nearby. As the perceiver approaches the waterfall, the sound of the waterfall increases in volume. But throughout the perceiver’s approach, the perceiver heard the constant flowing of the waterfall. The flowing of the waterfall is not experienced as getting louder so much as the perceiver is getting in a better position to hear just how loud the waterfall really is. (Kalderon, 2017, pp. 129–130)

Here we have variation in the sound as the perceiver’s position changes, whilst the object of perception remains stable. The waterfall seems to be flowing at a constant volume, and yet we have a sense of change with regard to its auditory appearance. Likewise, imagine that someone is shouting down the phone at you. This makes you hold the phone a little further away from your ear. Once you do this, you still hear them shouting just as loudly as they were before, but at the same time, it’s quieter from your adjusted position. This experience of change and stability in the way the shouting is heard is accompanied by a change to the perceptual medium; you are aware of the sound’s volume change.

In both of these instances of perceptual constancy, visual and auditory, we find we experience three things when perceiving an object. Firstly, we experience the constancy of the object, secondly, we experience variation in how that object appears or is heard, and thirdly, we experience this through a changing perceptual medium.

3.3 Transparency

Perceptual media are transparent. This means that they are perceptually penetrable, such that we see through or in them. Transparent things include air, water, glass and crystals. Recent discussions of media appeal to Aristotle’s notion of transparency (Kalderon, 2017; Mizrahi, 2019).

Now there is clearly something which is transparent, and by “transparent” I mean, what is visible, and yet not visible in itself, but rather owing its visibility to the colour of something else; of this character are air, water, and many solid bodies. (De Anima II, 7, transl. J.A. Smith)

Transparent things are in some sense perceived – but often, their visible character is owed to the things seen through them. If we look out of our window on a clear day, the window may be a part of our experience, but its blueness can be attributed to the sky that is seen through it. This is not to say that transparent things can never themselves be seen. I can look at a body of water and appreciate it in and of itself. But when I move towards it and look through it to the fish swimming below, its perceptible properties are in service of my seeing the fish.

It is this phenomenological aspect – the way in which a transparent medium contributes to the experience of objects seen through it – that is important for our discussion of the way in which enablers of perception render us perceptual access to things. This is the sense of transparency that is shared by perceptual media, rather than the particular configuration of electrons that allow photons, and thereby light, to pass through materials such as glass and crystals.

A very transparent medium is phenomenologically significant in the following way. It contributes to the character of the experience by being the perspective – the way in which – the object is seen or heard. We are aware of the medium, but only in the background of our experience. Take, again, our waterfall:

Hearing the sound of the waterfall, from a given auditory perspective, may be implicit, it may be recessive and in the background, so that it does not compete for attentive resources directed towards the flowing of the waterfall, but it contributes to the conscious character of the perceiver’s auditory experience by being the way in which the distal process is presented in that experience. (Kalderon, 2017, p. 130)

Crucially, however, transparency comes in degrees. An immaculately clean window has a higher transparency than one covered in dirt. When the window is highly transparent, it features less in one’s experience. It may be so clean that one isn’t aware of it being there at all (and walks right into it on the way outside). The dirty window is less transparent and intrudes into one’s experience more.

So, too, for perceptual media. Sounds can be recessive and in the background of our experience. When we listen to someone talk we are often only aware of the sound of their voice in the background of our experience – for the most part, we’re just aware of what they’re saying. However, we may find the sound of their voice particularly grating and lose track of what they’re saying altogether as the medium intrudes into our experience. In this way, the medium becomes less transparent, and performs less well in its role.

Illumination, too, can be a better or worse medium depending on its level of transparency. We attune the brightness on our computer so the illumination emanating from it contributes to our experience of what’s on the screen in the optimal way. Too bright, it can take too much of our attention and make it hard to focus, too dim and we have to squint through it to discern the words in front of us.

These cases bring out a crucial feature of perceptual media which is that as their transparency decreases, as does our perceptual access to objects perceived through them (Kalderon, 2017, p. 159).

4 Expressions of emotion

A problem we face in trying to determine the role that expressions play in our perception of emotion is understanding just what sorts of things we are talking about when we talk about expressions. We tend to think of behaviours like smiling, laughing, blushing, wincing, and frowning as expressive of emotion. And we tend not to think of behaviours like walking, craning one’s neck to see, closing the door, and yawning as expressive of emotion. To complicate things, sometimes these latter behaviours can be expressive, as when we walk with a spring in our step or close the door angrily behind us. As such, it may not be anything intrinsic to behaviours like smiling, blushing and walking that make them expressive, but rather other characteristics of the expressive agent or their environment (Stein, 1964, p. 53; Tormey, 1971, pp. 44–45).

Different accounts pick out different characteristics as those which are essential to the expressivity of behaviour. For Green, what’s important is that the behaviour is designed for the purpose of communication (2007, p. 5). In particular, that it is designed to convey information about the emotions of an agent. We can communicate our emotions by showing them to others in three ways. Expressions can show emotions by making them perceptible to others or alternatively by providing evidence of emotions or showing-how a particular emotion feels (Green, 2007, p. 47). So, for Green, while expressions can enable the perception of emotion, this is not an essential function of expressive behaviour.

Like Green, Bar-On emphasises the communicative and voluntary aspect of expressive behaviour and the way in which it can enable the perception of mental phenomena (2004, pp. 270–274). Unlike Green, expressions’ role in communication is not always their distinguishing feature. What she calls ‘natural expressions’ like frowning and giggling can be expressive either in virtue of being intentional doings on the part of the agent or by merely being the culmination of a causal process beginning with the emotional state. In many cases, expressions will involve both of these things (Bar-On, 2004, pp. 248–249).

In contrast, some accounts take the function of enabling the perception of emotion as essential to expressions. Taylor argues that behaviour from which we can infer the presence of emotion is not necessarily expressive. What makes behaviour expressive is that it manifests the emotion in the sense that it puts it out in the public domain – it is available for others to directly see (Taylor, 1980, p. 283). Glazer argues that it is this feature of expressions, that they enable the perception of emotions, that determines which behaviours are expressive (Glazer, 2017b). While the former two accounts support a weak claim that expressions sometimes enable the perception of emotion, Taylor and Glazer support a stronger claim – that expressions essentially enable the perception of emotion (Glazer, 2017a, p. 193).

I will not argue one way or another here. If the weaker claim is true, then my proposal is that expressions sometimes behave as the media through which we perceive emotions. If the stronger claim is true, then my proposal is that expressions always behave as the media through which we perceive emotions. That is, on either side of the debate, there is still a question of how expressions make emotions directly perceptible. For some, the answer is as parts make wholes perceptible. As I have motivated above, in this paper I will look for an alternative solution.

While I can stay neutral with respect to the above debate over whether the function of enabling the perception of emotion is necessary for behaviour to be expressive, I cannot remain neutral about whether expressions are necessary for the perception of emotion. Insofar as the analogy between expressions and perceptual media goes, I am committed to the claim that emotions must be expressed to be perceptible. After all, the ability to make out one’s car in the dark depends on at least some degree of light and hearing the traffic depends on at least some degree of sound. Without perceptual media these things would go unseen and unheard. It is in the very nature of perceptual media that they are necessary for our perception of the objects they mediate.

That emotions must be expressed to be perceptible runs counter to Green’s account in which one can show (in the sense of making perceptible) an emotion without expressing it.Footnote 11 For Green, the range of things which reveal a person’s emotion is not the same as the range of things that express a person’s emotion. Blushing might perceptually reveal embarrassment, but it falls short of being an expression since it is not obviously designed for the purpose of communication (2007, p. 27).Footnote 12 This, Green thinks, is in accordance with our intuitions. Others disagree and argue that it is natural to include unintentional blushes within the range of expressive behaviours (Martin, 2010, p. 87).

If we assume that Green is right and the extension of ‘expression’ and the behaviours I have in mind can come apart, then the account I am presenting would just require a terminological shift. In what follows, I am suggesting that behaviours that enable the perception of emotion are analogous to perceptual media. I am working on the assumption that these behaviours are expressions. But if it turns out that some of these behaviours do not properly warrant the term, then I am happy to concede this point. The opponent to the direct perception of emotion wants to tell us that we cannot directly perceive that which is mediated by behaviour. It is responding to this that is the target of this paper.

So, with respect to these perception enabling behaviours at least, our first point of contact with perceptual media arises. We cannot see objects without visual media and we cannot hear objects without auditory media. Likewise, we cannot perceive emotions without perception-enabling behaviour. As the analogy is drawn in what follows, the above clarification aside, I call these behaviours expressions.

5 Expressive variation

Above I describe illumination and sound to vary according to things like brightness, hue, volume and tone, resulting in a variety of different forms. The bright green light from a lamp and the sound of bird song are particular forms of illumination and sound.

Expressions are similarly varied. We express our discontent with a sigh, a huff, a frown, a squint, clasped hands, shaking heads, grimaces or grumbles. We do these things themselves in different ways; each person’s frown looks slightly different. As with illumination and sound, these differences are dependent on their sources. That is, different people produce different expressions. In part, this is because we have different faces. But, moreover, there are differences in the ways in which we choose to express ourselves. While one person is prone to a grimace, the other more often grumbles.

Additionally, we find expressive variation to be contingent upon our surroundings, in much the same way as illumination and sound. The same joke, told at work and at home, is likely to elicit two distinct expressive responses – we often need to emphasise our expressions in certain settings, as per certain social conventions, in ways we wouldn’t in our own homes. This emphasis on context when it comes to expressions is supported by a vast array of recent empirical research which highlights how context influences both the production of expression and its perception by others (Barrett, et al., 2019).

We can also see the influence of choice when it comes to expression. Just as we choose particular sources of light in order to see things in a certain way, such as shifting the spotlight on stage so as to emphasise the actor, we can manipulate our expressions in order to emphasise some feelings over others.

This control shouldn’t be overstated. Our expressions often give us away despite our best efforts. We often find ourselves unable to exert full control over our expressions, and thus unable to determine the aspects of our emotional lives that others have access to. However, this limitation is also true of our control with regards to illumination and sound. While we can sometimes manipulate them for perceptual purposes, this isn’t always the case. For instance, despite the efforts of conference organisers, we sometimes find ourselves in rooms where the acoustics make it impossible to hear the speaker from the back of the room. And we certainly cannot do much about the light on a gloomy day as we try and fail to see our surroundings.

In sum, the above features of expressive variation demonstrate commonality with the variation in perceptual media. They come in a variety of forms; such variety is dependent on differences sources and surroundings; they are liable to manipulation; and such manipulation serves to alter what we have perceptual access to.

6 Emotion constancy

Earlier, we saw that variation in perceptual media is not always straightforwardly tracked in our identification of what is seen through them. Strikingly, our awareness of others’ emotions exhibits constancy in much the same way as colours do (McNeill, 2019, p. 176). Heider draws out this connection as follows:

The term constancy phenomenon is usually applied to the perception of color, brightness, size, and shape, but it is also applicable in the social perception of such crucial distal stimuli as wishes, needs, beliefs, abilities, affects, and personality traits. If we assert that “wish constancy” is possible just as there is a size, shape, or color constancy, that means we recognize a wish as being the same in spite of its being mediated by different cues. The same wish may be conveyed, for example, by an innumerable variety of word combinations, ranging from “I want that” to the lengthy and complicated reflections transmitted to the therapist in a psychoanalytic session. Or, the same wish may be conveyed by a colorful array of actions, as when a child, wanting a red wagon above all else, goes up and takes it, pushes a competing child from it, and even angrily kicks it in a fit of frustration. (1958, p. 28)

Heider emphasises how we can recognise the same underlying mental states despite a range of changes occurring in the medium; that is, differences in the way such mental states are conveyed. We can appreciate this, especially, when it comes to emotional expression. Imagine Michael and Robyn both apply for the same job, but only Robyn is successful. When Michael receives the news, he cycles through a series of expressions. To begin with, he just looks at the ground, then shakes his head, gives his friend a knowing look, and finally rests his head in his hands. When Robyn happens to enter the room, Michael’s expression changes again. He is gracious and congratulates her, but there’s a slight pinch in his voice and a tightness to his face. Despite such variation, it seems likely that an observer would take Michael to be disappointed throughout. This is the sense in which the emotion appears constant.

Nonetheless, as with perceptual constancy, the variation does not incur a phenomenon devoid of any appearance change. The way the disappointment appears changes through the differences in expression. The disappointment clearly looks different when expressed through Michael’s head in his hands to when it’s expressed through a subtle tightness to his face. As such, we have a phenomenon involving variation and stability in emotion appearance, mediated by variation in expression.

As a capacity, our ability to recognise the constancy of emotions amid variation in expressions is imperfect. Sometimes, changes to expressions do alter the perception of their underlying mental states. Someone else might be totally convinced by Michael’s attempts to appear entirely happy for Robyn. Through his smile they take him to be perfectly content. This demonstrates that our capacity to recognise constancy can be greater or weaker across perceivers. We capture this in our everyday language when we observe that people can be more or less ‘emotionally astute’.

Furthermore, our ability to spot steadfast emotions depends on the kinds of expressive changes in operation. We associate some expressions with some emotions more than others. Smiles and laughs usually indicate happiness, and grimaces are usually paired with disgust. These associations help us categorise what others feel. I would have little trouble tracking my friend’s happiness if they were to progress from a smile to a laugh. But, assuming they are indeed happy throughout, I might have more trouble tracking this if their smile becomes a grimace. There’s a sense in which our capacity for recognising emotion constancy can be thrown off when an unexpected expression enters the mix.

These limitations to emotion constancy are similarly found in ordinary colour constancy. Colour constancy is imperfect (Hilbert, 2005, p. 3). Imagine looking at a wooden table in a shop window, as the sunlight shines through, throwing half of the table into shade. You might initially be taken by the interesting design – to varnish the wood only on one side so that it takes on two different colours. However, as the light passes behind a cloud, you come to realise there is no such colour difference. The reality is merely a failure of colour constancy.

And as with emotion constancy, the phenomenon of colour constancy varies across perceivers (Hardin, 1988). A high degree of colour constancy is associated with naïve perceivers (such as children) who tend to focus on surfaces of objects (distal stimuli), whereas more developed perceivers tend to focus on light intensities (proximal stimuli). In explaining why this may be, Hardin writes: ‘A rather high degree of constancy is, in general, evolutionarily advantageous because it significantly assists the animal to reidentify objects; attention to proximal rather than distal stimulus is a sophisticated luxury’ (1988, p. 86).

A high degree of constancy is in general evolutionarily advantageous, and likewise, being emotionally astute is in general socially advantageous. But for some, it can be in their interests to be sensitive to the medium as well as the distal stimulus. Hardin picks out artists as a group for whom it is in their interests to focus on the proximal stimulus as well as the distal stimulus, so they can better manipulate their work to appeal to perceivers in general. Likewise, observers can study to a greater or lesser extent the expressions of others and how these serve the perception of emotion. An actor wishing to convey a particular emotion might be adept at noticing the expressions of others just as an artist is adept at noticing light intensities.

Finally, limitations on colour constancy can depend on differences in the particular perceptual medium in play. Just as we are used to certain expressions being associated with certain emotions, aiding our identification of them, we can be used to particular sources of light. I am used to the lamp emitting green light on my desk and I expect a certain change in the appearance of objects under its glow. I have no trouble seeing my purple pen as purple, despite the green hue it now appears to also have. But if I were to visit a desk which throws my purple pen under an unfamiliar blue light, I may have trouble identifying its purpleness and briefly mistake it for a different pen.

In these examples, we see emotion constancy to work in a similar way to our paradigm case of perceptual constancy. In particular, expressions occupy the same role as perceptual media: our discriminatory abilities transcend their variation and, as with perceptual media, this occurs to varying degrees.

7 Transparent expressions

In this section, I draw a further parallel between expressions and media. In order to play the same epistemological role in emotion perception as perceptual media play in paradigmatic perception, expressions need to be transparent. As discussed in § 3.3, perceptual media are transparent in the sense that their perceptible character comes from that which is perceived through them. They are in the background of our awareness and contribute to our experience by being the way in which something is perceived.

To motivate this, consider how we often invoke talk of transparency when discussing other people. What do we mean when we say, ‘he’s so transparent’? We usually mean that his real thoughts or feelings have been laid bare. More often than not, this is in spite of an intention on his part to conceal them. For instance, imagine asking the room who ate the last of the biscuits you’d been saving. Everyone adopts a nonplussed expression, but you can see through your brother’s expression to his evident guilt. This seems the sort of situation in which we’d call another ‘transparent’. What it’s natural to take this to mean is that we can see straight through their expressions to how they are truly feeling. The fact that our language can represent expressions as transparent may at least lend some intuitive support to the suggestion.

But the idea that this turn of phrase reflects any real kind of transparency is prima facie strange. After all, the vehicles of expressions seem to be people’s faces and bodies – expressions consist in opaque objects. And these we certainly seem to see.

However, this is also true of many uncontroversially transparent things. Windows are transparent. We see through, in, or out of them to what lies beyond. But this doesn’t mean they cannot sometimes be opaque. We sometimes look at them as windows. For example, we might admire the windows on another person’s home in the hope of adopting the same ones for our own. Likewise, having left a concert you might be left with a ringing in your ear. The sound of ringing signifies no object beyond it, but you are forced to attend to the sound as something in its own right. Expressions, similarly, can be the sole objects of our perceptual attention. They need not always be transparent, but this fact alone is not enough to rule out that they sometimes are.

Again, for something to be highly transparent, it is not straightforwardly that the thing isn’t seen, but rather that it isn’t seen in and of itself. When in the service of an object, any perceptual character it has, it derives from the object seen through it. Furthermore, it is in the background of our awareness. Are expressions like this? Heider thinks they sometimes areFootnote 13:

In social perception, too, there are some instances in which the mediating factors are very obscure, and others in which we are or can be quite cognizant of the cues for the perception of o. For instance, we may see that a person is displeased, without being able to say just what about his appearance or behavior gave us that impression. This very often is true when the cues involve the interpretation of physiognomies, gestures, the tone of voice, and similar expressive features. They often mediate personality traits, wishes, or attitudes of persons without our being able to say what the materials upon which we base our perceptions. On the other hand, there are many occasions in which we can quite precisely elucidate the mediating conditions for our perceptions of other people. Often the raw material consists of actions and reactions of the person that can be perceived in their own right and can be separated from the terminal focus. (1958, p. 26)

Imagine looking at a friend and seeing that they are relieved. There’s a complex array of things happening on their face that contribute to their overall expression, and in this instance, one of them happens to be a smile. If you are pushed to explain why you saw them as relieved rather than happy (as smiles standardly display) it would be very difficult to explain without reference to the relief itself. Their smile gets its character in virtue of being a smile of relief. And this is just what it is to be transparent in the way perceptual media are. The perceptual character of the smile is owed to the object it enables us awareness of.

Another reason it’s difficult to explain why the smile is one of relief is that, in many of our social interactions, we don’t attend to the expressions of others. We interact on the basis of how others think and feel. We are wary of someone because they are angry, not merely because they are scowling. While the scowl may be part of our experience of them, it is not our ‘terminal’ focus. Much like with our brightly lit phone screen, where we don’t so much see the brightness, but the text brightly lit, we don’t so much see expressions, but rather emotions expressed.

Heider is quite right, however, to point out that the extent to which expressions feature in our experience of another’s emotion varies. While I may not be attuned to another’s expression in some instances, there are times when expressions come into the foreground. When someone explodes in a tirade of anger, for example, their physical behaviour can be as much a part of my experience of them as their anger is. This, however, should not surprise us given what was said above regarding how transparency comes in degrees. Illumination and sound can similarly encroach on our perceptual experiences.

That said, in the discussion of transparency above, a particular phenomenon was identified. We saw that as the medium becomes less transparent, our perceptual access to that which is mediates is weakened. In other words, the more we attend to the medium, the less we are able to perceive things through it. Is this sort of see-saw phenomenon true of expressions and emotions? At a first glance, no. One might think that the opposite is true; the more expressive one is, the more likely another is to perceive the underlying emotion. The bigger the smile, the more likely it is that you see the happiness. We tend to think of expressions as aids in our perception of emotion, rather than as distractions.

There are two lines of response here. Firstly, we might think that the see-saw phenomenon is overstated when it comes to perceptual media. It is not obvious that all cases of perceptual media being more obtrusive in our experience of an object renders its perception worse off. Imagine you are trying out a new set of speakers at a friend’s house. You put on a song you know well in order to investigate whether the speakers really do improve sound quality. When listening, you’re more attuned than you normally would be to how it sounds. You pick up some of the subtleties that you wouldn’t normally hear and conclude that the new speakers really are very good. In this case, the sound of the song is more obtrusive in your experience than it has been, but you seem to have heard the song just as well.

Children with specific language impairments are sometimes encouraged to read books under coloured lighting, so that the pages appear, say, yellow or pink. This is said to make the words stand out better on the page and are therefore easier to follow. Yet, the experience of viewing the page under these conditions can be one in which you are more aware of the light that you see through than you would otherwise be; you’re aware of its distinctive yellowness. There seem to be, therefore, cases in which the increased phenomenal presence of perceptual media can enhance perception in much the same way that exuberant expressions can enhance emotion perception.

Secondly, there are situations in which expressions and emotions do reflect this inverse relation. Cases in which the more the expression features in one’s experience of another’s emotion, the less that emotion is in view. Imagine walking beside someone as they share details of an incident playing on their mind. While expressing their remorse, you are struck by the beauty of the expression and, in the process, lose touch with their remorse in favour of attending to their expressive eyes. The aesthetics of expressions can often be a distraction from what underlies them. It is perhaps telling that people often say, ‘I’m an ugly crier’ as a way to alleviate the emotional weight of a situation. Likewise, part of why it must be so frustrating to be told ‘you’re cute when you’re angry’ is that they’ve failed to focus on how you actually feel.

For another example, take the perception of someone exploding in a tirade of anger. At some point, the hail of fists coming towards you will take over your perceptual focus. In turn, you might lose sight of the anger all together (as well as everything else in your vicinity) as you focus exclusively on the fists and how to avoid them. These cases demonstrate that, as with perceptual media, largely opaque expressions can make our perception of things worse off.

The fact that we can experience these shifts in our visual experience – between the emotion and expression of another – is better explained by the perceptual media account than the part-whole account. When we look through something transparent, the transparent material does not disappear – we can switch our attention back to it. When we switch our attention back to it, it is possible that we now no longer perceive what is behind it. The part-whole account of perception tells us that what it is to perceive the book in front of me (the whole) just is to perceive the facing surface that’s visually available to me. I cannot adjust my attention and thereby merely see the facing surface and not the book. Only on an understanding of expressions as transparent do we have room for expressions to feature in two kinds of visual experience – one in which we just look at the smile as a smile and one in which we experience the relief through it.

8 Objections

Emotions explain expressions. One disanalogy between emotional expressions and perceptual media is that we often take emotions to explain their expressions. Recent literature often highlights that expressions occur because of emotions (Smith, 2017, p. 134; Soteriou, 2017, p. 74). We scream and shout because we are angry, we smile because we are happy, and we sigh because we are relieved. For some, this demonstrates that a causal relation must obtain between emotions and expressions (Parrott, 2017, p. 1049).

But this kind of causal connection is not obviously present between perceptual media and the objects we perceive through them. When I perceive the reddish colour of the fox through the light outside, the light does not causally depend on the fox. It would still be light outside were there no fox there at all. Moreover, even when the fox is present, it wouldn’t make sense to ask why it’s light outside and provide an answer to do with the fox.

A first response to this is that when it comes to sound rather than illumination, it seems much more plausible to posit an explanatory relation. We talk of sounds being emitted by objects and often ask questions like ‘what’s making that sound?’ or ‘where’s that noise coming from?’. And with illumination, it’s not always inappropriate to ask questions about the choice of illuminant. We might ask why the lamp is on or why a torch is needed. Often we might ask these things in cases where it isn’t clear to perceivers just what the object of perception is. I might ask why a lamp is on because I have not clocked that you are reading or I might ask why you are shining a torch in the dark because I haven’t seen what it is you’re looking at.

Explanations we seek with respect to perceptual media tend not to be about the existence of an illuminant, but about its character. As we saw in the discussion of transparency, perceptual media adopt (to a greater or lesser extent) the character of that which is perceived through them. We do not so much ask ‘why is there light?’ but rather ‘why does the light have a pinkish hue?’ to which a non-technical answer would be to point to the setting sun. The same is true of the explanations involved in expressive phenomena. We do not just want to know why someone is expressing themselves, but why there are expressing themselves like this. We want to know why they are behaving in such a way, so that we can better identify their emotion and what might have caused it.

Relatedly, the extent to which we seek explanations for expressive behaviour should not be overstated. In fact, giving or asking for reasons for expressive behaviour only seems appropriate when such expressions are themselves inappropriate, shocking or unexpected. Screaming at one’s friends for no apparent reason might warrant explanation, but smiling upon receiving a gift does not.

Nonetheless, the point here might not be the prevelance (or lackthereof) of ascriptions of emotions as reasons for expressions, but rather that these instances indicate that there is in fact a causal relation underlying them. Let’s assume this to be true of expressions and emotions but not objects and perceptual media. There are two reasons this should not be of particular concern for the analogy being developed here. Firstly, this would not be a problem that is special to this solution. As discussed in § 2, it is often maintained that parts and their wholes cannot be causally related given that, since Hume, we generally take causal relata to be ontologically distinct. As such, if expressions are parts of emotions, then there is an in principle reason why emotions cannot cause expressions. In this way, the perceptual media solution outperforms the part-whole solution since I cannot see a reason for maintaining that perceptual media cannot be caused by the objects they mediate, even if in many cases they are not (take for instance the sound of birdsong, for which it is perfectly natural to take the bird to be the cause).

Secondly, in suggesting that expressions act as perceptual media in our experiences of others’ emotions, we need not maintain that expressions share in exactly the same features as illumination and sound. Inasmuch as illumination and sound differ from one another in important respects, so too for expressions. It is only when these differences are relevant to the way in which things are perceptually presented that we should be concerned. And we need not think that the causal properties of emotions and expressions affect the way emotions are perceptually presented through expressions. To emphasise this, one way we can think about the causal claim is in terms of the intentions of the agent – how emotions lead us to behave. In at least some cases, we intend to make others aware of how we feel by behaving in particular ways. This sort of agential control is perhaps typical of expression but not typical of sound and illumination. But it is not clear that it makes a difference from the perceiver’s perspective. Imagine a situation in which the illumination is under agential control. You walk into a dark room believing it to be empty and all of a sudden someone in the corner of the room turns on the overhead light. You can now see them and the fact that they were the cause of the illumination by which you can see them in no way affects how they look to you.Footnote 14 The way they look would have been no different had the situation been replicated except for the fact that it was you who switched on the light. As such, even if a causal relation is distinctive of emotions and expressions, this need not change the fact that when it comes to perceptual presentation, expressions behave as media do.

Differing percepts and perceptible properties. The above point is relevant to another potential worry. This is that the kinds of things that perceptual media enable us to perceive are different to the kinds of things that expressions enable us to perceive. Whether emotions are mental states, mental occurrences, feelings of bodily change, or some composite of a number of ontologically varied components, they are distinct from the kinds of objects we are aware of through sound and illumination that I have discussed so far – things like foxes, birds, pieces of paper.

As discussed in § 2, the defence of the direct perception of emotion that is sought here is of the structure of object-perception. The very question of how we perceive emotions is predicated on the assumption that they are different to material objects. The challenge is to explain how we can perceive, on the model of object-perception, things of a different nature.

But a related disanalogy between perceptual media and expressions is that when perceptual media make perceptible certain objects, we are able to attend to various perceptible properties of those objects. Footnote 15 The light enables me to see the fox and also to see the fox’s reddish-brown colour, the shape of its paws etc. But while the expression enables us to perceive emotions, it does not enable us to then attend to various perceptible properties of the emotion itself. That is, we do not attend to the colour, shape and size of the emotion.

One possible response to this worry is that not all of the things that perceptual media enable our perception of have perceptible properties. Perceptual media mediate our awareness of a range of perceptual ephemera such as absences (Farennikova, 2013) and causation (Siegel, 2009). Insofar as we perceive these things, we see them through illumination just as we see foxes through illumination. But we might be hard pressed to attribute a colour to these experiences. In short, emotions would not be the only things that flout certain paradigms of perception.

Furthermore, this challenge allows us to make a clarification. In my first example of a paradigmatic case of perceptual media, I said that illumination enables us to perceive the reddish-brown colour of the fox. In object-perception, we perceive both objects and properties of objects. I have so far remained neutral as to whether the emotions we directly perceive are themselves objects or properties of objects. But if we are pressed by this objection, one option is to say that emotions do not have perceptible properties because they themselves are such properties. What it is to perceive a person’s anger directly is to perceive a property of that person, akin to other perceptible properties like their colour, shape and size.

Multiple media. A final disanalogy is that while perceptual media are enablers of perception in general, expressions are enablers of perception only locally. It is only emotions and perhaps other mental phenomena that we perceive through expressions, unlike perceptual media. This in itself is not a problem for the analogy being developed – it need not affect the way in which expressions enable emotions to be perceptually manifest from the perceiver’s perspective. The phenomenal properties of the experience of emotion are not affected by the fact that the medium is emotion-specific. But what might affect them is the fact that in the emotion case, our experience is mediated twice over. We perceive emotions through expressions, on my account, but we also perceive them through illumination or sound. That is, when I look at someone across the room and perceive their sadness through their tears, this is possible only because of some degree of illumination. And when I hear their anger in their shouts, this is also only possible given some degree of sound.

While it might be the case that having two things enable one’s perception at once influences the character of the experience, such differences might not be unacceptable here. For one thing, we often see things through two kinds of illumination. In seeing the colour of the fox outside the window, one’s perception is mediated both by the light indoors and outdoors. When reading under a lamp and an overhead light, one’s experience is mediated by two illuminants. The same can be said for hearing things through multiple sounds. I hear the tennis match being played through the sound of shoes squeaking and the sound of the ball being hit. And not only do I perceive things through two different sounds or two different illuminants, I sometimes can only perceive them through a combination of sound and illumination. The light outside is so dim and the sound so quiet that, without both, I cannot perceptually distinguish my friend’s car pulling up outside my house.

One might still worry, however, that the fact that expressions enable perception only in conjunction with other media is just to introduce another charge of indirectness. It makes emotion perception that bit more difficult than ordinary perception. But a greater degree of difficulty in discrimination should not be confused with indirect perception. Take Duddington’s remarks on our experience of other minds:

The degree of difficulty involved in the process of discovery will vary for different minds, and for one and the same mind at the different stages of its development. What one person “sees at a glance” another may take years to discern…it may be convenient to draw a further distinction between discovery attained through a single act of discriminating, and discovery attained through a series of such acts; but the important point is that the directness of knowledge has to do not with the means whereby the perception of any particular reality is attained, but with the circumstance that when the end is reached, the mind is in the presence of the object. (Duddington, 1918, pp. 151–152)

Adding an additional perceptual medium to one’s perception does not render a difference in the kind of perceptual relation one stands in. And we should expect emotion perception to be a little trickier than our standard perception of material objects around us. While we usually have no trouble perceiving the tables and chairs in our vicinity, we are often mistaken about how other people feel and much of our emotional life goes unperceived. Illumination and sound are not sufficient to render emotions perceptible – we need expressions too.

9 Conclusion

At the start of this paper, I introduced three options for how expressions may operate in our coming to know another’s emotional state. The suggestion of their being evidence, upon which we could infer the minds of others, or perceptual intermediaries, upon which we could perceive the minds of others, are both incompatible with the idea that the direct object of our awareness is another’s emotion. I suggested a third option that is compatible. This is that we can think about expressions by analogy with illumination and sound, the perceptual media for vision and hearing.

There may be other objections to the theory that we directly perceive emotions and I have not presented a comprehensive defence of the account here. Rather, I have defended it from one criticism in particular. The asymmetry objection tells us that if expressions enable the perception of emotion, then such perception must be indirect. I have argued that this objection fails since it neglects the diverse ways in which things can enable perception. Not only does my opponent not consider the full range of intermediaries when constructing this objection, but I have shown that there are some striking similarities between perceptual media and expressions with respect to how they appear phenomenally in the service of that which they mediate. The burden of proof should be on the proponent of the asymmetry objection to demonstrate why expressions mediate emotion awareness in a way that screens off their direct perception, in light of the options discussed here and elsewhere.