A Critique of Embodied Simulation
- First Online:
- Cite this article as:
- Spaulding, S. Rev.Phil.Psych. (2011) 2: 579. doi:10.1007/s13164-011-0071-2
- 208 Views
Social cognition is the capacity to understand and interact with others. The mainstream account of social cognition is mindreading, the view that we humans understanding others by interpreting their behavior in terms of mental states. Recently theorists from philosophy, psychology, and neuroscience have challenged the mindreading account, arguing for a more deflationary account of social cognition. In this paper I examine a deflationary account of social cognition, embodied simulation, which is inspired by recent neuroscientific findings. I argue that embodied simulation fails to present an adequate alternative to mindreading accounts of social cognition. I defend a philosophically and empirically plausible two-systems account of social cognition, which holds that even very young children are capable of mindreading.
Social cognition is the capacity to understand and interact with others. In philosophy the mainstream account of social cognition is mindreading, the view that we humans understanding others by interpreting their behavior in terms of mental states. We attribute beliefs and desires to others, and we explain and predict their behavior on the basis of those attributed mental states. Recently theorists from philosophy (Gallagher and Hutto 2007), psychology (Bruner 1990), and neuroscience (Gallese 2007) have challenged the mindreading account, arguing for a more deflationary account of social cognition. In this paper I shall examine a deflationary account of social cognition, embodied simulation, which is inspired by recent neuroscientific findings. I shall argue that embodied simulation fails to present an adequate alternative to mindreading accounts of social cognition.
2 Embodied Simulation
This alternative to mindreading has its inspiration in the developing science of mirror neurons. Mirror neurons are neurons that fire, or activate, when a subject acts, emotes or experiences a certain sensation, and also when a subject observes a target acting, emoting or experiencing a certain sensation. For example, a host of neurons in the premotor cortex and posterior parietal cortex fires when I grasp an object, and this same host of neurons fires when I observe another person grasping an object (Rizzolatti and Craighero 2004). There are similar mirror neuron systems for experiencing and observing emotion. When I experience disgust and when I observe another person experiencing disgust the same collection of neurons in the insula fires (Wicker et al. 2003). Similar findings hold for the experience and observation of fear (Adolphs et al. 1994), and sensations such as pain (Singer et al. 2004) and touch (Keysers and Perrett 2004). In each of these cases, groups of neurons are endogenously activated when the subject acts, emotes or feels a certain way, and these same groups of neurons are exogenously activated (at an attenuated level) when the subject observes another acting, emoting or feeling in those same ways.
Mirror neurons were originally discovered in the brains of Macaque monkeys by a group of Italian scientists who are known as the Parma group (because the discovery of mirror neurons occurred in Parma, Italy). The existence of mirror neuron systems has now been confirmed by a variety of brain scanning methods (fMRI, transcranial magnetic stimulation, single cell recordings). There is good evidence that there are mirror neuron systems in human brains, as well (Gallese et al. 2004; Keysers and Gazzola 2009; Rizzolatti and Craighero 2004).
There has been and continues to be much excitement about mirror neuron systems in both humans and non-human animals. But what is the importance of mirror neurons? With regard to social cognition, my focus in this paper, some theorists propose that mirror neurons are the basis for our ability to understand and interact successfully with other people (Gallese 2007; Hurley and Chater 2005; Gallese et al. 2004; Keysers and Gazzola 2009; Oberman and Ramachandran 2009). The argument goes like this. How is it that we understand what other people are doing, why they are doing it, what they are going to do next? Mirror neuron studies demonstrate that parts of our brains fire in the same ways when we observe an action, emotion, or sensation and when we act, emote, or sense in the same way. An appealing tentative suggestion is that we understand what another person is doing, feeling and experiencing because when we observe the other person parts of our brains are activating as if we were doing what the other person is doing. Our brain activity mirrors the other person’s brain activity such that it is as if we are acting, feeling, or experiencing how the target is acting, feeling, experiencing.
Vittorio Gallese, a member of the Parma group, has done the most to articulate and defend this idea. He calls his view embodied simulation (ES). ES is a functional mechanism that underpins our understanding of others by activating the same neural substrates when certain behaviors are executed and perceived. Our brain is reusing part of the same neural resources it would use if we were performing the same kind of action as we observe. We understand what it is like to act, emote and feel as our interactant does because parts of our brain that determine our own experiences mirror the activity in the other’s brain. In other words, in the observer’s brain is a neural mapping that is isomorphic to the neural mapping in the target’s brain.
According to my model, when we witness the intentional behavior of others, [ES] generates a specific phenomenal state of ‘intentional attunement.’ This phenomenal state in turn generates a peculiar quality of identification with other individuals, produced by establishing a dynamic relation of reciprocity between the ‘I’ and the ‘Thou.’ By means of [ES] we do not just ‘see’ an action, an emotion, or a sensation. Side by side with the sensory description of the observed social stimuli, internal representations of the body states associated with these actions, emotions, and sensations are evoked in the observer, ‘as if’ he or she were doing a similar action or experiencing a similar emotion or sensation. To see others’ behavior as an ‘action’ or as an experienced emotion or sensation specifically requires such behaviors to be mapped according to an isomorphic format. Such mapping is [ES] (Gallese 2009, p. 527).
ES is meant to contrast with the mainstream cognitivist way of modeling social interactions, which, according to Gallese, holds that in our understanding of others we start with the observation of mentally opaque behavior, and we interpret and explain the behavior in mental terms. We understand others by thinking about the contents of their minds, i.e., by mindreading. Gallese argues that although some social cognition is ‘social meta-cognition,’ most of the time we have a much more direct access to the inner world of others via ES (2007, p. 659). We do not have to reason about others’ mental states because we already directly understand them. “A direct form of understanding of others from within, as it were—intentional attunement—is achieved by the activation of neural systems underpinning what we and others do and feel” (Gallese 2009, p. 524). ES produces the phenomenal state of intentional attunement, which is a direct form of understanding others from within, and this obviates the need for social metacognition in our usual social interactions.
In order to prevent confusion, it is important to understand that Gallese has an alternative, deflationary account of intention. His notion of intention differs from the cognitivist understanding of intention according to which an intention is a contentful, representational mental state and intentional understanding is a meta-representational activity in which one represents the mental state of the target. For ES, an intention is merely a sequence of goal-directed behaviors, and intentional understanding consists in predicting the next action in a sequence. As Gallese puts it, “Ascribing intention would therefore consist in predicting a forthcoming new goal. According to this perspective, action prediction and the ascription of intentions are related phenomena, underpinned by the same functional mechanism, i.e., embodied simulation” (2007, p. 662). This definition of intentional understanding will be important later in the paper.
To be clear, Gallese is not arguing that ES is all there is to social cognition, but it does, he contends, play a major role in social cognition. Why should we believe that ES plays a major role in social cognition? There are several arguments, two of which I shall discuss here. The first argument is meant to cast doubt on the mainstream view, mindreading, and highlight the need for a deflationary account of social cognition. The second argument aims to establish that ES is generally sufficient for social cognition, so mindreading accounts are unparsimonious.
Gallese’s first argument challenges a widespread assumption of mindreading accounts, which I shall call the sharp distinction (Gallese 2007, p. 659). Mindreading accounts suggest a sharp distinction between the social cognitive skills of normal adult humans and non-human animals. Normal adult humans understand others by mindreading whereas non-human animals understand others behavioristically. There are two strands of argument against this sharp distinction, however. One concerns so-called sophisticated social cognition and the other focuses on the supposed behavior-based social cognition of non-human animals. Gallese argues that the data suggest that allegedly sophisticated social cognition is not so sophisticated after all, and the behavior-reading of non-human animals is not so crude, and hence there is no sharp distinction between mindreading and behavior-reading. I will address these sources of evidence in order.
For the last twenty or so years, the standard developmental picture of mindreading has been that at around 4 years of age, children undergo a fundamental shift in their mindreading abilities. As Heinz Wimmer and Josef Perner's experiments first revealed, and other experiments have since replicated, before the age of 4 children cannot pass standard false-belief tasks (Gopnik and Astington 1988; Wellman et al. 2001; Wimmer and Perner 1983). In one task commonly referred to as the Sally-Anne task, children listen to a story as it is enacted with dolls named Sally and Anne. In the scene, Sally hides a toy in one place and then she leaves the scene. Anne moves the toy from the original hiding place to a new hiding place. When children younger than 4 years old are asked where Sally will look for the toy, they answer incorrectly. They say she will look in the new hiding place. Children 4 years and older, however, typically answer correctly. They say Sally will look in the original place and explain why she will look there. This evidence has been taken to show that there is a significant developmental shift in mindreading abilities at around 4 years of age. At age 4 children shift from lacking proficiency with the concept of BELIEF to being able to appropriately apply the concept in a range of situations. That is, at age 4 children master the BELIEF concept. Given that the concept of BELIEF plays an important role in understanding others’ mental states, the standard false-belief task has been taken to be the measuring stick of mindreading abilities.
Kristine Onishi and Renée Baillargeon (2005) have objected to the standard false-belief tasks, arguing that these tasks are computationally and linguistically too taxing for children younger than 4 years old. The standard false-belief task requires children to remember the details of the story, who saw what and when, to interpret adults’ questions, and give appropriate responses to these questions. Many of these task demands are unrelated to mindreading per se (Bloom and German 2000). Rather, the demands of the standard false-belief task reveal performance of executive functions, e.g., memory and response inhibition. In lieu of the standard measuring stick, Onishi and Baillargeon opt for a simplified non-linguistic false-belief task to measure mindreading abilities of younger children.
In their novel non-linguistic false-belief task, 15-month-old infants watch an actor put a toy watermelon slice in one of two adjacent boxes, a green box or yellow box. Next, the toy is moved. In half the trials the toy is moved halfway to the other box and then back to the original box, and in the other half of the trials the toy is moved to the other box. For both of these conditions the actor either does or does not see the movement of the toy. (In one variation she looks through an opening in the tops of the boxes and in another variation she does not.) Using the violation-of-expectation method, Onishi and Baillargeon found that 15-month-old infants looked longer in two cases: first, when the actor does not see that the toy's location has changed, but searches in the correct box anyway, and second, when the actor does see the toy being relocated but the actor reaches in the incorrect box.
Onishi and Baillargeon interpret these results as showing that the 15-month-old infants expect the actor to search for the toy on the basis of her belief about the toy's location. When the actor does not search for the toy on the basis of her belief, the infants' expectations are violated and they thus looked longer at those events. Onishi and Baillargeon take this to be good evidence for the conclusion that 15-month-old infants already have mindreading abilities and that the ability to mindread, in at least a rudimentary form, is innate.
If Onishi and Baillargeon are correct, infants as young as 15 months old can understand false beliefs. This result—and other similar results from developmental psychology—should be puzzling if your theory commits you to the idea that understanding false belief requires, first, that one possess the concept of BELIEF, second, that one must attribute to the target a particular belief, and third, that one must understand that the content of the target’s belief does not correspond to reality. 15-month-old infants are pre-linguistic and likely not capable of this kind of sophisticated mentalizing. It seems, Gallese argues, that understanding false beliefs cannot be as cognitively sophisticated as mindreading approaches indicate. In particular, understanding false beliefs does not require attributing to the target a mental representation with propositional content. Thus, Gallese’s first way of rebutting the sharp distinction is to show that a kind of social cognition that others assume is very cognitively sophisticated, false-belief understanding, is not so sophisticated after all.
Gallese’s second way of attacking the sharp distinction between the social cognition of adult humans and that of non-human animals involves the alleged mere behavior-reading of non-human animals. Behavior-reading consists in observing a behavior and predicting the behavioral consequences of that behavior. Importantly, it does not in any way involve inferring anything about the internal mental life of the target.2 The clearest cases of behavior-reading involve technology. When typing keywords into Google’s search bar, Google takes as input the symbols you type and brings up results that are relevant to your search according to some algorithm. Although it seems that Google knows what you want and caters its search results accordingly, it is simply taking your behavior as input and responding in the way that some algorithm dictates. The Google example demonstrates what behavior-reading is, but it also shows that behavior-reading can be highly complex. Behavior-reading need not be a simplistic stimulus–response mechanism.
Gallese argues that there is good reason to think that non-human animals engage in more than simple behavior-reading. Some of this evidence comes from classic behavioral experiments in cognitive ethology, which test monkeys’ social cognitive skills. Although these experiments reveal clear limitations on monkeys’ social cognitive abilities (Povinelli and Vonk, 2003), they also show that monkeys are not merely behavior-reading. Experiments in cognitive ethology reveal that some monkeys are sensitive to the gaze direction of conspecifics and humans, follow others’ gazes to out-of-view objects, and take into account opaque barriers (Tomasello, et al., 2003). They can also adapt their food retrieval strategy based on whether a dominant competitor can or has seen the food location, and they can even manipulate whether a competitor can see them to gain strategic advantage (Hare et al. 2000; Hare et al. 2001, 2006). These experiments show at a minimum that monkeys understand that seeing leads to knowing, and this is plausibly explicable in terms of intentional attunement rather than mere behavior-reading.
In addition to these behavioral studies, Gallese relies on the Parma group’s mirror neuron studies on Macaque monkeys. Mirror neuron studies indicate that monkeys are doing more than simple behavior-reading, that they are engaged in intention understanding. According to ES, the social cognition of non-human animals is not mere behavior-reading because for creatures with mirror neurons no behavior-reading is mere behavior-reading. The observation of goal-directed behavior is automatically imbued with social meaning.
Both the evidence from cognitive ethology and the studies on mirror neurons are controversial. The cognitive ethology experiments Gallese cites are part of a highly contentious ongoing debate in which there is no clear winner. Moreover, I am highly skeptical of the Parma group’s interpretation of mirror neuron activity as underlying intentional understanding.3 For now, though, let’s grant Gallese’s claim that cognitive ethology and mirror neuron studies provide good evidence against the claim that non-human animals are mere behavior-readers. I shall argue in the next section that even if this claim were true, Gallese’s conclusion would still not follow.
Gallese argues that there is strong evidence against a sharp distinction between mindreading and the social cognition of non-human animals, and because mindreading accounts are committed to such a sharp distinction, we should reject mindreading accounts. ES is an improvement over mindreading accounts because it explains why there is no such sharp distinction. First, mirror neuron-endowed creatures are equipped to grasp automatically the meaning of observed goal-directed behaviors, and so their social interactions automatically involve intentional attunement, a basic form of social understanding. Second, our ordinary interactions both as children and adults only require intentional attunement, the direct understanding realized by ES. Once we have intentional attunement, we typically do not need to engage in mindreading to explain the observed behavior. That is, the intentional attunement realized by ES is sufficient for our normal interactions. Thus, ES’s intentional attunement bridges the chasm between mindreading and mere behavior-reading.
3 Criticism of Embodied Simulation
The above arguments are meant to cast doubt on mindreading accounts and highlight the need for a more deflationary account of social cognition. In this section I shall critically evaluate the arguments. I shall argue that none of these arguments give us reason to doubt mindreading accounts of social cognition.
The Sharp Distinction
Gallese notes that the mindreading account draws a sharp distinction between mindreading and mere behavior-reading with the former characterizing adult human social cognition and the latter characterizing non-human social cognition (Gallese 2007, p. 659). Gallese finds this sharp distinction implausible. He says, “[I]t seems preposterous to claim that our capacity to reflect on the intentions, beliefs, and desires determining the behaviour of others is all there is in social cognition. It is even less obvious that, while understanding the intentions of others, we employ a cognitive strategy totally unrelated to predicting the consequences of their observed behaviours” (Gallese 2007, p. 659).
Reading the above quote literally, Gallese is attributing to mindreading accounts the claim that the kind of social cognition in which normal adult humans engage is entirely different from the kind of social cognition in which non-human animals engage. Adults engage only in mindreading, and this is entirely divorced from the behavior-reading in which non-human animals engage. This is a claim that, so far as I can tell, no mindreading theorist endorses. Sophisticated mindreading skills piggyback on more basic forms of social cognition. The basic forms of social cognition, like behavior-reading, that serve as the basis for the development of mindreading, do not disappear once high-level mindreading skills develop. They underlie sophisticated mindreading and are still at work in more basic social interactions. Furthermore, not all of adult social cognition consists in attributing mental states to others. Gallese argues against “the notion that the sole account of interpersonal understanding consists in explicitly attributing to others propositional attitudes like beliefs and desires, mapped as symbolic representations” (Gallese, 2009, p. 524). But this too is an idea no one seriously endorses. If explicitly attributing propositional attitudes means consciously attributing propositional attitudes, then just about all mindreading theorists would reject this notion. And although one might think that mindreading theorists believe that all there is to social cognition is mindreading because often their focus is solely on the nature of mindreading, no one argues that mindreading is all there is to social cognition. Behavior-reading, among other things, is still an important component of social cognition. Thus, mindreading theorists are not committed to the literal interpretation of the claim Gallese attributes to them.4
A weaker interpretation of the sharp distinction claim is that the kind of social cognition in which normal human adults often engage and the kind in which non-human animals engage are very different, but not entirely different. Normal human adults often engage in sophisticated mindreading whereas non-human animals do not. There is debate about the “often” part and how to characterize the social cognitive skills of non-human animals, but this general idea is widespread in the mindreading literature. So what is Gallese’s argument against this idea?
Gallese relies on two kinds of arguments to establish that there is not a sharp distinction between mindreading and behavior-reading: data from developmental psychology, like Onishi and Baillargeon’s results which allegedly downgrade the cognitive sophistication required for high-level mindreading, and data that allegedly show that non-human animals are not mere behavior-readers. I will consider these arguments in order.
Gallese argues that studies such as Onishi and Baillargeon’s should lead us to conclude that understanding false beliefs is not so sophisticated as mindreading theorists argue. That may well be true, but the results do not unequivocally suggest that conclusion, and in fact I think there is a better interpretation of this and other such studies. In the literature the response to such studies has been split between two contrary views (Apperly and Butterfill 2009). There are those who think these studies show that even 15-month-old infants understand false beliefs, perhaps already possess the BELIEF concept, and that the reason children fail the standard false-belief task until age 4 is that up until age 4 the task is too demanding on younger children’s executive system, e.g., short-term memory and response-inhibition (Baillargeon et al. 2010; Leslie et al. 2004; Onishi and Baillargeon 2005). On the opposite side are those who think that these studies show that infants must be very clever behavior-readers because it is not possible for 15-month-olds to understand false beliefs, or possess the BELIEF concept, and the reason children fail the standard false-belief task until age 4 is because up until age 4 they do not fully grasp the BELIEF concept (Perner and Ruffman 2005). According to the first interpretation, the shift that children undergo when they start to pass standard false-belief tasks at age 4 reflects a gradual development of the executive system, whereas according to the second interpretation this shift in the ability to pass standard false-belief tasks represents a genuine conceptual change.
Gallese’s position falls in the first camp, which holds that these studies show that infants as young as 15-months-old understand false beliefs. Gallese may seem like a strange bedfellow for those who offer mentalistic interpretations of these studies, but it is perfectly consistent with his account. Gallese’s position on the interpretation of developmental psychology studies stems from his argument against the idea of mere behavior-reading. For Gallese, behavior-reading is a solipsistic process, devoid of social meaning for the observer. He thinks the social cognition of mirror neuron-endowed beings (monkeys, human infants, etc.) involves an appreciation of what it is like to perform the observed act. Mirror neuron-endowed creatures are capable of intentional attunement, so the behavior-reading interpretation of the developmental psychology studies must be wrong.5 Instead, the conclusion he draws from this is that understanding beliefs (and false beliefs) must not be so cognitively sophisticated if infants can do it. He takes this result to be evidence for his deflationary account of social cognition.
My own interpretation of these results differs from both of the above responses.6 In my view, the truth lies between these two extremes. On the one hand, the 15-month-olds’ understanding of the other agent made manifest in the violation of expectations looking times is not equivalent to the attribution of false beliefs. Reasoning about beliefs involves appreciating the intentionality of mental states, grasping complex causal relations among mental states, complex abductive reasoning about what the target believes and how she will act on those beliefs, understanding how beliefs, in conjunction with desires, rationalize behavior, and ascribing states with propositional content (Davidson 2001). The Onishi and Baillargeon study, or any other study for that matter, does not show that infants are capable of the foregoing reasoning. The evidence does not suggest that 15-month-olds are reasoning about beliefs in the sense described above. Reasoning about beliefs is very computationally demanding because the contents of beliefs are domain general and often unrelated to proximal observable stimuli. Moreover, we have little reason to believe that infants grasp the often-complex causal relations among (true and false) beliefs and other mental states, and appreciate the intentionality of mental states. After all, even 5- and 6-year-old children who have passed standard false-belief tasks fail dual identity tasks, which test whether or not children understand that beliefs about objects are held under certain descriptions, or intensions (Apperly and Robinson 2003). Even 5- and 6-year-olds fail to appreciate this aspect of belief, which is arguably an important part of understanding the concept of BELIEF.
However, I also think the alternative interpretation, that they are merely very good behavior-readers, is wrong as well. First, the reason for endorsing the view that infants must be very clever behavior readers is that reasoning about beliefs is too sophisticated for infants. But it is not clear why learning complex behavioral associations or rules would be much easier than reasoning about beliefs. According to the behavior-reading hypothesis, these infants form associations between particular agents, objects, agents’ orientation toward objects, objects’ locations, etc. With regard to the Onishi and Baillargeon results, the hypothesis is that infants look longer when the adult reaches for the toy in the green box when she last saw the toy in the yellow box because they associate the adult with the toy and the yellow box and this association is broken. Forming such behavioral associations would still involve cognitively costly abductive reasoning about the relations between agents, perceptions, objects, locations, and perhaps behavioral rules about how ignorant agents behave.
“[A]fter watching familiarization events in which an agent repeatedly grasps object-A, infants look longer at test events if the agent now grasps object-B, but only if object-B is both present and visible to the agent during the familiarization events, so that infants have evidence that the agent prefers object-A over object-B. These different looking-patterns indicate that infants do not merely form associations but consider (at the very least) the motivational and reality-incongruent informational states that underlie agents’ actions” (Baillargeon et al. 2010, p. 115).
I think that what this and dozens of other studies show is that it is highly unlikely that infants are merely very clever behavior readers. It is one thing to explain one particular result by positing a behavioral rule. Although it is not always easy: What non-ad hoc behavioral rule or association would plausibly explain 5-month-olds exhibiting the looking pattern described in the above passage? But is another thing altogether to look at the spectrum of findings in developmental psychology and claim that for each of these there is a behavioral rule or association that explains each of the findings. The collection of behavioral associations and rules we would have to posit to explain infants’ behaviors in all of these studies would be massive, immensely complex, and suspiciously ad hoc. The number of behavioral associations and rules attributed to infants seems to correlate with number of different sorts of experiments in developmental psychology. It is not clear what advantage is gained by maintaining that infants are very, very clever behavior-readers.7
In terms of computational complexity, understanding others’ behavior would be much simpler for infants if they interpreted behavior in terms of inner mental states, e.g., intentions, desires, preferences, etc., because they would not have to memorize and sift through numerous behavioral rules and associations in order to understand an observed behavior. We know that older children demonstrate the ability to interpret behavior in terms of mental states. Thus, the question is not whether children develop this ability, but when they develop it. The evidence suggests that from very early on infants interpret behavior mentalistically. But if both above accounts of infants’ social cognitive abilities are wrong, what is the right account? I shall advance a thesis that accommodates the idea that infants mentalistically interpret behavior but which is not committed to the idea that infants grasp the concept of BELIEF.
Both human infants and adults have beliefs. That is, stored in their minds are representational states, which have certain propositional content, and which play a causal role in the production of their behavior. But only adults are demonstrably capable of reasoning about and attributing beliefs to themselves and others. For reasons described above—namely, the computational complexity of belief attribution and the requisite sophisticated concept of BELIEF—infants are not capable of reasoning about and attributing beliefs. That is, infants are not capable of reasoning about what I shall call belief proper. I suggest, however, that they are capable of employing a more rudimentary mental state concept, a prototype of the belief concept. Infants understand others’ behavior in terms of what I shall cleverly call belief-like states.
Belief-like states are similar to beliefs proper in several respects: they are representational mental states that are shaped by perception and guide behavior.8 Belief-like states are different from beliefs proper in the following respects. The contents of belief-like states are always closely tied to proximal observable behavior, how proximal stimuli affect others’ perceptions and, ultimately, their behavior. Beliefs proper, in contrast, can be about anything whatsoever, need have no connection to immediate, observable stimuli, and may bear complex relations to behavior.
Reasoning about beliefs proper and belief-like states are similar in certain respects: both require inferring in a target a representational state, which has a mind-to-world direction of fit, and which plays a causal role in the production of behavior. However, reasoning about belief-like states is not nearly as computationally complex as reasoning about beliefs proper. The former involves reasoning about states that are closely tied to current and recent observation, whereas the latter requires inferring, and grasping the complex causal relations among, mental representations that are domain general and can have contents unrelated to proximal observable stimuli. Moreover, reasoning about beliefs requires grasping the concept of BELIEF, whereas reasoning about belief-like states does not. The precursor or prototype concept is significantly less sophisticated than the BELIEF concept.9 It does not require grasping the fact that beliefs are held under certain descriptions,10 the appearance/reality distinction11 nor any of the more nuanced aspects of belief proper, e.g., different types (occurrent, dispositional, implicit, etc.) and degrees of belief.12
My hypothesis is that adults can—and often do—understand others’ behavior in terms of beliefs proper, whereas infants are limited to understanding others’ behavior in terms of belief-like states. One philosophically and psychologically plausible framework for understanding this hypothesis is in terms of a two-systems account of social cognition. One system is innate, fast, extremely limited, and the other system is late developing, slower, and highly flexible. Such a two-systems account of social cognition is analogous to Peter Carruthers’ two systems approach to human reasoning. Carruthers argues that System 1 consists in a series of parallel systems, which are common to all members of the species, evolutionary ancient, have their own virtually immutable trajectory of learning and change, operate swiftly and unconsciously, and are not subject to voluntary control. In contrast, System 2 operates linearly, is slower, characteristically conscious in its operations, can override System 1 reasoning, subject to explicit teaching and voluntary control, and much more subject to individual variation (Carruthers 2006, p. 254). Virtually all theorists studying human reasoning have converged on some version of a two systems account (Evans and Over 1996; Stanovich et al. 2008). The analysis of social cognition can benefit by extending such two-systems accounts to the study of social cognition.
The following account, which is heavily influenced by Ian Apperly and Stephen Butterfill’s (2009) account, is a two systems account of human social cognition that I shall argue is well supported and offers a unified explanation of the findings from developmental psychology.13 System 1 is a fast, frugal, and inflexible system that processes belief-like states. Minimally, having System 1 reasoning implies that the cognizer can infer that a target has an internal cognitive state (a belief-like state) that is shaped by his perceptions and which—along with his goals—guides his behavior.
System 2 is a slower, cognitively costly, flexible system that processes beliefs proper. Facility with System 2 reasoning implies having the concept of BELIEF, whereas facility with System 1 reasoning does not. Inferring and reasoning about beliefs proper is slower and more cognitively costly than reasoning about belief-like states because the former can be about anything at all and need not have any immediate, obvious connection with behavior. Given that belief-like states are limited to perceptions and how these affect observable behavior, reasoning about belief-like states is much faster and easier than reasoning about beliefs. But for the same reason, reasoning about belief-like states is significantly limited and inflexible compared to reasoning about beliefs.14
System 1 is early developing and operates roughly the same for all normally developing human infants. System 2 develops gradually throughout grade school years and is subject to cultural and individual variation. With regard to studies in developmental psychology, the hypothesis I am defending is that System 1 reasoning about belief-like states is the principal mechanism for infants’ anticipation of intentional behavior and interpretation of others’ perceptual perspectives. For example, the infants in Onishi and Baillargeon’s study expect that the experimenter will reach in a certain direction for the toy. The robust explanation of this result is that the expectation is based on a full-blooded belief attribution, whereas the minimalist explanation is that the expectation is based on a behavioral rule that people reach for objects where they were last oriented toward them. On the intermediate account I am advancing, infants understand the experimenter’s behavior in terms of what she sees, her intention to grasp the toy, and attribute to her a belief-like state about the location of the toy. This sort of understanding is more limited than full-blooded belief understanding. It is limited to how current and recent perceptions affect the experimenter’s intentions and behavior. Understanding others’ behavior in terms of belief-like states suffices for the basic social interactions of which infants are capable.
Such an account is philosophically and psychologically plausible. It is on solid ground philosophically because it respects the established notion of what reasoning about beliefs proper amounts to, e.g., ascribing domain general propositional attitudes, understanding how distinct mental states interrelate, reasoning about what the target believes and how she will act on those beliefs, the extent to which those beliefs correspond to reality, how beliefs, in conjunction with desires, rationalize behavior, etc. (Davidson 2001). It respects the fact that beliefs are domain independent and bear complex relations to other propositional attitudes and to behavior. Thus, this account preserves the established notion of what it is to reason about beliefs, and it does so without being committed to the empirically disconfirmed claim that infants are mere behavior-readers.
The account is also psychologically plausible. Two systems accounts of cognitive functions are well established in psychology. For example, there are two systems approaches to number cognition, empathy, vision, learning, and reasoning, just to name a few. These approaches share a commitment to a fast, automatic, limited system and a slower, more cognitively costly system that is influenced by conceptual knowledge. It is plausible that there are two such systems with regard to social cognition, as well.1516
Let’s incorporate the two-systems approach into the discussion of ES. Recall that Gallese argues against the mindreading accounts by arguing against the sharp distinction assumption, the idea that normal adult human social cognition is very different from that of less cognitively sophisticated creatures. He relies on studies from developmental psychology that purport to show that even 15-month-olds can understand false beliefs in order to demonstrate that even false-belief understanding is not so sophisticated after all. But I have argued that we should dismiss the dichotomous interpretations of such results in developmental psychology, that there is a middle ground between behavior-reading and full-blown belief understanding. If I am right about this, and something like System 1 better represents the sort of understanding infants exhibit in these studies, then Gallese’s argument fails. Gallese attempts to show that false-belief understanding—that is, understanding that beliefs proper can be false—is not so sophisticated because infants can do it. The best interpretation of these studies, however, is that infants have an understanding of belief-like states. Proper false-belief understanding emerges slowly and much later in development. Gallese has done nothing to refute the idea that proper false-belief understanding is in fact quite sophisticated and is a typical feature of normal adult human social cognition. That being the case, the first prong of his attack on the sharp distinction argument fails. Consideration of experiments in developmental psychology does not, as he claims, warrant a preference for ES over mindreading.
The second part of Gallese’s argument against the sharp distinction between behavior-reading and mindreading involves behavioral studies from cognitive ethology and mirror neuron studies. The behavioral studies, Gallese argues, show that monkeys are capable of more than brute behavior-reading. They can understand intentional behavior (though again keep in mind that intention has a deflationary meaning here). The mirror neuron studies are meant to substantiate this conclusion. However, even if the claims about behavioral studies and mirror neurons were true, the motivating argument for ES fails. Even if Gallese is correct that non-human animals engage in more than mere behavior-reading, this is not a serious challenge to mindreading accounts.
In fact, I think Gallese is right that non-human animals are not all mere behavior-readers. But this idea in no way suggests a rejection of mindreading. The following two hypotheses are compatible with the general mindreading account: Some animals, especially primates, are capable of System 1 reasoning about belief-like states. That is, they reason about how stimuli affect conspecifics’ perceptions, intentions and behavior. A second possibility is that non-human animals understand conspecifics’ behavior in terms of perceptual states and goal-oriented behavior, not intentions and belief-like states. This sort of understanding would be less sophisticated than System 1 reasoning, but still it is not mere behavior-reading.17 These two suggestions accommodate the motivation for Gallese’s argument without in any way casting doubt on the claim that adult humans often engage in sophisticated (System 2) mindreading and non-human animals do not.
Gallese also objects that mindreading accounts are unparsimonious. He holds that the intentional attunement afforded by ES is necessary and sufficient to explain our normal social interactions. We can, of course, attribute mental states to others, “[m]ost of the time, though, we do not need to do this” (Gallese 2007, p. 659). There is no need to posit mindreading over and above intentional attunement in order to explain our ordinary social interactions. We do not need to represent the propositional attitudes of others in order to successfully engage in social interactions. The direct understanding afforded by ES is sufficient for most of our ordinary interactions. Thus, mindreading accounts are unparsimonious.
I take this to be ES’s most important argument. If proponents of ES are correct about this, then the posits of mindreading accounts would seem profligate. However, if non-mentalistic embodied practices are not sufficient to explain our normal social interactions, then ES’s claims to parsimony are unjustified, and the need for something like mindreading is confirmed.
On the face of it, the claim that embodied simulational practices are sufficient seems patently false. There are many social activities in everyday life that seem to outstrip basic embodied practices. Examples from my own life include teaching philosophy, engaging with fiction, gossiping, debating politics, counseling friends, etc. These examples are not unique to my life. These and other such activities are widespread in human life, and it is difficult to see how ES alone could adequately explain them. That is not to discount the importance of embodied practices. They may certainly have an explanatory role to play here, but they could not be sufficient.
ES theorists could deny that such activities are widespread in everyday human life, but this seems just false. For example, gossip, which involves speculating about others’ motivations, feelings, beliefs and intentions, seems to make up a substantial part of adult socializing. Alternatively, they could emphasize all the activities in our social lives that do not go beyond intentional attunement. The purpose of a move like this would be to deemphasize the importance of the activities that outstrip embodied practices. This response simply changes the subject of discourse, for mindreading theorists are interested in explaining a certain part of social cognition, namely, folk psychological competence of the sort that is present in social interactions like gossip. Moreover, this move seems to admit implicitly that the deflationary account must cede sophisticated social cognition to the mindreading account.
I think it is more likely that ES theorists would argue that these activities do indeed require more than intentional attunement but deny that mindreading is required to explain these activities. For instance, the Narrative Practice Hypothesis (NPH), a hypothesis introduced by embodied cognition theorist Daniel Hutto (2008) offers a non-mindreading account of the source and nature of sophisticated social cognition. NPH holds that the source of our capacity for sophisticated social cognition is direct encounters with folk psychological narratives, and the nature of folk psychological competence consists in understanding others’ reasons for action in terms of folk psychological narratives. On this view, developing folk psychological competence consists in learning about the forms and norms of reason-giving explanations. NPH is meant to be completely independent from mindreading. Apparently it does not consist in, nor does it depend on, mindreading (Gallagher and Hutto 2007).
NPH seems like the most plausible candidate for a non-mindreading explanation of the more cognitively sophisticated social activities described above. Gallese endorses something like NPH when he says, “A possibility is that [ES] mechanisms might be crucial in the course of a long learning process required to become fully competent in how to use propositional attitudes, like during repetitive exposure of children to the narration of stories” (Gallese 2006, p. 20). Moreover, in a later article, Gallese cites Hutto and explicitly endorses NPH (2007, p. 667). Thus, ES theorists can, and do, legitimately endorse NPH as a non-mindreading explanation of sophisticated social cognition. I shall argue that if NPH is construed as completely independent from mindreading, as its originators argue, then it is not sufficient to explain the ontogeny of human social cognition. That is, I shall argue that NPH is an inadequate account of the source of sophisticated social cognition.
Most children’s worlds are populated with people acting for reasons. The question is how children come to acquire competency with folk psychology from this kind of environment. The critics of mindreading emphasize that embodied simulational capacities, such as intentional attunement, are responsible for our basic social interactions. These capacities allow us to directly perceive others’ intentions and emotions, but these do not involve representing others’ mental states. These embodied practices simply involve sensitivity to certain aspects of behavior. Intention understanding, recall, consists in anticipation of the next behavior in an action sequence.18 It is difficult to see how, from the basis of non-mentalistic embodied practices, children could learn folk psychology. Even if children are regularly exposed to narratives involving folk psychology, there is a substantial gap from intentional attunement to folk psychological competence.
In contrast, it is relatively easy to see how children could acquire folk psychological competence if children understand others’ behavior in virtue of developing mindreading abilities. If children start out with System 1 reasoning about belief-like states, rather than mere intentional attunement, then the development of System 2 reasoning is a smaller, more gradual step than the leap from detecting others’ sensitivity to worldly features to folk psychological competence.19 In fact, it is hard to see how children could acquire competency with folk psychology in such environments without some sort of mindreading abilities, that is, without the ability to attribute mental states to others.
Consider an analogy. Charles is thrust into a world in which everyone plays chess all the time. Charles, however, does not know anything about chess. He does not know that it is a game, the purpose of the game, the rules of the game, the strategies of the game, the game pieces, or anything at all about chess. Despite the fact that everyone around him plays chess, no one actually teaches Charles chess. What he learns about chess he learns from observing and interacting with those around him. How much can Charles learn about chess in this environment? Maybe he can learn the names of the game pieces (when players say, e.g., “I’ve got your rook”) and some of the typical moves of the game pieces, but it is unlikely that he would learn much about the rules and strategies of the game. These are not ostensively on display, so it would be more difficult to learn these without explicit guidance.
Of course, there are patterns in the way people play chess, and one could use these patterns to figure out the rules and strategies of chess. One might argue that all Charles has to do is detect these patterns and eventually with enough exposure to these chess-playing patterns, his pattern-completing, form-finding brain would pick up the rules and strategies of chess. Patterns are everywhere though, and some of these patterns signify the rules and strategies of chess and other patterns are meaningless. Charles is starting with no knowledge of chess, so even if he is predisposed to pay attention to chess-playing, he has no way of knowing whether a pattern is meaningful or meaningless with respect to the forms and norms, or rules and strategies, of chess.
Now, suppose we add to this story that Charles does not speak the language of the chess players; in fact, he does not speak any language at all yet. And in addition to learning chess, he is learning how to grab objects, to sit up, roll over, crawl, walk, speak, etc. Even if Charles were naturally very interested in chess-playing, it would be incredibly difficult for Charles to learn all about chess in this environment, not because he lacks information, but because there is so much information and no way to know what is important, why, and in what way.
There is much folk psychological information in the environment. There is also much non-folk psychological information in the environment. And these sources of information bear complex relations to each other. In such an informationally noisy environment, it is quite difficult to acquire folk psychological competence. It is difficult to see how, from non-mentalistic understanding, any child could learn folk psychology from folk psychological narratives.20
Moreover, if NPH were correct, if children’s interactions are non-mentalistic, and interacting with story-telling caregivers were how children come to acquire folk psychological competence, then it would seem miraculous that children across cultures—or even children within a culture—acquire folk psychological competence at roughly the same time21 because, as the empirical evidence shows, not all children are regularly exposed to story-telling caregivers.
I think there is a better, less miraculous account of children’s acquisition of folk psychological competence. All normally developing children are born with (or very quickly develop – the account does not depend on the truth of nativism) System 1 reasoning. This basic reasoning about belief-like states is the basis for non-human social cognition and is the stepping-stone to the more sophisticated System 2 reasoning about beliefs. Children are already engaging in System 1 rudimentary mindreading when they begin to be exposed to folk psychological narratives. And it is precisely because they are already engaging in System 1 reasoning that they can even start to understand folk psychological narratives.22 Thus, NPH has the story exactly backwards: it’s not the case that children understand how mental states cause behavior because they understand folk psychological narratives. Children understand folk psychological narratives because they understand how mental states cause behavior!23
This idea is independently well motivated. As I explained above, developmental psychology is full of experiments showing that from as early as 15-months, children understand others’ communicative intentions that are based on false perceptions, recognize others’ false perceptions, appreciate others’ visual perspectives, anticipate how others will act based on what the others see, attempt to alter others’ incongruent perspectives, etc. (Baillargeon et al. 2010; Csibra 2010; Gergely and Csibra 2003; Onishi and Baillargeon 2005; Southgate and Csibra 2009). The best explanation of these data is that infants automatically interpret behavior mentalistically, in the way characterized by System 1 reasoning about belief-like states.
From this basis, it is no mystery how children develop folk psychological competence. There is no huge leap to make. Infants develop an understanding of animacy very early on. They are able to understand emotions and basic intentional actions, and attribute behavior-guiding belief-like states. As their executive system matures, their attentional capacities increase and they are better able to manage more and various kinds of information. Their ability to understand communicative intentions improves, and they gradually acquire language, and eventually they come to appreciate folk psychological narratives. Children’s ability to understand behavior in terms of mental states improves with all of these developments, and their grasp of mental state concepts gradually develops. That is, throughout the first four to five years of life, System 2 gradually comes online, and it further develops and becomes more sophisticated throughout childhood. And all along this developmental trajectory children are interacting with caregivers and are as embodied and environmentally situated as ES theorists insist. This is a basic mindreading developmental story that I find very plausible. It does not deny the importance of embodiment or sociocultural narratives, nor does it mistakenly regard these as sufficient to explain human social cognition.
I have considered two main arguments that are meant to cast doubt on mindreading and highlight the need for ES: the sharp distinction argument and the parsimony argument. I argued that neither of these arguments casts doubt on mindreading accounts. In fact, examination of the parsimony argument reveals that mindreading plausibly is a necessary component of social cognition. I offered a philosophically and empirically plausible two-systems account of social cognition that better explains how children acquire social cognitive abilities. According to the two-systems account, young children have rudimentary mindreading abilities, which serve as the basis for the later development of more sophisticated mindreading abilities. This account is on sound philosophical ground, offers the best explanation of the developmental data, and is incompatible with the embodied simulation account.
Embodied simulation and mentalistic simulation are not mutually exclusive, e.g., in imaginatively simulating a target’s anger one’s brain may be also simulating the target’s neural states that are associated with anger.
It may be helpful to distinguish mere behavior reading—which Gallese argues against—from intentional attunement—which Gallese argues for. While both are non-metarepresentational according to ES, behavior reading is devoid of meaning for the observer, whereas intentional attunement is automatically imbued with meaning. With intentional attunement, the observer understands the target’s behavior—i.e., grasps the goal-directedness of the behavior and anticipates further behavior—from the inside. Perhaps one way to put it is that intention understanding additionally involves the understanding of what it is like to perform that act.
There is not enough space here to delve deeply into the reasons for my skepticism. Briefly, though, mirror neuron theorists seem to equivocate on the notion of intention. As I mentioned above, they explicitly employ a deflationary account of intention in explicating how mirror neurons function. But when drawing conclusions for social cognition, these theorists rely on a much richer notion of intention. By equivocating on intention, theorists conclude, for example, that mirror neurons solve the problem of other minds (Iacoboni 2009). My view is that if we adhere to the deflationary notion of intention, mirror neurons cause intentional understanding, but this has very limited implications for social cognition. If we adhere to the richer notion of intention, which plays a role in sophisticated social cognition, then mirror neurons are at best tenuously related to intentional understanding.
Shaun Gallagher, an EC proponent, makes a similar mistake when he argues that, “the idea that these capacities are precursors means that eventually and developmentally, they are not the capacities we employ in more sophisticated adult comprehension of others” (Gallagher, 2008, p. 166).
One might be skeptical about this distinction between solipsistic behavior-reading and phenomenologically rich intentional attunement. It probably begs the question against, e.g., Perner and Ruffman, who argue for a behavioral interpretation of these studies. I shall allow the distinction for now. Once I advance my own interpretation of these studies, the distinction between behavior-reading and intention understanding becomes moot.
My argument assumes that employing multiple behavioral rules is less parsimonious than attributing mental states. This is not a universally accepted assumption. In this issue of Review of Philosophy and Psychology, Low and Wang argue that such assumptions about parsimony are not so straightforward. Perner (2010) also offers a challenging response to the kind of parsimony argument I offer.
I am inclined to think that belief-like states are propositional attitudes, but my argument does not hinge on this claim.
There is a whole literature on the various theories of concepts that I wish to sidestep here. Whatever theory of concepts one accepts, my claim is that understanding belief-like states is less sophisticated than understanding beliefs.
For example, Lois Lane believes that Superman is hunky. Even though Superman is the same person as Clark Kent, Lois Lane does not believe that Clark Kent is hunky. Lois Lane’s belief that X is hunky is true under a certain description (X = the airborne caped superhero who wears his underwear on the outside of his pants) and false under other descriptions (X = the bespectacled reporter who works at the Daily Planet).
For example, suppose I have an apple-shaped candle. Upon first viewing it, you believe that it is an apple. After handling the object, you come to believe that it is really a candle that merely looks like an apple. When I show this object to Sally, what will she believe it is? Grasping the appearance/reality distinction requires understanding that Sally will believe that the object is an apple because the object looks to her like an apple despite the fact that it really is a candle. Mastery of the BELIEF concept involves appreciating that people can have false beliefs based on misleading appearances.
An appreciation of the types and degrees of belief is a feature of the folk psychological concept of BELIEF. Though the folk may not use philosophers’ terminology, such as occurrent and dispositional, they have no trouble understanding the gist of these ideas. Understanding the concept of BELIEF involves, among other things, appreciating that one can believe something that one is not consciously entertaining.
I want to make clear that my target here is the deflationary account of social cognition. In advancing this particular two-systems account I do not aim to refute other non-deflationary two-systems accounts. The account I advance may be more or less compatible with other two-systems accounts in the literature. My goal in advocating for this two-systems account is to offer an alternative to deflationary accounts of social cognition.
See Apperly and Butterfill (2009) for an excellent development of this sort of account. Several theorists offer two-systems accounts of social cognition that are, to varying degrees, compatible with this account. I do not intend to give a complete review of all two-systems accounts available, but I shall sketch a few prevalent alternatives and distinguish them from my preferred account. Nichols and Stich (2003) offer a two-system account of social cognition that differs from the account I am defending in that their first system (the Desire and Plan System) does not allow for metarepresenting targets’ beliefs or perceptual states. On the view I defend, metarepresentation is important for explaining infants’ performance on mindreading tasks. Gopnik and Wellman (1994) describe the first stages of the development of social cognition in terms of a non-representational concept of BELIEF, which in later stages develops into a fully representational concept of BELIEF. I am not entirely comfortable with the idea of non-representational beliefs, but perhaps one way of understanding my preferred account is as an explication and clarification of this transition from “non-representational” to “representational” belief understanding. Leslie and colleagues (2004) offer a two-systems account of social cognition in terms of a Theory of Mind Mechanism (ToMM) and selection process (SP). On their account, infants innately possess a ToMM, a mechanism of special attention that deploys innate concepts such as BELIEF, DESIRE and PRETEND and predisposes the normally developing child to pay selective attention to mental states over other things. ToMM default attributes beliefs with contents that reflect reality. But effective reasoning about belief contents (including contents that do not reflect reality) depends on a process of selection by inhibition. This SP develops slowly through the preschool period and beyond. Although the view I defend is largely compatible with Leslie’s ToMM-SP account, three differences are relevant. The view I defend is non-committal about the existence of a modular theory of mind mechanism, it is not committed to nativism about mental state concepts—in fact, it does not ascribe the BELIEF concept to infants at all,—and it does not hold that prior to the onset of the later system infants attribute only reality-based mental states. It is possible for infants and young children to attribute reality-incongruent perceptions and reality-congruent belief-like states.
There remain open empirical questions about this hypothesis. For example, what is the relation between System 1 and System 2? Does System 2 develop from System 1, do they operate autonomously, or are they interrelated? These are questions that further research must aim to answer.
See Scott et al. (2010) for an argument against Apperly and Butterfill’s two-systems account. Their argument against the Apperly and Butterfill account is that System 1 is allegedly sharply limited, but the experimental evidence suggests that infants’ abilities are not so sharply limited. However, Apperly and Butterfill’s account holds that System 1 is sharply limited in comparison to System 2. Pointing out the variety of reality incongruent states and contextual information to which infants are sensitive does not establish that infants’ abilities are on par with adults’ abilities. Perhaps infants’ social cognitive abilities are not sharply limited in comparison to non-human animals. But for a variety of reasons and in a number of respects, infants’ social cognitive abilities are limited in comparison to adults. I do not think that Scott, et al.’s position needs to be cast as in competition with Apperly and Butterfill’s account. Properly understood, the accounts are compatible.
Behavior-reading hypotheses involve bodily orientation and behavioral rules (such as, if the dominant is oriented toward a food item, it will retrieve the food). They do not involve attributions of perceptual states and goals.
See footnote 2 above.
Though I find this a plausible story, this is not the only possible relation between System 1 and System 2. The two systems may operate autonomously.
The empirical data on how often children are exposed to narratives makes this problem even more intractable. In America, for 9-month-olds, 32% are read to and 27% are told stories on a daily basis. For 2-year-olds, 45% are read to and 28% are told stories on a daily basis. For 4-year-olds, 39% are read to and 29% are told stories on a daily basis (Planty et al. 2009). Children often are not exposed to well-constructed exemplars of folk psychological narratives, a fact which makes learning folk psychology from mere exposure folk psychological narratives seem even more improbable.
It is plausible that exposure to narratives influences System 2 reasoning, which is more flexible and subject to individual and cultural variation.
It should be noted that these arguments are specific to the ES-NPH combination. The particularly problematic element for the account under consideration is the insistence that intentional attunement is non-mentalistic. And though the embodied cognition and ES accounts bear interesting similarities, Hutto’s full account of social cognition differs in significant ways from ES’s account. Importantly, Hutto has a different notion of intention and intentional attitudes and different interpretations of studies from developmental psychology, and so his account may not face the problems described in this section.
I am grateful to Olle Blomberg, Josh Shepherd, Robert Thompson, and especially Larry Shapiro for reading many drafts of this paper and providing excellent feedback.