1 Introduction: The Puzzle of Pretend Play

Around the age of two children start to engage in a peculiar type of entertaining games; they organize tea-parties with stuffed animals as visitors or pretend to bathe a doll playing mother and father. These are instances of ‘pretend play’, a kind of acting as if something is the case while correctly perceiving it is not: While the child may be acting as if he is holding a wet baby – by for instance pretending to dry it – he recognizes that he is actually holding a dry doll.Footnote 1

Pretend play has sparked the interest of cognitive scientists because of its potential importance in children’s cognitive development. Pretend play has, for instance, been linked to emotional development (e.g. Jent et al. 2011), language development (e.g. Orr and Geva 2015) and the development of cognitive flexibility and creativity (e.g. Russ 2014).Footnote 2 The emergence of ‘cooperative pretend play’ (i.e. pretend play with playmates) at the age of two introduces, apart from much fun, an interesting conundrum for philosophers and cognitive scientists interested in so-called ‘Theory of Mind’ development. (Henceforth I use the term ‘pretend play’ to refer to cooperative pretend play and exclude other forms of pretend play.) In psychology, children are said to have acquired a Theory of Mind, or what I will refer to as ‘mindreading abilities’,Footnote 3 when they obtain the capacity to ascribe mental states to other people (e.g. “John believes there is beer in the fridge”) and use these ascriptions to explain and predict their behaviour (e.g. “John will go to the fridge”). It is commonly assumed that mindreading requires so-called ‘metarepresentational’ abilities, or the ability to represent someone’s belief-state. In line with a long tradition,Footnote 4 I use the term ‘representations’ roughly for things that (purport to) stand for, refer to or denote something else. This includes verbal representations (e.g. the proper name “Trump” refers to the president of the U.S. and the sentence “Trump is the president of the U.S.” refers to the fact or proposition that Trump is the president of the U.S.), pictorial representations (e.g. the Mona Lisa refers to Lisa Del Giocondo) or mental representations (e.g. my Eiffel tower-concept refers to the tower on the Champ de Mars in Paris). According to Perner’s (1991) widely adopted definition, a ‘metarepresentation’ is a “representation of a representation as a representation” (p.35). In other words, a metarepresentation is a representation that somehow makes explicit that it is representing a representation. Someone who mentally represents another’s mental representation as a mental representation, or in other words someone who holds a belief about someone else’s belief-state, is thus metarepresenting.

In the Theory of Mind debate it is often assumed, following Perner, that children typically acquire metarepresentational abilities at the age of four, when they pass so-called ‘explicit false belief tests’.Footnote 5 In the standard formulation of this test – the ‘Sally-Anne test’ (Baron-Cohen et al. 1985) – the examined child watches a puppet, Sally, placing a marble in a basket and leaving the room. Next, the marble is transferred by another puppet, Anne, to a box in the same room. Upon Sally’s return to the room the child is asked where Sally will look for the marble. If the child predicts that Sally will look in the basket – because she was out of the room when her marble was moved – he passes the false belief test. The idea is that in order to predict the behaviour of Sally – who holds a belief that the child itself recognizes as false – the child will need to represent Sally’s false belief-state (or metarepresent) and thus exhibits mindreading abilities.

Interestingly, when we consider the phenomenon of pretend play, we find that this activity typically requires children to anticipate behaviour of playmates that is also based on false (pretend-)beliefs (e.g. “The baby is wet”). Puzzlingly, whereas the ability to pass explicit false belief tests emerges at the age of four, pretend play emerges at the much earlier age of two (Harris 2000; Howes and Matheson 1992). This raises the question of what mental capacities are required for pretend play and mindreading and whether there is a developmental link between these abilities.

In this paper I develop an alternative viewpoint to the debate that has been sparked by the puzzle of pretend play.Footnote 6 I use a recent insight due to Derek Matravers’ 2014 book Fiction and narrative from a domain seemingly unrelated to developmental psychology: the philosophy of fiction. After a brief exposition of Matravers’ argument in section 2, I introduce the second theoretical element of my account in section 3: the account of social cognition development from embodied cognition theorists Daniel D. Hutto and Shaun Gallagher. In section 4 I combine these two elements to develop a hybrid account of the role of pretend play in mindreading development. The main contribution of my analysis is an account of pretend play as ‘interaction with fictions’ with a representational element. I will dub my alternative the ‘MEC-account’ after its two main theoretical components: Matravers’ theory and Embodied Cognition theory. Lastly, in section 5, I compare my account to the established theories in the debate: Alan M. Leslie’s metarepresentational account of pretend play, Josef Perner’s secondary representational account and Paul L. Harris’s simulation account. I will argue that the MEC-account offers a better explanation of empirical data on the development of children’s pretend play and mindreading abilities.

2 Matravers: Confrontation and Representation

In this section I introduce the first theoretical element of my account: Matravers’ theory of fiction. Matravers (2014) has recently argued against, and offered an alternative to, what he dubs the ‘consensus view’ within the philosophy of fiction literature that links fiction to the cognitive attitude of ‘imagination’. He argues that the characterizations of imagination on offer do not apply to fictional representations only. For instance, ‘simulationism’Footnote 7 – the most developed account of imagination according to Matravers – defines imagination as ‘running mental states offline’, which indicates an absence of direct perceptual inputs and of a motivation to act. So, when reading The Hobbit I use my imagination because I have no direct perceptual inputs of Bilbo and no incentives to act upon the described events. However, when I read about Donald Trump in The New York Times, I also run mental states offline – with no direct perceptual inputs of Trump or a direct motivation to act – while engaging with non-fiction. Definitions of imagination other than the simulationist definition (e.g. Meskin and Weinberg 2006) run into similar problems. Therefore, on the consensus view we imagine many more propositions than those that we intuitively think of as fictional, and hence no special link between fiction and imagination is established.

Matravers argues that the fundamental flaw in the consensus view is its confusion of the distinction between engaging with fiction and engaging with non-fiction, with the more fundamental and cognitively primary distinction between engaging with “confrontation situations” (Matravers 2014, p.45) and engaging with “representation situations” (p.45). In confrontation situations people have the direct possibility to act because their mental states are caused by perceptual inputs from objects in their immediate surroundings (e.g. a situation in which a tiger enters your house and you have the possibility to run and shout for help). In representation situations, by contrast, people have no direct possibility to act because their mental states are caused by mere representations of objects (e.g. a situation in which someone tells you about a tiger that entered your house yesterday). Matravers focuses on verbal representations of events, or ‘narratives’, so that he usually refers to the distinction between engaging with a confrontation situation and engaging with a narrative.

A consequence of drawing the primary cognitive distinction between confrontation and representation is that our engagement with narratives is neutral regarding fictionality; what goes on in the head of the reader of a fictional story does not differ from what goes on in the head of the reader of a non-fictional story. Both types of stories are essentially examples of narratives about objects towards which we have no direct possibility to act.

Matravers does not discard the distinction between fiction and non-fiction entirely. Although the distinction is not essential for understanding the cognitive basis of our engagement with narratives, it is still relevant for understanding the result of our engagement with different types of narratives. Matravers proposes a “‘two stage’ model” (p.90) according to which in the first stage of interpreting a narrative we build “a ‘mental model’ of the content of that narrative” (p.3), which involves the same cognitive processes whether it concerns a fictional or non-fictional narrative. In the second stage the distinction between what we consider to be fiction and non-fiction becomes apparent depending on whether there are presuppositions to believe the propositions that make up the content of the narrative we engage with – as in the case of non-fiction – or not – as in the case of fiction. So while reading The Hobbit may involve the same cognitive processes as reading a vivid biography, there are different presuppositions towards believing the contents of these different narratives. Readers can be guided by convention and contextual clues to judge whether a “cautious scepticism” (p.95) towards the propositions of a possibly fictional narrative is appropriate. Hence, we can preserve the distinction between fiction and non-fiction without postulating a different cognitive attitude for our engagement with fiction; we represent the content of a fictional or non-fictional narrative in a mental model and consequently do or do not adopt its propositions as beliefs depending on the context.

3 Embodied Cognition Theory

The second theoretical element of my account is the theory of social cognition development from embodied cognition theorists Hutto and Gallagher. Embodied cognitive science is a family of research programs that challenge traditional theories of mind by arguing that cognition arises in bodily interactions with the environment rather than merely in the brain. Gallagher and Hutto’s account of social cognition is likewise best understood as a reaction to traditional accounts of mindreading.Footnote 8 According to Gallagher and Hutto these accounts offer a contrived ‘observational’ picture of social interaction: If Lucy walks towards a loose board with hammer and nail we do not, as traditional accounts presuppose, represent her beliefs and desires in order to predict that she will hammer the nail. Rather, in such ordinary social interaction, we directly perceive Lucy’s intention to hammer the nail.Footnote 9 The distinction between these two approaches to other people coincides with what embodied cognition theorists call taking a second-personal stance or taking a third-personal stance – what I will refer to as respectively an interactive stance or a spectator stance. An interactive stance involves engaged interaction with others (e.g. directly perceiving, reacting to or enquiring after Lucy’s intentions) whereas a spectator stance involves an observational perspective of others (e.g. independently forming the attitude ascription “Lucy wants to hammer the nail” and reacting to this mental representation).

Gallagher and Hutto (2008) offer an account of three different social-cognitive competencies acquired throughout early childhood that typically involve an interactive stance. The first two competencies are based on Gallagher Interaction theory (2004), which emphasizes embodied capacities for social interaction (section 3.1). The third competency is based on Hutto’s (2008) theory of folk psychological narrative competence (section 3.2).

3.1 Social Embodied Cognition

Inspired by Trevarthen’s (1979) notion of “primary intersubjectivity”, Gallagher’s Interaction theory holds that newborns already exhibit the capacity to directly perceive and react to basic intentions in another’s behaviour. They show an awareness of the other’s intentionality, or person-person awareness, as is suggested by for instance neonatal imitation (e.g. Meltzoff and Moore 1977). The second embodied social-cognitive skill that infants develop, at the age of approximately nine months, is based on “secondary intersubjectivity” (Trevarthen 1979). At this age infants develop the capacity for joint attention, or person-person-object awareness. Children will for instance be able to directly perceive that “Mother wants food” or “Mother is attending to the door”. Secondary intersubjectivity is evident from, for instance, interaction in which children make orientations to and handle objects that others have used, imitating their actions (Philips et al. 1992).Footnote 10

An interesting example of an early infant’s capacity – in which, according to Hutto and Gallagher’s account, children exhibit social embodied cognition – is the ability to pass nonverbal false belief tasks. Prompted by criticism of explicit false belief tests that something apart from mindreading ability is making these tests difficult (e.g. Robinson and Mitchell 1995; Chandler et al. 1989; Surian and Leslie 1999), developmental psychologists have conducted false belief tests that are nonverbal so as to avoid that a lack of linguistic skills hinders children’s performances (e.g. Southgate et al. 2007; Surian et al. 2007).Footnote 11 For example, Onishi and Baillargeon (2005) conducted a nonverbal false belief task in which 15 month-old children watched actors perform a standard false belief scenario – in which someone holds a false belief concerning the location of a toy – and then watched this person grasp for the toy in the correct or incorrect box. The children showed surprise (as exhibited by reliably longer looking times) when the location the actor searched was inconsistent with his false belief about the toy’s location. This suggest that children have false belief understanding long before they pass explicit false belief tasks at the age of four. In Hutto and Gallagher’s account this is because nonverbal false belief tasks involve an interactive setting in which children are tested on the ability to directly perceive the actor’s intention to look for the toy in a particular box. The emergence of the ability to pass nonverbal false belief tasks (around 13–15 months of age) likewise roughly coincides with the emergence of secondary intersubjectivity (around 9–15 months of age).

3.2 Narrative Competence

By introducing the notions of primary and secondary intersubjectivity, Gallagher and Hutto offer an account of two embodied social-cognitive competencies that allow us to directly perceive and react to another person’s intentions in an interactive stance. These capacities, being developmentally primary and therefore cognitively relatively ‘cheap’, continue to play an important role throughout adulthood, and are sufficient to explain the majority of social interaction. However, Gallagher and Hutto admit that social embodied cognition is “not sufficient to address what are clearly new developments around the ages 2, 3 and 4 years” (Gallagher and Hutto 2008, p.25) when children learn to explicitly refer to beliefs and desires. This is where Hutto’s theory of folk psychological narrative practice and his concept of folk psychological “narrative competence” (p.18) – the third social-cognitive competency acquired in early social-cognitive development – comes into play.

When patterns of interaction based on primary and secondary intersubjectivity are frustrated and behaviour surprises us, we resort to so-called ‘behaviour explanations’. Adequately explaining behaviour requires more than merely citing an agent’s beliefs and desires. If I were for instance asked why I ate an acorn and replied: “Because I believed it to be an acorn and I desired to eat an acorn”, I would not have produced the required explanation. Rather, “[t]o understand intentional action requires contextualizing [desires and beliefs], both in terms of cultural norms and the peculiarities of a particular person’s history or values” (p.27). An example of a proper behaviour explanation would be “Because eating acorns is a tradition in my hometown”, since it contextualizes my desires and beliefs. Such a behaviour explanation is essentially only a small part of a richer ‘story’ about my character and personal history that reveals my reasons for action. These full stories about people that act for reasons are dubbed “folk psychological narratives” (Hutto 2008). Folk psychological narrative competence (henceforth ‘narrative competence’) consists in the ability to produce and digest such folk psychological narratives. This ability manifests itself mainly in producing and interpreting behaviour explanations because shared background assumptions usually render a complete recapitulation of someone’s psychology and history unnecessary. We only highlight parts of folk psychological narratives “on a need-to-know basis” (p.36).

Children that pass explicit false belief tests (in the familiar Sally-Anne formulation) also exhibit (limited) narrative competence. In such tests children are asked to offer behavioural predictions prompted by the question “Where will Sally look for her marble?” which requires them to construct (part of) a primitive folk psychological narrative: “Sally will look in the basket because she put it there, did not see it was moved, and therefore falsely believes it is in the basket”. This is the reason why the ability to pass explicit false belief tests emerges later than the ability to pass nonverbal false belief tasks; acquiring the ability to pass explicit false belief tests is not a matter of acquiring an isolated capacity for false belief understanding, but rather a matter of developing the ability to produce and digest folk psychological narratives in which false belief understanding plays a part.

Similarly to the first two social-cognitive skills introduced by Hutto and Gallagher (i.e. primary and secondary intersubjectivity), whether someone possesses narrative competence is mostly apparent in whether he has the capacity to produce and digest folk psychological narratives – or behaviour explanations – from an interactive stance, rather than a spectator stance. Contrary to what traditional theories of mind assume, independently theorizing about behaviour explanations from a spectator stance (e.g. “John donated money because he wants to help fight cancer”) is “at best a peripheral […] use” (Hutto 2007, p.47) of our folk psychological abilities. Rather, behaviour explanations are commonly prompted by second-personal questions from an interactive stance (e.g. “Why did you donate money?”), and involve self-reports (e.g. “I donated money because my mother was watching”).

3.3 Narrative Practice Hypothesis

In Hutto and Gallagher’s framework, our mindreading abilities consist in narrative competence – the ability to produce and digest folk psychological narratives – rather than an isolated theoretical ability to represent mental states of others in order to predict and explain their behaviour. However, this competence does “rest on a quiet understanding of the way propositional attitudes [such as belief and desire] interrelate” (Hutto 2008, p. 27). Hutto and Gallagher thus shift the focus of the Theory of Mind debate from explaining how we acquire a theoretical understanding of concepts such as belief towards explaining how we develop the implicit practical skill or “skillful know-how” (p.32) to apply these concepts consistently in behaviour explanations. Hutto’s Narrative Practice Hypothesis is intended to address this latter project and entails that “[i]t is through scaffolded encounters with [folk psychological narratives] that children learn how the core propositional attitudes behave with respect to one another and other standard mental partners” (p.xii). Once children are able to represent propositional attitude reports (e.g. “Sally believes that her marble is in the basket”) and make relevant attributions to others, they can engage with uncomplicated elementary folk psychological narratives such as the one involved in the Sally-Anne test. According to Hutto primitive stories of this kind are common in children’s upbringing. An example is the tale of Little Red Riding Hood in which the protagonist “learns […] her grandmother is sick[,] wants to make her grandmother feel better [but] falsely believes that the wolf is her grandmother” (p.30). Such narratives function as exemplars in that they exemplify the relations between propositional attitudes and enable the child to learn how things such as beliefs, character traits and personal history, influence actions. The normal route to acquire full-fledged narrative competence (i.e. the ability to engage in all types of folk psychological narratives) is by starting out with engaging with primitive folk psychological narratives (e.g. Little Red Riding Hood) and gradually engaging with increasingly complex folk psychological narratives (e.g. Shakespeare’s plays).Footnote 12

In conclusion, combining Gallagher’s Interaction theory and Hutto’s theory of folk psychological narrative practice gives us an account of mindreading development based on three social-cognitive competencies acquired throughout early childhood that typically involve an interactive stance: 1) primary subjectivity 2) secondary subjectivity and 3) narrative competence. Children’s mindreading abilities gradually develop from only being able to engage in social embodied cognition (i.e. primary and secondary intersubjectivity) – manifested in the ability to directly perceive and react to another’s intentions – towards cultivating narrative competence – manifested in the ability to produce and digest increasingly sophisticated folk psychological narratives – where these cognitive capacities are all available in fully developed adults.

4 The MEC-Account

Now that its two main theoretical elements have been discussed, in this section I present the basics of my account and discuss the novel viewpoint that it adds to the debate on the developmental link between pretend play and mindreading abilities, namely an analysis of pretend play as interaction with fictions. In section 5 I compare my account to the three established theories by Leslie, Perner and Harris.

4.1 Combining Matravers and Embodied Cognition Theory

In section 3 we saw how Matravers argued that there is no special link between the cognitive attitude of imagination and our engagement with fiction. Rather, the primary cognitive distinction is between engaging with a confrontation situation – in which there is a direct possibility to act – and engaging with a narrative – in which there is no direct possibility to act. The key starting point in developing my account is to relate Matravers’ distinction to Gallagher and Hutto’s distinction between social embodied cognition – which enables us to directly perceive and react to another’s intentions – and narrative competence – which enables us to produce and digest folk psychological narratives. In fact, social embodied cognition enables us to engage with a particular type of confrontation situation involving other people, and narrative competence enables us to engage with a particular type of narrative concerning people that act for reasons.

I use this insight to combine Matravers’ theory of fiction with Gallagher and Hutto’s account of social cognition development. According to Matravers engaging with fictional narratives is cognitively identical to engaging with non-fictional narratives. Since folk psychological narratives are just a particular type of narrative, engaging with fictional folk psychological narratives– henceforth ‘folk psychological fiction’ – will be cognitively identical to engaging with non-fictional folk psychological narratives – henceforth ‘folk psychological non-fiction’.Footnote 13 Hence, narrative competence is neutral regarding fictionality.Footnote 14 In a similar fashion, we can extend Hutto and Gallagher’s account (i.e. narrative competence is scaffolded in social embodied cognition) to an account of narrative competence as also being scaffolded in social embodied cognition with fictions. In other words, if we assume that the cognitive processes involved in narrative competence are neutral regarding fictionality and that narrative competence lies on a developmental continuum with social embodied cognition, the cognitive processes involved in social embodied cognition may also be neutral regarding fictionality.

Note that narrative interpretation in Matravers’ theory involves building a mental model of the content of the narrative. This creates an interesting tension with the strict embodied cognition approach in which such mental talk is highly problematic. I adopt the embodied cognitivist developmental continuum of social-cognitive competencies, but also incorporate some mental talk in order to maintain Matravers’ distinction between fiction and non-fiction interpretation (i.e. whether we accept the content of the narrative or not) and thereby commit to a hybrid account.

4.2 Pretend Play as Interaction with Fictions

In this framework the development of our engagement with fiction essentially lies on the same developmental continuum as that of engaging in social embodied cognition with non-fictions and non-fictional folk psychological narratives. Children start out by being merely able to engage in social embodied cognition – manifested in what I will refer to as ‘interaction with fictions and non-fictions’ – and gradually cultivate the narrative competence, which enables them to also produce and digest sophisticated folk psychological fiction and non-fiction – in which children are able to verbally explain reasons for actions of other (possibly fictional) people.

This raises the question of what ‘interaction with fictions’ precisely amounts to. The main contribution of this paper is the idea that pretend play is (a type of) interaction in which the playmates, environment and props represent fictional entities (henceforth ‘interaction with fictions’). I argue that it thus involves, first and foremost, social embodied cognition, but also involves an additional representational ability.

First, pretend play involves social embodied cognition. As has been discussed, examples of interaction with real things and people (with non-fictions) are activities that involve social embodied cognition in which children directly perceive someone’s intentions (e.g. nonverbal false belief tasks in which a child directly perceives and reacts to someone’s intention to look for the marble in the basket). Pretend play – in which children cooperatively act as if something is the case while correctly perceiving it is not – also involves social embodied cognition. A child will for instance directly perceive and react to his playmate’s intention to pretend-bathe the doll. Pretend play is thus analyzed as interaction with fictions involving, at its roots, the same embodied social-cognitive capacities as interaction with non-fictions.

Second, pretend play also necessarily requires an additional representational ability,Footnote 15 whereas this is possible but not necessary for interaction with non-fictions. I argue that in pretend play certain real objects, persons or locations that children interact with have to symbolize or represent some other fictional objects, persons or locations in the pretend-scenarios. This representational element is necessary for the interaction to count as interaction with fictions; if no object, person or location involved represents something else, whatever happens within the pretence of the game of make-believe will automatically be what actually happens and hence non-fictional. This representational element is already apparent in very early primitive pretend play, in which a child may pretend to go to sleep using a piece of cloth, where the cloth represents a pillow. It is also obvious in sophisticated games of pretend play, in which children may pretend to be pirates, where the couch represents a ship and the floor one of the seven seas. In sum, pretend play is analysed as interaction with an additional representational element, or ‘representational interaction with fictions’, because it involves social embodied cognition but also the representation of fictional entities.

A child engaging in pretend play must thus, apart from engaging in social embodied cognition, simultaneously be able to attach symbolic or representational meaning to objects, persons or locations. In this sense we may identify in pretend play a very early developmental landmark in the acquisition of folk psychological narrative competence and narrative competence in a more general sense – (folk psychological) narratives being essentially verbal representations of events. Some developmental cognition studies that link pretend play to language development offer independent evidence in support of this idea. For instance, the first signs of solitary pretend play – around 11–12 months of age (Fein 1981; McCune 1995, 2010) – typically co-occur with the first production of single-word utterances – around 12 months of age (Huttenlocher et al. 2010; Osório et al. 2012).

Matravers argues that what distinguishes a fictional narrative from a non-fictional narrative is whether there is a presupposition to believe the content of the narrative as a result of our primary engagement with it. In line with Matravers’ theory, my account entails that what distinguishes interaction with fictions from interaction with non-fictions is whether there is a presupposition to believe propositions that express what happens. Importantly, in order to maintain this distinction I assume that children keep track of or register what happens in their interaction with fictions or non-fictions; in order to have a presupposition to believe what happens (or not) children require access to propositional content concerning what happens.Footnote 16 Pretend play is interaction with ‘fictions’ because the pretend-scenario, or what happens within the pretence of the game, is not real. So while the same type of embodied social-cognitive processes are involved when children pretend to bathe a teddy bear (e.g. by asking each other to turn on the pretend-tap and handing each other pretend-soap) as there are involved when children actually bathe a teddy bear, there are different presuppositions to believe propositions such as that Teddy is wet.

Note that the non-fictional parallel of pretend play is not simply interaction with non-fictions but ‘representational interaction with non-fictions’, or ‘re-enactment’. Re-enactment, like pretend play, involves social-embodied cognition with an additional representational ability but, unlike pretend play, involves non-fictional representations. Children reporting on a goal in a football match for instance, by passing an imaginary ball to one another, are engaged in interaction with a representational element. However, the re-enactment is a reference to a particular event that actually occurred rather than a fictional one and hence there will be a presupposition to believe the propositions that express what happens in the re-enacted scenario.

To sum up, Table 1 lists the different types of interaction discussed and the social-cognitive abilities involved in them.

Table 1 Types of interaction

4.3 Folk Psychological Fiction

In Gallagher and Hutto’s account of social cognition development, the development of mindreading abilities is understood as the cultivation of a particular skill: narrative competence, or the ability to produce and digest folk psychological narratives. I analyse this skill as the capacity to produce and digest both fictional and non-fictional folk psychological narratives. Hutto’s Narrative Practice Hypothesis thus applies equally well to folk psychological fiction: through encounters with both folk psychological fiction and non-fiction, we gradually develop an implicit understanding of how propositional attitudes interrelate so that we can produce and digest sophisticated folk psychological fiction and non-fiction.

In this way, my account offers an interpretation of a recent trend in literary theory in which reading literary fiction is linked to mindreading development (e.g. Kidd and Castano 2013; Panero et al. 2016; Vermeule 2010; Zunshine 2006). I suggest to consider Hutto’s Narrative Practice Hypothesis as a potential explanation for the empirical findings that imply that reading literary fiction improves mindreading abilities. There is no difference in our primary engagement with fictional or nonfictional narratives. Hence, it is not the fictionality per se but the folk psychological element of these narratives that triggers the development of mindreading abilities.Footnote 17 Also folk psychological non-fiction – factual stories that are about people that act for reasons (e.g. a biography) – will support this development, whereas non-folk psychological fiction – fiction that is not about people that act for reasons (e.g. a description of events taking place on a fictional planet devoid of any people or cognitively human-like creatures) – will not. In fact, studies that compare the effects on mindreading development of reading literary fiction compared to non-fiction (e.g. Kidd and Castano 2013; Panero et al. 2016) exclusively make use of non-fictional texts that “report facts about natural and historical topics and do not include biographical narratives about people” (Panero et al. 2016, p.48). I predict there will be no difference in the effect on mindreading development between reading fictional or non-fictional narratives that are about people (i.e. that are folk psychological).

Narrative competence does not necessarily involve metarepresentations but does presuppose the ability to explicitly metarepresent if required. A behaviour explanation may include any element of a narrative – concerning character traits, personal history or mental states – depending on its explanatory value for the addressee. Hence, it is perfectly acceptable to answer the question “Why did you not sign up for our school trip to Berlin?” by referring to and thereby representing your mental state (e.g. “Because I think Mary will go”) or to your personal history which involves no metarepresentations (e.g. “Because I go to Berlin every Christmas”). However, both types of behaviour explanation do imply a coherent underlying folk psychological narrative (e.g. “I think Mary will go and I do not like Mary” or “I go to Berlin every Christmas and do not want to visit a familiar city”) and thus require an implicit metarepresentational understanding of how the propositional attitudes involved interrelate. In other words, narrative competence involves the capacity to explicitly and correctly refer to the relevant desires and beliefs (i.e. to metarepresent) to supplement behaviour explanations if necessary.

In short, the developmental story offered by my account, henceforth the MEC-account, entails that children gradually develop from merely being able to engage in social embodied cognition – manifested in interaction with non-fictions and pretend play – towards cultivating narrative competence – through engagement with folk psychological fiction and folk psychological non-fiction – that enables them to also produce and digest more sophisticated folk psychological fiction and non-fiction. The development of this latter ability consists in the gradual acquisition of an implicit metarepresentational understanding of how propositional attitudes interrelate.

5 Comparison with Three Competing Accounts

In the previous sections I have presented the basics of my account. In this section I compare the MEC-account to Leslie’s (section 5.1), Perner’s (section 5.2) and Harris’ (section 5.3) theories by assessing how well they explain empirical data on the development of pretend play and mindreading abilities.

5.1 Leslie: Pretend Play and Explicit False Belief Tests

A central desideratum for any theory on the role of pretend play in mindreading development is to solve the puzzle of pretend play, or in other words, to explain the relatively late emergence of the ability to pass explicit false belief tests (around the age of four) compared to the emergence of pretend play (around the age of two). I argue that the MEC-account, unlike Leslie’s theory, adequately explains this developmental gap.

Leslie’s theory on the nature of pretend play (Leslie 1987, 1994; Friedman and Leslie 2007) is that the same metarepresentational mechanisms that underlie mindreading abilities are also involved in pretend play. When a child pretends that for instance the banana his mother is holding is a telephone, he maintains contradictory double knowledge (“That in mother’s hand is a banana” and “That in mother’s hand is a telephone”) which according to Leslie leads to so-called ‘representational abuse’ in that the child cannot use both representations to represent reality. Therefore the pretend-representation must be somehow quarantined, which children do by producing representations such as “Mother pretends that the banana is a telephone”. The child hereby mentally represents his mother as mentally representing the banana in a particular way and is thus metarepresenting, just like children do when passing explicit false belief tests of the familiar Sally-Anne type. Hence Leslie’s theory cannot, without additional assumptions or qualifications, explain the early emergence of pretend play relative to the ability to pass explicit false belief tests; if both activities require the same cognitive capacity, it seems that once children acquire metarepresentational abilities to engage in pretend play at the age of two, they should also be able to pass explicit false belief tests.

Leslie and others have proposed several hypotheses that purport to explain this developmental gap: although children younger than four have metarepresentational abilities, they lack other cognitive functions – such as executive resources (e.g. Baillargeon et al. 2010; Rubio-Fernández and Geurts 2016) or the ability to make the relevant pragmatic inferences (e.g. Helming et al. 2016) – that are required to pass explicit false belief tests but only mature around the age of four. In a similar way, Leslie argues that children under four have an underdeveloped ‘selection processor’ (i.e. a cognitive mechanism that selects the appropriate mental state that is to be represented in explicit false belief tasks) (Leslie and Thaiss 1992; Leslie and Roth 1993). However, there is no direct evidence for this additional cognitive mechanism and, as Doherty (1999) argues, it renders the original mechanism that computes the metarepresentations irrelevant, reducing the selection processor to a problematic extra bit of theoretical machinery.

Whatever the merits of these appeals to independent cognitive mechanisms, in the MEC-account a solution to the puzzle of pretend play comes for free. Pretend play is understood as involving a type of social embodied cognition. Its emergence is therefore placed relatively early on the developmental continuum. On the other hand, passing explicit false belief tests is, in line with Gallagher and Hutto’s account, analysed as requiring the child to construct a primitive folk psychological narrative. Hence, passing explicit false belief tests requires further developed narrative competence than pretend play and therefore this capacity emerges later.

A possible objection to this analysis of pretend play is that if we understand pretend play as involving social embodied cognition, its late emergence relative to interaction with non-fictions is left unexplained. If there really is no special cognitive attitude involved in pretend play, why then do children develop the capacity to engage in pretend play – around the age of two – later than the capacity to engage in interaction with non-fictions – which is fully developed in the form of primary and secondary intersubjectivity when children are 15 months of age (Trevarthen 1979)?

The MEC-account explains this additional developmental gap by analysing pretend play as requiring the development of an additional representational ability. As has been discussed in section 4.2, in this sense we may identify an early developmental landmark in the acquisition of narrative competence in pretend play. Pretend play hence involves more demanding cognitive processes than interaction with non-fictions and emerges later in early social-cognitive development.

5.2 Perner: Embellished False Belief Tests

As will be shown, the theories of both Perner and Harris share with the MEC-account the advantage over Leslie’s theory that they straightforwardly solve the puzzle of pretend play. However, as I will argue in this section, unlike the MEC-account, Perner’s theory cannot explain evidence for a further development of mindreading abilities after children pass explicit false belief tests.

Perner (1991) has proposed an account of the nature of pretend play that avoids the main problem with Leslie’s account. In Perner’s theory, although mindreading and hence passing explicit false belief tests (in the Sally-Anne formulation) requires metarepresentational abilities, engaging in pretend play does not. Perner recognizes Leslie’s concern of pretend play leading to representational abuse, but states that “although the need for quarantine is clear, it is not clear why quarantining requires metarepresentation” (p.61). Children may also avoid representational abuse by adopting a different representational attitude towards the pretend-content: whereas the child believes that his mother is holding a banana, he may suppose it is a telephone to make sense of his mother’s behaviour. Perner calls this latter a “secondary representation” (p.7). Children proceed through a necessary sequence of three developmental levels: First, children are only able to form ‘primary representations’, using a single mental model to represent reality. Second, around the age of two, children learn how to form ‘secondary representations’ – in which they represent hypothetical situations by constructing multiple models – and pretend play emerges. Thirdly, around the age of four, children learn to metarepresent by making models of models and pass explicit false belief tests. Hence pretend play emerges earlier in children’s development than the ability to explicit pass false belief tests because the former relies on a cognitively primary mechanism. However, there remains a functional link between the two phenomena in that the ability to construct multiple models is a prerequisite for constructing a model of a model.

In Perner’s theory our metarepresentational ability (i.e. constructing models of models) makes up our fully developed mindreading ability. Perner’s theory therefore cannot explain evidence for any further development of mindreading abilities after the age of four.Footnote 18 However, as Apperly (2011) argues, our mindreading abilities continue to develop throughout later childhood (after children pass explicit false belief tests) and even adulthood. For instance, our ability to use the concept of belief in more complicated, or ‘embellished’, false belief tests is relatively late developing. Four year-olds that pass explicit false belief tests of the Sally-Anne type have difficulties with versions of the false belief test in which the hidden object is undesirable and the protagonist wishes to avoid finding it (Cassidy 1998; Leslie and Polizzi 1998; Friedman and Leslie 2004; Leslie et al. 2005). It seems that the ability to co-ordinate an understanding of false belief with a mental state other than desire is relatively late developing. Another example of an embellished false belief test is the so-called ‘Oedipus problem’ in which a puppet is presented as able to see a die in a box but not able to touch it so that it has not experienced that it is also an eraser. Four year-olds that pass explicit false belief tests will nonetheless when asked affirm that the puppet knows that there is an eraser in the box (Apperly and Robinson 1998, 2001, 2003). This suggests that the ability to understand that mental states such as belief apply to particular descriptions is also relatively late developing.

A possible reply for Perner would be to state that four year-olds’ understanding of belief is in fact qualitatively adult-like, but that children lack the finesse in applying it in all contexts. However, I agree with Apperly (2011) that this is implausible because embellished false belief tests show that four year-olds fail to understand core features of the concept of belief such as its interaction with different mental states and the fact that it applies to particular descriptions. It is difficult to see how an understanding of belief that lacks these crucial elements is qualitatively adult-like.

The MEC-account explains evidence of a further development of mindreading abilities after children pass explicit false belief tests by analysing this achievement as merely a developmental step in the acquisition of narrative competence. Explicit false belief tests require children to construct relatively primitive folk psychological narratives and thereby only test an understanding of false belief in a particular experimental context. Passing explicit false belief tests thus does not ensure that the child understands how to apply the concept of belief in other contexts, nor that the child has a fully developed implicit understanding of its interplay with other propositional attitudes. This is because developing mindreading abilities is understood as cultivating a skill rather than as acquiring some ‘final’ understanding of the concept of belief. In Hutto’s account (and hence in the MEC-account) “nuanced folk psychological skills only develop securely after ages four and five” (Hutto 2008, p.26) through multiple encounters with folk psychological narratives. After the age of four children gradually acquire more folk psychological narrative skills and learn how to apply the concept of belief in other contexts (e.g. when the belief concerns an undesirable object or when the belief applies to a particular description). In this way the MEC-account can account for the evidence for a further development of mindreading abilities after children pass explicit false belief tests.

5.3 Harris: Nonverbal False Belief Tasks

Contrary to Perner, Harris does not treat the acquisition of pretend play and mindreading abilities as involving distinctive developmental steps. Hence, Harris’ theory is similar to the MEC-account in that both can account for the evidence for a gradual development of mindreading abilities discussed in the previous section. However, some developmental processes remain unaccounted for in Harris’ view. In this section I argue that Harris’ theory, unlike the MEC-account, cannot explain the fact that children develop the capacity to pass nonverbal false belief tasks much earlier than the ability to pass explicit false belief tests.

Harris (1991, 2000) argues that neither mindreading nor pretend play typically involves metarepresentation but both activities require the capacity to simulate counterfactual situations. Mindreaders ‘mimic’ or imagine another’s mental state and project the output of their own cognitive processes upon the other person to predict their behaviour. In pretend play children likewise simulate having an alternative belief-state in order to behave appropriately relative to the pretend scenario. Children’s capacity to engage in pretend play emerges earlier in cognitive development than the ability to pass explicit false belief tests because “simulation is more or less difficult depending on the number of adjustments that have to be made to default settings” (Harris 1992, p.129). Whereas pretend play typically requires the child only to simulate beliefs alternative to his own (e.g. “That banana is a telephone”), passing explicit false belief tests requires a child to simulate another’s intentions in addition to his beliefs (e.g. the child must appreciate what Sally believes and desires concerning the marble) and hence is cognitively more challenging. Thus, the development from being able to engage in pretend play to being able to pass explicit false belief tests consists in an improvement of the child’s imaginative flexibility rather than a qualitative change in mindreading abilities.

The early emergence of the ability to pass nonverbal false belief tests poses a problem for Harris’ theory. Recall Onishi and Baillargeon’s nonverbal false belief task discussed in section 3.1, in which 15 month-olds watch actors perform a false belief scenario and exhibit surprise when the location in which an actor searches for some object is inconsistent with the actor’s supposed false belief. In Harris’ account, passing nonverbal false belief tasks seems to require an equally complicated simulation as passing explicit false belief tests; the child will need to simulate the actor’s intentions and beliefs in order to predict that the actor will search in one location rather than another. This raises the question of why children develop the ability to pass nonverbal false belief tasks so much earlier than the ability to pass explicit false belief tests.

A possible reply for Harris is to admit that the ability to simulate both the intentions and beliefs of another person is already present when children pass nonverbal false belief tasks, but that the linguistic competence needed to understand the questions asked in explicit false belief tests, and hence pass these tests, is lacking. However, this move is unhelpful because it weakens Harris’ explanation of the developmental gap between the emergence of pretend play and the ability to pass explicit false belief tests in terms of increasingly difficult simulations. If the capacity to simulate both another’s intentions and beliefs is already present at the age of approximately 15 months, the emergence of this capacity can no longer be used to explain the late emergence of the ability to pass explicit false belief tests at the age of four.

It appears that the only viable option for Harris is to assume that passing nonverbal false belief tasks does not involve simulation but a different kind of cognitive process. Harris seems to opt for this strategy when, in a brief discussion of Onishi and Baillargeon’s results, he claims to be “open to the idea that infants and toddlers [that pass nonverbal false belief tasks] have a basic understanding of knowledge versus ignorance” (Harris 2016, p. 802) but maintains that this understanding falls short of four-year-olds’ mindreading abilities (i.e. simulation of intentions and beliefs). In this way Harris’ account does not contradict the evidence for early nonverbal false belief understanding, but places it outside its explanatory scope.

In the MEC-account, passing nonverbal false belief tasks is analysed as a type of interaction which involves social embodied cognition: the child directly perceives the actor’s intention to look for the toy in a particular box and hence will be surprised if he does not. So, contrary to passing explicit false belief tests, passing nonverbal false belief tasks only involves social embodied cognition and does not require children to produce a folk psychological narrative (i.e. narrative competence). In this way the MEC-account explains the developmental gap between the emergence of each of these two abilities.

6 Conclusion

I have proposed an alternative account of the role of pretend play in mindreading development that is inspired by Matravers’ theory of fiction and Hutto and Gallagher’s account of social cognition development. In this framework, children start out by being merely able to engage in social embodied cognition – manifested in interaction with fictions and non-fictions in which children directly perceive and react to the (pretend-)intentions of others – and gradually cultivate narrative competence that enables them to also produce and digest sophisticated folk psychological fiction and non-fiction – in which they are able to verbally explain reasons for actions of other (possibly fictional) people. Pretend play is analysed as interaction with fictions with a representational element. The role of pretend play is hence as a type of interaction involving social embodied cognition – in which narrative competence is scaffolded – and involving a representational ability – which is a developmental landmark in narrative competence acquisition. I have argued that the MEC-account has greater explanatory power concerning available empirical data on the development of children’s pretend play and mindreading abilities than the established theories from Leslie, Perner and Harris.

Although my survey of the data on pretend play and mindreading development is necessarily incomplete, a distinctive advantage of the MEC-account is that it explains a crucial trend in the empirical findings, concerning evidence for a gradual development of mindreading abilities. We can extend the comparison of the theories discussed and further test the MEC-account by incorporating additional data into the calculation for instance concerning pretend play and mindreading development in children with Autism Spectrum Disorder (cf. Currie 1996).