For the last almost 40 years, the focus in research on the development of Theory of Mind (ToM) – that is the ability to attribute mental states to oneself and others – has lain predominantly with the development of belief understanding. The False Belief Task (FBT) has taken a prominent position in this research as the definitive test of the development of ToM abilities. In this task, children are typically required to predict the behaviour of another person based on that person’s false belief, thereby requiring the child to take the perspective of the other person which differs from reality. Due to this difference between the other person’s perspective and reality, the FBT has long since been considered a particularly strong test of ToM abilities. It is a well-known finding that children do not pass explicit versions of the FBT till age 4 (Wellman, Cross, & Watson, 2001), although there is some evidence that they can pass implicit versions of the task at the age of 15 months already (Scott, 2017). These different findings regarding the age at which children are able to pass different versions of the FBT have led to a lively debate regarding the question when children are able to attribute beliefs to others.

More recently, however, this focus on belief understanding and the FBT has come under criticism, with arguments being made that we need to make use of a wider variety of test paradigms and also focus on the ability to attribute other mental states.Footnote 1 Factive mental states, especially the mental state of knowledge, have risen to prominence in this regard. A number of researchers have argued for the importance of knowledge attributions within social cognition; it has been argued that factive knowledge attributions are more basic than non-factive belief attributions (Nagel, 2017; Phillips et al., 2021) and that the difference between them does not lie in any particular ToM skills, but rather in the general ability to generate and maintain a representation which is inconsistent with how you take the world to be (Phillips & Norby, 2019a).Footnote 2 It may therefore be that children have a factive ToM which allows them to attribute knowledge states to others, before they are able to deal with the inconsistent representations required for non-factive ToM.

In this paper, I will move beyond belief and knowledge attributions and argue that we should also consider children’s pretend play as this carries important implications for research on ToM. I will be considering both social and individual pretence, which have been argued to be yoked in development with both occurring around the age of 18 months (Leslie, 1987), although there is evidence of some more basic forms of individual pretence occurring even earlier (Liao & Szabó Gendler, 2011). Pretend play is an ability which has been considered to be closely related to belief attribution since the seminal work of Alan Leslie (1987, 2002), although how this relationship is to be understood remains controversial (Nichols & Stich, 2000; Perner, Baker, & Hutton, 1994; Stich & Tarzia, 2015). In order to determine the implications of pretend play for ToM research, the first question we face is whether pretence requires mental state attribution and can therefore provide evidence of ToM. I will argue that although pretend play itself does not provide evidence of ToM, the ability to deal with inconsistent representations evidenced in pretend play is highly relevant to research on ToM, especially regarding the issue of how knowledge attributions relate to belief attributions. I will not dispute that children may well be capable of knowledge attributions before (false) belief attributions, but I will argue that the evidence of children’s early pretend play indicates that the reason why this is the case is more complex than usually thought.

It should be noted that throughout this paper I will be assuming a representational theory of mind as well as a representational theory of pretence. While most accounts of pretence do assume that pretence depends on having representations, recently there have been moves made to argue that pretence need not be understood in representational terms (Hutto, 2022; Rucińska, 2016; Rucińska, 2019; Weichold & Rucińska, 2021). I will not argue in this paper that representations accounts of pretence are inherently better than their non-representational counterparts, but rather leave aside what I take to be the larger issue of mental representation. The reason for this is that the debate regarding the priority of a factive ToM is rooted in the context of a representational framework, the claim being that children are capable of factive representations but not non-factive representations. If one does not share this commitment to representationalism, the question of the priority of factive ToM may not arise in the same way and discussion of this would go beyond the scope of this paper. For this reason, I will only allude to these alternative non-representational accounts tangentially in my discussion.

In what follows, I will begin by discussing what is required in order to provide evidence of ToM and how these requirements relate to the requirements for belief understanding (Sect. 1). This builds on the work of Phillips & Norby (2019a) who, firstly, set up clear requirements for testing ToM, and secondly, argue that the difference between factive and non-factive ToM lies in the more general ability to deal with inconsistent representations. Based on this, I then unfold my argument for the significance of pretence for ToM research in two steps: firstly, I argue that pretend play fails to meet the requirements for ToM attribution (Sect. 2), secondly, I argue that children’s pretend play is nonetheless relevant for ToM research because it provides evidence that young children before the age of 4 seem to be able to deal with inconsistent representations (Sect. 3). The contribution of this paper is therefore twofold: firstly, in applying the criteria for ToM to pretend play I aim to provide a more precise way of thinking about the relation between pretence and belief attribution and, secondly, I argue that pretend play should be seen as highly relevant for the ToM literature as it poses puzzles for our understanding of the development of ToM.

1 Testing Theory of Mind

In order to begin to assess the implications of pretend play for ToM research, it will be helpful to consider Phillips and Norby’s (2019a) discussion of what is needed in order to show that someone has ToM abilities. While the FBT has long occupied the central position in ToM research, Phillips and Norby point out that much of this has been unquestioning acceptance without much reflection of what exactly is required for something to be a ToM test. If we want to devise tests for ToM other than the FBT, then such requirements are needed. Phillips and Norby propose two requirements that must be met in order to demonstrate ToM abilities:

1. Tracking Requirement: the task must be such that it requires being able to track the mental states of the other person.

2. Separateness Requirement: the task must require being able to keep the mental states of the other person separate from one’s own mental states.

Why are these two requirements needed? The tracking requirement is presumably less controversial: if ToM requires being able to attribute mental states to others (as per definition), then at minimum this presupposes being able to track the content of other people’s mental states.Footnote 3 However, this is not sufficient for being credited with ToM abilities. Having a ToM requires attributing mental states to oneself and others, which means that the contents of the mental state being tracked must be associated either with oneself or others. Phillips & Norby (2019a) spell this out in more detail: The definition of ToM requires that one is able to attribute mental states to others in order to be able to predict and explain their behaviour. If I am merely able to predict your behaviour based on my own representation of the world or confuse your representation with my own, this would not count as ToM because it does not require attributing any mental state to you. It will only count as ToM if I predict your behaviour based on your representation of the world and not my own.

Identifying these two requirements presents an important development in the literature as they enable a principled assessment of any test of ToM. As Phillips & Norby (2019a) note, the traditional FBT does meet these requirements, but there could also be other tasks which meet these requirements.Footnote 4 Importantly, these requirements allow for moving away from the FBT as the standard for ToM in a principled and supported manner.

This is crucial for Phillips and Norby’s main aim in the paper, namely arguing that there can be a factive ToM and that we can test for a purely factive ToM. In other words, Phillips and Norby want to open up the possibility for a ToM where one can attribute factive mental states to others (e.g. knowledge states) without necessarily also being able to attribute non-factive mental states (e.g. (false) beliefs). If the determining fact of whether one has a ToM is whether one can pass the FBT, there is no room for a factive ToM, because having a ToM is contingent on being able to attribute a non factive mental state to others. Moreover, Phillips and Norby argue that with a ‘diverse knowledge task’ we have such a conclusive test of factive ToM. The idea of this task is that the child has to recognise that the other person’s knowledge of the situation differs from their own (e.g. that the other person does not know something that the child knows), and that the lack of this knowledge will influence their behaviour. Importantly, while there is a difference between the perspective of the child and that of the other person in this task, what the child does not have to represent is the other person’s belief which is in inconsistent with the child’s own perspective. So, for example, the child just has to represent that the other person does not know where the ball is, and not that the other person actually represents the ball as being in a false location. Being able to use a task like the diverse knowledge task means that we do not have to wait for the onset of a non-factive ToM in order to be able to test for ToM because we can provide evidence of ToM using only factive mental states.

It should be noted that often the claim made about children’s failure in the FBT is not that they completely lack a ToM but that they lack representational ToM (Wellman et al., 2001). This means that they allow that children have a ToM which enables, for example, the attribution of motivational mental states like desires (e.g. Repacholi & Gopnik, 1997) but that there is a particular type of ToM, namely representational ToM (requiring the attribution of a particular representation of the world to the other person), which they are not yet capable of. The view that children can have a factive ToM before passing the FBT goes beyond this, however, as factive ToM is supposed to be a form of representational ToM. Therefore, the claim is that there might be a purely factive but nonetheless representational ToM which precedes a non-factive ToM. Importantly, Phillips & Norby (2019a) argue that ultimately the difference between factive ToM and non-factive ToM is not a difference in ToM abilities, but rather a difference in the kinds of representations one is able to have. Specifically, the difference is in whether one is able to have only factive representations or whether one can also have non-factive representations. In other words, Philips and Norby think that the difference amounts to whether I can represent someone else’s perspective when it is inconsistent with my own (which is required for non-factive ToM), or whether I can merely represent someone else’s perspective when it is different from my own but not inconsistent (which is all that is required for factive ToM). This means that the difference between someone who can pass a test of factive ToM, like the diverse knowledge task, and someone who can pass a test of non-factive ToM, like the FBT, does not lie in any additional ToM abilities that someone passing the non-factive task needs to have, but in a domain general ability to “construct and maintain a particular kind of representation: a non-factive representation. That is, one that is inconsistent with the way you take the world to actually be” (Phillips & Norby, 2019a, 16). In this sense then factive ToM is more basic than non-factive ToM, but not because factive ToM requires less advanced specific ToM abilities, but because non-factive ToM tasks have the additional demand of dealing with inconsistent representations. This point is especially important given that there is evidence that children are able to pass tests requiring the attribution of knowledge states before they are able to pass the FBT (see Phillips et al., 2021 for an overview; Wellman & Liu, 2004). For example, children pass tests requiring verbal knowledge or ignorance attributions at the age of 3 (Pillow, 1989), while only succeeding in verbal FBTs around the age of 4.Footnote 5 If Phillips and Norby are correct in their analysis of the difference between tests for knowledge attribution and the FBT, then we should interpret this as evidence that young children do have a ToM, all they are missing is the general ability to deal with inconsistent representations.

The idea that the FBT is so challenging for children because it requires dealing with inconsistent perspectives is a very promising one, which is also supported by the literature showing that children struggle with perspective problems till age 4, even in domains where no mental state attribution is required. For example, Perner et al. (2002; see also Doherty & Perner, 2020) have pointed out that there is a strong correlation between performance on the FBT and the alternative naming task. The latter is a task in which children are, for example, shown an object and told that this can be called either “rabbit” or “bunny”. The experimenter then gives one name for the object, and it is the child’s task to give the alternative name. Children up to the age of 4 struggle with this type of task, repeating the name the experimenter gave rather than the alternative. Perner et al. hypothesise that this is because the alternative naming task, like the FBT, requires dealing with conflicting representations of one and the same thing. This ability to deal with perspective problems like this is something which they argue only develops around the age of 4, when children also pass the FBT.

Summing up, we therefore have two requirements needed in order to provide evidence of ToM abilities: Tracking and Separateness. While at least traditional explicit FBTs typically meet these requirements, they also pose a further demand, namely the general ability of being able to deal with inconsistent perspectives. Having clarified these requirements, I will now turn to pretend play and how this relates to ToM research.

2 Pretend play and theory of mind

Children begin engaging in pretend play around the age of 18 months, so over two years before they are able to pass the explicit FBT (Leslie, 1987; Liao & Szabó Gendler, 2011). The most discussed example of this in the literature is perhaps that of the banana phone, where a child watches her mother pretend that the banana is a telephone (Leslie, 1987). If the child was only capable of understanding this behaviour literally, they ought to be very confused by this behaviour and potentially come to think that you can talk to other people using a banana, in the same way as you can do with a telephone. Leslie terms this ‘the problem of representational abuse’: if children took pretence literally, then they should end up with confused ideas about the world. But this does not seem to happen. Instead, when a child sees her mother pretend that the banana is a telephone the child realises that this is in some sense not real and the pretence does not interfere with the child’s general understanding of bananas. What this means is that the child must understand that the banana does not actually function as a telephone and therefore the child’s regular representation of the banana which is used outside of the context of the pretence is not updated. The banana-phone is an example of an object-substitution pretence, but young children are also capable of other forms of pretence, for example pretending that an object has properties that it does not actually have (e.g. pretending that a cup is full and pretending to drink from it; Bosco, Friedman, & Leslie, 2006; see also Liao & Szabó Gendler 2011 for an overview of different types of pretence developed by young children).

It is generally agreed that there is a close relationship between pretend play and ToM, although how this relationship is to be understood is controversial. Nichols and Stich (2000), for example, have argued that pretend play is an important precursor of ToM, but falls short of mental state attribution. Alan Leslie (1987, 2002), on the other hand, defends the much stronger view that children’s early pretend play already provides evidence of “a major part of the specific innate basis for the development of theory of mind” (Leslie, 1988, 24). Leslie argues that children’s pretend play involves mental state attribution and therefore provides evidence that children already have the ToM abilities that are required for belief attribution at 18 months. Pretend play, according to Leslie at least, is supposed to be so significant because it involves representational mental state attribution in a way which is highly similar to belief attribution. In what follows I will argue that Leslie is wrong to think that children’s early pretend play provides evidence of mental state attribution, but that he is nonetheless right to think that pretend play shows that young children already have some of the important abilities underlying belief attribution: namely the ability to deal with inconsistent representations, and this in turn has implications for our understanding of the development of ToM abilities.

2.1 Pretence and Leslie’s mentalistic argument

Why should we think that pretence poses some of the same demands as the FBT or ToM in general? A first very important similarity that Leslie (1987) notes is that pretend play requires the child to deal with dual representations. Take the example of the banana phone: in order for the child to understand that the mother is pretending that the banana is a telephone, she must understand that the banana is being used as a telephone, while also remaining aware that the banana is not actually a telephone. In other words, the child must have a representation which they use for the pretence (pretend representation) of the banana as a telephone, and a regular representation of the banana as a banana. Importantly also, these two representations must be kept separate to avoid what Leslie terms ‘the problem of representational abuse’ – if the child did not keep these two representations separate, then the banana being used as a telephone should change children’s understanding of bananas and telephones, at worst the child might come to think that the banana actually can be used as a telephone. Therefore, if the child was not able to keep pretence and reality separate, the result should be confusion. But even young children do not typically get confused between pretence and reality (Lillard, 1993, 1994).

This part of Leslie’s view of pretence is generally not controversial amongst those who assume a representational account of pretence. Footnote 6 Even those who argue that pretend play does not provide evidence of mental state attribution nonetheless agree that children can keep pretence and reality separate and that they are dealing with dual representations in one form or another (Lillard, 1994; Nichols & Stich, 2000; Perner, 1991). The issue is whether keeping these representations separate requires mental state attribution. While mental state attribution would be a means of keeping the representations separate, it is not the only way to do so. Therefore, it is not clear that mental state attribution would be required based on this reasoning.

Leslie (1987) gives a further reason for thinking that children’s pretend play provides evidence of mental state attribution, namely that children are also able to recognise pretence in others at the same time as when they first begin to engage in pretend play themselves. It is this ability to recognise pretence in others that Leslie thinks depends on being able to recognise the mental state of pretending in others. So, for example, when the child sees her mother hold the banana to her ear, she is able to interpret this as pretence because she realises that the mother has a mental state of “pretending that the banana is a telephone”, which allows them to interpret her behaviour without leading to confusion.

Much of the debate regarding the relationship between pretend play and ToM has revolved around this issue, with Leslie arguing that we cannot explain all of young children’s ability’s to recognise pretence in others without acknowledging that they are capable of mental state attribution (Friedman, Neary, Burnstein, & Leslie, 2010; Leslie, 2002); and critics arguing that we can explain children’s abilities to recognise pretence in behavioural terms; i.e. that when the child recognises pretence in another person they understand this purely in terms of the other person behaving as if the banana was a telephone without attributing any mental states to the person (Nichols & Stich, 2000; Stich & Tarzia, 2015).Footnote 7

As it stands, I think it is an open question whether the behaviourist reading can give an account of all types of pretend play that children engage in. However, I do not think we need to give an answer to this in order to show that pretend play is not sufficient to show ToM abilities. As already mentioned, Phillips & Norby (2019a) argue that there are two requirements which need to be met in order to show ToM abilities: the Tracking Requirement and the Separateness Requirement. The debate thus far has revolved only around the question of whether children’s early pretend play meets the Tracking Requirement, i.e. whether children are tracking the mental states of others in pretence or not. But even if we were to grant this to Leslie, this would still fall short of providing evidence of ToM as the Separateness Requirement still needs to be met. As I will argue in the following section, I do not think that the current evidence of children’s pretend play meets this requirement. To be clear, the question of whether pretend play requires children to track the mental states of others is undoubtedly an important one. If we can show that pretend play requires being able to track the mental states of others, this would mark an important achievement for the child, albeit falling short of ToM. My aim in the following will rather be to highlight that, even if we were to show that pretend play requires tracking mental states, there is a further requirement to be met which, thus far, pretence has not been shown to meet.

2.2 Pretend play does not provide evidence of theory of mind – the separateness requirement

Phillips & Norby (2019a) point out that as well as being able to track the mental states of others, one also needs to be able to keep these separate from one’s own mental states in order to count as having a ToM ability. In this section I will argue that pretend play does not meet this requirement. To see why, we must again look at children’s ability to recognise pretence in others:Footnote 8 does recognising pretence in someone else require being able to keep separate one’s own perspective and that of another person?

One reason why one might think that it does is that, if we think of pretence as initiated by another person, the child would need to recognise this pretence which is different from their own view of reality. Therefore, one might argue that in recognising pretence in someone else, the child comes to track the perspective of another person which differs from their own. However, this view risks confusing two distinctions the child needs to draw: firstly, a distinction between pretence and reality and, secondly, a distinction between their own perspective and that of another person (see also Wolf 2021). While recognising pretence in another person clearly does require the distinction between pretence and reality, it does not also provide evidence that the child is keeping separate their own perspective and that of the other person. In particular, when recognising the pretence initiated by the other person, the child could join in the pretence and share the pretend perspective of the other person, without needing to keep this perspective separate from their own. The reason for this is that in order to recognise the pretence, the child needs to determine the content of the pretend representation (which does differ from the child’s knowledge of reality), but this pretend representation does not need to be attributed to anyone in particular. In other words, when recognising pretence in others, the child does not need to set apart their own pretend perspective from that of the other person. For example, suppose a child is invited to join a pretend tea party by the experimenter in which the experimenter pretends to pour tea into an empty mug on the table. In order to recognise the pretence, the child just needs to determine the contents of the pretence – i.e. that there is tea being poured into the cup in order to be able to respond appropriately. This means that a pretend perspective must be generated which is kept separate from the child’s perspective of reality. What is not needed, however, is that the child attributes this pretend perspective to the experimenter and distinguishes between the experimenter’s (pretend) perspective and their own.Footnote 9 It could be, of course, that it so happens that when recognising pretence in another person, the child does attribute the pretence to the other person first and only then comes to adopt it for themselves, but we would need evidence in favour of this distinction being made.

Indeed, I would argue that as it stands the evidence indicates that children’s early pretence is joint pretence in which a distinction between the child’s perspective and that of the other person is not needed. For example, we know that joint pretence seems to be one of the earliest forms of pretence (Rakoczy, 2006), with children being able to engage in more advanced forms of pretending when playing with someone else, such as an older sibling (Dunn & Dale, 1984; Perner et al., 1994). The problem with such cases of pretence is precisely that they do not require the child to separate their own perspective from that of the other person. Looking at experimental evidence testing children’s ability to recognise pretence in others, we are also predominantly faced with paradigms in which the child can pass the task by adopting a pretend perspective which is shared with the experimenter. For example, Bosco et al. (2006) tested whether the children are able to understand pretence by seeing whether they reacted appropriately to the pretence initiated by the experimenter, for example by appropriately pretending to drink from the cup which the experimenter had pretended to pour water into.

What would be required in order to meet the Separateness Requirement would be for children to recognise a pretence in others which they themselves do not share. In other words, we would require a paradigm in which the child shows that they are able to track what the other person is pretending, while themselves not engaging in the pretence, or engaging in a different pretence.Footnote 10 This, however, is evidence which we currently do not have. To be clear, the claim here is not that children engaging in joint pretence are not able to keep their own perspective separate from that of the other person. Rather, the claim is that as it stands the evidence from children’s joint pretence taken alone does not provide positive evidence in favour of their being able to make this distinction.

To conclude this section, I have argued that the evidence from children’s pretend play fails the Separateness Requirement and therefore evidence of children’s early pretend play does not provide evidence of ToM abilities, regardless of whether we interpret children’s early pretence as mentalistic or not. Given this, we might think that Leslie was wrong to think that pretence is an ability closely related to belief attribution specifically or ToM in general. This, however, would be a mistake. Even though I do not think that children’s pretend play can be counted as evidence of ToM, I think Leslie is nonetheless right to think that pretend play can offer evidence that children are already capable of some of the important components underlying belief attribution, namely the ability to deal with inconsistent representations. This, in turn, has implications for ToM research.

3 Pretence is nonetheless relevant for Theory of Mind research

In the previous section I argued that pretend play does not provide evidence of ToM as it fails the Separateness Requirement (I left open whether it meets the Tracking Requirement). In this section, I will argue that pretend play is nevertheless highly relevant for ToM research because it provides evidence of young children being able to deal with inconsistent representations. It might be thought that because pretend play only provides evidence of an ability to deal with inconsistent representations and not of ToM pretend play is only really relevant for research on the development of belief understanding. However, I think that pretend play has implications going beyond belief understanding, especially given the factive ToM views discussed in Sect. 1. It was suggested there that the difference between factive ToM and non-factive ToM lies not in any special ToM abilities, but in the general ability to deal with inconsistent representations. As noted, there is some promise to the view that the FBT is so difficult for children because it requires dealing with inconsistent representations, rather than due to the ToM requirements. However, if there is evidence that children can deal with inconsistent representations in pretend play, this indicates that the situation may be more complex and that children’s difficulty cannot simply be an inability to deal with inconsistent representations. I will therefore consider implications of pretend play for research on ToM going beyond belief understanding and the FBT in the next section (Sect. 3.1), before looking specifically at implications for the development of belief understanding (Sect. 3.2).

3.1 Pretend play and inconsistent representations in Theory of Mind

In Sect. 1 we saw that in the discussion of factive ToM, Phillips & Norby (2019a) argued that the difference between being able to pass a test of factive ToM and being able to pass a non-factive test of ToM lies not in any specific ToM abilities, but rather in the ability to deal with inconsistent representations. So, for example, someone who is only capable of factive ToM would be able to deal with dual representations, as long as there are no inconsistencies involved. This allows them a limited ability to represent differing perspectives (i.e. they can represent someone as being ignorant of something, e.g. I know where the keys are, but Jane does not; or conversely represent someone as knowing something which they themselves do not know, e.g. mum knows where the keys are but I don’t), but falls short of being able to attribute different beliefs about a situation. Moreover, there is evidence that factive ToM may actually be prior to non-factive ToM in development (Nagel, 2017; Phillips et al., 2021; Wellman & Liu, 2004), and there is evidence that children struggle with inconsistent representations till age 4 even in areas where no ToM is involved (Perner et al., 2002). This all made it seem plausible that children might develop a factive ToM some time before the age of 4, while the development of non-factive ToM is only limited by their inability to deal with inconsistent representations.

Pretend play, however, appears to pose a problem for this as it seems that children’s early pretend play provides evidence that children can – at least in the context of pretence – deal with inconsistent representations. Take, for example, a child who is pretending that their toothbrush is a magic wand – in order to do so the child must maintain both the representation of the toothbrush as a toothbrush, as well as the pretend representation of the toothbrush as a magic wand. Importantly, when pretending that the toothbrush is a wand, the child knows that the toothbrush is not actually a wand, which indicates that it is not the case that the child is only dealing with one representation, namely the pretend one. It is also important to note that these representations are inconsistent ones. Phillips & Norby (2019a) argue that we require dual representation also when dealing with knowledge, but that these representations are not inconsistent with each other. It is merely a matter of adding or subtracting some information from the representation of the perspective of the other person. Pretend representations are not like this, however. The child knows that the toothbrush is not actually a wand, and her pretence that the toothbrush is a wand is therefore inconsistent with her knowledge of reality.

It is important to note, however, that in saying that pretend play requires being able to deal with inconsistent representations, I remain neutral as to how these representations relate to each other. In particular, I am not committed to the idea that the pretend representation requires meta-representation. All I am committed to is that in pretend play children have dual inconsistent representations – a representation of reality and a pretend representation – which they are able to use in order to engage in pretend play.

Phillips & Norby (2019a) might be right that we do not need to be able to deal with non-factive representations in order to show ToM abilities, but what pretend play shows is that already 18 month old infants are able to deal with non-factive representations because they do so in pretend play. At the very least, what this indicates is that it cannot be that children fail the FBT till age 4 only because they are incapable of non-factive representations, even though they do have ToM skills otherwise.

There are a number of ways in which one might respond to this. Firstly, it might be alleged that the problem children face is not with ToM or non-factive representations separately, it is just when these two demands are combined that children have problems. This might be plausible if, for example, both ToM reasoning and using non-factive representations are demanding individually, and while children can manage the demands separately, the combined cognitive load is too much. While there might be some plausibility to this idea, I do not think that things are quite that simple. The reason for this is that children also seem to show problems with inconsistent representations more generally. For example, in the alternative naming task children struggle with this type of perspective problem and do not pass the task till age 4 (Perner et al. 2002). Therefore, it seems both that children do not have a general problem with mental state attribution and that they struggle with at least some perspective problems that do not require mental state attribution. Instead, what we should ask is why some perspective problem tasks - like the alternative naming task or the FBT - are more difficult for children than other perspective taking tasks like pretend play.

Secondly, it might be argued that pretence is different to states like knowledge and belief because pretence has a different relation to the world. States like knowledge and belief have a mind-to-world fit, while states like desires, for example, have a world-to-mind fit. I agree that there is a difference between pretence and states like belief and knowledge. Pretence is somewhat difficult to classify as either having a mind-to-world or world-to-mind fit – while there is a component of desire involved and making the world fit with what one wants it to be (world-to-mind), however, pretence also involves representing something as being a particular way (mind-to-world). Therefore, I think there is sufficient similarity between pretending and states like believing and knowing to allow for a relevant comparison. Moreover, suppose that we were to accept that pretending is fundamentally unlike belief or knowledge as it is not a representation of how the world is (or appears to be), we would still be left with the question of why this distinction is relevant and, in particular, why it would allow for dealing with inconsistent representations in a way which is not possible with more clearly mind-to-world states.

Thirdly, it might be objected that pretend play does not actually involve inconsistent perspectives. It could be argued that children never actually have to deal with the inconsistent perspectives as they merely have two separate perspectives which they switch between without putting these two perspectives together. In other words, this would mean that the child cannot understand the toothbrush as a toothbrush and as a magic wand simultaneously, but is merely able to switch from one to the other, perhaps depending on contextual cues. While I do think that contextual cues probably play an important role in triggering pretend play perspectives, I do not think that this switching objection undermines the important point that children are able to deal with inconsistent representations in pretend play. On the one hand, if the requirement was that someone demonstrate that they are making use of dual perspectives simultaneously then this would be setting a very high threshold for showing the ability of dealing with inconsistent perspectives which few, if any, of the tests used for testing belief attribution would meet. If we consider both the FBT and the diverse knowledge task, these usually do not test whether someone is making use of both perspectives simultaneously. Rather, it is first tested whether the child has correctly identified the false belief or knowledge state of the other person, and then they are asked about their own knowledge of the situation. This too could be explained in terms of switching between perspectives, rather than being able to simultaneously hold inconsistent representations. For this reason, I do not think that the possibility of switching between different perspectives should be seen as an objection, but rather this is part of what it means to be able to deal with different perspectives. On the other hand, I also believe there is positive reason to think that when engaging in pretend play children are actually using inconsistent perspectives simultaneously: while engaging in pretence, even young children seem to remain aware that this is not real. It is not merely something which they can retrospectively report on once they have been jarred out of the pretend situation, but rather seems like an awareness which accompanies the pretence itself. Therefore, pretending does not merely require being able to switch between perspectives, but there is also a link between the perspectives which allows children to remain aware of reality while pretending.Footnote 11

If young children’s pretend play does show that they are able to deal with inconsistent representations at 18 months, then the question we are faced with is why children seem to be able to deal with inconsistent representations in some context and not others. My suggestion to this is that the problem lies not in generating inconsistent representations, but rather in being able to manage and use these appropriately (see also Wolf 2021). This appropriate use consists especially in being able to consistently access the information or contents of these representations when needed. Pretend play is a strongly situationally supported and socially scaffolded ability, and it is notable that especially early pretence seems to be strongly dependent on props which initially are also required to have strong similarities with what is being pretended (Lillard, 1993). So, for example, a child might be able to pretend that the stick is a wand, but might not be able to pretend that a toy car is a wand. Although there is only limited evidence directly testing this, Fein (1979, 1981) reviews evidence indicating that especially young children’s pretence is facilitated when using realistic toys, as long as these correspond to the category being pretended. So, for example, children struggle when using a hairbrush as pretend food. This would make sense if children in their early pretence are strongly dependent on situational cueing in order to make use of the pretend perspective.

A further important aspect of early pretence which should be noted is that the pretend play context is often provided to children, which might help them use the pretend representations in the context of pretend play. There is evidence that while children of age 3 are quite good at joining in a pretence initiated by an adult and responding appropriately, they are considerably less good at starting a novel pretence of their own (Bijvoet-van den Berg & Hoicka, 2019). My claim therefore is that children at that age are able to handle inconsistent representations, but only if they have external support when generating and subsequently using these representations.

Regarding the question why factive ToM might precede non-factive ToM, I would therefore argue that this is not because children are unable to deal with inconsistent representations – there are situations in which they can deal with inconsistent representations – but because they are dependent on situational support in order to be able to make use of a representation which is inconsistent with their own (representation of reality). Factive ToM might be something which children show earlier competence in because it could be that perspectives which differ from- but are not inconsistent with the child’s own perspective on reality are easier to use. This view would also cohere well with evidence from visual perspective taking indicating that children are able to take the perspective of another person at as young as 2.5 years, but only develop the ability to be able to contrast that perspective with their own at age 4 (Moll, Meltzoff, Merzsch, & Tomasello, 2013; Moll & Tomasello, 2012). Specifically, in these tasks children struggled if the task was such that it required them to switch from their own perspective to that of the other person, which may be because if the child’s own perspective is made salient, they then struggle to consistently use the perspective of the other person and access the relevant information (Wolf, 2021). Plausibly, it is easier to relate different perspectives with each other or even make use of them simultaneously when they are not directly in conflict or inconsistent with each other, thereby allowing children to demonstrate factive ToM before non-factive ToM.

What I suggest is that there is a change around the age of 4 in the cognitive organisation of representations which allows children to use inconsistent representations in a less situationally dependent manner and allows them to pass the explicit FBT. Importantly, the claim is not that children are unable to use these inconsistent perspectives, but that they are dependent on situational support in order to be able to do so; while using different but not inconsistent perspectives can be possible with less situational support. This leaves open the question of how to interpret the findings from the implicit FBT, which I turn to in the following section.

3.2 Pretend play and the implicit false belief Task

In the previous section, I argued that pretend play indicates that children before the age of 4 are capable of dealing with inconsistent perspectives in at least some contexts, although they are limited in their ability to use these inconsistent perspectives. This is also relevant for interpreting the findings from the implicit FBT which, arguably, is a further area in which children have demonstrated the ability to deal with representations inconsistent with their own at 15 months of age already (Onishi & Baillargeon, 2005; see Scott, 2017 for an overview of the implicit FBT tasks).

The question of how to interpret the findings from the implicit FBT remains one of the main controversies in research on the development of belief understanding. It is controversial whether these findings should be interpreted as evidence of belief understanding (see Heyes, 2014 and Scott & Baillargeon, 2017 for opposing views on this), and recent concerns about the replicability of implicit FBTs have cast some doubt on how robust these findings are (Kulke & Rakoczy, 2018; Kulke, von Duhn, Schneider, & Rakoczy, 2018; Poulin-Dubois et al., 2018).

Nonetheless, it might be interesting to compare both pretend play and the implicit FBT.Footnote 12 While there are concerns about individual paradigms - for example that not all implicit FBTs require the child to deal with dual inconsistent perspectives (Phillips & Norby, 2019a, b) - there are some types of implicit FBT which do require dealing with two inconsistent perspectives and which children are able to pass at the age of around 18 months (D. Buttelmann, Carpenter, & Tomasello, 2009; F. Buttelmann, Suhrke, & Buttelmann, 2015; Knudsen & Liszkowski, 2012; Southgate, Chevallier, & Csibra, 2010). Why is it that children are able to deal with the inconsistent perspectives in this context?

From the discussion of pretend play and ToM, we have seen that the problem might lie not in being able to generate inconsistent representations, but in being able to use these subsequently. In pretend play there is substantial situational scaffolding which might trigger the pretend perspective and therefore help the child access the relevant information. Something similar may be the case in the implicit FBT, where children are able to use the perspective of the other person if this is highlighted by the task set-up and distractors are avoided (Newen & Wolf, 2020; Rubio-Fernández, 2013; Rubio-Fernández & Geurts, 2013)Footnote 13. The implicit FBTs might therefore be such that they provide the situational support needed in order to make use of the inconsistent perspective of the other person, while the explicit FBTs do not.

A question we are then confronted with is whether this early ability evidenced in the implicit FBT should be properly considered as evidence of ToM and belief understanding. As Phillips & Norby (2019a) rightly point out, some of the implicit FBTs fail the Separateness Requirement because they only test whether the child is able to track the perspective of the other person, and not whether the child is also keeping the perspective of the other person separate from their own.Footnote 14 However, they allow that some of the implicit FBTs do also require the child to make use of their own perspective. For example, in the active helping behaviour paradigms children need the perspective of the other person in order to correctly infer the person’s goal, but they then need to act on their own knowledge of reality in order to be able to appropriately help the other person. Does this mean that – provided these findings can be replicated – they provide evidence that children not only are able to deal with inconsistent perspectives but are also able to do so in a non-factive ToM context? One reason why we might be hesitant to concede this is that pretend play has highlighted that there can be situations which involve dual perspectives without requiring one of these perspectives to be attributed to another person; I have argued that this is what might happen in joint pretence but – less controversially – it is the case in individual pretence where pretence and reality perspectives must be kept separate, even though both are perspectives of the same person. One suggestion is that this could also be the case in the implicit FBTs: they require the child to deal with dual and inconsistent perspectives, but perhaps this alternative perspective is not attributed to anyone. After all, the child could pass the helping behaviour task by switching to their representation of reality from a perspective which is shared with the other person, but appreciated as being not real. This would mean that what counts as belief understanding in the implicit FBT is actually closer to what happens in pretence in the sense that the child will actually share the other person’s perspective. It should be noticed that this does not mean that they themselves come to hold a false belief about reality themselves, as in pretending children also do not come to hold a false belief about reality; the child pretending that the banana is a phone does not think that the banana actually is a phone.

But what does this amount to? The difference between pretending and belief attribution is in how the perspective of the other person is related to the child’s own representation. The pretend relation is more closely attached to the child’s own representation as there is a sense in which this is how she too thinks about the object. To appreciate another’s belief properly, this belief representation must be completely separate from one’s own belief: to represent another’s belief is not to represent your own beliefs, even if your own beliefs and those of the other person happen to match up. Therefore, while I think that implicit FBT’s indicate that infants are capable of dealing with inconsistent perspectives in a ToM task, there is a possibility that they are doing this without actually attributing this perspective to another person and clearly keeping it separate from their own perspectives. While this is currently only a suggestion which would need to be explored and tested in further research, drawing such a parallel between pretence and the implicit FBT seems relevant given that both pose a perspective problem in which children succeed at a similar time prior to being able to pass the explicit false belief task and most other perspective problems.

4 Conclusion

I have argued that although children’s early pretend play fails to meet the requirements for showing ToM abilities, it nonetheless carries important implications for ToM research. This paper contributes to two debates: firstly, it contributes to the debate whether early pretend play requires ToM abilities. Regarding this, I argue that it falls short of ToM because it does not require the child to distinguish between their own perspective and that of the other person. Secondly, I argue that pretence nonetheless carries implications for research on the development of ToM, because even though it falls short of ToM, it nonetheless requires some of the crucial abilities underlying belief understanding: namely the ability to make use of dual, inconsistent perspectives on an object. Pretend play indicates that children are capable of dealing with inconsistent perspectives in some situations and not others, and I argued that children might face the problem of having difficulty with consistently using these inconsistent representations. Before the age of 4, children are dependent on situational support in order to make use of inconsistent representations, while making use of representations which are not inconsistent can be easier. This would explain why factive ToM precedes non-factive ToM, even though young infants are already capable of generating and maintaining inconsistent representations. Furthermore, I highlighted that in pretence the pretend perspective can be shared, while appreciating that the pretence is not real. This is an interpretation which can also be applied to some of the findings from the implicit FBT. This, along with situationally supporting factors, might explain why infants are capable of passing some implicit FBTs before they pass the explicit FBTs at age 4.