The ability to take other people’s perspectives is integral to communication and effective interaction with other agents (Clark & Brennan, 1991; Sperber & Wilson, 1987). Both children and adults, however, have trouble appreciating that other agents see the world differently (Epley, Morewedge, & Keysar, 2004; Keysar, Lin, & Barr, 2003). Such difficulties have usually been attributed to the tendency to be biased by one’s own perspective when reasoning about others’, an effect known as the curse of knowledge (Birch & Bloom, 2004), the curse of expertise (Hinds, 1999), the false consensus effect (Ross, Greene, & House, 1977), and egocentrism or egocentric bias (Apperly et al., 2010; Epley et al., 2004; Keysar, Barr, Balin, & Brauner, 2000), among other terms.

This bias towards our own knowledge can be a hindrance when attempting to be objective about other people’s beliefs and experiences (Risen & Critcher, 2011). Typical means of measuring this bias are tasks in which participants are instructed to select a target that is not optimal or ‘true’ from their own perspective but appears to be true from the agent’s (Dennett, 1978; Keysar et al., 2000; Wimmer & Perner, 1983). Deviation towards the egocentrically correct distractor, or delays in processing the correct answer relative to when there is no egocentrically correct distractor present, are usually considered to index bias. For instance, in the director task (Keysar et al., 2003), participants are directed by the agent to select objects in an array. When the agent has a restricted view of the objects, the command from the agent to select the ‘top cup’ requires the participant to select a cup which, from their own perspective, is the middle cup, rather than the actual top cup, which is hidden from the agent’s view. It has been shown that adults make more errors and perform more slowly when there is a better match for the instruction from the participant’s own perspective (Apperly et al., 2010; Keysar et al., 2003; Legg, Olivier, Samuel, Lurz, & Clayton, 2017; Samuel, Roehr-Brackin, Jelbert, & Clayton, 2019b; Wu & Keysar, 2007).

Testing participants’ ability to reason about other perspectives in the presence of an egocentric distractor, which we here term the ‘right distractor’, is at the heart of the classic change-of-location/false belief task. In this task, participants are instructed to select the location where another agent falsely believes an object to be, contrary to the participant’s own knowledge of its true whereabouts (Baron-Cohen, Leslie, & Frith, 1985; Wellman, Cross, & Watson, 2001; Wimmer & Perner, 1983). Overall, much of our understanding of theory of mind, which is the ability to represent others’ unobservable mental states (Premack & Woodruff, 1978), as well as our understanding of our tendency to be egocentric more generally (Apperly et al., 2010; Birch & Bloom, 2007; Keysar et al., 2003), is predicated upon this ‘right distractor’ paradigm.

A problem at the core of these paradigms is that they conflate the difficulty of ignoring the egocentrically correct distractor with the difficulty voluntarily selecting something that is egocentrically wrong. For example, in the director task, the highest cup in the grid is the right distractor and must be ignored, but the middle cup is a ‘wrong target’ and must be selected. In the classic false belief task, the participant must select the location they know the items not to be in. Indeed, if both the right distractor and the wrong target problem are not solved, then either a ‘distractor error’ or no response at all (a time-out perhaps) will occur.

The reason for this conflation of two problems is that it is hard to design a task that can independently manipulate these two phenomena from the same perspective. Simply put, if there is a right distractor, then the target must be ‘wrong’, and vice-versa. However, until we can understand what interferes with participants’ correct choices on such tasks, we cannot know precisely what the difficulty in making judgments about other perspectives is. Is it the lure of what we think is correct, the desire to avoid error, or both?

One way to circumvent this issue is to use bivalent stimuli which change identity according to perspective, such as the way a 6 appears to be a 9 when it is viewed upside down. By doing so, it becomes possible to manipulate not only whether there is a right distractor but also whether there is a ‘wrong’ target. A recent visual-perspective-taking paradigm developed by Samuel, Legg, Manchester, Lurz, and Clayton (2019a) presents an opportunity to do this. In the top-left image of Fig. 1, the avatar (seen above the grid) says ‘four’, and the participant is required to locate the four from the perspective of the avatar. The correct answer is a bottom-left response key, corresponding to the bottom-left square. The original version of the task was concerned with the nature of participant’s responses—namely, whether they would erroneously press the button consistent not with their own perspective but with the avatar’s. Additionally, the avatar could appear at any of the four edges of the grid, creating shared-perspective, left, right-perspective, and opposite-perspective trials. In the present study, we were interested in opposite perspective trials specifically. This is because difficulty on trials from this perspective can be caused by either the pull of the egocentrically correct ‘right distractor’, or the push of the egocentrically incorrect ‘wrong target’. We can pull apart these two effects by comparing performance in this baseline condition with the conditions shown bottom left (Contrast A) and top right (Contrast B). The right distractor contrast (Contrast A in Fig. 1) compares performance in the baseline condition with a condition in which the distractor is not egocentrically correct and so minimizes the right distractor effect (while keeping constant the wrong target effect). Similarly, the wrong target contrast (Contrast B in Fig. 1) compares performance in the baseline condition with performance in which the target is identifiably another digit from the self-perspective. This condition maximizes the wrong target effect (while keeping constant the right distractor effect). Results from the original version of this task pointed to an additional difficulty caused by the requirement to select a target identifiable as another number (9) relative to the upside down 4 (Samuel, Legg, et al., 2019a). In the present study, we used this task to examine whether these two effects contribute independently to the difficulty of performing the perspective-taking task.

Fig. 1
figure 1

Example stimuli from the present experiment. Opposite perspective trials afforded four types of stimuli combinations. The top-left grid illustrates an example in which there is a ‘right distractor’ (the 4 in the top right corner) and an ambiguous and thus only minimally wrong target (always an upside-down 4 from the participant’s perspective). In the bottom-left grid, the distractor is unrelated to the instruction to find the 4, and thus the difference between this grid and the one in the top right forms the right distractor contrast (A), with trials where the distractor matches the instruction (the top-left grid) predicted to be harder. Note that across this contrast the target is held constant. In the top-right grid, the target is maximally wrong because it is another number (always a 9) from the participant’s perspective. The comparison in performance between grids like this and grids like those illustrated in the top left thus forms the wrong target contrast (B), with trials where the target is maximally wrong (the top-right grid) predicted to be harder. Note in this contrast the distractor is always a perfect match for the instruction from the egocentric perspective, holding the nature of the distractor constant. Although the diagonal arrangement of the digits within the grid varied, only these specific stimuli pairings were used to calculate the contrasts of interest, because they allowed a measurement of one effect while keeping the other constant. Shared-perspective trials in which the avatar was at the bottom of the grid, and grids with a target 6 and unrelated distractor (e.g., the bottom right example) did not form part of the calculations of these two contrasts

It has often been suggested that domain-general executive functions might serve to reduce egocentric biases (Brown-Schmidt, 2009; Lin, Keysar, & Epley, 2010), but usually in the context of the understanding that egocentric biases result from the presence of right distractors, with executive functions serving potentially to reduce this bias. Immediately after the perspective-taking task, we gave participants a Simon task (Simon & Rudell, 1967), which provides a measure of the ability to inhibit distracting information known as the Simon effect. By correlating the Simon effect with the two effects of interest, we could check whether executive function predicts the ability to ignore right distractors or wrong targets.

Method

Participants

We considered medium effect sizes the minimum of interest for the right distractor and wrong target effects (based on the contrasts shown in Fig. 1, one-tailed). A power analysis using G*Power 3.1.9.5 found a 95% chance of detection required approximately 44 participants. All participants were required to be aged 18–35 years, be native English speakers, have normal or corrected-to-normal vision, and demonstrate a minimum 60% accuracy on the task (chance being 25%). Participants were recruited using the University of Essex online recruiting system and were compensated with course credit. Ethical approval was obtained from the University of Essex Science and Health Ethics Sub-Committee. Total participation time was approximately 30 minutes. We recruited 47 participants whom, after removals following accuracy checks, became N = 43 for the analyses (Mage = 19 years, range: 18–24; 36 females, six males, one non-binary).

Materials and procedure

Perspective-taking task

Participants were instructed that they would hear a target number and that they should locate this number from an avatar’s perspective, and then press a button that corresponded to where they themselves saw it. For example, if the target was in the top-right position in the grid from their perspective, the participant should press the top-right button. They were told to respond as quickly as possible and to use the forefinger of their left hand for the left-sided buttons and the right hand for the right-sided buttons.

Each trial began with a blank (blue) screen and a cue (1,000 ms) via headphones, which was always either ‘four’ or ‘six’ always spoken in a female voice (the avatar was described as female). At 250 ms after the cue, an empty 2 × 2 grid then appeared (100 ms), followed immediately by the avatar (wearing a red cap, seen from above), the target (a 4 or a 6), which was always upright from the avatar’s perspective, and the distractor. The distractor was always in the diagonally opposite square to the target. On related-condition trials (50%), the distractor was the target digit rotated 180 degrees. On unrelated-condition trials (50%), the distractor was a different digit (a 6 if the target was a 4, and vice-versa) but upright from the avatar’s perspective. On half the trials the avatar shared the participant’s perspective (shared perspective trials), and on half she was located above the grid and saw the scene upside down (opposite perspective trials). We included shared perspective trials to ensure that the egocentric response was sometimes the correct one, but together with unrelated trials with a target 6 (bottom-right image in Fig. 1) these did not form part of the analyses. Responding terminated the trial, or if 3,500 ms had elapsed without a response the trial terminated automatically. One thousand ms of blank screen then appeared prior to the next trial.

Before performing the task, participants completed 16 warm-up trials, four each of the shared/related, shared/unrelated, opposite/related, and opposite/unrelated trial types, each further subdivided into two ‘four’ cue trials and two ‘six’ cue trials, with feedback. The experimental block consisted of 64 randomly presented trials, equally divided among all trial types and grid location such that, for example, there were 32 shared perspective trials, 16 of which occurred with a related distractor, eight of which with the target cue ‘six’, appearing twice in each of the four grid squares.

Simon task

The Simon task also consisted of 16 practice trials (with feedback as before) followed by a block of 64 experimental trials, randomly presented and equally divided between congruent/incongruent and red/green squares. Each trial began with a fixation cross for 150 ms, followed by a 350-ms blank interval and then the stimulus square for 400 ms on either the left or right side of the screen. Participants were instructed to press either 3 or 9 on the top row of the keyboard according to the colour of the square (key/colour mappings counterbalanced across participants), not its position. The 6 on the top row was aligned with the centre of the screen. They were told to be as quick but also as accurate as possible. Participants could respond during the stimulus presentation and for up to 900 ms of blank screen afterwards. On congruent trials, they location of the correct key corresponded to the spatial location of the square, and on incongruent trials it did not. This difference (incongruent minus congruent trials) generates the Simon effect (Simon & Rudell, 1967), a measure of the ability to inhibit information on an irrelevant (spatial) dimension.

Results

Accuracy was high (M = 95%, 95% CI [93%, 96%]). There was a total of 13 trials with no response (time-outs), which were classified as errors. None of the RT variables deviated from normality (Shapiro–Wilks tests >.5), but all accuracy variables did. All correct trials were included in the RT analyses (all >259 ms).

Right distractor effect

Participants were on average 387 ms slower (SE = 38 ms) to select a target 4 from the avatar’s perspective when the distractor was a match (4) from the self-perspective than when it was not (6), t(42) = 10.11, p < .001, d = 1.542, BF10 > 1000, one-tailed. A Wilcoxon signed-rank test found accuracy was also lower (M = 89% vs. 97%), W(43) = 40, p = .001, d = .557, one-tailed. Participants thus demonstrated a right distractor effect.

Wrong target effect

Participants were on average 76 ms slower (SE = 41 ms) to select a target 6 that looked like a 9 than an ambiguous target (an upside-down 4), t(42) = 1.898, p = .032, d = .289, BF10 = 1.6, one-tailed. A Wilcoxon signed-rank test found accuracy was also lower (M = 85% vs. 89%), W(43) = 158.5, p = .039, d = .267, one-tailed. Participants therefore also demonstrated a wrong target effect.

Comparison of effects

The right distractor effect was larger than the wrong target effect, MDiff = 311 ms, 95% CI [175, 448], t(42) = 4.604, p < .001, d = .702, BF10 = 293, two-tailed.

Relationship with the Simon task

There was no evidence of a relationship between the size of the Simon effect (Congruent RT = 393 ms; Incongruent RT = 422 ms), t(42) = 7.063, p < .001, d = 1.077, MDiff = 29 ms, 95% CI [21, 38]), and either the size of the right distractor effect, r(43) = .012, p = .939, or the wrong target effect, r(43) = .049, p = .756.

New analyses of previous data

To test for the robustness of these effects, we ran the same tests on the data from the original study by Samuel, Legg, et al. (2019a). There are differences between the present study and these others, most notably the inclusion of trials from both 90-degree perspectives around the grid, but the fundamental contrasts indicated in Fig. 1 were nevertheless present. We used one-tailed t tests for the contrasts or one-tailed Wilcoxon signed-rank tests where the distribution of at least one cell was not normal. There were significant wrong target effects in both experiments, Exp. 1: MDiff = 115 ms, t(30) = 2.04, p = .025, d = 0.366, BF10 = 2.3; Exp. 2a–b combined: MDiff = 81 ms, W(61) = 1277, p = .009, r = .215; and significant right distractor effects in both experiments, Exp 1: MDiff = 598 ms, t(30) = 7.194, p < .001, d = 1.292, BF10 > 1000; Exp. 2a–b combined: MDiff = 561 ms, W(61) = 1885, p < .001, r = .611. The present experiment thus represents a third replication of the finding of independent effects of a right distractor and a wrong target, in each case with similar magnitudes and effect sizes.

Other results

We conducted a 2 (target: 6 vs. 4) × 2 (distractor: related vs. unrelated) × 2 (perspective: shared vs. opposite) mixed-design analysis of variance (ANOVA) on mean RTs for correct trials (see Table 1). This was to confirm that the task conformed to the expectation that opposite perspective and related distractor trials would be harder. The analysis found the expected main effects of perspective, MShared = 1,009 ms, MOpposite = 1,398 ms, F(1, 42) = 167.327, MSE = 62473, p < .001, ηp2 = .799, and distractor, MRelated = 1,338 ms, MUnrelated = 1,028 ms, F(1, 42) = 237.425, MSE = 34825, p < .001, ηp2 = .850. There were also significant interactions between perspective and target, F(1, 42) = 19.353, MSE = 13129, p < .001, ηp2 = .315, owing to longer response times on six trials from the opposite perspective, and also perspective and distractor, F(1, 42) = 27.988, MSE = 24424, p < .001, ηp2 = .400, owing to longer RTs on related distractor trials from the opposite perspective.

Table 1 Mean response times and standard errors

Discussion

Results from the present study showed that the presence of a right distractor and the requirement to select a wrong target both contributed independently to the difficulty of a perspective-taking task. Analyses of earlier data showed that these effects are robust, occurring twice before in previous research (Samuel, Legg, et al., 2019a). Our results therefore imply a reconfiguration of our understanding of what egocentric bias actually is, because egocentricity has traditionally been defined in terms of difficulty ignoring what is correct from one’s own perspective. Overall, our data suggest that this is the larger bias, but not the only bias; the test-appropriate effect size measurements (ds = 0.267, 0.366, r = .215, BFs10 = 1.6, 2.3) indicate it is a small-to-medium effect overall, compared with a consistently powerful right distractor effect (ds = 1.292, 1.542, r = .611, BFs10 > 1000).

How important is the wrong target problem in perspective-taking? Given the difficulty in devising stimuli to tease apart the two effects found here, it is highly likely that the quotidian, real-world conflicts of perspective require solutions to both problems simultaneously to generate an appropriate response. For instance, if I am asked by someone opposite me to pass an object that is on their right (my left), I need to ignore both what is on my right and select what is on my left to succeed. Although less intrusive, the wrong target problem is therefore likely to occur with only slightly less frequency than the right distractor problem under such conditions.

At least two important theoretical considerations follow from these findings. The first concerns the source of the wrong target effect. Some scholars support the idea that theory of mind is to some extent domain specific (Baron-Cohen, 1995; Cohen, Sasaki, & German, 2015; Leslie, German, & Polizzi, 2005; Leslie & Thaiss, 1992), while others argue that more generalized processes are involved (Gopnik & Wellman, 1992), and that low-level alternative explanations exist for some important results in the field (Heyes, 2014a, 2014b; Santiesteban, Catmur, Hopkins, Bird, & Heyes, 2014; Santiesteban, Shah, White, Bird, & Heyes, 2015). At present, it is not clear whether the wrong effect is generated by a generalized error avoidance process.Footnote 1 Future research might attempt to relate the two using an error avoidance task with no perspective-taking element. Support for a role of general error avoidance would weaken the argument for domain specificity, or further caveat its remit. However, underpinning this hypothesis is a further question—namely, how generalizable the wrong target effect in perceptual perspective-taking might be to analogous tasks involving mental states such as beliefs or desires. While the logic of the wrong target effect applies to all such cases, we have so far only demonstrated it in perceptual perspective-taking.

Secondly, and crucially, regardless of the source of the wrong target effect, it should manifest only in tasks which require an outward response. This would place the wrong target effect in a later perspective selection phase rather than an earlier perspective calculation phase (Baillargeon, Scott, & He, 2010; Qureshi & Monk, 2018). In contrast, the right distractor effect should be present for both explicit and implicit tasks, such as violation of expectation and anticipatory looking paradigms. In recent years, the results of such tasks have reduced the age at which false belief understanding is thought to emerge from around 4 years (Wellman et al., 2001) to shortly after the first year (Onishi & Baillargeon, 2005; see also Tauzin & Gergely, 2018), and indicated that chimpanzees understand that others can have false beliefs (Krupenye, Kano, Hirata, Call, & Tomasello, 2016). It has been proposed that removing the requirement to respond allows young infants, whose ability to select between perspectives is underdeveloped, to succeed (Baillargeon et al., 2010; though see Heyes, 2014a). An alternative or perhaps complementary explanation is that implicit tasks do away with the wrong target problem. In support of this possibility, in our task the requirement to select the avatar’s perspective was the same whether the target was wrong or merely ambiguous, and therefore the wrong target effect demonstrates extra difficulty over and above perspective selection alone.

There are caveats we should apply to our wrong target contrast. Firstly, we could only create targets that contrast in terms of their recognizability. Our reasoning here is that a 9 is ‘more wrong’ because it is identifiable as something other than the cue, but the upside-down 4 is ambiguous and is thus wrong in a more limited sense. This is not precisely the same as a target that is clearly wrong and a target that is not, but rather a proxy for such a contrast, which might be empirically impossible to create in its purest form. We therefore allow that, at its most basic, the wrong target effect shows that the right distractor effect does not hold a monopoly over difficulty in such tasks. An additional caveat is that our results are based on a single paradigm, and with numerical stimuli only. Further research would be useful in determining whether independent effects are also found in other tasks and with other stimuli. However, as we described in the Introduction, it is difficult to conceive of tasks and stimuli that allow each effect to be measured separately.

Finally, the results of the Simon task showed no relationship between either effect and our measure of executive function. The relevant Pearson’s r figures (r = .012 and .049, respectively) suggests that any such effect would be too small to be of interest.Footnote 2This is problematic for accounts of egocentric bias as predicated at least in part upon such processes, but given the plurality of the forms of executive function and the means of measuring them (Miyake & Friedman, 2012; Miyake et al., 2000), we suggest further research is necessary before drawing firm conclusions. However, the absence of any relationship allows us to rule out the possibility that the independence of the wrong target and right distractor effects are artefacts of variable demands upon executive control.

Conclusion

The difficulty in taking perspectives that conflict with our own has usually been ascribed to the difficulty in ignoring our perception of what is correct. Overall, our results identify a right distractor problem and a wrong target problem, both of which must be solved to arrive at a correct judgment about other perspectives.