Introduction

Theory of Mind is the ability to attribute subjective mental states to oneself and to others (Baron-Cohen et al. 2000). This ability is crucial to the understanding of one’s own and other people’s behaviour. Autism Spectrum Disorders (ASD) are strongly associated with impairments of Theory of Mind skills. Following the great number of studies that have established this impairment (Yirmiya et al. 1998), interventions have been developed worldwide to improve the Theory of Mind skills of individuals with autism (Hess et al. 2008). Despite these efforts, little is known about whether, when, where and for whom these treatment programs work in autism (Koenig et al. 2009). Research to date is hampered by small samples, the absence of randomized controlled trials, and poorly designed outcome measures (Smith et al. 2007; Lord et al. 2005). The current study describes a randomized controlled trial on the effectiveness of a Theory of Mind training in 40 school aged children with ASD and normal intelligence (High Functioning Autism Spectrum Disorders, HFASD).

Large scale, randomized controlled trials (RCTs) are rare in autism, even when considering the wider domain of social skills training. Several recent reviews of studies on social skills treatment indicate that programs are often not even based on manual curricula, and fail to identify primary outcome measures (Seida et al. 2009; White et al. 2007; Matson et al. 2007). Given the variation in the nature of both treatment and outcome variables, it is unsurprising that the empirical support for the effectiveness of social skills training in autism is regarded incomplete. However, currently available evidence is generally regarded positive, and these programs continue to be used widely (Reichow and Volkmar 2010; Rao et al. 2008).

Considering the more narrowly defined Theory of Mind training programs, one might expect clearer treatment programs and outcome measures. Indeed, despite the variety in terminology—including trials that target ‘Theory of Mind’ (Ozonoff and Miller 1995), ‘social cognition’ (Turner-Brown et al. 2008), ‘mental state- or mind reading’ (Golan and Baron-Cohen 2006), ‘picture-in-the-head teaching’ (Swettenham 1996), or ‘thought-bubble training’ (Wellman et al. 2002)—there is a clear overlap. All programs focus children on the internal, subjective mental representations of themselves and those around them. When considering this overlapping construct, a next question is what outcome measures are sensitive to change and clinically relevant to the construct of Theory of Mind (Smith et al. 2007; Scahill et al. 2009).

While Theory of Mind research initially relied on behavioural tasks in primates (Premack and Woodruff 1978), most studies with children are characterized by a strong focus on conceptual measures. Starting in the early 1980 s, ‘false belief’ type tasks were developed, where a child was asked to explain the behaviour of an ignorant story character in a hypothetical scenario (Wellman et al. 2001). These were later adapted to advanced levels, suitable for adolescents and adults, which included more complex scenarios, requiring participants to reason about embedded mental states, such as what one person thinks about another person’s thoughts (second order belief reasoning) (White et al. 2009). Theory of Mind is usually also linked, and sometimes even identified with tasks that focus on emotion recognition (Baron-Cohen et al. 2001). However, when considering the function of Theory of Mind, which involves the mastering of social situations, it is surprising to find so little research on how individuals actually use their Theory of Mind skills in social interactions (Begeer et al. 2010). Direct measures of these types of behaviour may be difficult to use in RCTs, which is why most studies have used informants (parents, teachers), who report on children’s real life application of Theory of Mind skills through questionnaires. Applied or naturalistic measures of advanced Theory of Mind have also used videotaped social interactions or emotional expressions, which may be suitable as outcome measures (Golan and Baron-Cohen 2006).

To date, two RCTs specifically focused on the effectiveness of training Theory of Mind skills. Fisher and Happe (2005) selected 6–15 year olds with ASD and varying cognitive abilities, based on their poor Theory of Mind skills. The training, based on Swettenham et al.’s (1996) picture in the head procedure, included up to 10 individual 20–25 min sessions, lasting 5–8 days. Compared to a control group, the trained children, ranging from mentally retarded to normal intelligent, showed marked improvements in their performance on Theory of Mind tasks, which remained stable at follow up, between 6 and 12 weeks later. However, the training did not affect children’s emotion recognition skills, nor their daily life Theory of Mind use, as reported by their teachers (Fisher and Happe 2005). The second RCT showed the effect of a computer program for training emotion recognition. Six to 18 year olds with ASD and varying cognitive abilities improved compared to a control group on emotion recognition in cartoons and second order Theory of Mind reasoning, but not on their recognition of facial emotional expressions (Silver and Oakes 2001).

Furthermore, two controlled (but not randomized) trails with HFASD adults indicated that training the recognition of emotions and mental states with a computer program improved performance on those measures that were also used in the training sessions. No improvements were found on tasks targeting emotion and mental state recognition skills that were not used in the training (Golan and Baron-Cohen 2006). The effectiveness of the ‘Social Cognition and Interaction Training’ was shown in adults with HFASD who improved on Theory of Mind skills but less on social communication, relative to a non-treated control groups (Turner-Brown et al. 2008). In short, the examination of various programs for training Theory of Mind skills in ASD and HFASD show that generalization of Theory of Mind skills to daily life behaviour is often poor.

Besides the elementary question of effectivity it is pivotal to predict which children benefit most from treatments (Koenig et al. 2009). When considering the larger number of studies that have addressed social skills training, age is regarded an important factor, and it is generally recommended to treat children as young as possible (Dawson et al. 2010; Lord et al. 2005; Granpeesheh et al. 2009). However, on average ASD diagnoses are identified at age 5.7, which highlights the need for effective treatments at later ages (Shattuck et al. 2009). Furthermore, individuals with less severe subtypes of autism, such as PDDNOS, and higher intellectual functioning generally respond better to interventions (Lord et al. 2005), thought the effect of IQ has been prone to mixed findings (Beglinger and Smith 2005).

To date no RCTs have specifically focused on children with high functioning ASD, despite the wide use of these types of training in high functioning samples. The current study described a randomized controlled trial on the treatment effect of a Theory of Mind training. This program has been shown to increase the Theory of Mind skills of children with social handicaps (Steerneman et al. 1996), and its efficacy in children with autism has been suggested in an open trial (Gevers et al. 2006). To our knowledge, this is the first RCT for this training program, and one of the few for Theory of Mind treatments worldwide.

The study included 40 children with HFASD aged 8–13 years old. During 16 one-hour weekly group meetings children were trained on precursors of Theory of Mind (perception, imitation, emotion recognition, pretence), elementary Theory of Mind understanding (belief and false belief understanding) and advanced Theory of Mind understanding (second order reasoning and the use of irony and humour). Parents were provided with five psycho educative sessions, and they were actively involved in the training. We specifically aimed to examine the effect of the current training using outcome measures with different levels of complexity. This enabled us to more closely delineate at what specific levels the training was effective. Both mental state and emotion understanding were therefore measured at elementary and advanced levels. Furthermore, we assessed self-reported empathy, and we asked parents to judge the social skills of their children before and after the treatment.

It was hypothesised that the treatment group would increase their Theory of Mind skills in comparison to the waitlist control group on all domains. Furthermore, it was expected that the conceptual skills, as measured with the mental state and emotion understanding tasks, would improve at a higher rate than the practical skills, as measured by the parent reports. Finally, it was expected that children with PDDNOS would benefit more from the treatment than children with Autism or Asperger’s syndrome.

Method

Participants

Participants were 40 children with HFASD, aged 8–13 years old. Inclusion criteria were a clinical diagnosis within the Autism Spectrum (Autism, Asperger Syndrome or PDD-NOS), and IQ scores within the normal range (70 or above), as measured by the short version of the Dutch Wechsler Intelligence Scale for Children (WISC-III; De Kort et al. 2002). The ASD participants were recruited from an academic centre for child and adolescent psychiatry in Amsterdam, the Netherlands (de Bascule). Their diagnostic classification was based on assessments on multiple occasions by a child psychiatrist and multiple informants (psychologists and educationalists). All participants fulfilled established diagnostic criteria according to the DSM-IV-TR (APA, 2000). For all children, additional diagnostic information was obtained with the Social Responsiveness Scale (SRS; Constantino et al. 2003), the parental-reported Autism Spectrum Quotient (Baron-Cohen et al. 2001, 2006), or both. These measures confirmed the ASS diagnoses of all participants (Table 1). The VU University of Amsterdam Human Ethics Committee approved the project. Informed consent was obtained from parents and assent from children.

Table 1 Baseline Demographics and Clinical Features of Treatment and Waitlist Groups (n = 36)

Intervention

The Theory of Mind training (Gevers et al. 2006; Steerneman et al. 1996) is a manualized treatment program, that includes 16 weekly sessions of approximately 1,5 h each, provided to 5 or 6 children simultaneously, with a mutual age difference that does not exceed 3 years. All sessions were supervised by certified therapists, and every last 15 min, children are joined by their parents, who are informed about the meeting and briefed about the assignments for the next meeting. In addition, parents attend monthly trainings, where they are informed about the content of the training and about the progress of their children, and given suggestions on how to promote social cognition through playing games and story telling.

The training includes 53 structured sessions, which increasingly focus on the use of Theory of Mind skills. After initially highlighting precursors of Theory of Mind, such as listening to others, making acquaintance, perception and imitation, children are focused on the difference between fantasy and reality, learning to assess a social situation and the recognition of other’s intentions and emotions—happiness, anger, fear and sadness. Following this, attention is given to elementary Theory of Mind skills, such as placing oneself in the thoughts and feelings of another (first order mental state reasoning). Deceipt and deception are central elements of this type of mental state reasoning, and children are focused on deceived others who have a different perspective on reality than they do. Furthermore, the use of imagination is stimulated, and children practise the understanding of humour. A final stage of the training involves practicing with second order mental state reasoning, where embedded mental states are attributed to others (e.g., ‘where does Mary think that John thinks he will find the toy?’). Each activity is dealt with in a specific session, which become increasingly more difficult as the training proceeds, see Steerneman (1994) for detailed descriptions of all sessions, and Steerneman et al. (1996) for various specific examples of the training approach.

To sustain treatment integrity therapists received training in the procedure and were required to follow a manual that delineates each treatment on a session-by-session basis (Steerneman et al. 1996). Furthermore, a random 10% sample of therapy sessions was videotaped for content review and intervention adherence. Therapists received ongoing clinical supervision and training throughout the study.

Randomization

Randomization took place at an individual level after the baseline measurement and 1 week before the start of the interventions. Subjects were randomized into and intervention group (n = 20) and a waiting list control group (n = 20). An independent researcher made the allocation schedule. The waitlist control group started with the intervention after their waitlist period. Unfortunately, we were unable to obtain data from three children (one from the treatment group, two from the waitlist group) at Time 2 (see Fig. 1).

Fig. 1
figure 1

CONSORT diagram showing disposition of the entire sample

Outcome Measures

The ToM Test

The Theory of Mind test (Muris et al. 1999; Gevers et al. 2006) is a 72-item standardized interview targeting the Theory of Mind understanding of 5–13 year old children. The interview includes stories and drawings, and focuses on precursors of ToM (22 items, including perception and imitation, emotion recognition, pretence and physical-reality distinction), elementary Theory of Mind (38 items, including first order belief reasoning, false belief understanding) and advanced Theory of Mind (12 items, including second order belief understanding, understanding and complex humour). The internal consistency of the task ranges from .80 to .92. Concurrent validity of the ToM test with traditional ToM tasks is high (r between .37 and .77) and the test–retest reliability was satisfactory (ICC between .80 and .99). Discriminant validity of the ToM task was supported by worse performances of children with ASD compared to typically developing children, but also compared to children with other psychiatric disorders (i.e., ADHD an anxiety disorders). Furthermore, these differences remained when controlling for intelligence (Muris et al. 1999).

The Levels of Emotional Awareness Scale for Children (LEAS-C)

The LEAS-C is a performance based assessment of the structure and complexity of emotional awareness, including 12 scenarios where the subject has to imagine him or herself in a hypothetical interaction with another person. Subjects are asked to describe the feelings of themselves and the other person following scenarios that are designed to elicit the emotions happiness, anger, sadness or fear. The scoring of the LEAS-C determines the degree of complexity in children’s responses with regard to their own and the other person’s emotions. Four levels of awareness are differentiated with increasing complexity: 1. Somatic (e.g., ‘I would feel sick’), 2. Action (e.g., ‘I would feel like smashing the wall’), 3. One-dimensional emotions (e.g., ‘I would feel happy’), or 4. Multiple emotions (e.g., ‘I would be angry but also a bit sad’). Besides the awareness of these emotions in oneself and others, specific scores were awarded for responses that acknowledged own and others mixed emotions (e.g., ‘I would feel happy because I won, but sad for my friend’), or complex emotions (e.g., ‘She would be jealous’). Mixed and complex emotions could be attributed to the self (one point), the other (two points) of both self and others (three points). Internal consistency ranges from .64 to .71, and convergent validity was acceptable, based on significant correlations with emotion expression (.30) and emotion comprehension (.28) (Bajgar et al. 2005; Gooren et al. 2008).

Self Reported Empathy

The Index of Empathy for Children and Adolescents (Bryant 1982) measures empathy in children of 6 years and older, targeting various emotional reactions. It includes 10 dichotomous items such as “It makes me sad to see a boy who can’t find anyone he can play with”. The scale shows adequate internal consistency (ranging from .68 to .79), good test–retest reliability, ranging from .81 to .83, strong convergent validity with affect based empathy scales (r = .76), and was not related to reading achievement (Bryant 1982).

The Children’s Social Behaviour Questionnaire (CSBQ) is a 49 item parent questionnaire, relating to six scales: behaviour and emotions not optimally tuned to the social situation, reduced social contacts and social interests, orientation problems in time, place, or activity, difficulties in understanding social information, stereotypical behaviour, and fear of and resistance to changes (higher scores indicate more problem behaviour). This questionnaire has been shown to discriminate between TD and children with ASD. The internal consistency is .90, and interrater reliability is .80 (de Bildt et al., 2009; Hartman et al., 2006). In the current sample, the reliability of the CSBQ was .92.

Data analysis and Presentation

Data analysis focuses on estimating the size and clinical importance of the effects in the population based on the sample data, using Time 1 and Time 2 difference score means, 95% confidence intervals (CIs), effect sizes, and clinical importance, as recommended by CONSORT (Moher et al. 2001) and the American Psychological Association (2001). One-way between groups analyses of variance are reported to compare how groups, on average, differ in gains. Between-group effect sizes were calculated according to Cohen’s d. Effect sizes of 0.8 can be assumed to be large, while effect sizes of 0.5 are moderate, and effect sizes of 0.2 are small (Cohen 1988).

Results

Preliminary Analyses

Baseline differences in demographic and clinical characteristics were investigated using Chi-square tests and analyses of variance (ANOVA). No significant group differences were found in terms of chronological age, gender, diagnoses, verbal, non-verbal and full scale IQ, SRS and AQ score. In Table 1, the descriptive information is shown for all demographic data.

Improvement in Theory of Mind as a Function of Training

The effect of the Theory of Mind training was first examined on the total score of the Theory of Mind test. The treatment group showed significantly more improvement in their Theory of Mind understanding than the control group, F(1, 34) = 5.01, p < .03, d = .75. When analysing the improvement on the three subscales of the Theory of Mind task, in particular the elementary Theory of Mind tasks, including first order and false belief reasoning, showed a strong improvement compared to the control group, F(1, 34) = 9.00, p < .01, d = 1.00. No treatment effect was found on the more basic precursors of Theory of Mind, including perception, imitation and emotion recognition, and more advanced ToM, including questions regarding second order beliefs and humour (Table 2).

Table 2 Means (SD), difference score, effect sizes and 95% confidence intervals (CI), and summary statistics for one-way between groups Anova for treatment and Waitlist control groups

Improvement in Emotion Understanding as a Function of Training

Training effects on the level of emotional awareness were analysed separately for own, others, and total emotional awareness. No difference was found between the increase on these measures for the treatment and the control groups. However, the treatment group improved significantly compared to controls on their report of mixed emotions, F(1, 31) = 6.39, p < .05, d = .84, and complex emotions, F(1, 31) = 13.26, p < .01, d = 1.19 (Table 2).

Improvement in Self Reported Empathy and Parent Reported Social Skills as a Function of Training

No effects of the training were found on self reported empathy, F < 1, or on parent reported social skills, F < 1 (Table 2).

Diagnosis and Co-Morbidity

When treatment effects were analysed separately in children with PDDNOS or high functioning autism (HFA)/Aspergers syndrome, the PDDNOS group performed in keeping with the overall analysis. Treatment effects were found on the total Theory of Mind scores, F(1, 18) = 3.29, p < .05 (one-sided), d = .79, and in particular on the elementary Theory of Mind tasks, F(1, 18) = 6.23, p < .05, d = 1.08, and the understanding of mixed emotions, F(1, 18) = 5.06, p < .05, d = .97, and complex emotions, F(1, 18) = 6.37, p < .05, d = 1.09. Interestingly, the HFA/Asperger syndrome group only showed improvement on their understanding of complex emotions, F(1, 9) = 6.18, p < .01, d = .1.41, but not on any of the other measures. The self reported empathy or parent reported social skills did not show treatment effects in the separate diagnostic groups.

In addition, running the main analyses without the children with co/morbid ADHD or learning disorder showed the same results as the overall analysis including all children: the treatment was effective on conceptual Theory of Mind skills and complex emotion understanding, but failed to show an effect on self reported empathy or parent reported social behaviour.

Discussion

While Theory of Mind training programs are no novelty for individuals with autism, the evidence for their effectiveness has not been well researched, with only two RCTs to date. It is particularly acute to establish evidence-based treatments for autism because of the variety of novel and alternative treatments that are offered to this group, some of which can be dangerous to the child (Wadman 2008d).

The current study was conducted to examine the effect of training Theory of Mind skills on conceptual understanding of Theory of Mind and emotion, self reported empathy and parent reported social skills in children with HFASD. Effects of the treatment were found on the conceptual understanding of Theory of Mind, in particular on the ability to reason about beliefs and false beliefs, and on the understanding of mixed and complex emotions. Other conceptual measures such as precursors of Theory of Mind (perception and imitation, emotion recognition, pretence and physical-reality distinction), advanced Theory of Mind (second order reasoning and understanding humour), and the awareness of emotions were not affected by the treatment. Furthermore, the Theory of Mind training did not improve children’s social skills according to their parents, nor their self reported empathy.

As expected, the treatment had a higher impact on conceptual abilities than on daily life skills, as reported by children’s parents. Within the conceptual domain, it should be noted that no effects were found on the precursors of Theory of Mind and basic emotion understanding. This finding may be unsurprising given the children’s cognitive abilities and their average age of 10 years at treatment onset. However, it should be noted that the children did not perform on ceiling level on any of the precursor tasks (Table 2). These findings may be used to focus the treatment on more advanced levels of Theory of Mind and emotion understanding, though the absence of conceptual improvement on advanced Theory of Mind understanding may also indicate the limitations of their conceptual growth abilities.

The understanding of beliefs and false beliefs, which are a main focus of the treatment, were shown to improve relative to the waitlist control group. This replicates, for the first time with a RCT, that children with HFASD can be taught to understand beliefs, desires and emotions (Hadwin et al. 1996; Swettenham 1996). Still, the current finding may be unsurprising, given that the treatment program includes a strong focus on belief and false belief reasoning, and, more importantly, because the extensive belief and false belief material from the Theory of Mind understanding scale was also used during the training. Effects on treatment incorporated outcome measures, which are absent on measures outside the program have been found repeatedly in children with ASD (Golan and Baron-Cohen 2006; Fisher and Happe 2005; Turner-Brown et al. 2008), suggesting “teaching to the test” effects. Still, previously found suggestions of a link between Theory of Mind understanding and everyday social and behaviour in ASD give some merit to training these conceptual skills: they may have some bearing on children’s actual behaviour (Peterson et al. 2009).

Generalization difficulties of individuals with ASD have been reported for decades (Klin et al. 2003). These difficulties may represent a strong tendency to conceive the world systematically, without making the uncertain assumptions that are needed to generalize skills to new situations (Golan and Baron-Cohen 2006). This tendency may be intrinsic to the autistic disorder, and one way forward may be to look for specific types of children with ASD that show better generalization skills. Alternatively, treatment programs may be specifically designed to train the generalization of behavioural skills rather than conceptual understanding. On a more fundamental level, it may even be the question whether social behaviour should be conceived of as the generalization of social understanding. Young typically developing children often show adequate empathic behaviour, while unable to pass the most elementary Theory of Mind tasks, and adults who pass the most advanced conceptual Theory of Mind tasks have been shown to fail in direct measures of their perspective taking abilities in natural situations (Begeer et al. 2010). In addition, the ToM training may be too broad in its current form, and could be revised to focus on more specific areas, which may then be measured with specific outcome measures.

Children with PDDNOS may benefit more from the treatment than children with HFA/Asperger. While this finding is in need for replication, and more measures are needed to differentiate between PDDNOS and HFA/Asperger, it highlights an important issue with regard to treatment in autism. Children with ASD show large individual differences. While it is often difficult and time consuming to find appropriate treatments for each individual child, it would be extremely useful to delineate whether children with PDDNOS benefit more from this type of treatment than children with HFA/Asperger, who may be questioned to benefit from the treatment at all. Besides the DSM categories of the autistic disorder, in may also be useful to highlight other possible categories, based on age, IQ, or severity of the disorder. Unfortunately, due to the size of the current sample, we were unable to identify predictors or moderators of treatment effects in a meaningful way. Further limitations of the current study include the absence of diagnostic instruments such as the ADOS and the ADI-R, and the absence of follow up data. Furthermore, the fidelity checks on the treatment were currently based on 10% of the treatments, leaving the possibility that the other sessions had lower fidelity.

In short, the current study suggests that the Theory of Mind treatment could be a promising intervention for children with HFASD, but further study is indicated. The conceptual Theory of Mind understanding increased relative to a control group, while self reported empathy and parent reported social skills remained stable. Important issues for future studies are the use of more sensitive measures of daily life Theory of Mind skills. The current CSBQ questionnaire, like many parent questionnaires of social skills, focus on broad domains of behaviour, which may have minimal sensitivity to change (e.g., ‘he or she lives in her own world’). Using ratings of specified Theory of Mind related behaviour over a fixed period of time may provide more sensitive outcomes. Furthermore, it should also be considered to refrain from focusing on social skills, and highlight how children experience their functioning before and after the treatment period, to test the possibility that treatment may not increase children’s objectively measured social skills, but could enhance their quality of life nonetheless, by increasing their self-esteem.