Introduction

Despite a broad consensus on the behavioural phenotype of autism in the guidelines of ICD-10 and DSM-IV-TR, unifying cognitive characteristics are a matter of ongoing research and debate. In addition to the three prominent circumscribed neuropsychological approaches to autism (i.e., attention to detail, executive dysfunction, lack of theory of mind; see Hill and Frith 2003, for a review), research has been carried out into the more general intellectual capacities of this population. Findings indicate that autism can appear at all levels of intelligence, although there is probably a peak in the mentally retarded range. However, the reported rate of coexisting mental retardation varies considerably in epidemiological studies (e.g., 100% in Wignyosumarto et al. 1992, and 40% in Baird et al. 2000). Furthermore, individuals with autism produce spiky IQ profiles in multidimensional tests, with weaknesses on verbal subtests and marked strengths on visuospatial subtests such as the Block Design (e.g., Happé 1994). The weak central coherence account of autism has been derived from such findings (Shah and Frith 1993).

A reliable estimation of IQ in autism is important for many reasons: intelligence has proven a good predictor of outcome (Gillberg and Steffenburg 1987) and guides the selection of intervention type. Conversely, a systematic underestimation of intelligence may further increase the stigma that some individuals with autism experience and may adversely affect opportunities in everyday life (e.g., employment).

The most widely used and best-studied IQ batteries in autism are the Wechsler Intelligence Scales for Children and Adults (WIS; Wechsler 1958, 1991). It has been recommended that the WIS should be applied in clinical practice whenever possible, “because they provide valid measures across a large number of relevant constructs and yield profiles of functioning that can be readily translated into intervention objectives” (Klin et al. 2005, p. 788). It has also been pointed out that the WIS are preferable for matching IQ levels in scientific settings (Mottron 2004).

The Coloured and Standard Raven’s Progressive Matrices (RPM) (Raven et al. 2003) are nonverbal visual reasoning tests. They are considered to be culturally fair power tests and are often used as unidimensional measures of fluid intelligence. Recently, Dawson et al. (2007) reported children and adults with autism to score about 30 percentile points higher on the RPM than on the WIS. As the authors deem the RPM a paramount metric of reasoning and problem solving, and therefore a valid measure of general IQ, they conclude that intelligence has been underestimated in individuals with autism. Put differently, this conclusion suggests that while RPM do allow fair IQ testing in autism, the WIS do not, which would dispute the utility of the WIS as the standard IQ measure for clinical or research purposes. Given the WIS’s wide distribution, such a postulate requires additional evidence such as longitudinal data on prognostic validity of both scales regarding functional outcome. As a first step, the current study sought to replicate and elaborate the results by Dawson et al. on RPM versus WIS performance in autism.

Method

Participants

The total sample comprised n = 48 individuals with idiopathic autism (AUT), n = 28 individuals with other psychiatric diagnoses and no family history of an autism spectrum disorder [CLIN: attention deficit hyperactivity disorder (ADHD)(7), conduct disorder (CD)/ADHD (5), CD (3), social phobia (4), dyscalculia (2), personality disorder (2), language disorder (2), dyslexia (1), mental retardation (1), mutism (1)] and n = 25 neurotypical controls (NT)(see Table 1). The CLIN and NT were included to ensure specificity of possible RPM versus WIS differences for AUT. CLIN and NT were not matched to AUT, but age and sex independent normative values were applied and statistics did not yield significant effects of age, sex, and other potentially confounding variables on the criterion (see results).

Table 1 Demographic information and IQ measures for the individuals with autism (AUT), the mixed clinical (CLIN) and neurotypical (NT) control groups

Participants with autism and other clinical diagnoses fulfilled the ICD-10 research criteria for the conditions. In addition, probands with autism met the autism algorithm cut-offs on the German versions of the Autism Diagnostic Interview-Revised (Bölte et al. 2006) and the Autism Diagnostic Observation Schedule (Rühl et al. 2004), which have demonstrated good reliability and validity (Poustka et al. 1996; Bölte and Poustka 2004). Participants of the AUT and CLIN groups were consecutive inpatients and outpatients, and were recruited within the clinical routine of the Department of Child and Adolescent Psychiatry at Frankfurt/M. University. The neurotypical control participants were students, volunteers of other ongoing research projects at the department, and a few (4) unaffected siblings of individuals with a condition unrelated to the autism spectrum (mutism, eating disorder, social phobia, Down syndrome). All study participants and their relatives, respectively, gave informed written consent and the study was approved by the local ethics committee.

Measures

The German forms of the WIS for Children–Third Edition (Tewes et al. 1999) (WISC) and WIS for Adults–Revised (Tewes 1991) (WAIS) were used in this study (see Table 1). The WIS are the most frequently administered tests to measure general intelligence in international clinical practice, which holds true also for German child and adolescent psychiatry (Bölte et al. 2000). They comprise nonverbal and verbal scales, yielding separate performance and verbal IQs and, as a composite of both, a general or Full Scale IQ. The verbal subtests are Information, Similarities, Arithmetic, Comprehension, and Vocabulary. The non-verbal subtests are Picture Completion, Coding, Picture Arrangement, Block Design, and Object Assembly. The WIS are realisations of the concept of intelligence by David Wechsler, which intends to assess intellectual capacities multidimensionally with special reference to daily activities using items of face validity.

The German adaptations of the Coloured (CPM), the Coloured Board-form (CPMFB) (puzzle), and the Standard RPM (SPM) were used (Bulheller and Häcker 1998, 2002) (see Table 1). The CPM assesses attention to detail (Part A), pattern matching (Part AB) and the ability to analyze and reason about nonverbal stimuli (Part B). Each part contains 12 items. The SPM are similar and require pattern recognition and analogy to choose among a multiple-choice array of six to eight possibilities for the correct answer in a series of black-and-white designs that have a portion missing. The answers depend on appreciating spatial, design, or numerical relationships for pattern matching or completion. There are 60 items divided into 5 parts (A, B, C, D, E). In the CPMFB, each pattern or matrix is separately presented in the form of a board from which the part required for completion has been removed. The options from which a choice has to be made are available as movable pieces. By placing a selected piece in position, people can see the results of their judgement. Use of the Board Form makes the testing easier for certain clinical patients. The CPMFB form was applied if participants had difficulties performing one of the paper-pencil RPMs.

Procedure and Data Analysis

RPM and WIS were routinely administered in random order as part of the diagnostic process. As the current study is retrospective in nature, all assessments were made independently of the study’s research question. Pearson correlations were computed to determine quantitative associations between IQ measures. RPM IQ scores, WIS Full Scale IQ, WIS verbal IQ, WIS performance IQ, and WIS scaled scores (for subtests) were used for data analyses. In addition, percentile ranks (PR) were used for reporting of results. Repeated measures ANOVAs with type of IQ test (WISC/WAIS; CPM/SPM/CPMFB) as within group factor and diagnostic group (AUT, CLIN, NT), age group (><16 years, compare Dawson et al.), gender (male, female) and IQ group (WIS IQ ><85; normative intelligence versus learning disability/mental retardation) as between group factors were computed to determine effects (Table 1). IQ groups were introduced to examine whether RPM versus WIS differences might be associated with qualitative aspects of intelligence. The normative WIS IQ subgroup (>85) (M = 111.1, SD = 11.8) consisted of 16 participants with AUT, 13 CLIN and all 25 NT. There were 46 males and 8 females with a mean age of 18.9 years (SD = 9.0). The learning disability/mental retardation WIS IQ subgroup (<85) (M = 60.4, SD = 17.5) included 32 individuals with AUT and 15 CLIN (32 males, 15 females), with a mean age of 14.4 years (SD = 5.3).

Results

Mean WIS Full Scale IQ, WIS Verbal and Performance IQ, as well as mean RPM IQ are listed in Table 1. Although RPM IQ was higher in the autism group, the difference was much smaller (9 versus 30 percentile points) than reported by Dawson et al. In fact, repeated measures ANOVAs yielded a significant interaction between diagnostic group and IQ group (F (1, 82) = 4.8, p = 0.03, partial η 2 = 0.06), indicating that a WIS versus RPM difference was present only in individuals with autism in the low functioning IQ range that was not present in the mixed clinical or control group (see Fig. 1). There were no main effects of type of IQ test (within subject) diagnostic group, age group or sex (between subjects)(F < 1.7, p > 0.19, partial η 2  < 0.02) or any other interaction effects.

Fig. 1
figure 1

RPM, and WIS IQs (Full Scale, Verbal, Performance) of the individuals with autism (AUT) and clinical (CLIN) as well as neurotypical (NT) control groups stratified by WIS IQ (> <85). Error bars are 2 SEMs

We found the characteristic IQ profile of autistic individuals with Block Design and Object Assembly yielding the highest scores (M = 8.6/PR = 32 and 8.2/27, respectively) and Comprehension and Picture Arrangement the lowest scores (M = 4.9/PR = 5 and 5.4/7, respectively). Correlations between WIS Full Scale scores and RPM IQ were high for all three groups (AUT: r = 0.74, CLIN: r = 0.99, NT: r = 0.64; p < 0.001). While in AUT associations were somewhat higher between WIS Performance IQ and RPM IQ (r = 0.76) than between WIS verbal IQ and RPM IQ (r = 0.59), this was not the case for the CLIN (r = 0.99–0.99), and NT groups (r = 0.61–0.65)(all p’s < 0.0001).

Discussion

Our data confirm the findings by Dawson et al. of higher mean RPM than WIS IQs in individuals with autism. However, differences were much smaller in our study (9 vs. 30 percentile points) and restricted to autistic individuals with WIS IQs in the lower functioning range. In addition, as the sample presented by Dawson et al. had a higher average IQ level than ours, the identified total sample difference might have been even smaller than 9 percentile points in a group matching Dawson et al.’s in this respect. The observed difference between WIS and RPM IQs in lower functioning individuals with autism seems to be specific to autistic conditions, since clinical control subjects with low IQs did not show this effect.

The association between the WIS and RPM IQ scores was high not only in the neurotypical and mixed clinical groups, but also in the autistic group. Thus, both tests seem to share a similar source of variance and probably measure qualitatively comparable capacities in autism, even though in the total group RPM findings on average yield a higher estimate of level of intellectual functioning than the WIS.

The result of a sizeable discrepancy between WIS IQ scores and RPM IQ for people with autism who have WIS IQ estimates in the learning disability or mentally retarded range and perhaps language difficulties advocates for the use of more culture reduced, language fair tests like the RPM for those individuals. In contrast, for individuals with autism with average or above average IQ the WIS seem to provide fair judgements of intellectual capacities while also providing a much broader picture of cognitive functioning, owing to the WIS’s multidimensional design. In fact, for high functioning individuals in the autism spectrum that present with superior language skills, such as those with Asperger syndrome, an opposite bias might be introduced. Here, the exclusive use of Raven IQs could lead to a systematic underestimation of true intellectual abilities. Numerous previous studies have shown relatively better Wechsler verbal IQ for individuals with Asperger syndrome and relatively better Wechsler performance IQ in individuals with autism (e.g., Miller and Ozonoff 2000; Mottron 2004; Ehlers et al. 1997; Hayashi et al. 2008). Those patterns have been equated with higher fluid intelligence in autism and higher crystallized intelligence in Asperger syndrome (Ehlers et al. 1997), which would indeed predict underestimation of IQ in Asperger syndrome by tests of fluid intelligence such as the Raven.

In conclusion, the claim that intelligence has been underestimated in autism seems somewhat premature, even though this study did not use exactly the same methodology as Dawson et al. For instance, different kinds of samples (NT, CLIN and AUT children, adolescents versus NT and AUT) and versions of the RPM (German forms of CPM, SPM, CPMFB versus Canadian SPM) and WIS (German forms of WISC-III and WAIS-R versus Canadian WAIS-III) were investigated, which limits comparability. We found that a discrepancy between WIS IQ and RPM IQ is present only in low functioning individuals with autism, while scores are comparable in high functioning individuals. Therefore, we suggest that while the WIS should be considered the first choice IQ measures in high functioning individuals with autism, additional testing with nonverbal, culture fair scales like the RPM is recommended in the lower end of the spectrum.