Introduction

Autism Spectrum Disorders (ASD) are a group of related developmental disorders that are characterized by impairments in reciprocal social interaction, language development and intentional communication, and restricted interests and stereotyped motor behaviors (American Psychological Association 1994). Once considered to be a rare disorder, ASD is now estimated to occur in as many as one in 150 births or even more (Centers for Disease Control and Prevention 2007). ASDs are almost universally regarded as life-long conditions, although the severity of cognitive, language, social and adaptive skill impairments varies widely among children and across time within children. However, in recent years, it has been claimed that a significant minority of children with well-documented ASD have recovered. In this paper, we will (1) define “recovery”, (2) present evidence that supports the phenomenon of “recovery” in ASD, (3) briefly review the evidence concerning child and treatment characteristics that can lead to “recovery”, and (4) suggest mechanisms that might underlie “recovery” from this neurological developmental disorder.

Defining “Recovery”

Improvement in all aspects of ASD, including language, adaptive skills, academics, social interaction, and decreased repetitive behavior, among others, has been well documented in the treatment literature, and especially in the studies that describe behavioral treatments (Filipek et al. 2000; Harris and Handleman 2000; Howard et al. 2005; Lord and McGee 2001; Myers and Johnson 2007; Sallows and Graupner 2005). However, except in a few behavioral studies (reviewed below), improved behavior and skills do not reach levels within the normal range. We present a general, and then a more specific, definition of what is meant here by “recovery”. What do the children recover from and what do they recover to?

In order to be defined as “recovered”, a child must first have a convincing history of ASD. Some of his/her development will have been delayed in onset, slow to progress, and/or abnormal in quality. To be considered “recovered”, the child must now be learning and applying a core set of skills at a level and with a quality that reaches the trajectory of typical development in most or all areas. A corollary of this is that there will probably have been a period in the child’s development in which his/her progress was more rapid than normal; in fact, accelerated learning has been reported by Sallows and Graupner (2005) and Howard et al. (2005). Furthermore, the recovered individual no longer meets criteria for any ASD.

The term “recovery” or “best outcome” was probably first used by the UCLA group, headed by Lovaas, in describing the outcomes of their program of intensive Applied Behavior Analysis (ABA) therapy (Lovaas 1987). They used the term to describe children whose IQ had risen into the average range and who were functioning in regular education classrooms. However, Mundy (1993) pointed out that it is not easy to demonstrate recovery to full normal functioning, and that children with high functioning autism who still show clear autistic symptoms might also have average (or higher) IQs and be able to function in a regular classroom. We agree that more needs to be demonstrated to warrant the term “recovered”.

In our current study of 8–18 year old children with a history of ASD who are now “recovered” or have reached “optimal outcomes”, we use the following specific definition (the definition might need to be modified for different age groups):

  • By history: (1) The child was diagnosed with an ASD in early childhood (i.e., by age 5) by a specialist (i.e. someone whose practice is at least 50% devoted to autism). (2) There was early language delay (either no words by 18 months or no word combinations by 24 months). (3) Review by one of our team, blind to current group membership, of early reports (age 2–5) and/or videotapes, with diagnostic formulations elided, confirms early ASD.

  • By current functioning: (1) The participant does not meet criteria for any Pervasive Developmental Disorder, including PDD-NOS (at least one symptom in social domain plus one additional symptom), which generally means that no social symptom of ASD is present by best clinical judgment. (2) The participant does not meet ASD cutoff on social or communication domain of the Autism Diagnostic Observation Schedule, (3) any special education services the participant receives are to remediate difficulties with attention, organization, or specific academic difficulties and not to address features of autism, (4) the participant is functioning without an individual assistant in a regular education classroom, (5) VIQ, PIQ, and FSIQ are all at 78 or above (1.5 standard deviations below average), (6) Vineland Communication and Socialization Scales are all at 78 or above.

These are the working criteria for our current study, although modifications may be made as we study the individual children. Some children with clear early ASD clinical pictures may show no convincing early language delay; in addition, some recovered children with excellent social skills with familiar adults and children have social anxiety with strangers and show mildly elevated ADOS scores as a consequence. It will also be seen that children who no longer meet criteria for an ASD but are functioning in the mentally-retarded range are not considered “recovered” by these criteria. In addition, there are many possible impairments or diagnoses that are not ruled out: For example, the child may have clinically significant problems with attention, learning disabilities, and psychiatric diagnoses such as anxiety (including social phobia and obsessionality) or depression. Also, the requirement of language delay, if it is retained, precludes an initial accurate diagnosis of Asperger’s Syndrome; the purpose of this exclusion is to eliminate children with normal early development with later emerging eccentric personalities. Some children that we include may have had an early Autistic Disorder or PDD-NOS diagnosis and then later received an Asperger’s diagnosis (before their apparent recovery), or they may have received an early diagnosis of Asperger’s despite language delay. The definition of recovery, furthermore, applies to the behavioral level, and is neutral with regard to neurobiological mechanisms by which this behavioral “recovery” is achieved.

Many researchers and clinicians are highly skeptical about the possibility of “recovery” in ASD, believing, for example, that if ASD is an organic condition, “recovery” is necessarily unattainable (Schopler et al. 1989). In particular, Mundy (1993) raises some cogent objections. He rightly criticizes the claim that “recovered” children function within the normal range emotionally and socially, and even cognitively, without the support of extensive documentation; a normal range Vineland and IQ score is insufficient. Mundy points out that weak executive functions can co-exist with an IQ in the normal range, and might be expected to characterize children with high functioning autism who have not recovered, as might obsessive or odd thoughts, or depression. In his comment on McEachin et al. (1993) (discussed further below), Mundy points out that about half of the ‘best-outcome’ children had elevated scores on a personality test, and that therefore “...it seems difficult to interpret these data as evidence for the achievement of normal functioning in the best-outcome group.” (p. 383).

However, such residual challenges do not, in themselves, refute the possibility of an optimal outcome. The claim of recovery from autism does not necessarily entail the claim of fully normal cognitive, social, and emotional functioning. Children who have recovered from autism are at risk for other disorders, and thus may not be fully normal. Research is required to identify these continuing vulnerabilities, both for what they can tell us about autism, and for what they can tell us about the children’s continued treatment needs.

How can “recovery” from a neurodevelopmental disorder be possible? There are at least three fundamental but not mutually exclusive answers. The first is that these children did not really have an ASD to begin with. We will review below the extensive pretreatment similarities between children who “recover” and those who do not. On the behavioral level, the two groups are very hard to distinguish. The second possibility is that there are forms of ASD that are alleviated with maturation alone. Third is that successful treatment moved children who otherwise would have retained the full ASD picture off the spectrum. Since most children who receive the best intervention do not recover, the treatment alone cannot be responsible. Some combination of child and treatment characteristics therefore seems the most likely possibility.

Having offered our definition of “recovery”, we will dispense with the quotation marks, keeping in mind that losing the behavioral characteristics of ASD is what is meant here by “recovery”. In some cases, “optimal outcome” will be used as a synonym for recovery.

Evidence for Recovery

Outcome research in the field of ASD has historically focused on broad-based measures of functioning (intellectual level, adaptive behavior, living and working situations) primarily in adults. Only recently have studies begun to focus on more specific outcome measures in older children and adolescents. We will briefly discuss some of the seminal outcome studies in adults, and then in younger children. This is by no means a comprehensive review of the many outcome studies; we will focus primarily on those that document cases in which autistic behavior and cognition disappear to the extent that an ASD diagnosis is no longer warranted and/or cognition or adaptive living skills are within the normal range.

In one of the first contemporary adult-outcome studies, Gillberg and Steffenburg (1987) found a generally poor prognosis for individuals who had been diagnosed with autism as children. Only one out of their 23 participants was living independently. Living independently, working full-time, being married, and having friends have generally been considered to be indicators of an optimal outcome in the outcome literature, at least for adults. Similar results were reported by Billstedt et al. (2005), who found that of 108 individuals followed from childhood, only four were living relatively independently and only one was in a long-term relationship; however, this study included both low- and high-functioning individuals and therefore would be expected to show a low proportion of good outcomes. In a study of 58 high-functioning adolescents and young adults with a history of autism, Ventner et al. (1992) found a wide range of outcomes. While a few individuals were doing quite well (which was defined as being mainstreamed in school or, if older, living independently), generally the individuals in this study still required extensive help in their daily lives and were quite dependent upon their parents.

Over the years, adult outcome studies, like that of Ventner et al. (1992), have frequently found a handful of individuals in their samples that have achieved an optimal outcome, including fully independent living and some successful relationships. Perhaps the first study to hint at the possibility of individuals with an ASD diagnosis losing the diagnosis was by Rutter (1970). In this early longitudinal outcome study, they found that 1.5% of the original group were functioning normally on follow-up, while the rest were divided between “fair or good” adjustment (35%) and severely handicapped (60%). Higher numbers of recovered individuals in the more recent studies described below perhaps result from the great improvement in early intervention and educational services in later years.

In a review of the outcome literature, Seltzer et al. (2004) found that the core symptoms of autism tend to improve by adulthood, especially communication deficits. Restricted and repetitive behaviors become subtler and more complex. Seltzer and colleagues found that in a number of adult outcome studies it appeared that about 10–20% of the sample no longer met criteria for a diagnosis on the autism spectrum. They did note, however, that in the majority of outcome studies, the criteria for a good outcome are very poorly defined. It is also unclear if these individuals actually no longer met the criteria for a diagnosis on the autism spectrum since standardized diagnostic instruments were not always used.

More recently, studies that have examined only higher-functioning individuals have tended to find a higher proportion of individuals who are achieving good or optimal outcomes. Howlin et al. (2004) studied a group of 68 adults who had an IQ score greater than 50 as children. Although the majority of this sample was still living with their parents or in residential care, one-third of the sample was working and two of 68 had gotten married. This study found that social and adaptive outcomes were more highly correlated with verbal than with performance IQ. They concluded that having an IQ over 70 is necessary but not sufficient for an optimal outcome. Similar findings were reported by Szatmari et al. (1989). They assessed 16 very high-functioning individuals with a history of an ASD, and found that four no longer met criteria for any ASD as adults. Of these 16 individuals, one had married, one lived with a roommate, and three lived alone. Eight of them were able to manage their own finances. Perhaps most impressively, at least eight of the 16 scored within the normal range on all subscales of the Vineland Adaptive Behavior Scales. This is particularly striking as studies have generally found that adaptive behavior generally lags well behind IQ and remains problematic throughout the lifespan in individuals with ASD (Eaves and Ho 2004; Loveland and Kelley 1988, 1991; Ventner et al. 1992).

An early diagnosis of Asperger’s Syndrome (AS) carries a better prognosis than Autistic Disorder (AD). In a group of individuals with an early diagnosis of AS, 19 out of 70 were either employed or in school full-time AND were either living independently (over age 22 years) or had two or more friends or a steady relationship (age 22 years and under) (Cederlund et al. 2008). Using the Diagnostic Interview for Social and Communication Disorders, they found that 12% of their AS sample no longer met criteria for a diagnosis on the autism spectrum. None of their AD group had moved off the spectrum or had achieved a relatively high level of functioning as adults; however, the majority of this group functioned in the mentally-retarded range as children. Similarly, PDD-NOS carries a better prognosis for recovery than Autistic Disorder (Lord et al. 2006; Sutera et al. 2007).

Clearly, there are a handful of high-functioning individuals on the autism spectrum who appear to improve to a great extent by adolescence or adulthood. These adult outcome studies, however, do not clearly demonstrate when this improvement may occur. Moreover, because the assessment is conducted so long after the original diagnosis, it is more difficult to assess which factors might be predictive of this optimal outcome. Some studies that have examined outcome in younger samples have begun to address these questions. Beadle-Brown et al. (2000), in a review of the child outcome literature, found that, generally, self-care, communication, and educational achievement tended to improve over the course of childhood and level off in adolescence. They found that the higher the IQ of the children, the greater the gains that were made. No reference was made to any optimal outcome studies in their review; however, very little research had examined optimal outcome in children with an ASD at the time this review was written.

Several, more recent studies have examined outcome in middle childhood or adolescence. Sigman and Ruskin (1999) followed children longitudinally, with some cases dating as far back as 1979. In their sample of 51 children who were originally diagnosed with an ASD (at a mean age = 45 months), 17% lost their diagnosis of an ASD over time (at a mean age of 154 months). Fein et al. (1999) and Stevens et al. (2000) studied a large group of preschool children with ASD and 95 of them were followed to school age (7 or 9 years old). At preschool, cluster analysis indicated that the children could best be classified into low- and high-functioning groups based on cognitive scores (with a nonverbal IQ of 65 the best dividing line and the two groups about equal in size). At school age, again, the ASD group was divided into a lower functioning group, which was now much bigger (n = 71) and a higher functioning group, which was smaller (n = 24). In general, as has been found by others, the lower-functioning preschool group tended to lose ground relative to peers, while the higher-functioning group tended to show improvement in standard scores. Although this paper did not explicitly discuss optimal outcome or loss of diagnosis, the smaller, high-functioning group showed mean verbal and nonverbal scores within the normal range, and few autism symptoms; many of them would probably have met our current definition for recovery. Gabriels et al. (2001) also found two clearly separable developmental trajectories in a group of individuals with ASD who were studied from preschool to school-age. Unlike Stevens et al. (2000), however, Gabriels and colleagues found that in both groups, IQ tended to increase across development. They hypothesized that the children might become more testable as their autistic symptoms wane over time. Although no studies, to our knowledge, empirically assess whether the children with autism are actually more cooperative in testing situations over the course of development, many clinicians do report this observation, and it is certainly a factor to keep in mind when interpreting the results of these outcome studies.

An additional study reports on 11 cases of children with clear early histories of an ASD in which the clinical picture evolved into cases of ADHD with no autism, about equally divided into inattentive and combined type (Fein et al. 2005). As in other studies, more of these children had original PDD-NOS than AD diagnoses. Interestingly, nine of the 11 showed evidence of a regressive history, and ten of the 11 had recurrent ear infections. Eight had received intensive ABA therapy, while the other three had received intensive, preschool classroom interventions that included some ABA methods. In contrast to the later age reported by Sigman and Ruskin (1999), average age of loss of ASD diagnosis was at age 7 years; in some cases, the children may have lost their ASD behaviors earlier, but the diagnosis was not withdrawn immediately because the clinician did not want to be premature or the child was not seen right away. However, age seven is consistent with the UCLA findings (McEachin et al. 1993). Some of the children in the Fein et al. study had some mild residual features, including social awkwardness (but more of the ADHD than autistic type), and mild perseverative interests. The authors suggest that attention may have been a core symptom in the early development of these children with ASD; when the other core symptoms resolve, the attention problems persist, resulting in a clinical picture for which “ADHD” is the best description.

A similar series of cases was reported by Zappella (1999, 2002, 2005a, b). These were young children with PDD who subsequently evolved into cases of ADHD and/or Tourette’s Syndrome. They were predominantly male, most showed a regressive course, the initial autistic behaviors resolved, and they were left with tics, many with co-morbid ADHD. Zappella also notes extensive family histories of tics and ADHD in this series. He also notes that none of his cases were treated with ABA, but most did receive a developmental therapy that is described in Zappella (2005a).

Were the apparently recovered individuals misdiagnosed in early childhood and did they not really have an ASD? Two types of evidence bear on this question: First is whether the children who show later recoveries are behaviorally distinct in early childhood (e.g. have milder or qualitatively different symptoms); this will be considered below. Second is the stability of an ASD diagnosis in early childhood. Several recent studies have documented the accuracy of ASD diagnoses in children under the age of 3 years old (Charman and Baird 2002; Cox et al. 1999; Eaves and Ho 2004; Gillberg and Steffenburg 1987; Kleinman et al. 2008; Lord 1995; Moore and Goodson 2003; Stone et al. 1999; Sutera et al. 2007; Turner and Stone 2007). In a review of the earlier literature, Kleinman et al. (2008) report that between 75% and 95% of children diagnosed before 3 years old retained an ASD or non-ASD diagnosis at later evaluation. They found that 81% of children retained an ASD diagnosis between the ages of 2 and 4 years old, and none gained an ASD diagnosis. Stability was particularly good for clinical judgment, Autism Diagnostic Observation Schedule (ADOS) diagnosis, and Childhood Autism Rating Scale (CARS) score, and less so for the Autism Diagnostic Interview (ADI). Eaves and Ho (2004) also assessed children at age 2 and 4 years old; three of the 49 children moved off the autism spectrum (one of the 34 with AD and two of the nine with PDD-NOS); 94% retained the ASD diagnosis. As in the Kleinman study, no children moved onto the autism spectrum between 2 and 4 years of age. Turner and Stone (2007) found somewhat lower stability: 68% of children diagnosed with an ASD at age 2 years retained that diagnosis at age 4 years, while Sutera et al. (2007) (using some of the same participants as Kleinman) reported that 82% of 2-year-olds with ASD retained the diagnosis.

Therefore, diagnostic stability even in children as young as two is good; although a number of children move off the spectrum in each study, the overall percent of children who retain their diagnosis ranges from 68% to 95%, and few if any children move on to the spectrum following an evaluation at age 2 years. This, in addition to the pretreatment similarity of recovered and persistent ASD children (see below), suggests that early misdiagnosis is not a major factor in apparent recovery.

A number of studies have investigated optimal outcome specifically in older children, where the issue of diagnostic error is presumably less of a factor than the impact of treatment or the child’s characteristics. Lovaas and his colleagues (Lovaas 1987; McEachin et al. 1993) described a group of children who had undergone intensive behavioral therapy as young children and seemed to be indistinguishable from their typically-developing peers. These studies suggested that a significant proportion of children with autism can benefit appreciably from this early intervention. The benefits accrued include intellectual functioning within the average range and being mainstreamed into regular classrooms without requiring any extra support. Although these two studies have been criticized for their lack of experimental rigor in the assignment of individuals to the different treatment groups (Gresham and MacMillan 1998; Schopler et al. 1989), they still clearly document a group of children who had improved to function normally or close to normally. Several studies have failed to confirm the “best outcome” group (Anderson et al. 1987; Birnbrauer and Leach 1993); however, Howard et al. (2005) point out that these children started with lower IQ’s and did not receive comparable intensity of treatment as in the UCLA studies, or for as long. Sallows and Graupner (2005) investigated the differences in outcome between clinic- and home-based behavioral interventions. However, rather than finding differences between groups they found differences within groups; that is, they found group differences between a group of ‘rapid’ and ‘moderate’ learners who were evenly distributed across the clinic- and home-based intervention groups. The rapid learners (11 of the 23 children) made striking gains between intake to the study in preschool and follow-up 4 years after completion of treatment. These gains occurred across many areas of functioning including language, adaptive behavior, autistic symptomatology, and intelligence; indeed their mean full-scale IQ increased from 55 to 104. Furthermore, eight of the 11 no longer met criteria for an ASD according to the ADI-R. Forty-eight percent of the children reached ‘best outcome’ status, scoring normally on tests of IQ, language, adaptive functioning, school placement, and personality, with mild elevations on some personality and diagnostic scales (two of the rapid learners were given parent scores in the clinically-significant range on “worrying,” and teachers rated one rapid learner as high on aggression). Three of these ‘best outcome’ children needed classroom aides for attention problems, and one would probably still meet criteria for ASD, but the remaining seven or eight children would probably meet our criteria for optimal outcome (OO), outlined above. Of these children, one still had language problems on the ADI, and one had rigid play, but no other autism features.

We have been conducting an additional longitudinal study of a group of optimal outcome children who no longer meet the criteria for an autism diagnosis on the ADI-R or the ADOS, and who have been mainstreamed into regular classrooms without the help of an aide (see above for research criteria for “optimal outcome”). At the first data collection point (Kelley et al. 2006), the 14 optimal-outcome children with a history of autism were between the ages of 5 and 9 years old. Although all of these children no longer carried a diagnosis on the autism spectrum and were mainstreamed without help, they continued to experience some subtle difficulties in certain aspects of language. Specifically, the optimal-outcome children performed well within the average range on tests of receptive vocabulary, grammar, and verbal memory (Kelley et al. 2006). They also demonstrated intact grammatical competence on less structured, experimental tasks and a narrative task. However, the optimal outcome group continued to experience difficulties with the more semantic aspects of language. They had more difficulty in understanding the certainty differences between mental state verbs, such as think and guess, versus know. Categorical induction was problematic; they had difficulties extending the properties of an object to a new object based on the semantic label alone in comparison to their typically-developing peers. Interestingly, this inability to extend properties was more problematic with animate rather than inanimate categories. In addition to their semantic language difficulties, the optimal outcome group continued to experience problems with social-cognitive and pragmatic language tasks. They scored lower than their typically developing peers on tests of theory of mind, or the ability to understand that others have mental states that may be different from ones’ own. Although the optimal outcome group experienced no grammatical difficulties while telling a story from a wordless picture book, they were less likely than their typically-developing peers to discuss the goals of the main character and the causes of various events in the story, elements that are considered key to a well-structured narrative. Moreover, the optimal outcome group was more likely than the control group to misinterpret what was going on in the story.

Since some of these children were still quite young and had only recently lost their diagnosis or been mainstreamed, it was unclear whether they would continue to close the gap with their typically-developing peers or whether behavioral and cognitive problems would re-emerge as they entered adolescence. Thus, we decided to re-evaluate them approximately 3 years after the first study to further explore their strengths and weaknesses. Moreover, in addition to comparing the optimal-outcome children to their typically developing peers, we also compared them to a group of children with ASD whose intelligence was in the average range, but whose diagnoses clearly persisted. This high-functioning autism (HFA) group was expected to perform at the same level as the optimal-outcome and typically-developing groups on standardized tests of vocabulary and grammar, but show clear autistic symptomatology, and semantic and pragmatic language difficulties (Tager-Flusberg 1997). All children in the study were between the ages of 8 and 13 (Kelley et al., in preparation). They were tested on a large battery of language tests assessing grammar, semantics, and a number of pragmatic tasks. Additionally, we assessed their adaptive behavior as measured by the Vineland, as well as socio-emotional functioning as assessed by the Behavior Assessment Scales for Children. The pattern of test results was consistent across all measures: On all measures, the typically-developing children had the highest average scores, followed by the optimal-outcome group, and the HFA group showed the lowest level of functioning on all tasks. Additionally, the optimal outcome group, as a whole, scored within the normal range on all tasks and only the high-functioning ASD group scored in the impaired range on some of the standardized tests. Specifically, the HFA group scored in the impaired range on tests of pragmatic language, verbal memory, expressive language, general communication and socialization, and daily-living skills. Our typically-developing group, which was matched on age and socioeconomic status to the OO group, was above average in intelligence, however, and thus there were a number of areas in which the optimal outcome group scored significantly lower than the typically developing group, including pragmatic language. The OO group also scored lower than the typically developing group (but well within the average range) on parent ratings of attention problems, atypical behavior, and depression. On the numerous other tasks that we used to assess these groups, the children in the optimal-outcome group were statistically indistinguishable from their typically developing peers. In sum, we appear to have found a group that, with the possible exception of some very subtle pragmatic deficits, is currently functioning at the same level as their typically developing peers, and we are continuing to follow this group.

Predictors of Outcome: Child Characteristics

Although the mechanisms of improvement for any given child are not known, a combination of treatment characteristics and the child’s own characteristics probably contributes to cognitive, behavioral, and diagnostic status in later life. A few studies have examined early predictors of development and symptomatology at an outcome point. It should be noted, however, that the vast majority of these studies examined predictors of relative severity of behavioral and cognitive impairments at outcome, rather than optimal-outcome status. It can be presumed that the indicators of relative improvement would also predict recovery, but this has not yet been proven.

The most consistent prognostic indicator is early communication and language abilities (e.g., Mawhood et al. 2000; Ventner et al. 1992). Luyster et al. (2007) found communication scores at age 2 years, and especially age 3 years, to predict language and other outcomes at age 9 years. After they covaried for nonverbal IQ and age at the final time period, age three receptive and expressive language scores significantly predicted age nine verbal and nonverbal IQs, receptive and expressive language, and ADI-R/ADOS composite score. Use of symbolic and communicative gestures at age 2 also predicted age nine verbal IQ, expressive language, and adaptive skills. The predictive value of expressive and receptive language, and gesture suggests the importance of early symbolic and imitative skills as foundational skills that may make intervention more effective. Charman et al. (2003) also examined potential predictors of language outcomes in young children with ASDs (evaluated at 20 months and 42 months of age). They found that the children who met criteria for AD in early life had significantly poorer language outcomes than children with PDD-NOS diagnoses as well as more impaired initial, joint attention. Language outcomes were also positively associated with early, joint attention but not with play or “goal detection;” however, there were significant floor effects on these variables. In contrast to Luyster’s study, initial NVIQ in this study was not related to later expressive or receptive language skills. Toth et al. (2006) found that early imitation, joint attention, and toy play were good predictors of later language, and that joint attention, in particular, mediated the relationship between social engagement and language.

Dietz et al. (2007) found a high correlation between IQ scores as measured by the Mullen Scales of Early Development at 24 months and 43 months of age. However, there was significant heterogeneity in the scores; 12 of their 39 children had their scores increase by at least one standard deviation (15 points) and three children displayed a commensurate decrease in their scores. The children whose scores increased had milder initial delays. In the Sigman and Ruskin (1999) study, early, joint attention skills predicted later expressive language and early play skills, and nonverbal communication abilities predicted peer engagement in later childhood. Goldstein (2002) found that verbal imitation, IQ and age, together, strongly predicted language outcome. Gabriels et al. (2001), in their study of differential outcome after 3 years of treatment, found that no early characteristics significantly predicted outcome, although initial IQ scores tended to predict outcome. There was a 21-point gap in initial developmental IQ between the children who responded best to intervention as compared to the “low outcome” group; this difference was 51 points at follow-up.

None of the aforementioned studies specifically examined predictors of optimal outcome. In the Sallows and Graupner (2005) study mentioned above, initial status was examined to see what would predict membership in the “rapid learning” group after 4 years of ABA-based treatment. They found that optimal outcome at follow-up was predicted by a combination of pretreatment verbal and nonverbal imitation skills, language ability, and social interest, where higher initial skills in these areas predicted better functioning post-treatment. The most accurate regression model was produced by a combination of pretreatment verbal imitation and ADI Communication score. Individual scores, however, were not strong predictors: in examining the pretreatment scores of the rapid vs. moderate learners, only NVIQ showed a substantial difference (14 points); most other scores of the two groups were very close (e.g. Vineland Communication 61 vs. 59, receptive language 39 vs. 38). In their 4 to 6 year follow-up of 27 children who received services at an ABA-based preschool center, Harris and Handleman (2000) also found that higher baseline IQ predicted higher, later cognitive functioning. In addition, they found that age when treatment was begun was related to classroom placement in elementary school. Those who were younger when treatment was initiated (mean age = 42 months) were more likely to be in inclusive classrooms while those who were older (mean age = 54 months) when they began treatment were more likely to be in special-education classrooms. There was no correlation between age when treatment began and initial IQ. Autistic symptom severity, as measured by the CARS, was not predictive of later cognitive functioning or classroom placement.

Sutera et al. (2007) reported on 13 preschoolers who were diagnosed with an ASD at age two but who failed to meet criteria for an ASD diagnosis at follow-up at approximately age four. These 13 children were drawn from a sample of 73 children who were evaluated initially after screening positive on the M-CHAT (Robins et al. 2001) and who were given an ASD diagnosis. There was a significant difference in early diagnosis: of the children who were initially diagnosed with PDD-NOS, 39% exhibited optimal outcome while only 11% of children with Autistic Disorder did. Aside from diagnosis, the only significant differences between the optimal outcome and persistent ASD groups at age two were in the motor area: Mullen Fine Motor and Vineland Motor Skills were significantly higher initially in the optimal outcome. This difference in early motor skills may be due to these measures serving as proxies for underlying cognitive and/or neurological impairments. There were no other statistically significant differences between those children with optimal outcome and those who remained on the spectrum on initial nonverbal skills (as measured by the Mullen), expressive or receptive language skills (from the Mullen and Vineland), socialization (Vineland), or measures of autistic symptoms (CARS and number of DSM-IV-TR symptoms), except that receptive language and IQ scores showed trends; a larger sample of optimal outcome children might well show significantly higher scores in these areas. Similar predictive value of diagnostic status was found by Lord et al. 2006; they followed children with a diagnosis of either PDD-NOS or Autistic Disorder from age 4 years to age eight or nine. Only one child with Autistic Disorder lost the diagnosis by age 9 years, whereas almost half of the children with PDD-NOS lost the diagnosis.

Remington et al. (2007) also examined a subset of preschool-aged children who achieved a “best outcome” status in their comparison study of EIBI and a treatment-as-usual group. The children were evaluated at baseline and then again after 2 years of intervention. They did not look at diagnostic change but defined best outcome as “reliable and clinically-significant change” in IQ scores based on Jacobson and Truax’s criteria (1991). They found that five of the 23 children who received early, intensive behavior intervention and three of the 21 comparison group children showed such change. Exploratory analyses suggested that the most positive responders had higher initial IQ, mental age, Vineland Communication and Socialization scores, more behavioral problems as reported on the Developmental Behavior Checklist Autism Algorithm, and fewer hours of individual intervention in the second year as compared to those children whose IQ’s diminished. Preliminary examination of our own M-CHAT sample also suggests that for some measures, there may be an inverse relationship between number of hours of intervention and outcome; we presume that this is because the highest functioning and most rapid responders are eventually given fewer hours of service.

In the Fein et al. (1999) and Stevens et al. (2000) studies mentioned above, almost all the lower-functioning group at preschool stayed in the lower-functioning school-age group, whereas the higher-functioning preschool group had divergent outcomes, with some going into the lower-functioning school age group and the remainder forming the small, high-functioning school age group. When examining specific preschool predictors of group membership at school age, cognitive and developmental variables (receptive vocabulary standard scores, nonverbal IQ, Vineland Socialization and Communication) strongly differentiated the groups, while degree of autistic symptomatology in any domain failed to differentiate the groups. As with some of the aforementioned adult studies, therefore, early appearing higher intelligence level was a necessary, but not sufficient, factor in predicting optimal school-age outcomes. Turner and Stone (2007) found that the children who were more likely to move off the spectrum were those who were under 30 months of age when diagnosed, had milder social impairment, and higher intelligence levels. They did not find any differences between those who moved off the spectrum and those who did not on the amount of intervention received, although this may have been an issue of restricted range. Turner and Stone conclude that, “maturation alone may lead to significant improvement in symptoms for some children” (p. 799).

One promising predictor is early response to intervention. Although not strictly speaking a pretreatment characteristic, a small number of studies document that rapid responses to intervention are positive predictors for later outcomes (Newsom and Rincover 1989; Weiss 1999). In particular, early learning of verbal and motor imitation and receptive language is important in predicting outcome (Weiss 1999). This is certainly a fruitful avenue for more study.

Therefore, severity of autistic symptoms is not a good predictor of optimal outcome, but better cognitive and motor development, and a PDD-NOS rather than AD diagnosis are predictors of optimal outcome. It is hard to reconcile the lack of power of symptom severity to predict outcome, with the better outcome for PDD-NOS over AD. Two possibilities present themselves: one is that the presence of restricted, repetitive behaviors per se, rather than severity of social and communication symptoms, is the poor prognostic feature. The other is that children with AD tend to be lower functioning intellectually than those with PDD-NOS diagnoses. Both of these possibilities are supported by the literature (Szatmari et al. 2006; Gabriels et al. 2005; Lord et al. 2006). It is also interesting to note that in our current sample of optimal outcome children, their IQ is not only higher than non-recovered children, but significantly above average in some cases. Ongoing and future studies should investigate whether above-average IQ is a predictor of recovery.

While the above articles describe behavioral factors which may be related to outcome, there are physiologic factors which, when finally identified and investigated, will have far greater predictive value. The fact that ASD varies across such a wide range of severity, and that behaviorally similar children can respond very differently to the same intervention, makes this obvious. Accelerated head growth may be one such marker of a biological subtype, as well as seizures. In their meta-analysis of studies involving participants with ASD, Aimet et al. (2008) found that seizures are associated with intellectual disability (ID), with higher seizure rates in children with more significant intellectual impairment. They found that the pooled prevalence of seizures was 21.4% in individuals with ID (n = 1485) but only 8% in those participants without ID (n = 627). Early onset of seizures, especially infantile spasms or medication-refractory seizures, are associated with a poorer prognosis for children with ASDs (Saemundsen et al. 2007a, b; Danielsson et al. 2005) “Secondary autism” that complicates other conditions also has generally poorer outcomes. These conditions include congenital rubella, tuberous sclerosis complex, Fragile X syndrome, Joubert syndrome, Down syndrome, and many other genetic disorders (Peake et al. 2005; Asano et al. 2001). The poorer prognosis is probably due to the underlying neurological deficits that produce mental retardation, limiting amount and speed of learning independent of the autistic behaviors. In addition, children with idiopathic ASDs who also have other disabilities, especially sensory impairments, may have more limited potential for recovery. Children who exhibit high levels of stereotyped behaviors that are resistant to behavioral and pharmacological management (especially motor and object stereotypies, and delayed echolalia) face additional challenges because these self-stimulatory behaviors limit the ability of the child to attend to interventions and to engage in adaptive behaviors. They also tend to be associated with lower developmental or intelligence quotients (Bishop et al. 2006; Szatmari et al. 2006).

Two studies bear on the predictive value of head circumference development. Elder et al. (2008) conducted a records review of 77 younger siblings of children with a confirmed ASD diagnosis (considered at high risk for ASD) to examine whether early differences in head circumference predicted later ASD diagnosis for the younger siblings. Head circumference slopes and intercepts at twelve months of age were associated with social and communication (but not repetitive behavior) symptoms at age 22–24 months as well as M-CHAT critical items at the same age. In addition, the rate of z-score change in head circumference was associated with social symptoms; the slope was steeper for those children with more social impairment. Children with more communication symptoms had larger head circumference at 12 months of age, with a slope that leveled off more quickly between 12 and 24 months of age. Mraz et al. (2007) and Mraz (2007) also examined growth records to see if the abnormal patterns of growth reported by Elder et al. (2008) and a number of others (e.g., Courchesne et al. 2001) would differentiate the optimal-outcome children from those with persisting ASD. However, the optimal-outcome group had the same pattern of head growth as the ASD group—normal or slightly small at birth, and accelerating until about 1 year of age, then leveling off. The optimal outcome group, however, did show less acceleration of body length and weight, showing values close to the CDC averages for these variables across the first 2 years, while the ASD group showed an acceleration of length and weight that paralleled their head circumference. Thus, while accelerated head circumference in the first year has been confirmed to statistically predict the emergence of autistic symptoms, especially in children at risk, it does not seem to predict the possibility of an optimal outcome.

Which Characteristics Improve?

Another way of asking this question is to ask what residual or comorbid problems the recovered children experience. There are almost no data bearing directly on this question, but a few observations can be made: In Howard et al. (2005) the behavior-therapy group is described as a whole, rather than separating out the “best outcome” children. Some data, however, are very suggestive: As a group, the behavior therapy group did extremely well in most areas; the scores that were somewhat below average were in the areas of receptive language, expressive language, and self-help (although these data were collected after only 14 months of therapy, they suggest which areas might be most difficult to remediate). In the Kelley et al. (2006) study described above, which focused on language functioning, there were residual problems with higher-order language functions, including constructing a narrative, discourse, and social cognitive problems such as understanding the subtleties of mental-state verbs. The Fein et al. (2005) case series suggests that children who move off the autism spectrum are still at risk for significant attention problems, as well as some subtle social difficulties and perseverative interests. Our current study should shed some light on this question; although data collection is still ongoing, preliminary examination of the functioning of the optimal outcome group suggests minimal difficulties with executive functions (cognitive functioning by testing and behavioral functioning by parent report), verbal memory, or other standardized IQ and language tests (Rosenthal et al. 2008; Tyson et al. 2008). In addition, the optimal outcome and HFA groups appear to have significant psychiatric co-morbidities, whereas the typically-developing controls do not. Specifically, the 12 optimal outcome children examined so far showed present or past history of depression (1), phobias (8), ADHD (4), and tics (2). Zappella (2005a) reported tics and attention problems in his series of children who moved off the spectrum. In the Sallows and Graupner study (2005), similarly, one or more of the best outcome children were “worriers”, had still-delayed social skills, preoccupation/inattention, or somewhat poor communication skills. Bailey (2001) examined children who met criteria for the UCLA “best outcome” status, and found that they obtained lower scores than typically-developing children on most measures of social competence, especially parent rating of inappropriate behavior (but social functioning within the normal range was not required for “best outcome” status in the UCLA definition). McEachin et al. (1993) also followed a group of nine “best outcome” UCLA patients to an average age of 13 years old. They were administered an IQ test, a Vineland Adaptive Behavior Scales, and a Personality Inventory for Children. Except for one child, they were still in regular education settings. IQ’s ranged from 99 to 136, confirming the tentative findings of our current study that high IQ may facilitate recovery. Vineland scores, including Socialization, were overall at an average level, but several of the children had borderline Socialization scores. Aside from one child whose scores were of questionable validity (the same child who was no longer in regular education), the personality scores were mostly normal, with one child elevated for delinquency, two borderline for social withdrawal, and one borderline for psychosis (odd behaviors). Most important, blinded clinical assessors did not discriminate the eight still-best outcome children from the children with no histories of ASD.

Thus, although data are quite preliminary, the residual vulnerabilities of the recovered children appear to include anxiety (especially social anxiety), depression, tics, attention problems, and perhaps continuing difficulty with higher-level, complex social and language interactions.

Predictors of Outcome: Treatment Characteristics

A comprehensive treatment of this issue is well beyond the scope of this paper. The Journal of Autism and Developmental Disorders’ special issue (volume 30 (5), 2000) is devoted to treatments. Rogers and Vismara’s (2008) recent comprehensive review is also focused on evidence-based treatments. It becomes apparent that no treatment has been subjected to the same level of examination as Lovaas’ behavioral approach and treatments stemming from it. In addition to some pharmacological approaches, psychosocial treatments such as Pivotal Response Training (PRT) and the Denver Model have shown promise in single-subject designs but have not been held to the same level of empirical scrutiny. Rogers and Vismara (2008) separated treatment protocols published between 1998 and 2006 into three effectiveness categories: “well-established”, “probably efficacious”, and “possibly efficacious”. Lovaas’s treatment is the only protocol that meets criteria for being “well-established” because it incorporates a treatment manual and has clearly specified participant groups. It has been shown to be better than placebo or alternative treatments by two independent well-designed group studies, and has been studied by several single-subject design studies (Rogers and Vismara 2008). See Smith et al. (2000), Howard et al. (2005) and Eikseth et al. (2002), in particular, for comparisons of behavioral treatment to other therapies, or clinic vs. parent-directed behavioral treatment. None of the remaining treatment protocols in the Rogers and Vismara review fell into the “well-established” category because they lack rigorously obtained empirical support (Rogers and Vismara 2008).

Pivotal response training (PRT) developed by Koegel et al. (1999) uses both developmental and applied behavior analysis procedures to increase a child’s motivation to participate in learning skills within the domains of communication, language, play, and imitation (Schriebman and Koegel 1996). Although PRT does not meet the necessary criterion of strict empirical group comparisons, Rogers and Vismara suggest that this treatment protocol should be considered “probably efficacious” because of the large numbers of independent single-subject design studies that have demonstrated PRT to be effective compared to other treatments.

The Denver Model integrates behavioral, developmental, and relationship-oriented intervention to enhance function in language and developmental domains and is described in detail by Rogers et al. (2000). In short, this treatment technique has a curriculum and makes use of specific teaching techniques (trials and naturalistic behavioral exchanges) within an interpersonal relationship to teach necessary skills. A number of pre-post studies have been conducted demonstrating improvements across a range of skills for children who participate in this treatment (Rogers et al. 2006). Like PRT however, the Denver model has not been compared to other treatment approaches in a controlled manner and hence, can only currently be classified as probably efficacious.

Three interventions that are included in the Rogers and Vismara (2008) review were deemed “possibly efficacious” because these studies compared their interventions to other protocols and found their interventions to be effective. Aldred et al. (2004) implemented a combination of parent-training pragmatic language workshops, speech and language therapy, the North Carolina TEACCH model, and social-skills training. A second treatment protocol included a parent-trained group who implemented techniques to foster joint attention and behavior management in a naturalistic setting and the parents received in-home speech and language consultation every 6 weeks, for 3 h (Drew et al. 2002). The local services group received a mixture of standard treatments (speech and language, occupational therapy, etc), with some parents receiving direct treatment, and three children receiving in-home 1:1 discrete trial training for an average of 33 h/week. Third, Jocelyn et al. (1998) implemented a 12-week protocol targeting language, social, and play development, and decreasing unwanted behavior, delivered by trained child care workers in a typical day care center and at home with their trained parents (15 h of training and additional consultation). The control group attended community day care alone.

Most of the published treatment studies compare relative outcome of groups receiving two different treatments, or different intensities of the same treatment. They are generally not designed to examine retrospectively the treatment parameters for the best-outcome children. Examination of the studies mentioned previously in documenting the existence of recovery is not generally informative about treatments received by the most successful children, but there are some clues. Gabriels et al. (2001) noted that children in their “high outcome” group received an average of 40.3 more hours per month of intervention. Although this difference was not statistically significant, the authors suggested that it may reflect differential treatment effects in community-based settings in which children with initially higher developmental ability may be given more hours of intervention. Of the 11 children in the case study conducted by Fein et al. (2005), eight received intensive ABA therapy and three of the children were in an intensive preschool program with interventions that included some ABA techniques (but this was determined by record review, with no random assignment to groups). While the combination of treatments for children diagnosed with autism in the Sigman and Ruskin (1999) study is largely unknown because treatment data were collected by parent questionnaires, it is known that 93% of the children in the autism group were enrolled in special education. Of the 93%, 83% of the children received speech and language therapy, 25% received play therapy, 25% received physical therapy, and 45% received therapy focused on social skills. Sallows and Graupner (2005), whose sample included some best-outcome children (see above), compared a clinic-treatment group to a parent-directed treatment group. Both groups received Lovaas treatments, and (unintentionally) did not differ in intensity of intervention. Children in the Zappella (2005a) treatment study did not receive any behavioral therapy, but all were enrolled in some form of developmental therapy.

Although none of the studies found significant treatment differences between the children who moved off the spectrum and those who did not, measuring treatment is generally done by measuring treatment quantity and type rather than quality, which is much more difficult to assess. In addition to the confounding factor that Gabriels et al. (2001) suggested, another potential confound could work in the opposite direction: children who make slower progress are sometimes given more intensive treatment, making interpretation of the relationship between progress and treatment intensity very difficult. All of the children in the studies that reported participants with optimal outcome were receiving at least some level of treatment and thus it is possible that the treatment, in combination with the potential for normal levels of cognition, was responsible for their improvements. While the majority of studies reporting on recovery included some behavioral methodology, this was not always the case.

Preliminary Conclusions About Recovery

It is very difficult to integrate results across studies because both initial and outcome data vary so widely. Also, although many of the studies (e.g. Stevens et al. 2000) meet most aspects of our definition of recovery, they do not explicitly assess whether the participants continue to meet criteria for any ASD.

However, the following tentative conclusions seem to be warranted: (1) A certain number of children with well-documented ASD lose the diagnosis and function within the generally normal range of cognitive, adaptive, and social skills. This improvement may be attributable to treatment techniques, the nature of the original clinical presentation, brain maturation, or other endogenous biological changes such as diminution of neuroimflammation. (2) The percent of children with ASD who can reach this outcome varies widely; studies with unselected samples show anywhere between 3% in the earliest studies to about 25%, although a few ABA studies claim higher rates of success (up to 50%, but some of these started with higher IQ children). (3) Factors that seem to predict the potential for recovery are higher intelligence (when it can be reliably measured), receptive language, verbal and motor imitation, motor development, a diagnosis of PDD-NOS rather than AD, and earlier age at diagnosis and initiation of treatment. Social development, play, and joint attention show more mixed results: although joint attention in particular predicts relative improvement, there is no evidence as yet that early joint attention can predict recovery, although it would make sense that it would. However, severity of autistic symptoms per se generally fails to predict optimal outcome. (4) Physiological factors (e.g. seizures) that are associated with poorer outcome probably mark the presence of significant mental retardation and possibly specific syndromes; head circumference trajectory in the first year fails to predict recovery. (5) Almost no controlled studies directly compare outcome between behavioral vs. other therapies (e.g. developmental stimulation, Denver Developmental model, “Floortime”) or with “biomedical” treatments. Therefore, no definitive statements can be made about which treatments can produce recovery in the greatest number of children. However, although it cannot be stated categorically that behavioral treatment is necessary for recovery, the majority of studies that report actual recovery used behavioral techniques, alone or in combination with other therapies, for some or all of the children, and therapies that include behavioral methods are the most empirically validated. In addition to the well-described learning principles that govern behavior therapy, competent behavioral therapy requires a highly affective, emotionally positive set of interactions that promote the reward value of social interactions and more or less continuous social engagement, especially in very young children. (6) The range of residual vulnerabilities in recovered children is not yet known. Preliminary evidence suggests potential weaknesses in some children in higher order communication functions, as well as possible vulnerability to tics, depression, phobias (including social phobias), and ADHD.

It is very difficult or impossible to predict speed or ultimate level of progress at initial evaluation. However, if a child is seen after a year or more of good intervention and has made limited progress, clinical experience suggests that it is possible to clinically identify the “rate-limiting factor” for that child. For some, it seems to be a significant degree of mental retardation, which places a limit on speed and amount of learning. For others, it seems to be a very significant language disorder, where nonverbal learning may be good, but receptive and expressive language are severely impaired, despite reasonable teaching as well as attention and effort by the child. For yet others, the extent of repetitive behaviors is the limiting factor. Probably both because engaging in repetitive behaviors distracts attention away from learning opportunities, and because these behaviors can become increasingly reinforcing and compulsive with practice, severe repetitive behaviors can interfere greatly with development and behavioral improvement.

A key component of early intervention is that it occurs early enough in development to harness maximum plasticity (Thomas and Karmiloff-Smith 2002; Kolb et al. 2001). Harris and Handleman (2000) showed that optimal outcome, as measured by successful full inclusion, is more likely when intervention starts at an earlier age. Animal models of social deficits provide myriad examples of lesions being more or less consequential dependent upon whether they were inflicted early or later in development. The greater biological plasticity of the infant brain affords more potential for healing. If the brain can be forced to engage in “exercises” that represent normal behavior and cognition, there is more potential for these activities to develop neurological representation. This, however, should not be used as an argument against therapeutic interventions in older children because there is a growing literature on plasticity throughout the lifespan (e.g., Doidge 2007). On the other hand, if a child begins to create alternate experiences for him/herself or to use alternate information processing strategies, the brain’s plasticity will work against him/her by wiring itself in an alternate way, thus making the child an expert at maladaptive cognitive strategies. In some children, maladaptive plasticity may have progressed too far to be reversed. Similarly, just as auditory deprivation may cause cortical elaboration of vision, early deprivation of social stimuli may cause elaboration of other modules, such as spatial skills, in the autistic brain. In the absence of language input, or where a maladaptive strategy such as chunking auditory stimuli into long segments has been solidified, a child’s mental lexicon may develop in terms of “pictures” rather than words and consequently there may be a sophistication of thought processes beyond which a child is unable to reconstruct his/her fundamental units of thought. An emerging structure or schema places constraints on the structures and schemas that can emerge next; Lewis (2004) refers to this principle as cascading constraints. Beyond a certain point, the window in which the brain teaches itself what to learn will have been missed. Many basic cognitive skills are needed early in life to scaffold development of more complex cognitive skills. In short, early-onset neurological disorders such as autism may have the potential for both excellent and devastating outcomes, depending on whether plasticity is harnessed to work for the child or allowed to work against the child. If primary experiences and cognitive skills can be forced early in development, preventing the harder to reverse secondary consequences of the disorder, and if deficits such as severe mental retardation are not present, recovery may be possible.

Possible Mechanisms for Recovery

Our understanding of the mechanisms of recovery will depend on basic assumptions about whether ASDs constitute a unitary disorder, or several disorders, whether they are congenital, or diseases that can arise at different ages in childhood, whether the underlying abnormalities are modular or network properties, and whether autistic deficits and behaviors are fixed or state-dependent. Eventually, adequate explanations of the mechanisms of recovery will need to take into account the following general points:

First, any explanation of recovery that is applicable to the majority of cases needs to encompass great diversity in severity, pattern of impairment and age of onset. For instance, a currently influential approach assumes highly specific (modular) core deficits that limit learning opportunities very early in development, leading to a broadening range of secondary limitations in environment-expectant processes. This approach would predict that the later the autistic deficits present, the milder or less extensive the ultimate impairments would be expected to be. This is because an initial period of normal or near-normal development would offer opportunities for learning that are not available to early onset cases. Therefore, one would expect that the 20–40% of children who regress into ASD in the second year of life or later would be less severely and less extensively involved than the children with early onset. In fact, if anything, the opposite seems to be the case; outcome is generally very similar in regressive vs. non-regressive cases (Werner et al. 2005), and to the extent that there are differences, the regressive children as an overall group tend to have worse outcomes (Rogers 2004), although some studies have reported that a large number of their optimal outcome children have experienced regressive courses (e.g., Fein et al. 2005; Zappella 2005a, b). That being so, we are left wondering why the regressive phenotype is nonetheless so similar to the early onset phenotype in its pattern of impairments and response to therapies.

Second, if there are core deficits in key structures in ASD, focused, mass-trial interventions such as intensive learning sessions applied to the central deficits might be effective. Alternatively, the core deficits might be regarded as untreatable, and efforts could be directed at “by-pass”, teaching alternative approaches to practical goals, and perhaps engaging intact areas of the neural network. But if, as has been vigorously advocated, there are network impairments of a broad organizational type (Happe and Frith 2006; Just et al. 2004; Rippon et al. 2007), then a quite different set of constraints on functioning might be hypothesized. A dearth of long-range cortico-cortical connections would be expected to handicap distant associations and limit the individual to concrete and local solutions. Such impairment in executive processes or abstract thinking would hardly be addressable by intense training, but would call for by-pass. A variant of the network-impairment model is that the network impairments may be a consequence of biological mechanisms such as oxidative stress that are difficult but not impossible to reverse, and that if reversed or even diminished, could have widespread impact on functioning due to a widespread improvement in connectivity parameters (Herbert and Anderson 2008).

In the light of these more general considerations, we present possible mechanisms by which early intervention might result in loss of ASD diagnosis and normalization of surface behavior and cognition. We begin with attempts to avert the full autistic syndrome by attempting to treat before hypothesized core deficits have had enough time to broaden into the full syndrome. This methodology is based on assumptions about the evolution of autism from its early beginning.

If the pre-autistic infant is subject to core limitations which result in a cumulatively-reduced exposure to and experience of the social environment, then secondary detrimental effects on additional brain areas are anticipated, which would culminate in the gradually unfolding full panoply of autistic symptoms, behaviors and cognitive limitations. If early intervention can ameliorate the core limitations, further expansion of the autistic syndrome could potentially be averted. The reader is referred to Mundy and Crowson (1997), Dawson and Zanolli (2003) and Dawson (2008) for additional discussions of this issue.

Dawson (2008) presents a model in which risk factors (genetic as well as environmental factors such as viruses, toxins and intrauterine conditions) lead to risk processes, which are the behaviors, such as very early abnormalities in social interaction and attention, which precede the full syndromic picture. These risk processes prevent exposure to the normal social and linguistic inputs that are needed to drive development during early sensitive periods. Specifically, Dawson suggests that social engagement is necessary for the brain regions that underwrite perception of social stimuli to integrate with areas that mediate reward, thus motivating the developing child to seek social engagement for its own sake, and benefit from the experiences that it offers. These risk processes would be the appropriate targets of intervention, in order to forestall the development of the full syndrome. Furthermore, restriction of early social interaction prevents social contact from acquiring reward value, with all the downstream consequences to the types of learning that require an ongoing social context, and permanent epigenetic consequences to the stress/arousal system. In the model’s timeline, the initial risk processes are most prominent at 6–12 months, after which these basic events (social reward, anticipatory pleasure at being called) form the foundation for more elaborate social and cognitive processes, beginning at 12–18 months, including joint attention, imitation, and intentional communication. The Dawson paper presents a very heuristic model, for which research can focus on filling in the details and testing specific candidates for risk factors, risk processes, and interventions.

The idea that autism develops from a set of core deficits and gradually broadens into a “full syndrome” is, however, hard to reconcile with the fact that the full syndrome arises rather quickly in children who regress into autism, or who become autistic due to encephalopathies. In such cases, why do they not seem to have benefitted from their period of early normality or near-normality?

In the not-too-distant future, it is to be hoped, biological therapies will directly address causes of autism: Identification of missing gene products or verification of neuroinflammatory reactions (Vargas et al. 2005) or abnormal immune response in children with ASD might, for example, lead to direct medical treatments (see Herbert and Anderson 2008, for evidence for some of these processes). The recovered children studied by us and others, and described above, however, have generally not received any biomedical intervention. In this section we consider psychological or biological mechanisms that may underlie the empirically-demonstrated effectiveness of behavioral treatments (in this context, “behavioral” treatments include any treatment that works at the behavioral level, including treatments, such as the Denver model, not usually defined as “behavioral”). The suggestions we make here are consistent with the general Dawson (2008) model, and suggest some more specific mechanisms that might be possible to test.

It is currently not known which specific cognitive or affective mechanisms are impacted by such therapy and how the brain may be changed by such intervention. This question is made more difficult by the fact that autism may be the end point of multiple pathophysiological processes. It may very well be that the different responses of children with ASD to intervention are a function of which cognitive or affective mechanisms are inhibiting an individual child’s learning (which may differ from child to child) and how much plasticity underlies each one. Alternatively, there may be one disease process or set of cognitive mechanisms and the variable responses to intervention may reflect the timing or severity of these processes. Finally, there may be multiple, relatively independent deficits in autism, and intervention may tap multiple routes to recovery simultaneously. Uncertainties aside, we will lay out what we view as the major candidates for the mechanisms of change. These are certainly not mutually exclusive or exhaustive; several may be operating simultaneously, and some are similar. The first three mechanisms are all variants of normalizing environmental input:

  1. 1.

    Normalizing input through forcing of attention

Experience-expectant programming may begin to diverge from the normal developmental trajectory because the inborn deficits of children with ASD prevent their exposure to experiences that allow for typical development (Dawson 2008; Johnson et al. 2002; Mundy and Crowson 1997). For example, human infants are able to categorically perceive phonemes from all languages, but by the age of 1 year are only able to categorically perceive phonemes from their own language (Kuhl et al. 1991; Werker and Tees 1984). Similarly, human infants have been shown to be better at discriminating monkey faces than adults are (Pascalis et al. 2002), leading some to argue that there is a perceptual narrowing, or critical period, for both language and face processing, and that lack of early experience with the stimuli these systems “expect” to encounter will prevent infants from developing specialized face processing or phonetic systems (Dawson and Zanolli 2003; Dawson 2008). Intervention then, by forcing attention to those critical stimuli, hypothetically prevents the catastrophic cascade into an autistic endpoint, and puts in place the necessary cognitive and affective building blocks for typical development to take place. If providing these critical experiences via intervention simply allows development to resume its natural course, resulting in normalization of neural processes, then structural and especially functional brain studies should be similar or identical to those of normal children with no ASD histories.

The most likely candidates for a psychological deficit that would prevent normal environmental input would be abnormal attention or deficient motivation. It has been suggested over the past two decades or so that one fundamental aspect of autism is a very early social disinterest (Baranek 1999; Waterhouse et al. 1996; Werner et al. 2000). That lack of observable interest could be a selective deficiency in a specific brain mechanism, or it could be due to negative reinforcement of social interaction by aversive concomitants such as overwhelming arousal (Kinsbourne 1987) or hyperstimulation from cortical noise (Belmonte and Yurgelun-Todd 2003; Rubenstein and Merzenich 2003). Lacking the keen interest in caregivers and others that drives much of early behavior and learning in normal development (e.g. Trevarthen and Aitken 1994), the entire motivational structure that drives attention and learning would be abnormal. Two related impairments in the operation of attention have been posited: one is inability to disengage attention from the current focus (Courchesne et al. 1994; Kinsbourne 1987). This “sticky” or overfocused attention (Kinsbourne 1987) could confine the acquisition of skills and information to restricted areas, as well as cause severe deficiency in social functions such as joint attention, that require rapid shifting of attention (Courchesne et al. 1994). Indeed, Zwaigenbaum et al. (2005) found social disinterest and inability to disengage visual attention to be among the earliest signs of autism (in the first year) in children at risk for ASD.

A second possible attentional deficit is that on the continuum of inward vs. outward directed attention, individuals with ASD are stuck at the extreme inward end (Kinsbourne 1987). This certainly seems to be clinically observable, in cases where it is difficult or impossible to draw the child’s attention away from inward preoccupations to the instructional environment. It is also consistent with research findings using neuroimaging. Recent studies have delineated a “default network” in the brain, which is activated at rest and deactivated during task performance, and includes medial frontal cortex, anterior and posterior cingulate gyrus and precuneus (Gusnard et al. 2001; Johnson et al. 2006). This network is active when people engage in self-reflective thought. Kennedy et al. (2006) reported that in individuals with ASD, this network fails to deactivate when the child is given tasks to do. This appears to demonstrate that autistic individuals maintain a maladaptive degree of inward-focused attention. One would therefore expect that treatments that lead to recovery would result in the normal occurrence of deactivation when the individual is engaged in tasks.

It is interesting to consider further the timing of the early “disinterest” in, or aversion to social stimuli. Typically developing infants exhibit intense motivation for social interaction (e.g. Trevarthen and Aitken 1994; Yarrow et al. 1975); however, children with autism seem, generally, not to develop the core deficit of social disinterest/aversion until they are more than six months old. So for some children it may not be, as Dawson suggested, that interpersonal interactions fail to become motivating, but that, having been motivating early on, they cease to be so by the end of the first year of age. Perhaps autism is a disease that has its onset or first clinical expression during the first year rather than already by birth, as is usually assumed. Or perhaps social engagement, initially rewarding, becomes gradually less so because of some aversive accompaniment, such as excessive phasic aversive arousal and/or sensory overload due to unstable activating systems charged with the control of excitation/inhibition balance. Early intervention might seek to render these interactions less aversive by controlling external factors that modulate arousal and sensory stimulation, and by attempts at desensitization.

Whatever underlying deficit causes the experience-expectant systems to be deprived of input, intervention would work by forcing attention to the instructionally- and socially-relevant aspects of the environment, thereby normalizing the crucial early input. Typical infants have pre-set and unobstructed biases that amplify features to which attention should be paid, including speech, faces, and gestures. These attentional biases facilitate perceptual processing, leading to imitation, and rapid learning, and culminating in expertise. If this normal attentional bias is absent or obstructed, early intervention might force the child to attend to these stimuli. In this view, treatment bypasses the abnormal motivation system, possibly without ever fully correcting it, or else compensates for the obstruction by amplifying inputs that would otherwise be too weak to overcome the biological obstruction. In other words, treatment prevents the child’s neurologically-based deficit in social orienting from disrupting further neurological development (Mundy and Crowson 1997) by preventing the child from missing out on critical, early social learning. In recovered children, social orienting presumably becomes autonomous at some point and no longer dependent on their attention being specifically directed during interventions.

The accelerated head growth in many children with ASD in the second year of life may reflect or contribute to a failure of this experience-expectant learning. Experience-expectant learning may work by overproducing synapses to be pruned. The overgrowth of the brain that peaks by the second year might reflect a failure of this pruning while an obstruction model would be appropriate if metabolic abnormalities rather than failure of synaptic pruning were at play (Herbert and Anderson 2008). If most underlying cognitive systems are potentially intact or have functional or metabolic changes that are reversible, then future research might show that the brains of recovered children appear similar to those of children with no history of autism.

  1. 2.

    Promoting reinforcement value of social stimuli

A related process is helping the child to associate adults, and then peers, with reward value, promoting motivation to attend to other people. Most behavioral programs utilize conditioning techniques that begin by rewarding the child with primary reinforcers or objects and experiences that have already acquired reward value for the child (e.g. food, tickles, breaks to do preferred activities). Dawson and Zanolli (2003) and Dawson (2008) speculate that this allows the adult to acquire reward value for the child. But since the reward value of the adult does not extinguish when the primary reinforcers are withdrawn, the question arises of whether the learning process involved is best described as conditioning of someone with absent social motivation or, instead, recovery of obstructed motivation.

By explicitly pairing attention and response to other people with reinforcers that create emotional reactions, the child may be described as acquiring what Grossberg and Seidman (2006) refer to as “drive representation” of these social stimuli. Hypothetically, instructions to the prefrontal and sensory cortices amplify the signal on any incoming sensory stimuli to which the child is expected to have an emotional reaction. A drive representation is an expectation that the child will have a positive emotional reaction to a class of stimuli. Once classical conditioning takes place and a child independently experiences emotional arousal in conjunction with social stimuli apart from reinforcement, this circuit is presumably created between the amygdala and the prefrontal cortex. In effect, this mechanism creates the biases that are pre-set in typical infants but that are not originally present or accessible in the autistic infant. Classical conditioning may have more profound effects in very young children as their brains develop salience maps of the world that will guide them for the remainder of their lives. This may allow the connection between social stimuli and reinforcement not to become extinguished once the explicit pairing ceases. It is also possible that by mere repetitive orienting and responding toward social stimuli, the social motive takes on functional autonomy, as can often be seen in the persistence of unreinforced habitual behaviors.

These first two proposed mechanisms imply that underlying cognitive systems are initially intact in an autistic child but that the emotional systems engaging and motivating social attention are not. In this scenario, controlling the focus of the child’s attention and pairing reinforcers with socioemotional stimuli is central to treatment success.

  1. 3.

    Early intervention provides an ‘enriched environment’

A similar possibility for a mechanism underlying effective intervention is that it creates an enriched environment for the child (Dawson 2008). Consistent with the notion that autism comes about when experience-expectant stimuli are not encountered, studies of Romanian adoptees have shown that children who experience an absence of stimulation are at substantial risk for developing some autistic symptoms. Approximately one-third of the environmentally-deprived Romanian adoptees show autistic traits (Rutter et al. 1999). Rutter et al. (1999) note that these symptoms seem to be associated with “prolonged perceptual and experiential deprivation” (pp. 546). The deprivation might lead to the extinction of the normal, early predisposition to seek and enjoy social interactions. A hallmark sign of autism, the restricted and repetitive movements and activities, are also commonly observed in animals kept in small cages, but not in enriched environments (Lewis et al. 2007), consistent with the idea that they are the result of deprivation. Of course, in the case of children with autism the deprivation is neurologically, rather than environmentally, imposed. However, the fact that even under conditions of environmental deprivation that are apparently far more severe than those that autistic behavior imposes on children with ASD, the full autistic syndrome fails to appear, indicates that environmental deprivation, though it may contribute to, cannot account for the bulk of autistic symptomatology.

In addition, it seems to be precisely during the earliest months after birth that the deprivation is least marked, or minimally observable. The degree of deprivation of normal experience in hospitalism and certain orphanages seems to be far more severe than that which autistic children experience in their first year of life, and yet it is deprivation during the first year that has the most severe and enduring, deleterious consequences for mental development. Nonetheless, the prognosis for children who suffered early social deprivation is far better than that for children with incipient autism. Few of them become classically autistic, and their development seems to be suspended during the deprivation rather than permanently impaired. In other words it is possible to overrate the negative effects on the development of key brain areas of environmental restriction. Held and Hein (1963) deprived newborn kittens of any opportunity to explore the environment or even to locomote by harnessing them to other kittens, who did all the moving, for the first 6 weeks of their lives. When the experimental kittens were unharnessed, they were unsteady, tentative and insecure. As little as 48 h later, they were indistinguishable from their control age-mates. At least some developmental skills can develop absent the expected environmental opportunities. Perhaps mental skills are more vulnerable than motor skills. However, recovery should not be considered impossible because certain brain areas are presumed to have failed to pursue a normal maturational course early in life.

Like the first two proposed mechanisms, this one implies that the child’s underlying cognitive systems are intact but his/her motivation and emotional response are dampened. Treatment compensates for this dampened motivation by creating a highly emotional and perceptually-rich environment that causes the child to experience the same (or near normal) frequency and intensity of emotion as other children. Such emotional experiences draw the child’s attention outward. Grossberg and Seidman (2006) proposed that individuals with autism have a dampened amygdala response, meaning that the intensity of a stimulus will have to be much greater than is typically required in order to provoke an emotional reaction. Intervention therapists typically used heightened affect when interacting with children with ASD, create and emphasize emotional situations for the child, and, when teaching cognitive skills, use objects that hold emotional value for the child (e.g., food, tickles, favorite toys, i.e. reinforcers). Thus, what may appear to be an enriched environment is approximating a normal environment for a child with autism.

In addition, enriched environments may even prolong critical periods during which neurons are maximally sensitive to modification by experience (Hensch 2004). This is potentially important, since effective early intervention is often not started until after the third birthday, well beyond what evidence (reviewed by Dawson and Zanolli 2003) suggests is the critical period for automatic face processing as well as phoneme discrimination. On the other hand, the deprivation presumably experienced by inattentive autistic children may itself prolong certain critical periods. It is possible that a lack of competition prevents synapse elimination and extends the sensitive period for many circuits. For example, rats reared with a modulated but limited repertoire of sounds experience early closing of critical periods before functional maturation is achieved. In contrast, rats reared with continuous, unmodulated sounds experience prolonging of critical periods indefinitely (Chang and Merzenich 2003).

The idea of providing an enriched environment is not inconsistent with the normalization of input if the untreated autistic environment is one of functional deprivation. One finding that argues against this explanation is that animals and children suffering severe deprivation tend to have smaller brains and even frank atrophy, which has not been reported in autism.

  1. 4.

    Early intervention provides mass trialing/practice

Animal models amply demonstrate that intensive training can overcome brain-damaged-induced learning deficits and even reverse hippocampal hypoplasia (Loupe et al. 1995). In the intensive training conducted by Loupe et al. (1995), it was crucial to start with easy discriminations and proceed incrementally with gradually more difficult ones, as is done in effective behavior therapy for children. From the material in the foregoing section, we will proceed under the assumption that the best therapy, the most likely to promote recovery and reverse neurological impairment, is instituted early (before age 4 years and the earlier the better), is intense (20+ h a week), incorporates structured teaching using behavior principles, administers large doses of positive affect designed to promote social engagement with adults and later with peers, and involves parents to directly administer treatment or help with generalization and maintenance of skills and to motivate positive affective interactions. One aspect of successful early intervention is that it is intensive; successful early intervention seems to generally entail 20–40 h per week (Dawson 2008); this provides “mass trialing”, or repetitive exposure to stimuli and repetitive practice in acquiring skills that typically developing children do not require. Studies on expertise, rehabilitation of acquired injury, and early intervention for developmental or early onset disorders in humans, all demonstrate that intense repetition of a set physical or cognitive exercises leads to some amount of neural reorganization, either through increased synaptic connections or alterations in the cortical map (Sur and Rubenstein 2005). Children with autism may need extreme amounts of practice or exposure because: (a) the child may have a deficit in implicit learning, (b) the child may have difficulty attending or discriminating because of a relatively undifferentiated cortex or other biological interferences with higher order functions, (c) the child may not benefit from observation or imagination due to a simulation deficit and needs explicit teaching, or (d) the child may have underdeveloped areas of the brain that require extra practice to bring them online.

  1. (a)

    Rubenstein and Merzenich (2003) have proposed that environmental factors affect the neural circuits of young children at-risk for autism, causing premature termination of critical periods before their neural maps are fully differentiated; that is, before neurons have selected their permanent repertoire of inputs from among a wider array of possibilities. This hypothesis carries two sets of implications. Although the brain is capable of learning throughout life, critical periods represent a massive sensitivity of neuronal properties to modification by experience (Hensch 2004). When a critical period is open, a child merely has to be exposed to stimuli such as speech to learn from it, even during sleep (Cheour et al. 2002) whereas after it has expired, a child must deliberately attend to material in order to learn from it (e.g., second language learning beyond early childhood, unlike first language learning, is effortful). Consequently, whereas typically-developing children will implicitly learn what they need to about the world via simple exposure, autistic children may need to be formally taught nearly everything in a tightly controlled environment that ensures attention and effortful learning (see Renner et al. 2000 for a review on implicit learning deficits in autism).

  2. (b)

    A secondary consequence of neural maps being relatively undifferentiated might be chronic overarousal (Goodwin et al. 2006; Hutt et al. 1965; Kinsbourne 1987; Rubenstein and Merzenich 2003) as well as difficulty attending to particularly salient environmental stimuli because of reduced signal/noise ratios. As a result of this lack of perceptual differentiation, areas of the brain would begin to respond broadly to stimuli rather than selectively to the specific stimuli that would normally engage each area of the brain. This would potentially flood the brain with stimuli to which it will respond (with both baseline and stimulus-induced electrical activity heightened and disorganized, placing the child at risk for seizures). Children may attend to all environmental signals, experience abnormal signal modulation, and have difficulty blocking irrelevant stimuli, thus making it hard to attend selectively to what is salient. Support for this hypothesis comes from an animal model. The primary auditory cortex of rats reared in continuous 70dB acoustic noise continues to show the immature pattern of broad, high frequency tuning curves and imprecise tonotopy characteristic of the earliest stage of auditory development (Chang and Merzenich 2003)—a pattern similar to that observed in high functioning autistic participants in psychoacoustic experiments (Plaisted et al. 2003). Furthermore, perceptual training normalized this pattern in these same rats as adults, whereas passive exposure uncoupled with reward did not reverse the effects (Zhou and Merzenich 2007). These findings demonstrate that adult cortex remains plastic for attended, rewarded stimuli. An effective early intervention environment would be one that generally amplifies all informative signals in the child’s environment, teaching discrimination to the undifferentiated cortex through rewarded discrimination learning. It also suggests that the notion of loss of plasticity after the end of critical periods may inappropriately deflect potential therapeutic opportunities.

  3. (c)

    A third possibility is that autistic children do not automatically mimic or simulate the actions of others as do typically developing infants and children. Simulation may underlie imaginative play (Goldman 2006), which is disrupted in individuals with autism, as evinced by their reduced amount of pretend play (Charman et al. 1997). Mental practice, for example, is said to be as effective in increasing synaptic connections and expanding cortical representations as physical practice under many conditions (Jackson et al. 2003). In short, it is possible that typically-developing children benefit from observation and imagination, in a way that children with autism do not. Imagining previously-observed signals and actions may give normal children’s brains extra practice, allowing the proper amount of exposure for neural development. Children with autism, on the other hand, may only benefit from actions as they are being enacted and stimuli as they are being directly experienced, meaning that they require the enhanced direct experience with cognitive and sensorimotor experiences provided by structured teaching. This lack of automatic delayed and immediate mimicry has been related to the “mirror neuron system”, a set of neurons distributed in several areas of cortex, that have been suggested to be functionally disordered in individuals with autism (see Iacoboni and Dapretto 2006, for a review). Typical individuals, but not autistic individuals, show activation in these areas that is similar whether they are experiencing a stimulus or enacting a behavior, or whether they are watching someone else experience a stimulus or enact a behavior. It seems to us more parsimonious to regard this as an effect of early social disinterest, rather than suggesting that a system which is not morphologically, neurochemically, or anatomically distinct is selectively abnormal and causes social disinterest. Particularly given growing documentation of widespread connectivity abnormalities, it is more likely that “mirror neuron system” dysfunction is secondary to much more widespread processing disturbances or a preceding deficit in social motivation, depriving it of input. But in either case, dysfunction in this system is consistent with the lack of mimicry that preliminary research suggests is the case in autism (McIntosh et al. 2006).

  4. (d)

    Yet another possibility is that in children with autism, particular neural systems or areas are underdeveloped, and need enhanced practice and experience in order to bring systems online. The concept of constraint-induced movement therapy (i.e. forcing use of a damaged system; e.g., Taub et al. 2004) has recently entered the literature on cognitive rehabilitation (Sohlberg and Mateer 2001). Constraint-induced movement therapy, which forces individuals to rely on their damaged (as in the case of stroke patients) or dysfunctional (as in the case of children with cerebral palsy) motor system by restraining the spared limb, has been shown to result in both restoration of lost cortical representations and acquisition of new representations (Johansen-Berg et al. 2002; Liepert et al. 2000). Early intervention that emphasizes extreme amounts of practice may be conceptually quite similar. Just as a normal system may become enhanced with extreme amounts of practice, a damaged system may reach normal levels of functioning with extreme amounts of practice. If an autism cascade begins as a structural or connective defect of some sort, intervention may force use on a damaged attentional, social-emotional, or language system until the system’s cortical representation grows enough to function without continued treatment.

  1. 5.

    Compensatory input

Alternatively, there may be irreversible damage to underlying neural systems responsible for key behaviors such as language and face processing. If so, the only route to recovery would be to teach the child early on to compensate for this damage by learning alternative strategies for mastering these pivotal skills, so as to recruit areas of the brain that are not typically used for such functions. In this case, fundamental building blocks to development are acquired in an alternative way that allows for nearly typical development of surface behaviors, but involves substantial alterations in the cortical map of cognitive processes or neural representations of information.

Interventions that lead to effective change in one or two pivotal skills that are fundamentally disordered in autism might lead to collateral changes in other prominent symptoms and skills (Koegel et al. 1999). The more sophisticated cognitive and emotional processes that children with optimal outcomes go on to master may only require foundation skills such as language, face-processing, and imitation upon which to build, regardless of how these foundation skills were initially developed. This possibility implies that underlying cognitive systems representing language and social skills are damaged in an autistic child but that other aspects of cognition are intact (and therefore recovery would be precluded by comorbid mental retardation). In these successful children, intact parts of the brain may take over in representing these pivotal functions (just as left hemisphere stroke victims sometimes show significant right hemisphere activity taking over for language) (Kinsbourne 1971; Mussol et al. 1999). In this scenario, acquiring a basic set of foundational skills, necessary for further development, is the mechanism central to early intervention success.

This possibility is illustrated by considering how young children with autism learn language and social skills. Typical children, acquire these concepts and skills implicitly (Church et al. 1986; Karmiloff-Smith 1992). However, intervention programs make the learning of these skills explicit. Doupe and Kuhl (1999) suggested that the act of learning itself may limit the extent to which one can learn to vocalize specific phonemes. Acquiring these abilities through the conscious effort most children apply to math and reading, but not conversation or imaginative play, may bypass the normal course of cognitive development, leading to atypical cortical representation of these skills. In other words, the alternate strategies generally used by children with autism to acquire communicative and social abilities may lead to compensatory, rather than normalized, functional systems in the brain.

If compensation underlies recovery from autism it should be accompanied by changes in the localization of function of skills that were emphasized by therapy in recovered children. Indeed, preliminary evidence supports this idea. Typical individuals show unique activation in the fusiform face area (FFA) when looking at faces. However, individuals with autism consistently show hypoactivation in this region (Schultz et al. 2000; Pierce et al. 2001). Schultz et al. (2000) have interpreted these findings as a reflection, rather than a cause, of the tendency of autistic people not to look at faces. A popular theory regarding the FFA is that it is linked to expertise (Gauthier et al. 2000), and although most of us become face experts by a very young age, Schultz et al. (2003) argue that people with autism, due to their lack of social attention, do not. Expecting that increased experience with faces might activate the fusiform face area, Bolte et al. (2006) trained 10 high-functioning autistic adults in facial affect recognition. While the group showed significant behavioral improvements, these improvements did not lead to increased post-training activation of the FFA. Instead, they led to increased activation in the superior parietal lobule (Bolte et al. 2006). Other studies that have found activation in the precuneus region, or retrosplenial cortex, in autistic children (Wang et al. 2004) and adults (Schultz et al. 2003) in response to familiar faces. This region may represent a higher-order processing system that is unique to individuals with autism (Schultz et al. 2003).

The possibility of alternate specialization in the autistic brain is consistent with the idea that regional specialization is sensitive to experience (Jacobs 1999). The evidence cited above suggests that treatment-enhanced ability of autistic individuals to recognize faces and facial affect may involve compensatory pathways and activation, and thus neuroimaging studies should show different degrees or location of activation in processing this kind of information. To date, no study has examined these processes in recovered children. Evidence from successful treatment of dyslexia (Aylward et al. 2003; Richards et al. 2000; Shaywitz et al. 2004, Simos et al. 2002; Temple et al. 2003) demonstrates both normalization of cerebral activity and compensation by recruiting additional brain areas; furthermore, this plasticity has been observed across the life span (e.g., Eden et al. 2004; Shaywitz et al. 2004; Simos et al. 2002). Of course, one might expect later-acquired skills such as reading, as opposed to skills usually acquired very early, such as face processing expertise and basic language skills, to show an extended period of possible effective plasticity.

  1. 6.

    Effective intervention suppresses interfering behaviors

Behavioral treatment may bring about functional recovery in some children with autism by suppressing those of the child’s behaviors that interfere with attention to his/her environment, especially the repetitive behaviors that therapists call “self-stimulatory”. This may work by suppressing the abnormal cortical input that restricted and repetitive behaviors induce, thereby preventing them from taking up valuable cortical space, or from altering the neurochemical balance in the brain, such as, by reducing brain-derived neurotrophic factor (BDNF) levels in the hippocampus (Branchi et al. 2004). Alternatively, interrupting the child’s repetitive behaviors may simply make him/her available for teaching and receiving meaningful input from his/her environment. If the repetitive behaviors are not interrupted, the child’s potential to benefit from meaningful input will be diminished as more and more processing capacity is occupied by meaningless input and he/she loses neuroplastic “degrees of freedom” (Lewis 2004). At the same time, the child will miss out on input and experiences necessary for normal neural and social development.

Turner (1999) reviewed the literature about the possible functions of repetitive behaviors in autism, including the arousal-reduction hypothesis (Kinsbourne 1980, 1987); the operant hypothesis, according to which stereotyped behaviors are maintained by their sensory consequences, attention elicited from caregivers, or escape from aversive tasks; and the executive hypothesis, in which stereotyped behaviors result from impairments in initiating new behaviors or in inhibiting ongoing behaviors. A variant of the arousal and operant hypotheses is that repetitive behaviors are a response to sensory overload which overwhelms discrimination in proprioceptive as well as other sensory channels, with the movements being an attempt to ramp up physical stimulation to restore a sense of the location of the physical body in space. Lewis et al. (2007) also review animal models of stereotyped behavior, including CNS insults, pharmacological interventions, and rearing in restricted environments. These attentional mechanisms are suggested to result in functionally impoverished environments. The hypotheses postulated above carry different implications for the treatment of stereotyped behaviors. The arousal hypothesis implies that behavioral limitations are at least in part state dependent. It would divert remedial efforts toward manipulation of the environment, for instance avoiding novelty and stress, or attempting to acclimate the individual to unavoidable uncertainties. Many deviant behaviors would then be recognized as being attempts at compensation. If stereotypic behaviors serve a de-arousing purpose, then the child should not be deprived of these behaviors until more socially-acceptable maneuvers are successfully substituted (Kinsbourne 1980). Another implication of this hypothesis is that attempts at environmental enrichment would be quite counterproductive, since they might increase arousal levels. In contrast, if the stereotyped behaviors in some children with ASD arise for the same reasons as in animals reared in restricted environments, then providing enriched environments (or forcing attention to the normal environment, which might have the same result) should reduce these behaviors. And if the repetitive behaviors are supported by their consequences, then operant procedures to change the consequences (e.g. not allowing escape from tasks, preventing the sensory consequences) should reduce the behaviors. What supports these behaviors may differ between behaviors, between children, and within children over time. A behavior initially caused by one factor, for example, may come to be supported by secondary consequences. Advances in functional behavior analysis allow therapists to study the antecedents and consequences of these behaviors for each child, at one point in his/her development. Difficult as this makes theory, this focus on individual differences in the purposes or causes of specific behaviors, seems likely to be productive.

  1. 7.

    Successful early intervention reduces stress and stabilizes arousal

Another contender for the critical mechanism underlying the success of behavioral intervention is that it structures and organizes the child’s world in such a way that it normalizes the child’s arousal levels, thereby allowing learning to take place. Kinsbourne (1987) proposed that social stimuli are selectively avoided by individuals with autism because social interactions are, by nature, the most unpredictable. This unpredictability, he argued, creates an untenable level of arousal for children with autism, because they have unstable arousal systems, causing them to seek comfort in objects and routines which are generally de-arousing. Behavioral intervention may structure social interactions in such a way that they become more predictable and therefore less arousing. This would make social interactions less aversive. Recent work documenting connectivity abnormalities may support this arousal model since unpredictability can overwhelm the reduced capacity of the autistic brain to coordinate complex and rapid stimuli, leading to overload-related stress. It is also possible that as the child grows, rather than turning to a nurturing social environment for soothing, he/she may learn to self-regulate by engaging in repetitive, self-stimulatory behaviors, thus making him/her increasingly less available to the outside world at the risk of being overwhelmed. Making the environment more predictable might lower arousal and therefore decrease the need for the de-arousing repetitive behaviors that interfere with learning. On the other hand, forcing a child into highly arousing social situations (face to face interaction with others for hours per day) may create habituation and lower stress in that way.

Chronic stress can instigate developmental brain damage in several different ways. This cascade could be stopped early in development and the environment made able to compensate for the child’s biological overarousal, either by making social interaction more predictable, or by desensitizing the child to the unpredictability. This theory implies that the children with autism who recover have cognitive systems that are initially intact but that due to chronic overarousal, the children are initially too stressed to attend and learn as they otherwise might. This theory leads to direct predictions that in early childhood, particularly before effective intervention has begun, signs of autonomic overarousal, overreactivity, or instability should be detectable.

  1. 8.

    Boosting recovery via biomedical treatment

There is currently no evidence that biomedical intervention alone can result in recovery from autism. However, such intervention may boost the effectiveness of educational interventions. A child who is sleep-deprived, experiencing gastrointestinal distress, eating a self-restricted imbalanced diet, underactive, or suffering from depression and anxiety may not receive the full benefit of behavioral treatment or education. Slow wave sleep has been shown to enhance critical period plasticity in the visual system (Dang-Vu et al. 2006) and it is possible that it does so for additional systems. Indeed, in mammals, high levels of sleep coincide with the rapid phases of brain development, and decline when brain maturation has occurred. Sleep-deprived rats show significant decreases in the size of the cerebral cortex and brainstem (Mirmiran et al. 1983). Moreover, sleep-deprived rats (as opposed to rats whose sleep cycles were undisturbed) showed no plasticity benefits from exposure to an enriched environment (Mirmiran et al. 1983). These findings suggest strongly the need for optimal sleep management for young children. On the positive side, exercise has been shown to increase BDNF levels and generally promote neural plasticity (Cotman and Berchtold 2002; Widenfalk et al. 1999) as well as improving cognitive and brain function (Kramer and Erickson 2007). Thus, treatment for such ancillary symptoms may improve the benefit the child receives from behavioral intervention.

Another state change that has been noticed in children with ASD is improved behavior with significant fevers (Curran et al. 2007). Parent report has confirmed reductions in adverse behaviors with fever. Despite the significance of this observation, it would be even more important if anecdotal reports of increased emotional contact and speech with fevers could be confirmed. This might lead to a better understanding of state changes that promote normal behaviors and the underlying chemistry of autism. Finally, there are multiple reports of autistic children speaking in conditions of perceived emergencies, starting with Rimland’s 1964 book. While these otherwise mute or almost-mute children did not produce complex, advanced speech, they did produce utterances (“look out”, “take it out”, “I don’t want to go”) that were thought to be beyond their ability. Assuming these reports are true, it bolsters the motivational theory to explain at least some autistic behaviors.

Several lines of research lend hope to the idea that biomedical treatments may someday improve the prognosis for a larger majority of children diagnosed with ASD. Many children with ASD may experience some form of immune compromise (Warren et al. 2005). Herbert and Anderson (2008) suggest that early immunological insults to the brain, such as by toxicants and infectious agents, may not be eliminated from the body if encountered during critical periods of early development. If viruses or heavy metals penetrate the nervous system they may stimulate an oxidative stress response which could lead to neural inflammation. Inflammation and oxidative stress could interfere with optimal neural functioning through multiple mechanisms. By contributing to excitotoxicity and suboptimal cellular energetics they could exacerbate the neurochemistry underlying the stress response and contribute to excessive arousal, as well as to a more general phenomenon of cortical noise with decreased signal-to-noise ratio that could contribute to abnormal thresholding and diminished specificity in response to sensory stimulation (Anderson et al. 2008) The astroglial activation component of immune activation may well lead to the hypoperfusion often seen in children with ASD (e.g. Degirmenci et al. 2008), since activated astroglia are enlarged and can reduce brain capillary lumen by as much as 50%, reducing oxygen support of brain tissue, increasing the difficulty of eliminating waste products to the blood system, and hence and impairing the cellular activities associated with neural activity and synchronization (Aschner et al. 1999). Over time, this could result in various areas of the brain developing in poor relation to one another, with each area of the brain perhaps developing hypersensitivities or special properties, but making it difficult for multiple neural systems to work in concert (see Muller 2007 for a review on lack of synchronicity in autism). If this inflammation could be controlled early in life, it might prevent such atypical development from taking place. This might be accomplished by agents that reduce microglial and astroglial activation, address the triggers for this activation, or that counteract the consequent hyperglutaminergic state. This scenario is consistent with the idea that intrinsic bias toward social motivation is obstructed rather than absent in children affected with this type of pathophysiology.

A recent study reversing the symptoms of Rett’s Syndrome in adult mice (Guy et al. 2007) raises the possibility that biological treatment may not even need to occur early in life. They found that activating MeCP2 in adult affected mice resulted in phenotypic reversal of the syndrome. This demonstration that defective neurons may be repaired even in adulthood, and that developmental damage done during brain formation may sometimes be reversible, is a further caution to avoid too rigidly holding that after “critical periods” deficits are totally fixed.

However, neuronal circuits that control behavior are largely shaped during critical periods in the first few years of postnatal life. Another possibility for future treatment is that critical periods may be extended, or even reopened, via pharmacological intervention to treat children with autism. For example, autistic children do not experience the period of high serotonin synthesis during childhood that typically developing children do (Chugani 2004). Serotonin is critical to postnatal synaptogenesis, and so one possibility would be to treat very young children with serotonin agonists in an attempt to replicate for autistic children a more typical period of early brain plasticity (Chugani 2004). Similarly, activity-dependent development of sensory systems has been shown to be dependent upon GABA neurotransmission and treatment with GABAergc drugs extend the time course of the critical period for vision (Hensch et al. 1998). As research progresses as to the timecourse of various neurochemical developmental processes in different subtypes of autism, more potential pharmacological interventions aimed at modulating experience-induced synaptic plasticity in young children may present themselves. Pharmacological interventions may be particularly potent when delivered between 12 and 24 months—an active period of synaptogenesis when children with autism are frequently observed to regress and/or become symptomatic.

Genes that control activity-regulated synaptic development and function are affected in some autistic children (Garber 2007; Morrow et al. 2008; Sutcliffe 2008; Zoghbi 2003). Normalizing the malfunctioning control genes or providing the missing gene product, of course, would be a direct treatment for such children. However, children so affected may not be among the ones for whom intense behavioral intervention can produce recovery. The children in the Morrow study for whom information is given appear to be severely affected, with comorbid MR and sometimes seizures, as might be expected with a widespread malfunction of synapses. Most of the mechanisms suggested in this paper would probably not produce recovery for such children, although treatment could certainly still produce improvement. Gene–environment interactions may also affect synaptic functioning; for example, in addition to the multiple candidate genes impacting calcium channels, multiple ubiquitous environmental toxins targeting these same channels could also impair function both prenatally and postnatally (Pessah and Lein 2008). This may be pertinent in less severe cases of autism. If the contribution of such environmental triggers in the setting of genetic vulnerability is substantial, reducing exposure to environmental toxins may decrease gene penetrance and increase receptivity to behavioral intervention.

Conclusions and Future Directions

The gold standard in treatment evaluation is the randomized prospective study. Despite the absence of such studies in the field of treatment of autistic children, we are able to draw some tentative conclusions.

Recovery in children with ASD through behavioral and educational interventions seems possible in a significant minority of cases. Ideally, treatment methodologies are based on an understanding of the underlying brain abnormalities and dynamic issues. In autism treatment we are compelled to reason in the opposite direction. Having determined what seems to work empirically, we suggest which biobehavioral mechanisms might underlie their success. There are many possible psychological and neurobiological mechanisms through which this improvement can come about. We have listed some that broadly fall into the categories of intensive practice (“treating to weakness”), environmental enrichment and stress/anxiety reduction coupled with reinforcements that guide attention outward into the physical and social environment, as well as the possibility of increasing receptivity to behavioral interventions by reducing the severity of treatable biological processes that impair neural functioning. These efforts appear most promising when implemented early in life, even before the autistic symptoms have fully presented.

In addition to the more fundamental questions about the biological causes of autism, many questions remain about how behavioral intervention can work, answers to which may provide basic information not only about autism but about neuroplasticity in general.

Which children have the potential for recovery through behavioral means, and how many are there? Recovery may occur through spontaneous reorganization of the brain, through behavioral adjustments that circumvent permanent brain impairments, through brain reorganization facilitated by behavioral interventions, and/or through facilitation of behaviorally-induced brain reorganization through reduction of biological barriers to learning. What genetic, physiological, or developmental factors may predict recovery? Are there structural or neurotransmitter defects from which it is possible to recover through behavioral means and others from which it is not? Different cognitive or affective systems may have more or less potential for reorganization or normalization, and thus, an individual child’s outcomes may depend upon the nature of the initial neurological impairments. Children from consanguineous or multiplex families may have a somewhat different set of conditions (Morrow et al. 2008) and therefore their potential for recovery may differ. Does a regressive course have a different probability of recovery? Some evidence suggests that regressive course may have a slightly worse outcome, in general (Rogers 2004), although the data are inconclusive (Werner et al. 2005), and yet many of the recovered children in the Fein et al. (2005) and Zappella (2005a, b) series seem to have had a regressive course; how can these findings be reconciled?

Does any sort of matching factor play a role? Certain treatment protocols, and certain therapists, may emphasize varying levels of factors such as positive affect and reward value for adults, teaching fundamental cognitive skills, forcing attention to the environment in a continuous way, etc. Some of these may have stronger effects on certain phenotypes of the disorder.

What is the critical time period for intense intervention to begin? Is there a “zone of modifiability” (Ramey and Ramey 1998) during which the developmental trajectory can be maximally impacted?

Is behavioral intervention necessary for such recovery or are there other interventions that might have the same result? Do some children with ASD achieve recovery with no specific intervention, merely through maturation, because of the type of ASD they have?

Another question that has not been well addressed, either empirically or even theoretically, concerns the nature of the predictor variables. Firm data support the predictive value of motor development, IQ, receptive language, and suggest the probable predictive value of joint attention, social interest, and play. But what do these predictive factors indicate? If they are gateways to further learning (as would be easily imagined for receptive language and joint attention), then treating them directly should improve outcome. However, if they are markers of underlying CNS integrity (as might be imagined for motor skills), then treating them would not have much effect (analogous to treating pain that indicates a serious underlying condition).

What can we learn from the residual vulnerabilities of the recovered children? Although data are meager, so far they suggest that recovered children are subject to difficulties with higher-level language pragmatics (e.g. discourse), attention, tics, anxiety and depression. Does this reflect the comorbidity of ASD with several of these disorders? Simonoff et al. 2008 found ASDs share high comorbidity with social anxiety, ADHD, and oppositionality. Or does it suggest that problems with attention and anxiety are central to ASD (Kinsbourne 1987) and persist when other parts of the syndrome resolve? Do these problems need to be treated in their own right, regardless of the autistic comorbidity, and are they treatable with standard therapeutic methods?

When recovered children perform language, social, or academic tasks to normal levels, are they using the same neural networks to the same level of activation as children with no ASD history? Is normalization or compensation more prominent, in different tasks, and in different children?

As the recovered children enter adolescence and then adulthood, are any at risk for regressing back into their ASD symptomatology? So far, our studies and those of the UCLA group indicate that this does not happen, but the research is certainly insufficient for a definite conclusion.

Research that examines functioning in persistent or recovered ASD, either through behavioral/cognitive testing or through physiological or neuroimaging methods should specify the treatment that their participants received. This will help untangle the effects of intervention on behavior and the brain, and assist our understanding of the critical differences between ASD itself and ASD in its treated state.