Introduction

Autism is a behavioral syndrome defined by the presence of social deficits, abnormalities in communication, the presence of stereotyped, repetitive behaviors, and a characteristic course [1]. The clinical phenotype is widely regarded as heterogenous with diagnosed individuals ranging from those with no functional language to more subtle deficits in pragmatic language; and ritualistic, repetitive behaviors ranging qualitatively from motor stereotypies to difficulties tolerating minor changes in routine. Cognitive abilities may range from severe mental retardation to above average intelligence.

Clinical heterogeneity is thought to be due to underlying etiologic heterogeneity. The main argument in support of etiologic heterogeneity in autism has come from studies showing that approximately 10% of individuals with autism have an identifiable medical condition of known etiology [2]. The existence of distinct mechanisms underlying these medical conditions is often put forward as the best evidence that multiple etiologic pathways lead to autistic behavior. Etiologic heterogeneity is often invoked as a major reason for non-replication in studies of autism. To date there has not been consistent evidence that particular profiles of autistic behavior are specifically associated with biologically defined subgroups of autistic individuals, although associated features (e.g., hyperarousal in the case of fragile X syndrome) are often noted to differ between individuals affected with these various medical conditions and autism.

Fragile X syndrome (FXS) is a well characterized X-linked genetic disorder and the leading cause of heritable intellectual disability [3]. FXS is one such condition that has been strongly associated with autistic behavior. Approximately one third of individuals with FXS meet criteria for autistic disorder [4, 5] and approximately 1–3% of individuals with autistic disorder are found to have FXS. In studies of boys with the full mutation for FXS, behaviors similar but milder than those seen in autistic disorder have been observed, such as social deficits with peers, abnormalities in communication, unusual responses to sensory stimuli, stereotypic behavior, social avoidance, and gaze aversion [611]. Morphological brain abnormalities on MRI have been described in individuals with FXS and in individuals with autistic disorder [12] but to date, no studies have directly compared brain structure in autistic individuals with and without FXS.

Converging evidence suggests that brain volume enlargement is a characteristic feature of autism, with onset of this enlargement most likely occurring in the latter part of the first year of life [1321]. We found generalized enlargement of cerebral cortical gray and white matter in a large sample of 2 year olds with autism compared to controls [22]. Increased volume of selected subcortical structures (i.e., amygdala, caudate nucleus), along with decreased size of others (i.e., corpus callosum) have also been reported in autistic individuals [2325], and more recently our group reported amygdala enlargement in toddlers with autism [26].

Neuroimaging studies have shown specific subcortical structures to be enlarged in FXS, such as the caudate nucleus, thalamus (in females), and lateral ventricles (see review by Hessl et al. 2004 [27]). Increased caudate volume has also been reported in individuals with FXS in association with severity of stereotyped, repetitive behavior [12, 28].

In the present study we contrast the morphological patterns of selected subcortical structures—components of the basal ganglia (caudate, putamen/globus pallidus), amygdala, and hippocampus—in individuals meeting DSM-IV criteria for autistic disorder with and without FXS. Specifically, we examined the brain volumes in boys with FXS compared to a group of controls and to a group of children with idiopathic autism (without FXS). We also compared a subgroup of children with FXS in our sample who met criteria for autism with the group of children with idiopathic autism. We hypothesized that we would find significant brain volume differences in the children with FXS compared to controls, and that the pattern of brain volumes might differ for children who had both FXS and autism. Studies to date have compared autistic individuals with controls and individuals with FXS with controls, but have not examined the neuroanatomical profiles underlying autistic individuals defined by both the presence and absence of the fragile X mutation.

Methods

Sample

Subjects were combined from two sites (Stanford University (SU) and the University of North Carolina (UNC)) and included 52 male children with FXS (with and without autism) and 113 male comparison cases, 18–42 months of age. In the comparison group, there were 63 boys with idiopathic autism (AUT), 19 boys with developmental delay (DD) and 31 boys with typical development (TYP). Children with FXS were recruited from both the SU and UNC registry databases, postings on the National Fragile X Foundation website and quarterly newsletter, and mailings to regional FXS organizations. Children with autism were primarily referred from nine specialty clinics for pervasive developmental disorders in North Carolina (TEACCH, Treatment and Education of Autistic and related Communication Handicapped Children) and were referred from community clinics for the SU sample. Subjects with autism were referred after receiving a clinical diagnosis of an autism spectrum disorder. Children with DD and TD were recruited locally through early intervention programs, preschools, child care centers, community media, and state run agencies (Regional Center system in California and Child Development Service Agencies in North Carolina).

Clinical assessment

Subjects were enrolled between 18 and 42 months of age. Medical records and developmental history were reviewed for all subjects. Inclusion in the FXS group required DNA testing confirming the fragile X full mutation as diagnosed with standard Southern Blot technique. These children also received testing for the fragile X mental retardation protein (FMRP) expression by calculating the percentage of peripheral lymphocytes containing FMRP using immunostaining techniques [29]. Subjects with autism were included after receiving a clinical diagnosis of autism, which was then confirmed by our team using the Autism Diagnostic Interview—Revised (ADI-R) [30] and the Autism Diagnostic Observation Schedule-G (ADOS-G) [31]. Subjects were only included in the AUT group if they met criteria for autism in all domains of the ADI-R and ADOS-G, and if they showed no evidence for the fragile X mutation on the DNA testing. Because the children with autism were participants in a longitudinal they were re-assessed using these measures at age 4, and we were able to confirm the classification of autism at this time. Only children who met full autism criteria at both timepoints (age 2 and 4) were included in our autism group. Inclusion in the DD group was defined as having significant global delays (developmental IQ ≤ 80), scores consistent with DD on the other assessment measures, no known identifiable cause for their delay (on medical record review), and no indication of a pervasive developmental disorder. Inclusion in the TYP group was defined as having average developmental and cognitive abilities (i.e. developmental IQ ≥ 85).

Subjects were excluded for evidence of medical or genetic conditions such as Tuberous Sclerosis (TS), gross CNS injury (e.g., cerebral palsy, significant complications or perinatal/postnatal trauma, drug exposure), prematurity (<34 weeks), low birth weight (<2000 g), seizures, and significant motor or sensory impairments. Medical records were reviewed for any evidence of autism or PDD-NOS for the DD and TYP subjects and they were excluded from these groups for any evidence of these disorders. DD and TYP children were screened for autism with the Childhood Autism Rating Scale [32] and excluded if they approached the cutoff for autism (≥25 total score). All autistic and DD subjects received testing (cytogenetics or molecular) to exclude FXS.

All subjects were given a battery of measures including the Mullen Scales of Early Learning [33], the Vineland Adaptive Behavior Scales [34], behavioral rating scales (e.g., Repetitive Behavior Scales), and a standardized neurodevelopmental examination to exclude subjects with any notable dysmorphology, evidence of neurocutaneous abnormalities, or other significant neurological abnormalities. Study approval was acquired from both the SU and UNC/Duke Institutional Review Boards and written informed consent was obtained by getting parental (or custodial guardian) consent for each subject. Table 1 displays the subject characteristics (e.g., age, cognitive ability, and adaptive functioning) of the final study population.

Table 1 Sample characteristics

MRI acquisition

All subjects were scanned on a 1.5 Tesla GE Signa MRI scanner (GE Imagine Systems, Milwaukee, WI) at either Stanford-Lucile Packard Children’s Hospital or Duke-UNC Brain Imaging and Analysis Center (BIAC). Image acquisition was designed to maximize gray/white tissue contrast for the pediatric brain and included: (1) a coronal T1 IR Prepared: T1 300 ms, TR 12 ms, TE 5 ms, 20° flip angle, at 1.5 mm thickness with 1 NEX, 20 cm FOV; and 256 × 192 matrix; (2) a coronal PD/T2 2D dual FSE, TR 7200 ms, TE 17/75 ms, at 3.0 mm thickness with 1 NEX, 20 cm FOV, and 256 × 160 matrix. A series of localizer scans and a set of phantoms was used to standardize assessments across sites and time (for the longitudinal study).

Subjects with FXS, autism and DD were scanned using sedation administered by a sedation nurse and under the supervision of a pediatric anesthesiologist. Physiological monitoring was conducted throughout the scan and recovery. TYP subjects were scanned without sedation, in the evening, while sleeping. All scans were reviewed by a pediatric neuroradiologist and screened for significant clinical abnormalities (e.g., malformations, lesions, etc.).

Image processing

Our segmentation procedure involved an automated pipeline that utilized a probabilistic spatial prior template (atlas) aligned to our subject MRI using a linear, affine transformation and included bias estimation, inhomogeneity correction, and non-brain stripping procedures. The result was gray, white, and CSF tissue segmented images for each subject. The methods and procedures are described elsewhere [3539]. Total brain volume (TBV) measures included total gray and white matter and all CSF. Total tissue volume (TTV) included all gray and white matter in the cerebrum and cerebellum. These measures were obtained to get a covariate of TBV to be used in the substructure analyses. Standardized tracing protocols used for each of the substructures we examined are briefly described below. All brain volume measurements were completed by raters in the UNC image processing lab. Reliability was obtained by two raters who made independent measurements on a set of 15 images, which included 5 images repeated 3 times (in random order). We were unable to successfully process some scans for all the regions examined secondary to insufficient scan quality or artifact. Group comparisons showed no bias (e.g., group, gender, age) in association with scans that could not be successfully segmented.

Caudate nucleus (CN)

The CN was measured on high resolution T1 images in ACPC alignment using a semi-automated 3D segmentation tool (IRIS/SNAP [40, 41]) that employs a user-defined threshold window, initialization, and region-growing parameters. This semi-automated method is more reliable and efficient than a fully manual protocol. The tool automatically finds tissue boundaries and will label the caudate. The segmentation label can then be manually edited as necessary, so that the caudate trace excluded the nucleus accumbens. The average intra-rater reliability was r = 0.97 and average inter-rater reliability was r = 0.96. There was 1 FXS, 2 AUT, 4 DD, and 3 TYP scans that were unable to be processed successfully and these were excluded from analyses.

Putamen/Globus Pallidus

The PUT and GP were manually traced on high resolution T1 images in ACPC alignment as a combined structure using the IRIS/SNAP tool. The major boundaries of the combined PUT and GP are the internal and external capsules. A second step involved using designated landmarks to separate the two structures. The average intra-rater reliability for the PUT was r = 0.97 and for the GP was r = 0.93. The average inter-rater reliability for the PUT was r = 0.97 and the GP was r = 0.83. Due to the difficulty in obtaining manual traces of the PUT/GP, there were 2 FXS, 8 autism, 8 DD, and 3 TYP cases that were of insufficient quality to be included.

Amygdala (AMY)

The AMY was manually traced on high resolution T1 images aligned along the long axis of the hippocampus using the IRIS/SNAP tool following a protocol developed by the Center for Neuroscience and the M.I.N.D. at UC Davis [42]. We first established our reliability with the UC Davis group (average inter-rater reliability r = 0.92) to ensure we had been adequately trained on the protocol. Subsequently, reliability was established on scans from our sample of 18–35 month olds. Average intra-rater reliability was r = 0.90, and inter-rater was r = 0.78. A single rater (r = .90) completed all the AMY traces. There was 1 FXS, 4 AUT, 2 DD, and 4 TYP scans that were of insufficient quality to obtain a valid AMY trace and were excluded.

Hippocampus

The HIP was obtained using a semi-automated tool where the user defines landmarks but the HIP is automatically segmented via a high-dimensional deformation of a template (MOJO [4346]). The unaligned T1 gray level image is the input image to the MOJO tool. The average intra-rater reliability was 0.95 and average inter-rater was 0.81. There were 3 FXS, 1 AUT, 3 DD, and 3 TYP cases that were excluded because the HIP could not be adequately visualized to perform the segmentation.

Statistical analyses

Descriptive statistics and data plots were first examined to look for anomalous data or outliers. No anomalous data was observed or removed. We used a multivariate analysis (a repeated measures mixed model) in SAS 9.1 to test our a priori hypotheses (see Gueorguieva & Krystal, 2004 [47] for a description of this approach). The brain volume measure was the dependent variable, with diagnostic group as the predictor. Diagnostic group was entered as a 5 level categorical variable (FXS, FXS+Aut, AUT, DD, TYP). A model was fit examining group differences for 5 substructures: AMY, CN, HIP, GP, and PUT. This model included up to 10 observations per subject. All analyses adjusted for the effects of age and IQ ratio on the measured brain volume by including them as covariates. TBV was included in the model to test for disproportionate differences in the subcortical volumes. Data collection site (UNC, SU) was not included as a predictor because no systematic difference in brain volumes (GM, WM, CSF, TTV, and 5 substructures of interest) were observed between sites.

Our primary hypotheses focused on three group comparisons (FXS vs AUT vs Controls). For these comparisons, combined estimates for ‘controls’ (DD + TYP) and ‘FXS’ (FXS and FXS+Aut) were created using post-estimation commands to create weighted averages. By using a weighted average of the subgroups the combined group estimates are accurate estimates of the means, while the possible error variance that could be accounted for by mean group differences is minimized. Our secondary analysis included four group comparisons (FXS+Aut, AUT, DD, TYP). However, only a single model was fit to obtain these estimates.

Laterality was assessed by examining the significance of interactions between group and hemisphere. A significant group by hemisphere interaction indicates that the group difference varies significantly by side (left vs right).

Results

A description of the sample (N, age, IQ) appears in Table 1. Group differences were evaluated for age, adaptive functioning from the Vineland Adaptive Behavior Scales-Interview Edition Survey Form [34], and developmental IQ from the Mullen Scales of Early Learning [33]. Age differences were observed, with the TYP group being slightly younger, so age was included as a covariate. As mean IQ for combined (TYP + DD) control group was higher than the FXS and AUT groups, IQ was also included as a covariate.

Substructure volumes: comparing FXS with control and autism groups

Age-adjusted means are reported in Table 2 and comparisons for total substructure volumes are reported in Table 3 (adjusted for age, IQ, and TBV). The main effect of group varied between structures (p < .001), therefore the effect of group is reported individually for each structure.

Table 2 Adjusted mean volumes (cm3) for substructures by group
Table 3 Group* comparisons for selected substructure volumes controlling for age, IQ, and TBV

CN and PUT/GP

Total CN volume was significantly enlarged in the FXS group compared to the control (40%), AUT (26%), TYP (34%), and DD (45%) groups. This pattern was also observed for right and left CN volumes, suggesting no laterality effect. Total PUT volume was significantly enlarged in the FXS group compared to the control (8%), TYP (8%), and DD (9%) groups, and were larger (but not significantly) than the AUT group (3%). No laterality of the PUT was observed. Volume of the GP was significantly enlarged in the FXS group compared to the control (13%), TYP (10%), and DD (16%) groups. No laterality of the GP volumes was observed.

Amygdala and Hippocampus

FXS subjects had smaller AMY volumes than the control (−7%), DD (−7%), and TYP (−8%) groups, but only differences with the control group were significant. The FXS group was significantly smaller than the AUT group (−19%). The right and left AMY volumes followed this same pattern (see Table 3), suggesting no laterality effect. The FXS group had significantly larger HIP than the DD group (26%), but showed no significant differences with other groups.

Comparison of FXS subgroups: with and without autism

We identified children in the FXS group who also met criteria for autistic disorder, referred to as FXS with autism (“FXS+Aut”). The children with FXS who did not meet our autism criteria were labeled as FXS without autism (“FXS−Aut”). Membership in the FXS+Aut subgroup required meeting cut-offs for autistic disorder on the ADI-R and the ADOS-G. Using this classification scheme, there were 17 children with FXS (33%) who also met criteria for autistic disorder. This is comparable to the rate of autistic disorder observed in toddlers and preschoolers with FXS reported by Rogers et al. 2001. Age adjusted means for the FXS subgroups were similar to those observed in the total FXS sample. Percent differences in substructure brain volumes for FXS, AUT, and FXS+Aut groups compared to the controls are displayed in Fig. 1. Comparison of the FXS+Aut group to the AUT group revealed significant CN enlargement (p < .001), and significantly smaller AMY (p < .001). The CN enlargement in the FXS+Aut and FXS−Aut groups compared to controls was dramatic in comparison to the AUT group. However, these groups showed opposite trends in their AMY volume, with the AUT group showing enlargement but the FXS groups showing decreased volume compared to controls. There were no significant differences in CN volume between the FXS+Aut and FXS−Aut groups, but both of these groups had significantly larger CN volume compared to the AUT group (see Fig. 2). Volumes of the AMY for the FXS+Aut and FXS−Aut were significantly smaller than in the AUT group, and there was approximately a 5% difference (not significant) between the FXS+Aut and FXS−Aut subgroups (see Fig. 2). As was observed in the overall FXS group comparison, we observed a double dissociation of greatly enlarged CN and small AMY in FXS, regardless of autism status.

Fig. 1
figure 1

Percent differences in substructure brain volumes for FXS, AUT, and FXS+Aut groups compared to controls *p < .05; **p < .01; ***p < .001

Fig. 2
figure 2

Percent differences in caudate and amygdala volumes for children with FXS who either met autism criteria (FXS+Aut) or did not meet criteria (FXS−Aut) compared to children with Autism *p < .05; **p < .01; ***p < .001

Exploratory analyses were conducted to determine whether brain–behavior relationships differed between the FXS+Aut, FXS−Aut, and AUT groups. ADI-R subdomain scores are presented in Table 4. Examination for clinical correlates using global autism measures (ADI-R algorithm subdomain scores; ADOS algorithm domain scores) as well as more refined measures of repetitive behaviors (e.g. RBS-R) did not reveal any significant brain-behavior correlations in any of these groups (FXS+Aut, FXS−Aut, AUT).

Table 4 ADI-R subdomain scores for FXS with autism (FXS+Aut) and Autism (AUT) groups

Discussion

In this study we observed significant differences in the neuroanatomical profiles of male children with autistic disorder with FXS relative to those who did not have FXS. Specifically, boys with both FXS and autism had substantially enlarged CN volume and smaller AMY volume compared to boys with FXS without autism. In contrast, boys with idiopathic autism (no FXS) had only modest enlargement in their CN volumes compared to controls, but more robust enlargement of their AMY volumes. Although observing this double dissociation among selected brain volumes, no significant differences in severity of autistic behavior as measured by subdomains of the ADI-R were detected between these two study groups. This study therefore provides evidence of a substantially different pattern of brain structures in two clinical populations with presentations of autistic behavior. The findings in the present study suggest that heterogeneity may be underestimated in studies attempting to identify common biological underpinnings of individuals meeting DSM IV behavioral criteria for the behaviorally defined syndrome of autism (e.g., genetic linkage studies). Clearly the study of biological mechanisms underlying autistic behavior in etiologically-defined subgroups such as those with FXS, is an important and probably under-employed strategy for dealing with the heterogeneity issue.

The finding of enlarged CN in FXS is consistent with other studies finding enlargement of the CN in FXS [12, 28, 48], and the magnitude of enlargement (∼40%) suggests this is a robust finding in children with FXS. The CN has been implicated in the repetitive behaviors seen in both autism and FXS. One possible cause for such an enlargement of the CN in FXS may be linked to the underlying genetics of the disorder. An association has been reported between a measure of FMR1 gene inactivation (activation ratio-AR) with caudate volume, and an association of IQ with both caudate and ventricular volumes [49]. We also know that FMRP has also been shown to play a direct role in brain development [50] and decreased FMRI protein has been associated with the cognitive deficits seen in FXS [5154]. The significant CN enlargement observed in FXS, in this study almost 3–4 times greater than in controls and about 3 times greater than cases with autism, may be more related to the FXS mutation versus diagnosis of autism, since the findings in the FXS autistic and non-autistic individuals are the same as those in the overall FXS group, regardless of autism status.

We report decreased AMY volume in our FXS group (with and without autism) and enlarged AMY in our AUT group. Our finding of increased AMY volume in our AUT sample is consistent with other reports of increased AMY in young children with autism, where AMY enlargement has also been associated with deficits in social behavior [21, 55] and social orienting [26]. Our finding of decreased AMY volume in boys with FXS is contrary to some reports of AMY enlargement in FXS [28], but consistent with studies that included very young children [48]. Dalton and colleagues hypothesized that early AMY enlargement is the result of amygdalar hyperactivity and hypertrophy, in response to the aversive nature of social stimulation in autism, with subsequent decreased AMY volume occurring as the aversive stimulation becomes chronic [56, 57]. Here we find support to suggest that children with FXS have decreased AMY despite having social deficits characteristic of autism. On the other hand, children with autism have enlargement of the AMY, yet display the same behaviors. It may be that having an AMY either too large or too small are ‘two sides of the same coin’ in the same way that hyper or hypo function of the MECP2 gene both result in the Rett Syndrome phenotype [58].

We did not find phenotypic differences in autistic behavior, as measured by the ADI-R and ADOS in our FXS+Aut and AUT groups. The defining features of autism as they appear in the DSM, while perhaps good ways to characterize the most impairing clinical features of autism, are not necessarily the best phenotypic features for separating out the underyling etiologic heterogeneity. Other behavioral features such as hyperarousal, thought to be more characteristic of FXS, may better distinguish groups of autistic individuals with and without FXS and may eventually find a place in studies of the autistic phenotype. Clearly additional studies which seek to identify different behavioral profiles in autistic individuals with and without FXS would provide important clues to meaningfully subsetting the autism phenotype.

In this study, we also found evidence for significantly enlarged CN volume in FXS compared to our control group, and with and TYP and DD subgroups. We found significant enlargement in the PUT and GP in the FXS group compared to the controls. This is the first report of significant enlargement in the PUT and GP structures in FXS, and as part of the fronto-striatal circuit with the caudate, this enlargement provides support for a neuroanatomical abnormality in this pathway in FXS. We also found significant group differences for the HIP volumes between the FXS and DD groups.

There are several limitations in the current study. The size of our subgroups of the controls (DD and TYP) and FXS (with and without autism) was modest and is a limitation. The heterogeneity we observe in idiopathic autism may have also limited the generalizability of the findings in this study. We only examined male children and therefore our findings may not generalize to females. Lastly, the measures we employed for behavioral assessment (ADI-R, ADOS) were developed for categorical diagnosis of autistic disorder and not for use in the way we have done—contrasting items and domains. Tools designed for this that have better dimensional qualities and that examine more varied behaviors (e.g., arousal, face processing using eye gaze, etc.) may have revealed behavioral differences that more closely correlated with the neuroanatomical differences observed between the groups. The study also has a number of strengths, most notable the large sample size of our autism and FXS groups within the narrow age range we examined. There is good evidence of age-dependent variation in neuroanatomical structures in autism (albeit indirectly from cross sectional studies [42]) and both FXS and ideopathic autism are well known to be developmental conditions where presentations vary with age.

In conclusion, this study offers a unique examination of early brain development in two behavioral overlapping disorders, FXS and autism, and finds two distinct patterns of brain morphology. The present study finds uniquely different neurodevelopmental profiles for these two behaviorally similar disorders and suggests that comparative neuroimaging studies may provide the best window into teasing apart genetically meaningful aspects of the autism phenotype in FXS. These findings also underscore the importance of addressing heterogeneity in studies of autistic behavior. Simply mentioning that it exists and that it may be the cause of a lack of replication, may not be enough as the continued failure to take etiologic heterogeneity into account in autism is likely to continue to handicap our best efforts to find circumscribed genetic and neurobiologic mechanisms underlying this condition. Studies examining the molecular basis for this difference may provide the best approach to getting a foot-hold into the pathogenesis of autistic behavior. For example, mouse studies have revealed that while social deficits qualitatively similar to those seen in autism are present in C57B6 FX (−/−) mice, they are not observed in FX (−/−) on a FVB background. Such findings suggest that it is the effect of interacting genes (and perhaps environment) on FMR1 that may have a role in causing abnormal social behavior, and similarly, studying genes interacting with FMR1 in humans with autism may provide clues to the genetic profiles causing autism in this and perhaps even subgroups of autistic individuals. Clearly working in an iterative fashion—going from genotype to phenotype and back again from phenotype to genotype, may eventually reveal more subtle phenotypic variations, perhaps outside of the traditional defining features of autism, to distinguish underlying etiologically-defined subgroups that could be employed to find new autism susceptibility genes.