Introduction

Autism spectrum disorders (ASDs) are heritable, genetically complex neurodevelopmental conditions. In this paper, we search for ASD genes through combined analysis of two datasets collected by the Autism Genome Project (AGP). The AGP previously reported linkage to 11p12–p13 along with notable copy number variations in the largest collection of multiplex ASD families analyzed to date (Szatmari et al. 2007; see also Liu et al. 2008), followed by copy-number variation (CNV) and association analysis in a large cohort of trios (Anney et al. 2010; Pinto et al. 2010). Here, we use a distinct statistical method, based on the PPL framework (Vieland 1998, 2006; Yang et al. 2005; Vieland et al. 2008; Wratten et al. 2009; Huang and Vieland 2010), to reconsider the combined multiplex and trio data sets.

The PPL statistical framework has three principle advantages in this context. (1) It handles genetic heterogeneity via “sequential updating” across data subsets (Vieland et al. 2001; Huang and Vieland 2001; Govil and Vieland 2008). The posterior evidence from previously analyzed data is carried forward as prior evidence as new data subsets are analyzed, with underlying genetic parameters (allele frequencies, penetrances, and levels of heterogeneity) allowed to vary between subsets. This can be a powerful method for discovering genetic signals arising from even very small subsets of the data, provided only that we have classification variables allowing the division of families (or cases) into relatively more homogeneous groups (Govil and Vieland 2008). (2) The PPL accumulates evidence against genetic hypotheses as well as in favor of them. Thus inspection of subset-specific contributions to the omnibus signal can distinguish among subsets that are supporting the hypothesis and subsets that are actually contributing evidence against it. (3) The PPL permits analysis of multiplex families and trios in a unified manner. The multiplex families provide linkage information, while the trios provide information on allelic associations. Here, we introduce a novel method for genome-wide analysis based on simultaneous use of linkage and association information from two different sets of data.

It is widely accepted that ASD is genetically heterogeneous, but less clear whether clinical features can be used to demarcate more homogeneous subclasses. Familial concordances for specific ASD symptoms are not strong, and there is generally high intrafamilial variability. However, familiality for nonverbal IQ has been reported in several studies (Le Couteur et al. 1996; Silverman et al. 2002; Szatmari et al. 2008; MacLean et al. 1999). This is in line with more recent family and twin studies suggesting that IQ is the most heritable component of the ASD phenotype (Szatmari et al. 2008). Furthermore, subgrouping ASD patients on the basis of IQ has provided the most consistent method for distinguishing patients on a number of dimensions. At the lower end of the IQ range, there is considerable overlap between autistic features and chromosomal syndromes (Xu et al. 2004); epilepsy is more prevalent, and the ratio of females to males approaches unity in contrast to the preponderance of males among higher IQ cases (Amiet et al. 2008). Moreover, there is compelling emerging evidence of considerable etiologic overlap between the clinical classification of intellectual disability, various mental retardation syndromes, and ASD in terms of rare de novo and inherited CNVs (Guilmatre et al. 2009; Bijlsma et al. 2009; Marshall et al. 2008). Indeed, the distinction of “high” and “low” functioning autism is often based on IQ, and indicates groups that differ with respect to associated brain dysfunction, outcome, and response to treatment (Lotspeich et al. 2004; Allen et al. 2001; Stevens et al. 2000).

Here, we accumulate the total, or “omnibus,” evidence across subsets of families characterized by the presence or absence of lower IQ autistic individuals while allowing for the fact that the subsets may differ substantially from one another. We find compelling evidence that, indeed, the lower IQ group appears to be genetically distinct.

Methods

Participants

Multiplex families (N = 1,069), each containing at least 2 individuals diagnosed with autism by the Autism Diagnostic Interview (ADI; Le Couteur et al. 1989) and clinical best estimate, were contributed by 10 sites. (See Szatmari et al. (2007) for additional details. Note that some “sites”, or research groups, covered multiple data collection locations; however, sample sizes precluded further subdivision of the data.) IQ was recorded as a dichotomous trait. Families were classified as lower IQ (LIQ, N = 255) if at least one ASD individual had performance IQ ≤ 50 or was coded as “missing due to low functioning”; as normal IQ (NIQ, N = 580) if all ASD individuals had IQ > 50; and as missing IQ (MIQ, N = 234) if there were no lower IQ individuals and at least one affected individual missing IQ information.

Trios (N = 1,129) were contributed by 8 sites. (See Anney et al. (2010) for additional details. Again, some sites covered multiple data collection locations). Children met criteria for either autism or ASD based on ADI and ADOS criteria. A trio was classified as LIQ (N = 285) if the child had performance IQ ≤ 70 or was coded as “missing due to low functioning”, as NIQ (N = 394) if the child had IQ > 70, and as MIQ (N = 450) if the IQ information was missing. Changes in IQ classification by the AGP over time have led to the slight difference in IQ classification compared to the multiplex families. However, IQ is used only to subdivide the sample into relatively more homogeneous subsets, not as an outcome variable in its own right, and this is therefore unlikely to appreciably affect the results. Note too that the proportion of LIQ families is similar (25% vs. 24%) in the trios and multiplex families, respectively, suggesting that the change in cutoff might actually appropriately compensate for differences between the two datasets. Of the trios, 283 overlapped with the multiplex families, but only a single case from each overlapping family was used in the LD analyses; thus there is no overlap in the information extracted from the overlapping samples. All trios were of European ancestry (Anney et al. 2010). All sites had Institutional Review Board approval for this study, and the research was conducted in accordance with the World Medical Association Declaration of Helsinki (2000). Written informed consent was obtained from all subjects after the study had been fully explained.

Genotyping and data cleaning

Details of genotyping methods are given in Szatmari et al. (2007) and Anney et al. (2010). In preparation for linkage analysis, marker data were cleaned for family structure problems and Mendelian inconsistencies. Merlin (Abecasis et al. 2002) was run to detect and remove unlikely double recombinants, and to cluster any SNPs in LD groups. (Most parents were genotyped and LD in the marker map proved not to affect the results.) In preparation for LD analyses, marker data were additionally cleaned for marker missingness (>5%), sample missingness (>5%), and excess Mendel errors both by SNP and individual. Markers with minor allele frequency <1% were dropped, as were SNPs with a Hardy–Weinberg (HW) p value < 1 × 10−10 in at least one data subset or HW p value < 0.05 in at least three subsets. After cleaning, 749,933 SNPs remained in the analyses.

Statistical methods

All analyses were conducted using the software package Kelvin (Huang et al. 2006), which implements the PPL class of models for measuring the strength of genetic evidence (Vieland 1998, 2006). The two specific statistics employed were the PPL itself (posterior probability of linkage) and the posterior probability of trait-marker linkage disequilibrium (PPLD). Linkage analyses utilized LOD scores computed in Merlin (Abecasis et al. 2002; Lander and Green 1987) as input to PPL calculations (Vieland 1998). The genetic map is based on http://compgen.rutgers.edu/mapopmat (Matise et al. 2007; release 10/09/06).

The PPL as applied here is parameterized as a dichotomous trait model with parameters α (the admixture parameter of Smith (1963), representing the proportion of ‘linked’ pedigrees), p (the disease allele frequency), and the penetrance vector f i , representing the probability that an individual with genotype i develops disease, for i − 1..3. All trait parameters are integrated out of the final statistic, using uniform prior distributions, implicitly allowing for dominant, recessive, and additive models along with intra-subset heterogeneity. This provides a robust approximation for mapping complex traits in terms of the marginal model at each locus, and because the parameters are integrated out, no specific assumptions regarding their values are required. The likelihood also contains two location parameters: the recombination fraction θ and the standardized LD parameter D′, representing trait–marker association due to physical proximity.

The PPL framework accumulates evidence across data subsets by integrating the trait parameters out of the likelihood separately for each subset, using Bayesian sequential updating to combine the marginal information regarding θ and D′ across subsets. This procedure allows for genetic differences among data subsets, and is far more robust in retaining true signals originating from individual subsamples than analyses that simply combine subsets for a single analysis (Vieland et al. 2001; Huang and Vieland 2001; Govil and Vieland 2008). Here, we have subdivided the data and sequentially updated across IQ groups. Because the AGP families have been contributed by multiple research groups, we also sequentially update over “site.” Sites can vary with respect to the populations from which they recruit, ascertainment strategies and criteria, and subtle differences in clinical evaluations; simple sampling variability can also lead to inter-site differences. While not usually considered as a separate source of variation in genetic studies, the importance of allowing for site effects has been long appreciated in other settings, such as clinical trials. After dividing by IQ and site, subset sizes ranged from N = 20–148 (mean = 62) for the multiplex families and N = 20–169 (mean = 71) for the trios.

The PPL is on the probability scale, and its interpretation is therefore straightforward: e.g., PPL = 40%, means that there is a 40% probability of a trait gene at the given location based on these data. Based on earlier calculations (Elston and Lange 1975), the prior probability at each location is set to 2%, so that PPLs > 2% indicate (some degree) of evidence in favor of a trait gene at that locus, while PPLs < 2% represent evidence against the location. The prior probability of LD given linkage (L) is also set to 0.02, so that in the absence of prior linkage information P(L&LD) = 0.0004 (see also Welcome Trust Case Control Consortium 2007 for justification of a comparable figure).

Novel here is a mathematically rigorous method for using linkage information from the multiplex families to inform the association analyses, based on the fact that PPLD = PP(LD|L) × PPL (see Huang and Vieland 2010 for additional details). We interpolated the PPL results onto the physical map, and inserted the measured PPL into this equation. Thus, PPLs < 2% will depress PPLDs, and PPLs > 2% will increase PPLDs, by increasing the prior probability of LD under a linkage peak, up to a maximum of 2% prior probability of LD when PPL = 1 (see Roeder et al. (2007) for a related approach). This assumes that at least some ASD genes are etiologically relevant to both the multiplex and trio sets.

The PPL and PPLD are measures of statistical evidence, not decision-making procedures; therefore, there are no “significance levels” associated with them and they are not interpreted in terms of associated error probabilities (Royall 1997; Vieland and Hodge 1998). By the same token, no multiple testing corrections are applied to the PPL or PPLD, just as one would not “correct” a measure of the temperature made in one location for readings taken at different locations (Vieland 2006). Nevertheless, it may assist readers to have some sense of scale relative to more familiar frequentist test statistics. In simulations of 10,000 replicates of sets of 1,000 affected sib-pairs under the null hypothesis (no linkage), PPLs of 5%, 25%, and 80% were associated type 1 error probabilities of 0.00128, 0.00002, and <0.00001, respectively. In 10,000 null (no linkage, no LD) replicates of sets of 1,000 trios, no PPLDs > 1% were observed, while PPLD > 0.1% occurred in just 0.04% of replicates. At a locus with PPL < 2%, this represents a PPLD < 0.1%; while at a locus with PPL = 80% this would still only correspond to a PPLD of 3.9%.

It is also of interest to consider “power” in the trio sample in particular. For relative risk (RR) of 1.3–1.7, our ability to detect association in regions lacking evidence of linkage is low; e.g., for RR = 1.7, PPLDs > 10% occur just 10% of the time. However, LD under linkage peaks is expected to be considerably stronger. For RR = 2.0, with PPL = 80%, 91% of PPLDs are > 29% and 59% of PPLDs are >82%; with RR = 2.5 99.6% of PPLDs are >82%. (Here we generated data with disease and marker minor allele frequencies of 0.1, varying D′ and the penetrances to achieve different RRs; actual power can obviously deviate from these results.) Thus, we are unable to draw definitive conclusions regarding absence of LD in unlinked regions of the genome based on the current sample size. However, the sample size appears adequate for detection of moderate allelic effects under linkage peaks.

Results

Omnibus linkage analysis

Figure 1a shows genome-wide PPL results for the omnibus (all groups) analysis. 92.6% of the genome showed evidence against linkage, 97.4% of the genome had PPLs < 5%, and 98.7% of PPLs were <10% (99.6% ignoring chromosome 11, which shows several broad peaks). Against this backdrop, several peaks stand out. Two peaks on chromosome 11 coincide with locations reported in the two previous AGP analyses of this dataset (PPL = 60%@11p13; PPL = 93%@11p15.2). Also noteworthy is the very high PPL = 87% on 16q21, as well as several other peaks including: 2p25 (PPL = 12%), 4q31 (PPL = 33%), 6q14 (PPL = 11%), 18q22 (PPL = 18%), and possibly two additional peaks on 11p15 and 11q14, which are more moderate in size although still salient relative to the background. We note that the detection of multiple loci in this dataset is attributable largely to the PPL’s use of sequential updating. For instance, if we simply “pool” all sites and IQ groups together for a single analysis, on 16q21, the PPL at the peak is just 4%, compared to 87% based on sequential updating.

Fig. 1
figure 1

Genome-wide linkage analyses in a omnibus, b LIQ, c MIQ, and d NIQ groups. The PPL (posterior probability of linkage) represents the probability of an ASD gene at each position. The x-axis represents chromosomes 1–23 (X) on the Kosambi cM scale; the y-axis is on the probability scale. The horizontal line at PPL = 0.02 corresponds to the prior probability of linkage. Values below this line represent evidence against linkage, while values above the line represent evidence for linkage, at the given position

Linkage analysis by IQ group

Plotting the IQ groups separately (Fig. 1b–d), we see that the linkage plots suggest substantially different genetic profiles, with peaks occurring at different positions and more peaks in the LIQ group than the NIQ group. Notably, in several cases in which one IQ group gives evidence in favor of linkage, the other IQ group is actually giving evidence against linkage across the region. For example, the NIQ group gives PPL < 2% across the entire region surrounding the peak on 16q21 in which the LIQ group is giving evidence for linkage.

In this context, the MIQ group serves as a kind of control. Combining data from two genetically distinct groups in a single analysis tends to attenuate linkage signals (Govil and Vieland 2008). Thus, if the LIQ and MIQ groups differ in their underlying genetic etiology, then the MIQ group, presumably comprised of a mixture of LIQ and NIQ families, should produce smaller linkage signals overall. On the other hand, if the appearance of two distinct genomic patterns comparing the LIQ and NIQ groups were the result of random variations rather than true genetic differences, the larger MIQ group would be expected to yield larger linkage peaks, perhaps in separate locations. The observed pattern in the MIQ group corroborates the interpretation of these graphs as indicating that IQ is indeed demarcating genetically different subsets of the data.

The linkage signals on 1q31.3, 13q22.1, 14q24.2, and 16q21 are clearly driven by one IQ group in particular (with the other giving evidence against linkage), and in three of the four cases it is the LIQ group that is driving the signal. The peaks on 11p13, 11p15.2 are more difficult to parse: on the one hand, the omnibus PPLs are higher than the PPLs from either the LIQ or the MIQ subset; on the other hand, Fig. 1 strongly suggests that there are multiple loci on this chromosome, and possibly distinct genes operating in the two IQ groups (see below), which is consistent with the absence of appreciable signals from the MIQ group. Note too the small but visible omnibus signal on 15q11.2 (PPL = 4%), which rises to PPL = 14% in the NIQ group. This signal is directly over the known Prader–Willi ASD locus (van der Zwaag et al. 2010; Vorstman et al. 2006).

Omnibus combined linkage and association analysis

Figure 2a shows omnibus PPLD results. Against a very clean background, two modest peaks stand out. These occur at rs11603469 (11p15.2, PPLD = 26%) and rs10221112 (16q21, PPLD = 15%). In both cases, surrounding SNPs are also giving PPLDs elevated above the baseline (prior) probability of LD. On 11p15.2, rs11603469 is one of a small cluster of SNPs showing some LD evidence and overlapping the gene FAR1 (rs11603469 itself is 10 kb from the FAR1 start site); on 16q21 the SNP falls 351 kb from the nearest annotated gene (GOT2). A third, smaller, LD signal stands out on 4q31.23 (PPLD = 6% at rs7668351, which falls in BC031092). In each case, these SNPs fall directly under corresponding linkage peaks (Fig. 3). It is noteworthy that in each case, multiple data subsets (sites) support LD, but also, multiple sites give evidence against LD, and some are merely neutral. In situations where allelic effects may vary across strata, pooling data across strata will tend to wash out these types of signals.

Fig. 2
figure 2

Genome-wide combined linkage and association results from a omnibus, b LIQ, and c NIQ analyses. The PPLD (posterior probability of LD) represents the probability of allelic association with ASD due to LD for each SNP in turn, and utilizes both linkage information from the multiplex families and association information from the trios. The x-axis represents the physical map for chromosomes 1–23 (X); the y-axis is on the probability scale. An additional 151 markers from the pseudoautosomal region of X are not shown on the graph; none had PPLD exceeding the prior probability of LD

Fig. 3
figure 3

Omnibus PPL and PPLD for chromosomes a 4, b 11, c 16. Units on the x-axis are in cM

Combined linkage and association analysis by IQ group

Because the linkage results strongly suggest distinct etiology in the LIQ and NIQ groups, it is also of interest to consider the two groups separately in the association analyses. As expected, different SNPs are salient in the two groups (Fig. 2b–c). In general, the LIQ plot is slightly noisier than the NIQ plot, with smaller maximum peak and more “chatter” at the bottom of the plot. In part, this is consistent with smaller sample size. However, “power” is not merely a function of sample size, but also of the underlying genetic model. The LIQ multiplex family dataset is also smaller than the multiplex NIQ dataset, yet the linkage signals are more numerous and higher in the LIQ group. The different pattern observed for the PPLD analyses may therefore be revealing real differences in the underlying genetic architecture, and not just reflecting relative sample sizes. We return to this point below.

Table 1 shows all PPLDs ≥ 10% from the separate LIQ and NIQ analyses. Compared with the omnibus results, on 11p15.2, the omnibus signal in FAR1 is driven by the NIQ group (maximum PPLD = 32%). On 16q21, the omnibus signal is driven by the LIQ group, which on its own gives a PPLD = 7%, bolstered by a small signal from the MIQ group (not shown); none of these SNPs falls in an annotated gene. Some additional signals also appear in the subgroup analyses that were not salient in the omnibus results (see Fig. 4; this figure also shows the distinct genetic linkage patterns on chromosome 11). On 8q21.12 (LIQ, not in an annotated gene), a pair of SNPs is showing evidence of LD in a region not showing evidence of linkage (the second SNP, rs7007634, has PPLD = 8%). Additional association signals from the separate analyses are found on 3p12.1 (NIQ) and Xq13.1 (NIQ, with no clear difference between males and females) and 16p13.2 (LIQ).

Table 1 All PPLDs ≥ 10% from the LIQ, NIQ analyses
Fig. 4
figure 4

PPL and PPLD for LIQ, NIQ groups respectively, for chromosomes a 3, b 8, c 11, d 16, e X containing SNPs shown in Table 1. Units on the x-axis are in cM

Discussion

These analyses represent an examination of the AGP data from a unique statistical perspective. In contrast to the original analyses of the multiplex families (Szatmari et al. 2007), we have found multiple strong linkage signals. Disappointingly, however, PPLD analysis failed to find strong evidence of allelic effects under the linkage peaks, which would point us to the individual genes driving the linkage results. (We note, however, that follow up molecular work focused on one of the linkage peaks has established a strong prima facie case for involvement of the gene CDH8 (Pagnamenta et al. 2011).) The apparent absence of allelic effects could reflect a genuine absence of LD under the peaks, or limitations in 1 M coverage of the peaks for LD mapping purposes. Another possibility is that there is sufficient heterogeneity that the trios are simply too dissimilar to the multiplex families to be informative at the same genes. The absence of dense SNP array data in the majority of the multiplex families makes direct evaluation of this possibility difficult.

It is also important to keep in mind that the trio sample is still relatively small, and in particular, the LIQ and NIQ groups individually may be too small to provide strong evidence on their own. The AGP is currently completing a second phase of trio data collection and genotyping, which will effectively double the sample size, and sequentially updating with the new dataset will provide better differentiation between SNPs truly supporting LD and SNPs with evidence against LD.

However, the overall pattern of results might reflect heterogeneity between the IQ groups rather than sample size. The linkage analysis finds more loci in the LIQ analyses than the NIQ analyses, despite the fact that the multiplex NIQ sample is 2.3 times the size of the multiplex LIQ sample; while in the LD analyses, where the sample sizes are better matched (the NIQ trio sample is just 1.4 times as large as the LIQ trio sample), the strongest signals are found in the NIQ group. Linkage analysis is powerful for identifying relatively major effects, that is, those in which mutations at a single locus greatly increase disease risk, even if only in a small subset of cases or against specific genetic and environmental backgrounds. Association analysis is particularly powerful for detecting alleles that individually confer small effects on disease risk, but do so in a relatively homogeneous manner across the study population. Thus, the two sets of results can be interpreted as telling a complementary story. The LIQ families may represent more strongly “genetic” forms of disease, in which a single gene or a small number of genes cause the disorder in any given individual, with sufficient overlap in causal genes across families to permit linkage mapping. The NIQ families, on the other hand, could involve more of a spectrum of conditions, possibly more highly influenced by the accumulation of variants in multiple genes each of smaller effect, or perhaps simply involving even higher levels of heterogeneity and/or many private mutations.

Of course until more data are available, this remains highly speculative. Further work to fully characterize the distinction between the LIQ and NIQ groups, combined with additional genetic analyses, will be needed to refine and test this hypothesis. But the results obtained thus far require us to at least consider the possibility that subtypes of autism have distinct genetic architectures. This means that no single study design or experimental approach is likely to be optimal for all subtypes, and that we must be prepared for disparate results across different types of studies, or across data sets comprising different mixtures of subtypes. This point almost certainly applies to other complex disorders as well.

Finally, it is interesting to note that the association signal on 16p13.2, which does not fall under a PPL linkage peak, does fall within a linkage interval previously reported in a subset of AGP families, using a very different approach to untangling clinical heterogeneity based on latent class modeling (Bureau et al. 2008). Thus, the PPLD may be indicating a true association, but at a locus that our linkage analysis lacked power to detect, given the particular phenotypic classifications used here.