Overlaps Between Autism and Language Impairment: Phenomimicry or Shared Etiology?
- First Online:
- Cite this article as:
- Bishop, D.V.M. Behav Genet (2010) 40: 618. doi:10.1007/s10519-010-9381-x
Traditionally, autistic spectrum disorder (ASD) and specific language impairment (SLI) are regarded as distinct conditions with separate etiologies. Yet these disorders co-occur at above chance levels, suggesting shared etiology. Simulations, however, show that additive pleiotropic genes cannot account for observed rates of language impairment in relatives, which are higher for probands with SLI than for those with ASD + language impairment. An alternative account is in terms of ‘phenomimicry’, i.e., language impairment in comorbid cases may be a consequence of ASD risk factors, and different from that seen in SLI. However, this cannot explain why molecular genetic studies have found a common risk genotype for ASD and SLI. This paper explores whether nonadditive genetic influences could account for both family and molecular findings. A modified simulation involving G × G interactions obtained levels of comorbidity and rates of impairment in relatives more consistent with observed values. The simulations further suggest that the shape of distributions of phenotypic trait scores for different genotypes may provide evidence of whether a gene is involved in epistasis.
KeywordsAutism Specific language impairment Comorbidity Epistasis
Specific language impairment (SLI) refers to a condition where a child fails to develop spoken language on the normal schedule, for no obvious reason (Bishop and Norbury 2008). Potential causes such as hearing loss, low general ability, or physical impairment of articulators are excluded. Development in areas such as skills of daily living and nonverbal ability is age-appropriate. Autism is also excluded, and the textbook picture of SLI is of a child with normal social interaction and nonverbal communication, but with specific difficulties in mastering structural aspects of language, especially syntax (grammatical devices such as word order and inflectional endings) and phonological skills (identification and production of speech sounds). Autistic disorder also involves impairments of communication, but these are much broader, affecting pragmatics, i.e., the appropriate use of language in context, as well as nonverbal communication. In addition, there are impairments in social interaction and understanding, and the repertoire of behaviour and interests is unusual and often restricted (Dover and Le Couteur 2007). The diagnostic manuals ICD-10 and DSM-IV (American Psychiatric Association 1994; World Health Organization 1993) use different terminology, but both make a clear diagnostic distinction between a specific developmental disorder affecting language and the more pervasive difficulties seen in autism.
The past few decades have seen two major changes in our conceptualisation of the etiology of both autism and SLI. The first breakthrough was the abandonment of purely environmental explanations for these conditions as it became clear from twin studies that they were both highly heritable. Several studies showed that identical, monozygotic (MZ) twins were significantly more concordant than fraternal, dizygotic (DZ) twins for autism (Bailey et al. 1995; Folstein and Rutter 1977; Hallmayer et al. 2002; Steffenburg et al. 1989) and SLI (Bishop et al. 1995; Bishop and Hayiou-Thomas 2008; Lewis and Thompson 1992; Tomblin and Buckwalter 1998), despite growing up together and sharing many environmental influences. The next shift in understanding was from a focus on single genes of large effect to an etiological model of these conditions as complex and multifactorial, resulting from the influence of many genes of small effect, combined with environmental influences (see review by Bishop 2009). There were several reasons for this development. First, the high heritability estimates from twin studies did not translate into discoveries of common single gene mutations of large effect, as might have been expected. In the field of SLI, discovery of a mutation of the FOXP2 gene in one multigenerational family led to a fascinating series of studies exploring the gene’s evolution and mode of action (see Fisher, 2007, for review and Konopka et al. 2009 for more recent work), but it has become clear that it is a rare cause of speech and language impairments (Newbury et al. 2002). Second, pedigree studies showed that in both autism and SLI, rates of impairment in first degree relatives are higher than in the general population, but it is unusual to observe a classic Mendelian pattern of inheritance (Lewis et al. 1993; Rutter 2005a). In short, these conditions aggregate but do not segregate (Sing and Reilly 1993). Third, family studies indicated that first degree relatives of affected individuals often manifest subthreshold symptoms, such as subtle phonological difficulties in relatives of children with SLI (Barry et al. 2007), or mild social and communicative difficulties in relatives of those with autism (see review by Bailey et al. 1998). This suggested that these conditions correspond to points on a continuum of impairment, rather than all-or-none diseases. Finally, studies in clinical medicine led to growing awareness that complex multifactorial etiology is the rule rather than the exception for disorders that are common in the general population, consistent with arguments by evolutionary scientists that genetic variants that had large effects on reproductive fitness in ancestral humans would be unlikely to persist in modern humans (Keller and Miller 2006).
The cutoff on the causal trait will determine the frequency of disorder. The prevalence of SLI depends on the operational criteria that are used; an epidemiological study in the US estimated prevalence as 7% in kindergarten (Tomblin et al. 1997). This prevalence corresponds to a z-score threshold of around –1.5. Estimates of the prevalence of autism have mounted steadily (Rutter 2005b), from 4 per 10 000 in the 1960s up to 38.9 per 10 000 in a recent epidemiological survey in the UK (Baird et al. 2006). This is usually thought to reflect changing diagnostic criteria (King and Bearman 2009), although a genuine increase cannot be ruled out. In addition, autism is now regarded as a spectrum disorder rather than an all-or-none disease, with milder and partial forms of disorder being labelled as cases of ‘autism spectrum disorder’ (ASD) or pervasive developmental disorder not otherwise specified. When these cases are included, the prevalence rises to 116.1 per 10 000 (Baird et al. 2006), with corresponding z-score = −2.3. Nevertheless, even this estimate is lower than the prevalence of SLI and the cutoffs are shown as different in Fig. 1 to reflect this. In simulations discussed below we use the latter prevalence rate, and hence refer to ASD rather than autism.
The neat divide between independent disorders, shown in Fig. 1, has been questioned in recent years. It has been argued that the conditions of ASD and SLI may be less distinct than the textbooks imply. There are three lines of evidence that any etiological model has to account for: (a) apparently above chance levels of comorbidity between SLI and ASD; (b) rates of language impairment in relatives of probands with SLI and ASD; (c) molecular genetic findings of shared genetic risk factors for ASD and SLI.
Comorbidity between SLI and ASD
According to conventional diagnostic frameworks, SLI and ASD are mutually exclusive diagnoses—ASD is explicitly excluded when making a diagnosis of SLI, which is, by definition, a ‘specific’ developmental disorder. From this perspective it does not make sense to talk of comorbidity. Nevertheless, diagnostic frameworks do not necessarily reflect clinical reality, and there has been interest over many years in the idea that there might be overlapping language deficits in the two conditions.
In discussing overlaps between SLI and ASD, it is important to distinguish between different aspects of communication. On the one hand, children need to master the structural aspects of their language—phonology and syntax. These are the domains that are most often noted to be impaired in SLI. On the other hand, children need to use those skills to communicate with others—pragmatics. Although there are exceptions, most formal language tests focus either on vocabulary or structural aspects of language, but do not assess how effectively language is used to communicate in everyday situations. The conventional view of SLI maintains that pragmatic skill is intact and the child may communicate reasonably despite having limited structural language skills (Bishop 2000).
As demonstrated in a landmark study by Bartak et al. (1975), many children with ASD are poor at both structural and functional aspects of communication. These authors compared children with severe receptive SLI (termed ‘developmental dysphasia’), and children with autism. They documented similarities for the two groups on language milestones and measures of language structure, but striking differences in the functional use of language (Cantwell et al. 1978). Children with autism had much broader communicative difficulties than those with SLI, extending to encompass nonverbal as well as verbal communication.
Nevertheless, while poor functional communication is a hallmark of ASD, not all cases are impaired on formal language tests. Kjelgaard and Tager-Flusberg (2001) noted substantial variation in language abilities in a group of 89 children with autism. On a wide-ranging battery of language tests commonly used to diagnose SLI, 76% of them performed in the impaired range. The remaining 24% had no evidence of structural language deficits. In an epidemiological sample, Loucas et al. (2008) found that 41 of 72 (57%) children with autism and normal nonverbal IQ had impaired performance on a language battery. Subsequently, Tager-Flusberg and Joseph (2003) noted that many children with ASD were particularly poor at repeating nonsense words, a measure that has been proposed as a marker of heritable SLI (Bishop et al. 1996). Furthermore, these children tended to make similar morphosyntactic errors to those seen in SLI, i.e., omission of verb inflectional endings.
These studies indicate that structural and pragmatic language deficits are logically separable, but often co-occur. Tager-Flusberg and Joseph (2003) argued that the existence of cases of ASD whose language features resembled those of SLI suggested overlaps between these disorders at a deeper level. In this paper, a distinction is drawn between pure ASD, SLI and the apparently ‘comorbid’ cases, referred to here as ASD+LI, who have classic autism together with language impairments of the kind seen in SLI.
Rates of language impairment in relatives of probands with SLI or ASD
There have been several studies of parents and siblings of people with autism on language measures, with rather mixed results (see Bailey et al. 1998). In general, relatives of people with autism are more likely than control relatives to report a personal history of language or literacy problems, but these have proved harder to demonstrate on formal testing, especially when the measures come from instruments that are sensitive to SLI (Bishop et al. 2004; Whitehouse et al. 2007). A study by Lindgren et al. (2009), which is noteworthy for its methodological rigour and relatively large sample size, explicitly compared parents and siblings from three groups of probands: pure SLI (N = 36), pure ASD (N =20), and comorbid ASD+LI (N = 32). Particular care was taken to exclude from the SLI group any individuals with autistic features. The probands with SLI and comorbid ASD+LI were similar in their language profiles, but their siblings and parents differed. The relatives of those with SLI had language deficits, in line with previous studies of SLI relatives (Barry et al. 2007), but relatives of those with ASD+LI had language scores in the normal range. Relatives of the pure ASD group tended to obtain higher scores than the ASD+LI parents on language measures, but both the language scores and nonverbal IQ of the pure ASD group were above average. A supplementary analysis in which relatives were categorised according to whether they met criteria for language impairment showed the following rates of language impairment in relatives of those with pure ASD, ASD+LI and SLI respectively: siblings, 11%, 16% and 42%; fathers, 21%, 35% and 54%; mothers, 5%, 29% and 60%. The rates of LI in relatives were significantly lower in ASD+LI than in SLI probands for all relatives except fathers.
Molecular genetic risks for SLI and ASD
Vernes et al. (2008) studied a sample of individuals with SLI and demonstrated association between nonword repetition skills (a marker of SLI) and polymorphisms of CNTNAP2, a gene on chromosome 7q35 that is a downstream target of FOXP2. CNTNAP2 encodes a neurexin and is expressed in the developing human brain. These authors noted that association with ASD had been demonstrated for the same locus (Arking et al. 2008), with the strength of association greatest when cases were restricted to probands with severely delayed language milestones (Alarcón et al. 2008).
Taken together, the evidence from comorbidity, impairments in relatives and molecular genetic risks presents a puzzling picture. The comorbidity and molecular genetic findings would appear to point to overlapping etiology, yet the data from relatives are inconsistent with that picture. In this paper, I shall first present a formal simulation of overlapping etiology through additive pleiotropic effects to demonstrate the problems this model has in accounting for observed data, before going onto consider two radically different accounts of the etiological relationship between the two disorders.
A simulation of overlapping etiology
A ‘correlated additive risks’ model
Before describing the simulation, it is worth noting the different routes by which risk factors may be correlated: In the simulation presented here, genetic correlation is induced by including pleiotropic genes that lead to increased risk for both disorders. However, there are other possibilities: risk genes for the two disorders may be transmitted together because they are close together on a chromosome (linkage): in general, if linkage is tight, then predictions are similar to those from a pleiotropic model. Furthermore, nonrandom (assortative) mating could lead to different risk genes being contributed by each parent. In that case, predictions about affectedness in relatives may differ for parents and siblings. Environmental risks could also be correlated, but are not discussed here because evidence from twin studies suggests that environmental factors play a relatively minor role in the etiology of both SLI and ASD (Bishop 2006; Newschaffer et al. 2002).
A Matlab program that simulates the CAR model is available from http://psyweb.psy.ox.ac.uk/oscci/Miscellaneous.htm. The simulation starts by assigning a set of probands values for a set of genotypes, aa, aA and AA, whose frequency is determined by the user, with default minor allele frequency of .5. The ‘a’ allele is designated the risk allele, and has an additive impact on the causal trait for one or both disorders, with genotype aA having an effect intermediate between aa (low) and AA (high). The user specifies the number of probands, the number of genes affecting a causal trait, and the proportion of genes that affect both traits (pleiotropy). If we have 10 genes, when pleiotropy is set to zero, there are five genes affecting each trait, and none affecting both, and rg is zero (corresponding to Fig. 1). If the proportion of pleiotropic genes is set to .2, then genes 1, 2, 3 and 4 affect SLI only, genes 7, 8, 9 and 10 affect ASD only, and genes 5 and 6 affect both traits, giving a computed value for rg of .33. The model was designed to simulate twin data, and can be set to give estimated trait scores for MZ as well as DZ twins. The focus of interest here, however, is in similarities between probands and their first degree relatives, and so only the predictions for DZ twins are considered. In addition, environmental effects that are shared between relatives are modelled by a random normal variable that is identical for first degree relatives (and for MZ twins), and nonshared environmental effects (including measurement error) are modelled with a random normal variable that has no correlation between relatives. The genetic effects, shared and nonshared environmental effects are then combined in a weighted sum where the weights reflect the values of h2, c2 and e2 input by the user. The diagnostic categories of probands and their relatives are then assigned depending on whether or not the standardized score on the liability distributions for ASD and SLI fall below threshold.
As shown in Fig. 3, the prevalence of comorbid cases increases as the genetic correlation increases. Regardless of the size of genetic correlation, pure disorders tend to ‘breed true’, but the less extreme threshold for SLI means that probands with ASD+LI will have more relatives with SLI than with pure or comorbid ASD. The proportion of relatives with language impairment is similar for the SLI and comorbid proband groups, at around 22–23%.
Comparison with observed data
As we have seen, the CAR model can explain why rates of comorbidity occur in excess of the chance rate that would be expected if the two disorders were independent. It also is compatible with molecular genetic evidence of common risk variants for ASD and SLI. However, it further predicts that relatives of those with comorbid ASD+LI should resemble relatives of those with pure ASD on ASD trait markers, and resemble relatives of those with pure SLI on language measures. As noted above, this is inconsistent with observed data from studies by Bishop et al. (2004), Lindgren et al. (2009) and Whitehouse et al. (2007). The CAR model therefore fails to provide a plausible account of etiological overlaps.
There are other models of overlapping risk factors that have been proposed in the literature, but in all cases they predict that relatives of those with comorbid ASD+LI should have an increased rate of language impairments. Thus, Tager-Flusberg and Joseph (2003) suggested that pure ASD and ASD+LI might be distinct subtypes, with a common neurocognitive phenotype for ASD+LI and SLI: however the subtype hypothesis predicts that ASD+LI will ‘breed true’, so relatives of affected individuals would show language difficulties.
Another model of overlapping risk factors is the dimensional account of ASD by Ronald et al. (2006). Rather than assuming two liability distributions, one for autism and one for SLI, they suggest that the different components of autism (social interaction, communication and behavioural repertoire) are independently heritable, and only when all dimensions were impaired would they qualify as cases of autism. A problem for this view is that the communication dimension is not well-specified. As noted above, while an autism diagnosis requires that the child have difficulties with communication, this does not necessarily mean problems with structural aspects of language of the kind seen in SLI. Thus, to represent the full range of observed phenotypes, we need at least four dimensions, corresponding to language structure, pragmatic aspects of communication, social interaction and behavioural repertoire. As noted by Bishop (2003), such a model has the advantage of being able to capture a wide range of clinical conditions, including classic SLI (only language structure impaired), pragmatic language impairment (communication impaired, with or without poor language structure), Asperger syndrome (impairment in all domains except language structure) and autism (impairment in pragmatics, social interaction and behavioural repertoire, with or without poor language structure). The model runs into difficulties, however, in accounting for patterns of trait markers in relatives, because, like the CAR model, it predicts that the different dimensions should ‘breed true’, so relatives of those with comorbid autism+LI should resemble relatives of those with SLI on language trait markers, with both being impaired. Furthermore, it is hard to specify thresholds on the four causal traits that can generate plausible prevalence rates for the different types of disorder. If the traits are semi-independent, then the more traits that are impaired, the rarer the disorder will be. Autism without structural language impairment should therefore be far more common than autism with language impairment, which is not what is found (Loucas et al. 2008).
An account similar to that shown in Fig. 5 was proposed by Williams et al. (2008), who argued that apparently similar language deficits in SLI and ASD had different underlying causes. They argued that, although some children with ASD made errors in using inflectional endings on verbs, the types of errors differed from those seen in SLI. In a similar vein, Whitehouse et al. (2007) noted that the pattern of errors on a nonword repetition test was different in the two disorders, and argued that deficient phonological memory, indexed by disproportionate difficulty with long nonwords, may be implicated only in pure SLI.
Problems for a phenomimicry account
Although data on relatives appear compatible with a phenomimicry account, there are two problems for this explanation of comorbidity. First, it cannot readily explain the finding that CNTNAP2 is implicated as a risk factor for both SLI and ASD. Furthermore, we also have to explain why only a subset of individuals with ASD have SLI-like language problems, if such problems are a consequence of having ASD. An obvious possibility is that the likelihood of language problems increases with severity of ASD, as suggested by Whitehouse et al. (2007). However, this was not found in the larger study by Lindgren et al. (2009), nor in the epidemiologically-based sample studied by Loucas et al. (2008).
The evidence for etiological overlap between ASD and SLI is therefore somewhat inconsistent. On the one hand, phenotypic similarities between language deficits in the two disorders, and findings that CNTNAP2 variants confer risk for both ASD and SLI, suggest a common causal pathway. On the other hand, qualitative differences in language phenotypes, coupled with the relatively spared language abilities in relatives of those with ASD points to distinct etiologies.
A modified model: correlated risks with epistasis (CRE)
A possible way of resolving the inconsistencies is to incorporate nonadditive interactions between genes in an etiological model. There are several lines of evidence that suggest that a genetic model of autism needs to include interactions between genes, rather than just additive effects. The first is an analysis by Pickles et al. (1995), who considered frequency of autism in relatives (twins and other family members) for probands with autism. If the etiology involves many genetic variants with additive effects, then the prediction would be that the rate of autism in DZ twins or sibs of those with autism should be around 50% of the rate seen in MZ twins. In fact, the rate in these first degree relatives is considerably less than that (with around 10% of siblings and DZ affected with ASD, compared to around 80% in MZ twins; Pickles et al. 1995). The authors concluded that nonadditive genetic influences must be implicated in the etiology. A similar conclusion was reached by Risch et al. (1999), who compared the proportion of alleles with a common identity by descent (IBD) in affected sib-pairs vs. discordant sib pairs. They found that there was a small increase in IBD-sharing for affected sib pairs across all 360 markers that they considered, rather than an effect confined to a few loci. They concluded that the etiology of autism involved a large number of loci, perhaps more than 15, and probably involves interactions between genes (i.e., epistasis) as well as additive effects. A third line of evidence comes from consideration of the functional networks in which genes are involved; Bill and Geschwind (2009) noted that many autism susceptibility genes are involved in the same pathways, suggesting possible interactions between proteins in signalling pathways.
The Matlab script in the Appendix includes the option of specifying G × G interactions. Each gene is identified by number, and the user specifies a list of genes involved in each interaction. The first gene in the list has the effect of its risk genotype increased by a specified amount if and only if all the subsequent genes in the list have the risk genotype. For instance, if a G × G term is specified as [1 7 8] this means that an individual with a homozygous risk genotype (aa) for gene 1 will have the effect of that genotype amplified if a homozygous risk genotype is also present for genes 7 and 8. This means that even if two relatives have the risk genotype for language impairment, they may differ in terms of the effect of that genotype, because the added effect depends on the presence of a constellation of genotypes on other (ASD) genes. The probability of relatives sharing such a constellation decreases with the number of genes involved in epistasis.
Results from correlated risks with epistasis simulation, with 10 genes, one of which is pleiotropic. The effect of the pleiotropic gene is magnified by risk genotypes from ASD genes (see text)
Percentage relatives with pure LI
Percentage relatives with pure ASD
Percentage relatives with ASD+LI
LI trait mean in probands
ASD trait mean in probands
LI trait mean in relativesa
ASD trait mean in relativesa
Another inconsistency with data from Lindgren et al. (2009) is that their ASD+LI and SLI probands had similar scores on language measures, whereas Table 1 indicates poorer performance in the ASD+LI probands. This simulation results are, however, compatible with some other empirical studies, which report poorer language test scores in ASD+LI cases than SLI for some receptive language measures (Loucas et al. 2008; Rapin and Dunn 2003).
The simulation was re-run with different values specified for G × G interactions, number of genes, and the size of epistatic effect. Examples of outputs are shown in Supplementary Material. Including more than two ASD genes in interaction with a LI gene makes outcomes of relatives less similar to the proband, but also makes it less likely that conditions for epistasis will be met, and so has little impact unless the size of epistatic effect is large. Altering the number of genes did not, in general, affect outcomes, unless the proportion of genes involved in epistasis was too low to exert much influence on overall trait scores. Increasing the size of epistatic effect decreased similarity between relatives, but large values induced strong skew, and sometimes bimodality, in the data.
It is striking, however, that the distributions of phenotype scores are more skewed for risk genotypes that enter into G × G interactions. Bold lines denote cases where the variance on a trait for the aa genotype is significantly greater than for other genotypes on F-test. It is evident from inspection that this is a feature that characterises just those genes that are implicated in epistasis. This is because the genotype usually has no effect on language ability, but occasionally exerts a large effect, when occurring in the context of a set of other risk genotypes. This skew is distinctive compared with that seen for the other genotypes of this gene, and for distributions of genotypes of other genes. If this kind of G × G interaction is in play, then significant differences in variances of phenotypic trait scores between genotypes could provide evidence that a gene may be implicated in epistasis.
Although ASD and SLI have traditionally been regarded as distinct disorders, they often involve similar language deficits, raising the question of whether this is merely a superficial resemblance, or indicative of a deeper similarity, with overlap in etiology. Three models of etiology were considered. The first, of CAR, was simulated to test its predictions. The simulation confirmed that a model of correlated additive genetic risks can explain the relatively high rate of comorbid ASD+LI cases, but it does not predict observed data showing that relatives of people with ASD+LI tend to do better than relatives of those with pure SLI on language measures. The second model, in terms of ‘phenomimicry’, could account both for comorbidity and the patterns of deficit in relatives, but is unable to explain why the CNTNAP2 gene has been found, in independent samples, to be associated with both ASD and SLI. It also leaves unexplained why only a subset of those with ASD have language difficulties resembling SLI.
The final model was a modified version of the CAR model that incorporated G × G interaction. With one pleiotropic gene, whose impact was enhanced when a risk genotype occurred in the context of ASD risk genotypes, the model gave a pattern of results more in line with observed findings. In particular, the model could account for (a) comorbidity of ASD+LI at above chance levels, (b) similar or more severe levels of language impairment in ASD+LI as in SLI probands, while at the same time predicting (c) higher rates of language impairment in relatives of SLI cases than in relatives of ASD+LI cases.
Of course, the fact that a simulation can fit a pattern seen in observed data does not mean that the model is correct. Phenomimicry could also be implicated: we need more studies of qualitative aspects of language phenotypes in ASD and SLI to test this hypothesis convincingly. Other mechanisms, such as gene–environment interaction or assortative mating, could also be involved. Nevertheless, the simulation program used here showed that incorporating epistasis allows us to retain a model that postulates overlapping genetic etiology for ASD and SLI, in line with the molecular genetic findings on CNTNAP2. The G × G interaction reduces the correlation between probands and first degree relatives, and so can accommodate the result observed by Lindgren et al. (2009) whereby relatives of those with comorbid ASD+LI were less impaired on the language trait than relatives of pure LI cases, even though the comorbid probands themselves were at least as impaired as SLI cases on language measures. Although a model with G × G is less parsimonious than the basic CAR model, it is compatible with evidence from other sources that point to epistasis being implicated in the etiology of complex disorders in general (Carlborg and Haley 2004) and in ASD in particular (Bill and Geschwind 2009; Pickles et al. 1995; Risch et al. 1999).
The model not only provides a better fit to the observed data on relatives; it also suggests a way to identify genes that are involved in epistatic interactions. The CRE model included a single gene whose effect was magnified when its risk genotype co-occurred with risk genotypes on other genes. When this condition was met, this gene had a substantial impact on the pattern of observed data, introducing a skew in the tail of the distribution of liability markers. Nevertheless, when effect size is measured simply by comparing overall liability marker scores for those with the risk and nonrisk versions of the gene, the effect sizes of the interacting genes were small. Importantly, as shown in Fig. 7, the genes involved in epistatic interaction could, however, be distinguished from other genes in terms of the shape of the distributions of liability marker scores for risk and nonrisk genotypes. It follows that if such mechanisms do operate, then one could identify genes likely to be involved in epistasis by considering the distribution of liability marker scores associated with different allelic variants. In practice, non-normal distributions are often regarded as a nuisance to be corrected by removal of outliers, transformation or use of nonparametric tests (see also Bishop 2005). The sheer scale of information now available about the human genome means that one needs a strategy for prioritising which genes to analyse in order for association studies to be both statistically and methodologically tractable (Tabor et al. 2002). These simulations suggest that increased skew in phenotypic distributions for one genotype vs. others may be an indicator that a gene is likely to be involved in epistatic interactions.
The author is supported by a Principal Research Fellowship from the Wellcome Trust (ref 082498/Z/07/Z). The ideas in this paper were stimulated by discussions at a meeting organised by Gina Conti-Ramsden and funded by the Economic and Social Research Council, ‘Language and Social Understanding in Developmental Disorders’. The author likes to thank Andrew Whitehouse and Courtenay Frazier Norbury for their insightful comments on an earlier draft of this paper.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.