Introduction

Language is a quintessential human trait that, for the most part, proceeds along a recognized trajectory with minimal explicit instruction [1]. In some cases, however, language acquisition is not so straightforward and language ability is delayed or permanently impaired. Sometimes these impairments form part of a recognized medical condition (such as learning deficit, autism or deafness), but often no obvious cause can be identified. In such cases, the language deficit is usually classified as specific language impairment (SLI) [2]. As such, SLI is usually diagnosed through exclusionary criteria rather than on the basis of any specific clinical test. SLI affects between 5% and 8% of English-speaking (primarily UK and US) pre-school children, and is a lifelong disability with an increased risk of behavioral disorders, social problems and literacy deficits [35]. The disorder shows significant overlap with associated developmental conditions, such as attention deficit hyperactivity disorder (ADHD), speech sound disorder (SSD), dyslexia and autism [6].

Over the past decade, researchers have begun to identify genetic factors that may have roles in the etiology of language disorders. It is hoped that the study of these genes will facilitate a better understanding of the cause of language impairments, leading to the development of improved diagnostic and treatment strategies for affected individuals. In turn, knowledge regarding the cause of such impairments may further our understanding of the biological pathways that underpin normal language acquisition [7].

Here, we focus on specific genes that have been identified to have a role in language impairment. Genetic linkage and association studies of SLI and related learning disorders are reviewed elsewhere [810].

FOXP2

Until recently, the only gene that had been directly implicated in the etiology of speech and language disorders was the FOXP2 gene on chromosome 7q (OMIM 605317). In 2001, a study by Lai and colleagues [11] implicated mutation of FOXP2 in a monogenic form of speech and language disorder found in a three-generation pedigree (the KE family) and in an unrelated individual with a chromosome translocation. In both cases, the disorder was characterized by verbal (or articulatory) dyspraxia, that is, difficulties controlling the movement and sequencing of orofacial muscles, causing deficits in the production of fluent speech. In-depth studies of the KE family showed that, in these individuals, speech production problems are accompanied by a complex array of linguistic deficits that include varying degrees of expressive and written language problems and, in some members, nonverbal cognitive impairments [12]. Subsequent screening studies have shown that although FOXP2 mutations are unlikely to be involved in the etiology of typical forms of SLI [13, 14], heterozygous disruptions of this gene (point mutations or chromosomal rearrangements) invariably lead to syndromes that include aspects of verbal dyspraxia [1521].

The FOXP2 gene encodes a transcription factor that regulates the expression of other genes. Downstream target screening studies have highlighted a variety of genes that may be regulated by FOXP2 and indicate that the effect of FOXP2 can vary greatly between tissues and developmental time points [2224]. FOXP2 may thus be involved in a variety of biological pathways and cascades that may ultimately influence language development. Pathway analyses of the identified targets indicate an enrichment of genes involved in the functioning, development and patterning of the central nervous system. In analyses of human neuronal cell models, Vernes et al. [23] estimated that FOXP2 may bind directly to approximately 300 to 400 gene promoters in the human genome. Although statistically significant overlaps were seen between the individual studies of FOXP2 targets, there were also notable differences in the sets of downstream genes that were identified. This finding demonstrates the complexity of these regulatory pathways and the inherent difficulties of precisely defining them in the laboratory.

FOXP2in the brain

The expression of FOXP2 is not limited to the brain but also seen in several other organs, primarily those derived from the foregut endoderm, such as the lungs and esophagus [25]. In the human brain, FOXP2 is expressed in a range of regions, including sensory and limbic nuclei, the cerebral cortex and several motor structures, particularly the striatum and cerebellum [26, 27]. Within these anatomical areas, FOXP2 expression is often limited to selected subdivisions or neuron types (for example, deep layers of the cortex, medium spiny neurons in the striatum and Purkinje cells in the cerebellum).

Mice that are bred to carry disruptions of both copies of Foxp2 survive only a few weeks. They are small for their age and have widespread developmental delays, severe motor abnormalities and impaired cerebellar growth [2832]. Given that total absence of functional Foxp2 results in lethality, in-depth behavioral investigations have focused on heterozygous mouse models, which carry a single working copy of Foxp2. Note that this matches the heterozygous state of humans with FOXP2 mutations; no humans carrying homozygous mutations have ever been identified. In general, it found that these animals have normal motor skills and no obvious gross abnormalities. However, in-depth behavioral and morphological profiling has uncovered subtle deficits. Interestingly, two groups have reported that heterozygous pups produce fewer innate ultrasonic vocalizations than wild-type animals [28, 30]. Other groups have questioned the reliability of this finding, instead describing deficits in motor skill learning [31], abnormal synaptic plasticity in striatal and cerebellar neural circuits [31] and differences in auditory brainstem responses [32] in heterozygous pups. In song-birds, it has been reported that reducing the expression of FoxP2 in an area of the brain necessary for vocal learning can interfere with the song learning process [33]. For an in-depth discussion of these animal studies, see [34].

Brain imaging studies of KE family members have also revealed structural and functional abnormalities in the cerebellum and striatum [12, 35, 36]. Affected individuals were found to have reduced gray matter densities in the caudate nucleus, the cerebellum, the inferior frontal gyrus and the lower primary motor cortex [12, 35]. During the performance of language-related tasks, in contrast to the expected left-lateralized pattern of activation, affected members of the KE family showed bilateral, diffuse activation with little or no activity in the left inferior frontal cortex (which includes Broca's area, involved in speech production) and reduced activation in other speech-related cortical and sub-cortical brain regions. In addition, brain areas not usually activated during linguistic tasks, including the posterior parietal, occipital and postcentral regions, were found to be over-activated in affected individuals [36].

Evolution of FOXP2

Because of the proposed function of FOXP2 in speech and language development, this gene has been widely investigated from an evolutionary perspective. Versions of FOXP2 are found in many organisms and show striking similarities in terms of sequence and expression patterns across vertebrate species [26, 27, 33, 3739]. Aside from a difference in polyglutamine tract length, there are only three coding changes between the mouse and human versions of the FOXP2 gene, making it one of the most highly conserved genes found in comparisons of human-rodent genomes [38, 39]. Interestingly, analyses of primates demonstrated that two of these three changes occurred in the human lineage after splitting from the chimpanzee and found additional signs that FOXP2 may have undergone accelerated evolution in humans [38, 39]. Population modeling estimated that the gene was subject to positive selection approximately 200,000 years ago, a period that coincides with, or is subsequent to, the emergence of modern humans [38, 39]. Note, however, that the errors attached to these estimates are large. Moreover, subsequent sequencing of paleontological samples has identified the human-specific coding changes of FOXP2 in Neanderthal tissues, which suggests a more ancient origin, given that Neanderthals split from humans at least 400,000 years ago [40]. Thus, the interpretation of these data is still under debate [41].

Two studies have investigated the functional differences between the human version of FOXP2 and that found in the chimpanzee. Enard et al. [42] reported that when human-specific coding changes were engineered in mice (partially 'humanizing' them at the locus), this resulted in an altered structure of innate pup vocalizations, decreased levels of exploration, decreased levels of dopamine in the brain and an increased dendrite length and synaptic plasticity in the striatum. These findings are intriguing, given that mice carrying disrupted versions of Foxp2 (described above) showed contrasting alterations in similar developmental areas. Konopka et al. [24] investigated potential differences in the functionality of the human and chimpanzee versions of FOXP2 [24]. They identified 116 genes that were differentially expressed between neuronal cell lines engineered to express either the human or the chimpanzee protein. They postulated that the identified set of genes may represent a biological network that could have a role in the evolution of human language, noting that the identified targets included genes involved in cerebellar motor function, craniofacial formation, cartilage and connective tissue formation [24].

In conclusion, although the exact contributions of FOXP2 to the development of speech and language remain unclear, the consensus from expression studies, neuro-imaging data and animal models is that this gene is of particular importance in the central nervous system, such that its dysfunction disturbs the development and function of the motor cortex, striatum and cerebellum. Investigations of the properties of FOXP2 and its downstream targets are beginning to identify networks of genes that could be crucial players in neural circuits that facilitate language acquisition.

CNTNAP2

The CNTNAP2 gene on chromosome 7q (OMIM 604569) was the first gene to be associated with genetically complex forms of SLI. This association was achieved through a candidate gene approach that arose from downstream target screening studies of FOXP2 [43]. Vernes et al. [43] discovered that FOXP2 directly binds a regulatory region of the CNTNAP2 gene. CASPR2, the protein encoded by CNTNAP2, is a member of the neurexin family, a family that is particularly interesting from a functional point of view as members are known to interact with neuroligins to adhere presynaptic neuronal membranes to postsynaptic ones. In the case of CASPR2, the protein mediates interactions between neurons and glia during nervous system development and is also involved in localization of potassium channels within differentiating axons [44, 45]. Furthermore, both neurexins and neuroligins have been strongly implicated in autistic disorder, a neurodevelopmental condition that shows strong overlap with SLI [4652].

The regulation of CNTNAP2 by FOXP2 was verified both in neuronal cell lines and in vivo (in human fetal cortical slices). In both of these experiments, the level of FOXP2 was found to be inversely correlated with that of CASPR2 [43]. An association analysis of 38 single nucleotide polymorphisms (SNPs) across CNTNAP2 was performed in 184 families ascertained by the SLI Consortium (SLIC). These families were identified by various different groups from across the UK but all contained a proband who, currently or in the past, had expressive and/or receptive language abilities more than 2 standard deviations (SD) below that expected for their age [53]. In accordance with SLI diagnostic guidelines, individuals with autistic features, signs of mental retardation or co-occurring medical conditions were excluded from this cohort. Three quantitative measures of language were considered in this group; composite scores of expressive and receptive language ability were derived from the Clinical Evaluation of Language Fundamentals battery (CELF-R) [54]. In addition, a measure of non-word repetition [55] was collected for all probands and siblings. This test involves the repetition of nonsensical words of increasing length and complexity and the results from it have been shown to be highly heritable and a consistent marker of the presence of language impairment. Non-word repetition is considered to be a measure of phonological short-term memory, leading to the proposal that short-term memory deficits may underlie some aspects of language impairment (reviewed in [56]). Nine single SNPs in CNTNAP2 showed association primarily with the non-word repetition phenotype but also with expressive and receptive language measures. The most strongly associated SNP was rs17236239 (P = 5.0 × 10-5), a variant that falls within an intronic sequence near the middle of the gene. This same region has also been implicated in a quantitative language-related trait (age at first word) in autism [57]. The exact mechanism by which the identified SNPs alter CNTNAP2 function has yet to be elucidated, but the integration of evidence from these various routes of investigation makes CNTNAP2 a compelling candidate for language disorders.

The CNTNAP2 gene has recently been implicated in multiple neurodevelopmental disorders, including Gilles de la Tourette syndrome [58], schizophrenia [59], epilepsy [59, 60], autism [57, 6165], ADHD [66] and mental retardation [45] (Table 1). This diverse range of studies provides evidence for the disruption of CNTNAP2 by copy number variants (CNVs), gross chromosomal rearrangements and mutations as well as association with common variants. It remains unclear how one gene can contribute to such an array of neurological conditions, although it should be noted that the implicated disorders are not completely disparate and can be expected to involve some shared neuropathology. Nonetheless, it is obvious that CNTNAP2 must have vital roles in neuronal development and that perturbations of the function of this gene significantly increase the chances of some form of neurological dysfunction. It is likely that the differences in outcome are decided by a complex function that includes the nature of the mutation and both the genetic and environmental background of the affected individual. For example, it is feasible to consider that gene deletions may have different effects from point mutations, and that the consequence of a point mutation will vary according to its location in the protein or its effect on gene expression. Equally, one can see how different combinations of point mutations or common variants across gene networks may have divergent outcomes that depend on the exact genes involved.

Table 1 Investigations implicating CNTNAP2 in neurological disorders

It is likely that a gene such as CNTNAP2 functions in overlapping and intersecting neurodevelopmental pathways and thus even a seemingly subtle disruption of its function may affect a variety of processes. The eventual outcome at the organ or organism level may in turn be modulated by the ability of downstream genes and proteins to compensate for these variations. We can therefore view CNTNAP2 as a neuronal buffer; subtle disruptions of this gene alone may be insufficient to cause disorder but may place a critical load on neurological systems, which manifest in different ways depending on the nature of additional load factors. Once a critical threshold of load is exceeded, it is likely that neurological imbalance will ensue.

ATP2C2 and CMIP

The calcium-transporting ATPase 2C2 (ATP2C2) and cMAF inducing protein (CMIP) genes, both on chromosome 16q, were identified as SLI candidates by a positional cloning approach, which involved a genome-wide linkage study followed by a targeted high-density association investigation [53, 6770]. These phased investigations were performed using the SLIC sample, as described above [53]. Genome-wide linkage analyses in these families revealed a strong and consistent linkage signal on chromosome 16q with a measure of non-word repetition [53, 6769]. Association analyses of chromosome 16q indicated significant association with two clusters of SNPs, one between exons 2 and 5 of the CMIP gene (most significant P = 5.5 × 10-7) and another 3 megabases distal between exons 7 and 12 of ATP2C2 (most significant P = 2.0 × 10-5) [69]. Individuals carrying risk alleles at both these loci had an average non-word repetition score more than 1 SD below those carrying homozygous non-risk alleles. Association between ATP2C2 and performance on the non-word repetition task was subsequently replicated in a language-impaired sample selected from a population cohort (most significant P = 0.006) [69]. In this replication sample, some association was also observed with CMIP but in an opposite direction to that seen in the discovery cohort (most significant P = 0.02) [69]. Although this does not preclude the presence of a genuine association, as it may be caused by differences in linkage disequilibrium patterns, it does highlight the need for careful interpretation of this result as well as for further replication in additional cohorts.

Both ATP2C2 and CMIP show expression in the brain and, although little is known about their role in this tissue, hypothetical links can be made between their putative functions and language and memory-related processes. The CMIP protein forms part of the cellular scaffold linking the plasma membrane to the cytoskeleton [71], and cytoskeletal remodeling represents a critical step in neuronal migration and synaptic formation processes. In addition, CMIP has been shown to interact with filamin A and nuclear factor κB, both of which have important neurological functions [72, 73]. ATP2C2 is responsible for the removal of calcium and manganese from the cytosol into the Golgi body [74]. Calcium is an important ion in the regulation of many neuronal processes, including working memory, synaptic plasticity and neuronal motility [75], and manganese dysregulation has been linked to neurological disorders [76]. Interestingly, in a recent meta-analysis of genetic data for ADHD, which shows significant co-morbidity with SLI, chromosome 16q was highlighted as the most consistently linked region for this disorder [77]. Concurrent genome-wide association studies described significant association with a variant in ATP2C2 [78], reinforcing the fact that, as discussed above, the correlation between genetic susceptibility and surface phenotype is far from straightforward.

As with CNTNAP2, the specific causal variants and the underlying mechanisms by which ATP2C2 and CMIP might contribute to language impairment have yet to be elucidated. The characterization of these factors will not only provide definitive evidence for the involvement of these genes but may also lead to the identification of further neurological pathways that contribute to language acquisition. Given the proposed reliance of non-word repetition performance on short-term memory ability, one can postulate that the investigation of ATP2C2 and CMIP may provide a biological link between memory-related pathways and language acquisition. The fact that neither ATP2C2 nor CMIP have been identified as downstream targets of FOXP2 suggests that the eventual combination of information from converging routes of investigation will enable the characterization of overlapping and interacting neurological systems that serve the acquisition of language.

Conclusions

The past few years have seen exciting progress in the genetics of language impairment. The increased knowledge of the FOXP2-dependent molecular networks has enabled the identification of brain regions and pathways that this gene may influence. Although FOXP2 mutations seem to contribute to only a relatively small number of language disorder cases, it seems likely that variations in the genes it controls, such as CNTNAP2, may be implicated in common forms of language impairment. Thus, as our understanding of downstream targets grows, so will our list of potential candidate genes for SLI. The association of CNTNAP2 variations with an array of developmental disorders indicates that alternative deficits may arise from the dysfunction of a neurological network, demonstrating the complexity of brain development processes.

Although the expression of FOXP2 seems to be particularly important for neurological mechanisms relevant to motor skills, we predict that ATP2C2 and CMIP are likely to be involved in memory-related circuits. Thus, although language is unique to humans, we should not necessarily expect the pathways underlying it to be exclusive to humans. Processes such as memory and motor skills have key roles in language development, but they are certainly not specific to, and may not be completely essential for, language acquisition. Rather, we expect that a variety of pre-existing and diverse neurological pathways have been adapted to promote the development of human language [79]. Characterization of these pathways and the way they overlap and interact will be an enormous task but one that is becoming increasingly feasible thanks to advances in genetic techniques. Given the expected complexity of such pathways, it seems unlikely that the identification of genetic susceptibility factors will ever lead to the discovery of a 'cure' for SLI. Nonetheless, this is a worthwhile endeavor, as a better understanding of the causes of SLI will allow the development of better diagnostic systems and therapies for affected individuals. Furthermore, it is clear that the achievement of the ultimate goal - the elucidation of a genetic network underpinning language processes - will have an impact on our understanding not only of language impairment and acquisition, but also of human development, brain function and the neuropathology of associated developmental disorders.

Authors' information

DFN is a post-doctoral researcher in APM's lab. She leads the SLI research project and was involved in the positional cloning of ATP2C2 and CMIP. SEF is a Royal Society Research Fellow and Reader in Molecular Neuroscience at the WTCHG, where he pioneers investigations into molecular mechanisms underlying speech and language. After working with APM on the identification of FOXP2, he became head of his own laboratory, which uses state-of-the-art methods to uncover how language-related genes influence the brain at multiple levels. APM is the head of the developmental neurogenetics group at the Wellcome Trust Centre for Human Genetics (WTCHG) in Oxford. His group works in two main areas: the genetics of neurodevelopmental disorders, including complex genetic diseases such as autism, specific language impairment and developmental dyslexia; and the positional cloning and functional characterization of monogenic neurological diseases, including chorea acanthocytosis, speech and language disorder and Menkes disease. All three authors are members of the SLI Consortium.