Two initial GWLA studies of SLI identified novel candidate regions that are linked to performance on tests of language-related ability, in families affected by SLI [5–7]. The first study, conducted by the SLI Consortium (SLIC), identified two novel susceptibility loci on chromosomes 16q23–24 (SLI1, OMIM 606711) and 19q13 (SLI2, OMIM 606712) [6]. A subsequent study by Bartlett et al. highlighted 13q21 (SLI3, OMIM 607134) and 2p22 as additional loci predisposing to SLI [7].
SLIC analysed 98 UK families (473 individuals), each with a proband whose language scores fell ≥1.5 standard deviation (SD) below the mean for their age and nonverbal IQ scores that were within the specified normal range (>80). This study considered three quantitative language measures derived from the Clinical Evaluation of Language Fundamentals—Revised (CELF-R) and non-word repetition (NWR) [6]. CELF-R is a commonly used test battery designed to assess both receptive and expressive language ability in school-age children [18]. The NWR test measures the ability to retain novel phonological (speech sound) information for short periods of time; this is commonly impaired in people with SLI [5, 11]. Following the language tests, linkage analysis was performed using 400 highly polymorphic microsatellite markers. Significant linkage was detected on chromosome 16 with the NWR trait (LOD score 3.55; P = 0.00003) and on chromosome 19 with the CELF-R expressive language score [ELS] (LOD score 3.55; P = 0.0004) [6].
Bartlett et al. analysed five Canadian families (including 73 individuals) of Celtic ancestry. Individuals were sorted into three categorical groups, labelled ‘language-impaired’, ‘reading-impaired’ and ‘clinically impaired’ on the basis of their performance across six quantitative language tests. Language-impaired individuals were defined by a Spoken Language Quotient (SLQ) score (from the Test of Language Development) [19] of ≤85, reading-impaired individuals had a non-word reading and IQ discrepancy, and clinically impaired individuals had a history of speech and/or reading therapy [7, 8]. Linkage to region 13q21 was detected under a recessive model in the reading-impaired group (LOD score 3.92; P < 0.01), and linkage to 2p22 was detected under a recessive model of the language-impaired group (LOD score 2.86; P < 0.06) [7].
The lack of overlap between the SLI susceptibility regions, implicated by these two independent linkage studies, not only supports the theory of locus heterogeneity but also demonstrates the statistical complexity of replicating genetic loci in different cohorts with variable phenotypes. Susceptibility regions 13q21 and 2p22, highlighted by Bartlett et al., may not have been detected by SLIC, because of alternative allele frequencies within the UK sample set. Although the Canadian families in the study by Bartlett et al. were not considered population isolates, they were selected from a different ethnic background compared with that of the SLIC cohort, and thus the markers carried in these regions may have been elevated to a detectable linkage peak in this group. Furthermore, SLIC and Bartlett utilised slightly different linkage methodologies and diagnostic criteria for determining SLI affection status. SLIC diagnosed probands on the basis of a clinical verbal-language battery [5, 6], whereas Bartlett included a more varied set of phenotypes, including reading ability (designed to reflect the proband’s overall language ability), and then classified all individuals as being affected or unaffected under three alternative definitions [7, 8]. SLIC used non-parametric linkage methods, whereas Bartlett used parametric linkage, assuming 7 % population penetrance, and used both dominant and recessive models of inheritance [5–8].
Despite the inconsistency of loci linked to SLI, both SLIC and Bartlett have since replicated their findings [5, 8]. SLIC conducted a targeted linkage study in 2004, with an additional 86 nuclear families selected and characterised as described for the SLIC samples above. Linkage was detected again on chromosomes 16 (LOD score 2.86; P < 0.02) and 19 (LOD score 2.31; P < 0.02), both to the NWR trait. The two SLIC cohorts were then pooled to total 184 families and 840 individuals. In this pooled dataset, highly significant linkage was detected on chromosome 16 [5].
In 2007, SLIC applied a genome-wide multivariate linkage approach to their pooled cohort, which was able to analyse linkage to multiple quantitative traits simultaneously. In total, they investigated 11 measures of spoken and receptive language ability, reading ability and non-verbal IQ. This study supported the Consortium’s previous evidence linking loci on chromosomes 16q (P = 0.008) and 19q (P = 0.017), and highlighted a novel region of linkage on chromosome 10q26 (P = 0.019) [9]. The linkage on chromosome 16q was specific to NWR in the previous univariate SLIC studies, and in the multivariate study it was linked to NWR and literacy measures (single-word reading and single-word spelling) [5, 6, 9], indicating that variation in this region will likely impact phonological memory. An inability to store short-term verbal information is likely to impair the ability of an individual to acquire and retain language skills, and this has been a growing, aetiological theory surrounding SLI [20, 21]. In contrast, linkage to chromosome 19q had previously been detected with multiple traits, firstly to ELS [6], then to NWR [5], and in the multivariate study it was found to be linked to a variety of expressive and language traits [9]. This suggests that the risk variants within this linkage region may impact upon a variety of language abilities.
One final study further expanded the SLIC cohort, analysing an additional 300 individuals from 93 families affected by SLI, and again replicated linkage of chromosome 16q (P = 0.002) with NWR, and linkage of chromosome 19q (P = 0.007) with ELS [22].
Bartlett et al. also expanded their cohort to include 22 US-based nuclear families with at least one individual per family affected by SLI, in combination with the original families studied [8]. In total, 365 DNA samples were genotyped for microsatellites on chromosomes 2 and 13, enabling replication of the chromosome region 13q21 linkage, using similar parametric modelling procedures. Further analysis revealed that only a small percentage of the families contributed to this linkage, suggesting that the risk factor is not sufficient or necessary for SLI to manifest itself in all families. In this combined sample set, the linkage region on chromosome 2 could not be replicated, suggesting that if SLI risk factors exist on chromosome 2, they may have a small effect size with low penetrance, which would make them difficult to identify using a linkage study [8].
In addition to the SLIC and Bartlett studies, a few other groups have also investigated the genetic basis of SLI, using single families and isolated populations [12, 23, 24, 25•]. These studies provided a unique opportunity to look at individuals with an increased level of shared environmental and genetic influences. When certain phenotypes become more prevalent in isolated populations, it may suggest that a founding genetic influence has been shared amongst the group and may thus be more common and somewhat easier to identify.
A classic linkage study, conducted prior to the two described above, linked language impairment to a region designated SPCH1 on chromosome 7q [26, 27]. This study analysed a single, large, three-generation pedigree known as the KE family, in which approximately half of the individuals were affected with a severe speech and language disorder. GWLA identified a region on chromosome 7q31 (maximum LOD score 6.62) that co-segregated with the language disorder in this particular family [26] and has since been narrowed down to a causative mutation in the FOXP2 (forkhead box P2) gene (OMIM 605317) [24]. It is important to note that not all members of the KE family would meet the selection criteria for studies of specific language impairment, because they had evidence of both intellectual disability and motor impairment. In addition, the severe language impairment exhibited by members of the KE family surpasses the typical SLI phenotype, in the sense that it involves a variety of associated neurological dysfunctions. All of the FOXP2 coding regions, or exons, have since been screened for association with SLI, using 43 SLIC probands, but no associations or mutations were detected in this study [28]. It is likely, then, that the FOXP2 gene remains functional in typical SLI probands. Despite this, the FOXP2 gene is clearly vital for language acquisition, as demonstrated by the KE phenotype [26]. The KE phenotype has since been described as childhood apraxia of speech (CAS), which has been linked to disruptive variants within FOXP2, as demonstrated by similarly affected families [29–32]. Subsequent studies of CAS also identified 16 submicroscopic deletions and duplications (copy number variants [CNVs]) in half of the participants [33]. These fell across ten different chromosomes and have the potential to cause disruptions in speech and language development. Of note, overlapping deletions at chromosome 16p13.2 were found in two of the participants, though the region does not overlap with previously implicated loci on chromosome 16, and the phenotypes associated with CNVs in this region have not been characterised [33]. High-throughput sequencing methods have also been applied to individuals affected by CAS [34]. Although this study sequenced the entire exomes (i.e. all known gene-coding regions) of ten CAS probands, it reported only mutations that affected known candidate genes. The study reported potentially clinically relevant variants (i.e. those variants that were predicted to have a deleterious effect upon protein function and had a reported population frequency of <0.3 %) in eight of the ten individuals investigated. These were distributed across six candidate genes that had previously been associated with CAS (FOXP1 [forkhead box P1], CNTNAP2 [contactin associated protein-like 2]) or overlapping phenotypes (ATP13A4 [ATPase type 13A4], CNTNAP1 [contactin associated protein 1], KIAA0319 and SETX [senataxin]) but did not include any FOXP2 mutations [34]. Although preliminary, the findings of this study suggest that the application of high-throughput methodologies and comprehensive analyses of the arising data may prove fruitful in future studies of speech and language impairments.
In 2010, a linkage study was conducted on a three-generation German family, the NE family, with multiple members affected by variable language and literacy impairments [23]. Psychoacoustic tests demonstrated an auditory processing deficit that co-segregated with these impairments. The investigators hypothesised that this deficit disallowed the affected family members to discriminate between tone durations, putting them at an increased risk of language-related disorders. Linkage analysis suggested that a large 58.5 Mb region, containing 600 genes, on chromosome 12p13.31–12q14.3 may contain a contributory variant, but the specific variant has yet to be identified [23].
Another independent linkage study investigated an isolated Chilean population with an increased prevalence of SLI. A series of language tests indicated that ~35 % of the island’s children met criteria for SLI, and a further 27.5 % had impaired language skills accompanied by other neurological deficits [12]. Parametric and non-parametric linkage analyses found consistent linkage to a 48 Mb stretch of chromosome 7q31–q36 (LOD score 6.73; P = 4.0 × 10−11), which included the genes FOXP2 and CNTNAP2 (OMIM 604569) [12], the significance of which is discussed later in this article. No single co-segregating chromosome regions were identified using parametric linkage analyses, which supports the likelihood of a polygenic aetiology of SLI in this population.
A recent linkage study detected a heterozygous 4 kb deletion at chr2q36 (SLI5; OMIM 615432) in 15 Southeast Asian probands with language delay and white-matter hyper-intensities (WMH), which are common markers of aging [25•]. The deletion, which eliminated exon 3 of the TM4SF20 [transmembrane 4 L six family member 20] gene, co-segregated with language delay in the 15 families studied and appeared to represent an ancestral haplotype confined to Southeast Asian populations, notably Vietnamese, Thai and Burmese, with an allele frequency of ~1 % [25•]. The function of the TM4SF20 gene is unknown, and its function has yet to be assessed in other SLI populations.
Another study described a geographically isolated, Russian-speaking population with an increased prevalence of SLI [35]. The settlement involves ~871 people, 20–40 % of whom have language impairment [35]. At present, no genetic studies have been conducted using this population, but it is likely that the increased prevalence of SLI is caused by a founder mutation that is now widespread amongst the settlers, as seen in previous family and isolated-population studies of language impairment [12, 23, 25•].
Populations with an increased prevalence of language impairment can assist with the identification of candidate genes. It is assumed that the causative variant will be found more commonly within the affected population and will thus be easier to associate with the impairment. Future studies would benefit from investigating the role of the candidate genes that have been identified in these populations.