Introduction

Stuttering is a speech disorder characterized by involuntary repetitions or prolongations of words or syllables, or by involuntary interruptions in the flow of speech, known as blocks (Bloodstein and Ratner 2008). The disorder typically arises in children at age 3–4 years, where it is common, affecting up to 20% of all children. In 75–80% of these children, the disorder spontaneously resolves. However, in the remaining fraction, the disorder persists into adolescence and adulthood, where it is difficult to treat. The disorder preferentially affects males. At the typical age of onset, males slightly outnumber females, and with time, females tend to recover more frequently than males, which leaves a male/female ratio of approximately 4:1 in persistent stuttering. While the disorder has been clinically well described for millennia, its underlying causes have been unknown. A remarkable array of potential causes of stuttering have been hypothesized, however the lack of clear data supporting any of these suggested causes has hampered the development of more effective speech therapies.

Although specific causes of stuttering long remained obscure, it was clear that genetic factors play a role in the disorder. Evidence supporting this has come from numerous independent twin studies (Andrews et al. 1991; Howie 1981; Felsenfeld et al. 2000; Ooki 2005; Dworzynski et al. 2007), from adoption studies (Felsenfeld and Plomin 1997; Bloodstein 2006), and from studies of large families in which there are many cases of stuttering (MacFarlane et al. 1991). However, although a clear role for genetic factors was established, a more precise understanding of the nature of these factors has been elusive. Although the disorder predominantly affects males, transmission from fathers to sons is commonly observed, which rules out X-linked inheritance. In addition, there have been conflicting reports regarding other possible modes of inheritance of stuttering (e.g., dominant or recessive), and there has been conflicting evidence for the presence of a single major gene causing the disorder in families.

A traditional goal in inherited disease research is a linkage study, in which genetic markers positioned along the length of each human chromosome are tested for co-inheritance (i.e., linkage) with the disorder as it is passed down through families. Because the chromosomal locations of each of the markers tested is known, the observation of linkage with a particular marker or markers identifies the location of the causative gene(s) in those families. However, the lack of Mendelian inheritance of stuttering in families rendered such studies, which traditionally rely on a clear model of inheritance, unfeasible. The development of so-called model-free or non-parametric methods for the statistical analysis of linkage data addressed this problem, and a number genome-wide linkage studies of stuttering have now been published (Shugart et al. 2004; Wittke-Thompson et al. 2007; Suresh et al. 2006). In general the results of these studies have been somewhat disappointing, because they have predominantly generated linkage scores of marginal statistical significance. Moreover, linkage locations observed in one study were not replicated in others.

To address this problem, we turned to families with many cases of stuttering from highly consanguineous populations. Such inbred populations have traditionally been advantageous for studies of genetic disorders, and the incidence of recessive disorders in particular is highly elevated compared to that in outbred populations. Although stuttering does not in general display a recessive or any other clear Mendelian inheritance pattern, we hypothesized that such consanguineous families would enable more effective linkage studies of stuttering. We initially recruited 44 consanguineous families, each with multiple individuals who stutter, from the city of Lahore, Pakistan and surrounding areas. Other speech-language disorders or other neurological phenotypes were not reported by the families, and not observed by the local evaluating speech-language pathologist. The Stuttering Severity Instrument, 3rd Edition, was used to characterize stuttering in these families.

We performed a linkage study that enrolled 44 families from Pakistan, each containing multiple cases of stuttering, and performed a genome-wide search for linkage. The results of this study were clear. We observed a strong linkage signal on the long arm of chromosome 12 (Riaz et al. 2005). Our subsequent studies have focused on identifying the causative genetic variation that resides in this region of chromosome 12.

Analysis of genes on chromosome 12

The region of chromosome 12 identified in our linkage study spanned more than 10 million bp of DNA and contained 87 known and predicted genes. Given the daunting task of evaluating this large candidate region, plus the complex inheritance pattern observed for stuttering, we hypothesized that analysis of a single large family provided the best chances of identifying a unique causal genetic variant. We chose one of the largest families from our Pakistani linkage study, family PKST 72 (Fig. 1), for this purpose. We initially performed a customized comparative genomic array hybridization study using CGH 385 K Array (NimbleGen) to look for large-scale genetic lesions (such as chromosomal deletions or inversions) that were shared only by the affected members of family PKST 72. No such variants were identified. We then undertook systematic DNA sequencing of 45 of the 87 genes in this region in PKST 72 family members. This allowed us to identify a number of genetic variants in affected members of PKST 72; however, most of these variants were commonly observed in the normal Pakistani population, indicating that they were non-pathogenic natural polymorphisms.

Fig. 1
figure 1

Pedigree structure of Pakistani stuttering family PKST 72. Filled blocks and circles denote affected family members, open blocks and circles denote the unaffected ones

The variant that co-segregated most closely with stuttering in PKST 72 was an apparent mutation that substituted an adenine residue for the normal guanine residue that exists at position 3598 in the coding sequence of a gene designated GNPTAB (Kang et al. 2010). This mutation predicted the substitution of a positively charged lysine for the normal, negatively charged glutamic acid present at amino acid position 1200 in the GNPTAB protein. A comparison of the sequence of the GNPTAB gene in other species revealed that the glutamic acid at residue 1200 was conserved in the GNPTAB protein in all species known, suggesting that a glutamic acid is necessary for an important functional role at this position in the protein. An initial screen of 96 normal Pakistani control subjects showed that only one carried a lysine at this position, and no other amino acids other than glutamic acid were observed at this position. Sequencing a single representative affected individual from each of the other 43 families included in the genome-wide linkage study of Riaz et al. (2005) revealed the lysine mutation at position 1200 in three additional Pakistani stuttering families, PKST 05, 25, and 41. The appearance of the same mutation in the same gene in a number of families, independently ascertained solely on the basis of familial stuttering, was a strong piece of evidence favoring the hypothesis that mutations in this gene cause stuttering. In these families, affected individuals carried either one or two copies of this mutation. This observation is consistent with the complex, non-Mendelian inheritance pattern well known to occur in stuttering families (Kidd et al. 1981; Cox et al. 1984).

We then examined the GNPTAB gene more fully in 316 unrelated individuals who stutter and in 276 unrelated neurologically normal controls. Additional mutations at other sites in this gene were identified in our cases, but no variants were observed in any of our controls (Table 1). In human gene finding studies, the identification of different mutations in the same gene in different families with the same disorder constitutes substantial proof that the disease gene has been correctly identified. Thus, our finding of additional mutations in the GNPTAB gene in unrelated familial cases of stuttering added significant additional support to our hypothesis that mutations in GNPTAB cause stuttering.

Table 1 Mutations found in the GNPTAB, GNPTG and NAGPA genes in stuttering (Kang et al. 2010)

The GNPTAB gene and the lysosomal enzyme-targeting pathway

The GNPTAB gene encodes a portion of the enzyme GlcNAc-phosphotransferase (EC 2.7.8.17). This enzyme acts in the first step of the so-called lysosomal enzyme-targeting pathway. This pathway functions to generate the mannose-6-phosphate targeting signal that serves as a marker, recognized by the mannose-6-phosphate receptors, that direct a large group of degradative hydrolases (~60) to the lysosome (Reitman and Kornfeld 1981) (Fig. 2). The GNPTAB gene encodes a single polypeptide that is subsequently cleaved to generate the α and β subunits, which together constitute the catalytic subunits of the GlcNAc-phosphotransferase enzyme. The holoenzyme also contains a third subunit, designated γ, which serves the function of substrate recognition. The γ subunit is encoded by the GNPTG gene. This gene thus became a functional candidate gene for stuttering.

Fig. 2
figure 2

Two-steps adding mannose-6-phosphate markers on lysosomal enzymes. In the first step, GlcNAc-phosphotransferase (GNPTAB/G) transfer GlcNAc-1-phosphate from UDP-GlcNAc to terminal mannose residues of N-linked glycans on enzymes destined to the lysosome. In the second step, N-acetylglucosamine-1-phosphodiester-alpha-N-acetylglucosaminidase (NAGPA), also known as the uncovering enzyme, cleaves off the GlcNAc residue, thereby exposing mannose-6-phosphate (circled). Enzymes with this recognition marker are then transported from Golgi to the lysosome

Examination of the sequence of the GNPTG gene revealed a number of mutations, all at amino acid positions evolutionarily conserved within the encoded protein, in individuals who stutter (Table 1). None of these variants were observed in the GNPTG gene of 276 neurologically normal control subjects or in databases of normal human genetic variants, such as dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/).

The lysosomal enzyme-targeting pathway has two steps. The first is carried out by the GNPTAB/G phosphotransferase enzyme. The second step is carried out by the enzyme N-acetylglucosamine-1-phosphodiester alpha N-acetylglucosaminidase (EC 3.1.4.45), encoded by the NAGPA gene. This enzyme is commonly referred to as the Uncovering Enzyme (UCE), because it functions to remove the terminal GlcNAc residue, exposing the mannose-linked phosphate (Fig. 2) that serves as the ultimate targeting signal. The NAGPA gene thus also became a candidate gene for stuttering. Sequencing this gene quickly revealed a number of different mutations in stuttering cases, none of which occurred in normal controls (Table 1). One of these mutations (encoding a cysteine in place of the normal arginine at position 328 in this protein) was observed in five unrelated stutters, suggesting that it may be a commonly occurring mutation in stuttering.

Together, our data provide a strong argument for the involvement of mutations in the lysosomal enzyme-targeting pathway in stuttering. The mutations we observed were generally uncommon in the stuttering population. In aggregate, a mutation was observed in one of these three genes in approximately 8% of our unrelated cases of familial stuttering. However, given the relatively high frequency of persistent stuttering in the general population (estimated to be as high as 1%), this frequency could represent a substantial number of cases attributable to such mutations, perhaps tens of thousands of cases in the United States alone.

Lysosomal targeting disorders in humans — the mucolipidoses

Mutations in the genes encoding the lysosomal enzyme-targeting pathway in humans are well studied. Mutations in GNPTAB cause mucolipidosis type II (MLII, formerly known as I-cell disease, MIM#252500) or type IIIA (MLIIIA, formerly known as pseudo-Hurler polydystrophy, MIM#252600) (Kudo et al. 2006), while mutations in GNPTG cause mucolipidosis type IIIc (MLIIIc, formerly known as variant pseudo-Hurler polydystrophy, MIM#252605) (Raas-Rothschild et al. 2000). ML II and IIIA are rare lysosomal storage disorders characterized by pathology in many tissues, including the skeletal system, connective tissues, brain, liver and spleen. A detailed clinical examination of four stuttering subjects containing mutations in these genes revealed no symptoms associated with ML II or III, and other than moderate to severe persistent stuttering, all subjects were neurologically normal by medical examination. One possible explanation for this discrepancy is that ML II and III are typically associated with complete or near complete loss-of-function mutations, such as deletions or stop codons, while virtually all the mutations we observed in stuttering are mis-sense mutations which may not abolish the functioning of these enzymes. Biochemical functional studies of the enzymes that carry mutations we’ve found in stuttering subjects will be an important adjunct to our genetic findings thus far. An interesting observation in children with ML II is that although this severe disease is fatal in the first decade of life, even the children who survive the longest fail to gain the ability to speak (Otomo et al. 2009).

No human disorder has previously been associated with mutations in the NAGPA gene. This has been puzzling, as such mutations might be expected to cause symptoms similar to mucolipidosis types II and III. We hypothesize that the predominant manifestation of mutations in the NAGPA gene is persistent nonsyndromal stuttering.

Other potential mutations affecting lysosomal function in stuttering

Many different chromosomal locations have generated suggestive evidence for linkage in family studies of stuttering (Shugart et al. 2004; Wittke-Thompson et al. 2007; Suresh et al. 2006). Given our results implicating mutations affecting lysosomal function in stuttering, it is reasonable to hypothesize that mutations in other genes encoding lysosomal functions may contribute to this disorder. Although several such genes reside near a previously suggested linkage signal for stuttering, to date we have not yet identified mutations in any of these genes in our stuttering population.

Population genetics of stuttering

Many of the mutations in these three genes were observed only once in our stuttering cases, suggesting that they are rare. However two mutations, the substitution of a lysine at position 1200 in GNPTAB and the substitution of a cysteine at position 328 in NAGPA, were repeatedly observed in apparently unrelated individuals. The lysine variant at position 1200 was observed in the affected members of four ostensibly unrelated Pakistani families (PKST 5, 25, 41, and 72) as well as in three unrelated individual cases. The cysteine variant at position 328 in NAGPA has now been observed in five unrelated individuals.

There are two ways in which a mutation comes to be commonly observed in apparently unrelated individuals. The first is due to a so-called mutation hotspot, in which new mutations repeatedly arise at the same position in the same gene. The second explanation is a founder mutation, which occurred once in the distant past and has been passed down to the individuals who carry it today. Such individuals are thus actually distantly related to each other through the common ancestor in whom the mutation arose. These two possibilities can be distinguished by examining the DNA sequence of the chromosomes surrounding the mutation in different individuals. Repeated mutations typically arise at the same site on the same chromosome in different individuals. These chromosomes are distinguishably different between different individuals, and the DNA sequence surrounding the mutation shows occasional variant sites that are different in each mutation-carrying chromosome. Founder mutations typically still carry some of the chromosome on which they arose, leading to a so-called conserved haplotype of DNA sequence variants surrounding the mutation in different individuals (Drayna 2005). We have not yet identified enough cases of the cysteine mutation at position 328 in NAGPA for a statistically significant analysis. However, we have observed a total of 14 chromosomes carrying the lysine mutation at position 1200 in GNPTAB, sufficient for an initial study of this question.

All 14 chromosomes containing this mutation shared a common haplotype of variants in the surrounding chromosomal DNA, indicating that it is a founder mutation. The length of the shared haplotype can be used to estimate the age of this mutation. In this case the estimate is subject to wide error due to the small sample size, but it is clear that the mutation is quite old, with a most likely estimate of ~570 generations, or about 14,000 years (Fedyna et al. 2011). It is noteworthy that all individuals carrying this mutation have their family origins in either Pakistan or India. A limited geographic distribution is common for many founder mutations, further supporting our conclusion regarding the origin of this mutation.

Neuropathology of stuttering

How do mis-sense mutations in genes encoding the lysosomal enzyme-targeting pathway specifically cause stuttering? At present, we do not know. The lysosomal enzyme-targeting pathway exists, and the GNPTAB/G and NAGPA enzymes are expressed in all cells of the body (Kornfeld et al. 1999). However, we hypothesize that mutations in these genes generate a metabolic defect in a class of neurons uniquely dedicated to speech. At this time, we believe such neurons are unique to speech because of the lack of other detectable neurologic symptoms in the individuals carrying these mutations that we have evaluated to date. Consistent with this hypothesis is the fact that lysosomal storage disorders frequently display a striking specificity in the organs, or the subtissues within organs, that are affected (Kornfeld and Sly 2010). The future goals of our research are to identify the cells within the brain that are affected by these mutations, to determine the normal function of these cells, to trace their connections within the central and peripheral nervous system, and to determine how the function of these cells is altered by the stuttering mutations we have found.

Finding additional stuttering genes

Our findings to date can explain less than 10% of familial stuttering. Our goal has been to apply our genetic approach to identifying the underlying genes in other cases of this disorder. To this end, we have recently enrolled an additional set of highly consanguineous families in Pakistan, each having multiple individuals affected with persistent stuttering. A genetic linkage study of one of these families, designated PKST 77, has identified a locus on chromosome 3 that shows strong evidence for linkage to stuttering (Fig. 3) (Raza et al. 2010). No gene or linkage signal for stuttering has previously been suggested to reside at this locus, indicating that the identification of this gene is likely to provide new insights into the pathology of persistent stuttering.

Fig. 3
figure 3

Pedigree structure of Pakistani stuttering family PKST 77. Filled blocks and circles denote affected family members, open blocks and circles denote the unaffected ones

Future research opportunities

The finding of mutations associated with stuttering has opened a number of possible new avenues for research. One longstanding question in speech-language pathology has been why stuttering therapy can have very different outcomes in different individuals. One possibility is that such outcomes are influenced by different genetic mutations carried by different individuals. Because we now have genetic mutations that we can test, it becomes possible to address this question.

Another exciting avenue of research is based on the large body of knowledge that exists on the biochemistry and cell biology of the GNPTAB/G and NAGPA enzymes, which is available due to decades of careful medical research on the mucolipidoses. Why do some mutations in these genes cause mucolipidosis, a fatal multi-organ system disease, while other mutations in the same genes cause only stuttering? What are the enzymatic and cell biological consequences of these stuttering mutations, and how can they help elucidate the neuropathology underlying this disorder? The well-established biochemical assays for these enzymes (Reitman and Kornfeld 1981; Varki and Kornfeld 1981) hold the promise of answering these questions in some detail.

Finally, it is clear from comparative genomic analysis studies that humans and other mammalian species share the vast majority of their genes. It is clear that although speech and language are unique to humans, these abilities are built upon structures and functions that are encoded by genes that are not unique to humans. This suggests that studies in model organisms may be useful in understanding stuttering. Mice carrying complete loss of function mutations (knock-out mice) have been engineered for the GNPTAB and GNPTG genes. These mice display symptoms similar to human mucolipodiosis and die at an early age (Gelfman et al. 2007; Vogel et al. 2009). Therefore, it will be necessary to engineer the equivalent of human stuttering mutations in these genes (creating knock-in mice) to generate plausible mouse models of human stuttering. While mouse vocalizations are extremely rich (Holy and Guo 2005), they remain poorly characterized, partly because most of them are ultrasonic and not accessible to analysis without sophisticated acoustical methods. However, the observation of altered vocalization in a mouse carrying a human stuttering mutation would open a wide array of studies that could exploit the unparalleled genetic methods available in this species. In addition, the availability of such mice will allow a search for other neurological deficits that may manifest themselves in such mice, and will facilitate neuropathology studies that could identify specific degenerative processes in the mouse brain. Although genetic studies of stuttering are in their infancy, we believe such studies hold the promise of significant advances in our understanding of this common speech disorder at the molecular and cellular level.