Background

Speech and language disorders can be classified into numerous categories, including stuttering, speech sound disorder (SSD), verbal dyspraxia, specific language impairment (SLI) and developmental dyslexia (DD) [1]. Dyslexia, also known as reading disability (RD), is characterized by difficulties in reading and spelling despite of normal intelligence and adequate education background without any neurological impairments [2,3]. Though language disorders such as dyslexia are quite different concept from speech disorders, in many cases, it is difficult to discriminate a language disorder from a speech disorder in a specific individual [4]. Hence, some researchers regard them as a continuum of language disorders [5-7]. Motor deficiency might be one of the underlying mechanisms that explain how the two defects are connected. For instance, stuttering has been attributed to a temporal motor defect in speech preparation [8,9]. In terms of dyslexia, some recent studies have revealed that dyslexic individuals suffer from motor problems as well, especially in performing fine movements [6,10]. A great deal of evidence reveals that language disorders and speech disorders could share some genetic factors. For example, forkhead box P2 (FOXP2) and its downstream target gene contactin associated protein-like 2 (CNTNAP2) have been shown to be an important link in the networks of several speech and language disorders, including SLI, dyslexia, stuttering and dyspraxia [1,11-20]. This viewpoint triggered us to verify whether candidate genes for stuttering were also involved in the pathogenesis of developmental dyslexia.

Recently, in a study of stuttering individuals from Pakistan and North America, candidate gene and linkage analyses identified several mutations in the lysosomal enzyme-targeting pathway genes N-acetylglucosamine-1-phosphate transferase gene (GNPTAB), N-acetylglucosamine-1-phosphate transferase, gamma subunit (GNPTG) and N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase (NAGPA) [21]. Subsequent studies of stuttering identified mutations in the GNPTAB gene and two functionally related GNPTG and NAGPA genes in large families and in the sporadic patients, reaffirming their association with stuttering [22-24]. However, the relevance of these genes with dyslexia has not yet been reported. It has been shown that stuttering is more common in children who suffer from concomitant speech, language, or motor deficiencies, implying that speech and language disorders may be connected genetically to some extent. Therefore, the three genes (GNPTAB, GNPTG and NAGPA) that may predispose people to stuttering are potential candidate risk genes for other speech and language disorders. Based on the above evidence, we performed association analysis on these genes with dyslexia in a large unrelated Chinese cohort.

Results

Single marker analysis

In the present study, we performed genotyping on Tag SNPs of three candidate genes for stuttering, GNPTAB, GNPTG and NAGPA. Data adjustment for age and sex was performed on genotyping results. Table 1 shows the SNP markers with significant unadjusted p-values (<0.05) in the study.

Table 1 Association between significant SNP markers and dyslexia using the additive, dominant, genotype, and the recessive models

In GNPTAB, we genotyped 11 Tag SNPs and found nominal association of one SNP with dyslexia before adjustment (Additional file 1). SNP rs10778148 showed significant association with dyslexia under recessive model (P = 0.007633, OR = 7.568) and in homozygous genotype (P = 0.008803, OR = 7.3083). After the adjustment for age and sex, the association between SNP rs10778148 and dyslexia remained significant under recessive model (P = 0.01205, OR = 7.462) and in homozygous genotype (P = 0.01364, OR = 7.2499). Moreover, we found rs17031962 achieved significant level under dominant model (Padjusted = 0.003443, OR = 0.6647) and in heterozygous genotype (Padjusted = 0.007001, OR = 0.6738) after adjustment for age and sex. However, only the P-value of rs17031962 under dominant model (Padjusted = 0.0357) remained significant after the FDR adjustment for multiple comparisons.

In GNPTG, we genotyped 2 Tag SNPs (Additional file 2) and only found one SNP significantly associated with dyslexia before adjustment. SNP rs2887538 showed significantly associated with dyslexia under dominant model (P = 0.03411, OR = 0.7634). However, no significant association was found after FDR correction.

In NAGPA, we genotyped 8 Tag SNPs (Additional file 3) and only found one SNP significantly associated with dyslexia before adjustment. SNP rs882294 showed significantly associated with dyslexia under additive model (P = 0.006043, OR = 1.404), dominant model (P = 0.006426, OR = 1.462) and in heterozygous genotype (P = 0.01175, OR = 1.4361). After the adjustment for age and sex, the association between SNP rs882294 and dyslexia remained significant under additive model (Padjusted = 0.001571, OR = 1.531), dominant model (Padjusted = 0.00167, OR = 1.611) and in heterozygous genotype (Padjusted = 0.003546, OR = 1.6765). While after FDR correction, the association between SNP rs882294 and dyslexia remained significant under additive model (Padjusted = 0.0336) and dominant model (Padjusted = 0.0357).

Haplotype analysis

We built 3 blocks within GNPTAB and 3 blocks within NAGPA through Haploview software (Figures 1 and 2).

Figure 1
figure 1

Linkage disequilibrium analysis of the 11 SNPs in GNPTAB investigated in healthy controls (a). Three blocks were identified using Haploviewsoftware (b).

Figure 2
figure 2

Linkage disequilibrium analysis of the 8 SNPs in NAGPA investigated in healthy controls (a). Three blocks were identified using Haploviewsoftware (b).

In GNPTAB, haplotype analysis was conducted in three blocks (Table 2). All blocks were not associated with dyslexia (P > 0.05 Omnibus test), but a four marker protective haplotype TTCT (Block1 rs1811338-rs17031962-rs10778148-rs11111007) was identified after adjustment for age and sex (Padjusted = 0.00985, OR = 0.761). However, all P-values failed to reach significance after the FDR correction.

Table 2 Haplotypes of the three blocks in GNPTAB between developmental dyslexia and control subjects

In NAGPA, haplotype analysis was conducted in three blocks (Table 3). Block 3 consisting of rs1001170, rs882294 and rs17137545 was associated with dyslexia (P = 0.0228 Omnibus test), and included one risk haplotype TCT (Punadjusted = 0.0129, OR = 1.38). After adjustment for age and sex, the association for haplotype TCT in Block 1 remain significant (Padjusted = 0.00289, OR = 1.52), and a risk haplotype GTC in Block 2 (rs12929808-rs7110-rs3743840) achieved significant level (Padjusted = 0.0494, OR = 1.28). However, all P-values failed to reach significance after the FDR correction.

Table 3 Haplotypes of the three blocks in NAGPA between developmental dyslexia and control subjects

Discussion

Generally, deficits in speech and language functions can be characterized as expressive (production), as receptive (comprehension) or as mixed [4]. Genetically, different mental disorders may share some common factors [1,11-20]. The present study aimed to identify the correlation between dyslexia and three stuttering associated genes, GNPTAB, GNPTG, and NAGPA. Our data showed that genetic variants of GNPTAB and NAGPA might contribute to the pathogenesis of dyslexia.

GNPTAB and GNPTG genes encode the alpha and beta subunits and gamma subunit of enzyme UDP-GlcNAc-1-phosphotransferase (GNPT), which is essential to proper trafficking of lysosomal acid hydrolases [25]. Mutations in GNPTAB and GNPTG genes could cause mucolipidosis types II and III, which are severe forms of autosomal recessive lysosomal storage diseases [26,27]. Here we identified that two SNP markers, rs17031962 and rs10778148, were associated with dyslexia with significant adjusted p-value. However, only an intronic SNP marker rs17031962 was associated with dyslexia under dominant model after the FDR correction.

Moreover, NAGPA encodes a Golgi enzyme that catalyzes the second step in the formation of the mannose 6-phosphate recognition marker on lysosomal hydrolases [28]. Our data showed that SNP rs882294 was associated with dyslexia with the allele C as a risk factor after FDR correction. Recently, three mutations in the NAGPA gene including one deletion and two missenses have been identified in patients with persistent stuttering. Further biochemical analysis shows that these mutations could impair folding and change degradation activity by the proteasomal system [29]. Since both GNPTAB and NAGPA are involved in lysosomal decomposition, the above evidence may reveal a potential role for inherited enzyme deficiencies in lysosomal metabolism in speech and language disorders such as stuttering and dyslexia. Furthermore, this knowledge may trigger a variety of new investigations that could help to explore the biological mechanism underlying speech and language disorders.

Conclusion

In conclusion, we found significant association between development dyslexia and genetic variants in genes encoding the lysosomal targeting system in a large unrelated Chinese cohort. Our data also supported that there are common genetic factors underlying the pathophysiology of different speech and language disorders.

Methods

Subjects

Dyslexia screening underwent the two-stage procedures as previously reported. The criteria for dyslexic patients and healthy individuals was described previously [30]. This study was approved by the ethical committee of Tsinghua University School of Medicine. The guardians of children under 16 gave informed, written consent about participation in the study. Briefly, 6,900 primary school students aged between 7 to 13 from Shandong province of China were subjected to a Chinese reading test consisting of character-, word-, and sentence-level questions. Then, 1794 participants whose reading scores were above 87th percentile or below the 13th percentile among all students in the same grade were chosen for further evaluation. These participants were subjected to a character reading test composed of 300 Chinese characters individually for the assessment of reading ability. Then the Raven’s Standard Test was performed to exclude individuals with intelligent deficiency. In total, 1024 children were selected for subsequent analysis, including 502 dyslexic patients and 522 controls.

SNP markers selection and genotyping

In total, 21 Tag SNPs covering GNPTAB, GNPTG and NAGPA were selected through Tagger program [31] with parameters of minor allele frequency (MAF) over 5% and pairwise r2 threshold of 0.8. The SNP genotyping was performed on SequenomMassARRAY platform (Sequenom, San Diego, CA) at CapitalBio Corporation (Beijing, China). Genomic DNA samples were extracted from saliva samples using Oragene™ DNA self-collection kit (DNA Genotek Inc., Ottawa, Ontario, Canada) and DNA quantity was determined by Nanodrop spectrophotometry (Nanodrop 1000 Spectrophotometer, Thermo Scientific, Wilmington, DE). A locus-specific PCR reaction based on a locus-specific primer extension reaction was designed using the MassARRAY Assay Design software package (v3.1). MALDI-TOF mass spectrometer and Mass ARRAY Type 4.0 software were used for mass determination and data acquisition.

Data analysis

Statistical analysis was undertaken using PLINK software (http://pngu.mgh.harvard.edu/~purcell/plink/), which is an open-source whole genome association analysis toolset and is commonly used to perform a range of basic, large-scale analyses [32]. Hardy-Weinberg equilibrium (HWE) tests were undertaken for each SNP, and association tests were performed using additive, dominant, or recessive genetic models. Haplotype analyses were performed using Haploview software (Version 4.2). Haploview is a software package that provides computation of linkage disequilibrium (LD) in genetic data, performs association studies, chooses tagSNPs and estimates haplotype frequencies [33,34]. Chi square tests were used to test for haplotype association and full model association (Genotype, Dom, Rec). A Fisher’s exact test was used for allelic association. Logistic regression was applied for risk stratification with or without covariate (age and sex) in both single marker and haplotype analysis. False discovery rate (FDR) correction for multiple testing was undertaken for the 21 SNPs that were adopted into the single site association analysis.