Background

Stuttering is a fluency disorder resulting in various forms of speech interruptions affecting all language groups which typically arise in children aged ~ 2 to 5 years when they begin to develop more complex speech and language (Reilly et al. 2013; Didirková et al. 2021; Polikowsky et al. 2022). Stuttering occurs predominantly in males than females with a male-to-female ratio of 5:1 and most of them; particularly the females recover spontaneously or with the aid of speech therapy (Drayna and Kang 2011; Yairi and Ambrose 2013). It has long been observed that stuttering frequently runs in families and is highly heritable (Fedyna et al. 2011; Barnes and Neutel 2016; Bloodstein et al. 2021). Various Studies have elucidated a solid genetic influence on stuttering risk and identified coding variants in GNPTAB, GNPTG, and NAGPA genes which have been linked to mutations in the lysosomal enzyme-targeting pathway (Riaz et al. 2005; Kang et al. 2010; Raza et al. 2016; Frigerio-Domingues and Drayna 2017; Gunasekaran et al. 2021).

Over the years, research on neurological aspects of stuttering has been carried out to understand the nature and metabolism of the disorder (Alm 2021). Expression of stuttering genes (GNPTAB and NAGPA) in children with persistent stuttering and non-stuttering controls revealed gray matter differences linked to lysosomal deficits (Chow et al. 2020). Lysosomal deficits likely reduce the processing of biomolecules (Alm 2021); energy metabolism was observed in mice carrying the mutant GNPTAB gene which had fewer astrocytes in the brain which could be the result of a reduced peak rate of energy supply to the motor system (Barnes and Neutel 2016).

The genes GNPTAB [NM_024312.4] (N-acetylglucosamine-1-phosphate transferase subunits alpha and beta) located on chromosome 12q23.2 together with GNPTG [NM_ 032520.4] (N-acetylglucosamine-1-phosphate transferase subunit gamma) located on chromosome 16p13.3 encodes for a phosphotransferase enzyme, while NAGPA [NM_016256.3] (N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase) also located on chromosome 16p13.3 encodes an enzyme responsible for the removal of N-acetylglucosamine, thus uncovering the mannose 6 phosphate (M6P) targeting acid hydrolases to lysosomes (Kazemi et al. 2018; Gunasekaran et al. 2021). Some candidate missense variants in stuttering such as GNPTAB: rs137853824, rs137853823, rs137853825; GNPTG: rs137853827; and NAGPA: rs139526942 were previously detected in stutterers (Kang et al. 2010).

Genome-wide association studies (GWAS) by high-throughput sequencing have identified several loci linked with the trait and identified additional candidate genes. Exonic mutations in the SLC6A3 gene (rs2617604, rs28364997, rs28364998) and DRD2 gene (rs6275, rs6277) were detected among the Hans Chinese patients with speech disfluency (Lan et al. 2009); AP4E1 gene (rs760021635, rs556450190) variants among a large African family (Raza et al. 2013, 2015); CYP17A1 gene (rs743572) variant among the Kurdish (Mohammadi et al. 2017); and CYRIA gene (rs12613255) variant in patients of European ancestry (Shaw et al. 2021). Also, high-throughput sequencing has transformed to detect an abundance of variants of noncoding segments (introns) through several GWAS (Reuter et al. 2015; Elliott and Larsson 2021). Sequence elements within the nuclear introns may modulate significant functions in gene expression, mRNA export, splicing, alternative splicing and transcription coupling (Berk 2016; Panaro et al. 2022). Studies on intronic variants in stuttering are limited, and only a few studies have reported the presence of fewer intronic alleles. Therefore, the current study was performed to reveal the intronic single-nucleotide variants (iSNVs) of three candidate genes (GNPTAB, GNPTG, and NAGPA) to conceal the possible pathogenicity of intronic variants in the south Indian cohort who stutter.

Methods

Recruitment and stuttering examination

The study included 100 participants (94 male and 6 female) > 18 years of age, who enrolled for speech impairment assessment at the All India Institute of Speech and Hearing (AIISH). The study participants had a detailed speech pathology examination. Individuals without any associated communication, cognition, psychological, and neurological problems except for developmental stuttering were selected. Among the 100 participants, 67/100 (67%) had a family history of stuttering and the remaining 33/100 (33%) participants had no family history. The distribution of severity ranged from very mild 46/100 (46%), moderate 36/100 (36%) to very severe 18/100 (18%) stuttering with an average onset age of 2–5 years. The stuttering Severity Instrument (Riley and Bakker 2009) was used to document the severity of overt stuttering.

Sample and DNA isolation

About 5 ml of peripheral venous blood was collected from the study participants (n = 100) by standard phlebotomy. DNA isolation was done using PureLink ™ Genomic DNA Mini Kit (Thermo Fisher Scientific) as per the manufacturer’s protocol.

Massively parallel sequencing and analysis

Among the 100 samples, only 79 samples (75 males: 4 females; mean age ± SD = 26 ± 6.49 years) were selected based on the DNA quantitation. Custom-targeted libraries were constructed by Ion AmpliSeq Library Kit Plus (Life Technologies) and PCR enrichment was done using Ion AmpliSeq Exome RDY panel (Life Technologies) according to the manufacturer's protocols. Sequencing was processed on the Ion Proton™ next-generation sequencing systems (Life Technologies) following the manufacturer's guidelines. All sequencing data passed specific minimal quality control requirements, and the sequence read alignment and variant calling were performed with the reference genome (hg19) using TMAP Alignment (Thermo Fisher Scientific). Variants were detected using the Ion Reporter (Thermo Fisher Scientific).

Allele frequency estimation, functional annotation, and pathogenicity prediction of iSNVs

Intronic variants were filtered based on the allele frequencies. The allele frequencies of the variants were compared with the gnomAD database (Karczewski et al. 2020) (https://gnomad.broadinstitute.org/) that served as a control. RegulomeDB (http://regulomedb.org) is a database integrating information from the Encyclopedia of DNA Elements (ENCODE) that was used to annotate single-nucleotide variants (Boyle et al. 2012). Reg-SNP Intron (https://regsnps-intron.ccbb.iupui.edu/), a computational framework, was used to predict the pathogenic impact of intronic single-nucleotide variants (Lin et al. 2019).

Statistical analysis

Descriptive statistics, i.e., mean, standard deviation (SD), and probability values of the allele frequencies, were analyzed using Statistical Package for Social Sciences (SPSS v21 IBM Corp New York).

Results

In this study, massively parallel sequencing of the three genes GNPTAB, GNPTG, and NAGPA identified 41 iSNVs in 79 samples (75 males and 4 females; mean age ± SD = 26 ± 6.49 years). Among the indexed patients, mild stuttering (46%) was more prevalent followed by moderate (36%) and severe (18%) and all the study participants were of south Indian descent. Allele frequencies of the 41 iSNVs were compared with the allele frequencies of the South Asian record and total allele frequency record using the gnomAD database; the allele frequency was highly significant and consistent with both south Asian (p = 0.001) and total allele frequency (p = 0.001) from gnomAD database (Table 1).

Table 1 Allele frequencies of the 41 variants observed in GNPTAB, GNPTG, and NAGPA genes among the 79 unrelated persistent stutterers and their comparison with gnomAD database

Functional annotation of intronic SNVs identified in this study

RegulomeDB was used to identify the potential regulatory/functional iSNVs. Overall 41 iSNVs were identified in this study, out of which 38 revealed RegulomeDB scores of 1- 6 and 3 with a score of 7 (Table 1 and Additional file 1: Table S1) Further, 6 iSNVs showed comparatively more evidence for the regulatory element with a score of 1, which included 5 iSNVs (rs11111002, rs4764814, rs4764813, rs1001171, and rs1001170) with a score of 1f and 1 iSNV (rs11110995) with a score of 1d. Expression quantitative trait loci (eQTLs) were observed in GNPTAB and NAGPA gene variants which describes a fraction of the genetic variance of a gene expression phenotype (Nica and Dermitzakis 2013). It is noticeable that the lesser the RegulomeDB score, it is more likely that it would be the variant that lies within a potential functional region (Liao et al. 2016). Detailed information about the regulatory iSNVs and functional annotation of other variants observed in the study, viz. likely/less likely affecting binding, and minimal binding are shown in Additional file 1: Table S1.

Pathogenicity of iSNVs

The pathogenic impact of intronic SNVs was analyzed using RegSNPs-intron which measures the impact of splicing on an intronic variant with structural features corresponding to potential alternatively spliced exons. The assay identified three iSNVs: GNPTAB: c.3603 - 1359A>G (rs11110995) in 30/79 (37.9%) of the cases and c.324 - 457A>G (rs11830792) in 21/79 (26.5%) cases and NAGPA: c.543 - 404T>A (rs1001171) in 47/79 (59.4%) cases with the prediction score of having a potentially deleterious effect (Table 2), and the remaining 38 iSNVs were benign (Additional file 1: Table S2).

Table 2 Pathogenicity prediction of the variants using Reg-SNP Intron tool

Discussion

Stuttering is a disorder of speech interruptions or disfluency which is highly heritable and has a strong genetic influence. This study describes the potential regulatory and pathogenic effect of intronic SNVs which has been discussed. Apart from the coding exonic variants, the noncoding intron plays a vital part in gene regulation (Rose 2019). The assortment of proteins is enhanced by alternative splicing where introns play important roles in producing multiple variant proteins from a single gene in a eukaryotic cell (Wang et al. 2015; Yang et al. 2021). Conservations in flanking introns of conserved alternative exons regulate alternative splicing (Pan et al. 2008; Vaz-Drago et al. 2017; Yang et al. 2021). In this study, we investigated the intronic variants of GNPTAB, GNPTG, and NAGPA genes and predicted the pathogenic impact of intronic SNVs using the RegSNPs-intron tool. This study identified three possibly pathogenic intronic variants rs11110995, rs11830792, and rs1001171. Previous studies have reported an intronic variant g.10985G>A in the GNPTG gene among the Iranian stutterers (Kazemi et al. 2018), and another intronic variant c.192+618G>A (rs7837758) in the ZMAT4 gene in stutterers of African ancestry was also reported (Shaw et al. 2021).

Among the three possibly pathogenic intronic variants detected, rs11110995A>G in GNPTAB gene with a RegulomeDB score of 1d which is an eQTL that likely affects binding and is linked to the expression of a gene target, pathogenicity estimation showed a damaging effect which was detected in 30/79 (37.9%) of the stutterers. The variant rs11830792 A>G in the GNPTAB gene with a RegulomeDB score of 6 indicated whether a certain position in the DNA sequence is bound or unbound by the transcription factor, pathogenicity estimation was possibly damaging for the iSNV which was detected in 21/79 (26.5%) of the stutterers. The variant rs1001171T>A detected in the NAGPA gene segment with a RegulomeDB score of 1f also indicated to affect binding and was linked to expression of a gene target with pathogenicity estimated to be possibly damaging and was detected in 47/79 (59.4%) of stutterers. No pathogenic iSNVs were detected in the GNPTG gene. In summary, these database provided evidence allowing us to examine the nucleotide variations responsible for conservation, chromatin state, and their effect on regulatory motifs. However, these regulatory variants are only associated with altered gene expression which is not the risk loci on disease pathogenesis and progression, and may not be as disruptive as the coding region variants which may modify the genes.

Conclusions

This study identified three intronic variants of pathogenic impact (rs11110995, rs11830792, and rs1001171) using the RegSNPs-intron tool in stuttering patients that are known to be associated with a certain genetic trait, as well as the regulatory function of the intronic variants were identified using RegulomeDB database which documented a few potential regulatory variants and susceptible loci. Thus, the combination of the two computational approaches may be helpful to understand the regulatory regions and derive a valid hypothesis as to their function. The limitations of this study included the relatively small sample size, and the patients were chosen  from a single center, which may limit the generalizability. Therefore, future work confirming the current findings is warranted to better understand the role of the intronic variants in a larger cohort of stutterers and a cohort of fluent controls would be valuable.