Identification of genetic variation that determines human trehalase activity and its association with type 2 diabetes
- First Online:
- Cite this article as:
- Muller, Y.L., Hanson, R.L., Knowler, W.C. et al. Hum Genet (2013) 132: 697. doi:10.1007/s00439-013-1278-3
- 783 Downloads
A prior linkage scan in Pima Indians identified a putative locus for type two diabetes (T2D) and body mass index (BMI) on chromosome 11q23-25. Association mapping across this region identified single nucleotide polymorphisms (SNPs) in the trehalase gene (TREH) that were associated with T2D. To assess the putative connection between trehalase activity and T2D, we performed a linkage study for trehalase activity in 570 Pima Indians who had measures of trehalase activity. Strong evidence of linkage of plasma trehalase activity (LOD = 7.0) was observed in the TREH locus. Four tag SNPs in TREH were genotyped in these subjects and plasma trehalase activity was highly associated with three SNPs: rs2276064, rs117619140 and rs558907 (p = 2.2 × 10−11–1.4 × 10−23), and the fourth SNP, rs10790256, was associated conditionally on these three (p = 2.9 × 10−7). Together, the four tag SNPs explained 51 % of the variance in plasma trehalase activity and 79 % of the variance attributed to the linked locus. These four tag SNPs were further genotyped in 828 subjects used for association mapping of T2D, and rs558907 was associated with T2D (odds ratio (OR) 1.94, p = 0.002). To assess replication of the T2D association, all four tag SNPs were additionally genotyped in two non-overlapping samples of Native Americans. Rs558907 was reproducibly associated with T2D in 2,942 full-heritage Pima Indians (OR 1.27 p = 0.03) and 3,897 “mixed” heritage Native Americans (OR 1.21, p = 0.03), and the strongest evidence for association came from combining all samples (OR 1.27 p = 1.6 × 10−4, n = 7,667). However, among 320 longitudinally studied subjects, measures of trehalase activity from a non-diabetic exam did not predict those who would eventually develop diabetes versus those who would remain non-diabetic (hazard ratio 0.94 per SD of trehalase activity, p = 0.29). We conclude that variants in TREH control trehalase activity, and although one of these variants is also reproducibly associated with T2D, it is likely that the effect of the SNP on risk of T2D occurs by a mechanism different than affecting trehalase activity. Alternatively, TREH variants may be tagging a nearby T2D locus.
A prior genomic linkage scan in Pima Indians identified an obesity susceptibility locus on chromosome 11q23-25 (LOD = 3.6). There was also suggestive evidence that the same genomic region contains a susceptibility locus for type two diabetes (T2D, LOD = 1.7) (Hanson et al. 1998). This region of linkage to obesity and/or T2D was replicated in several other studies including Caucasians who are part of the Framingham study (Atwood et al. 2002), part of the Breda Study Cohort (van Tilburg et al. 2003), Utah Caucasians (Elbein et al. 1999) and Mexican-Americans who are part of the GENNID study (Duggirala et al. 2003). However, no variant in this region on chromosome 11 was reported to be among the top associations in large-scale meta-analyses of genome-wide association data for T2D or obesity in Caucasians (Zeggini et al. 2008; Morris et al. 2012; Willer et al. 2009).
The region of linkage in Pima Indians spans approximately 23 Mb, from approximately 112–134 Mb and contains ~345 genes. This segment of chromosome 11 contains the dopamine receptor D2 gene, which is a good physiologic candidate gene for obesity; however, variations in this gene do not account for the linkage signal for BMI in Pima Indians (Jenkinson et al. 2000). In subsequent association mapping studies, several SNPs with nominally significant associations with T2D are mapped to or near TREH, UBASH3B, KIRREL3 and SNX19. Some of the most strongly associated variants are in or near the potential candidate gene TREH (Hanson et al. 2006), which codes for trehalase. In British and Nigerian studies, plasma trehalase activity was higher in diabetic subjects than non-diabetic subjects (Eze 1989). Higher serum trehalase activity has also been observed in diabetic subjects with glycosuria compared to diabetic subjects without glycosuria (Isichei and Gorecki 1993). In mice, trehalase activity is also elevated in both alloxan-induced and genetic (Ob/Ob, Db/Db) diabetic mice (Baumann et al. 1981; Ramaswamy and Flint 1980). However, the causal direction underlying this association is not known.
Our association mapping for the T2D locus on chromosome 11q23-25 led us to perform detailed analysis of the TREH gene structure and its enzymatic activity, reported herein. Trehalase splits the disaccharide α, α-trehalose into two molecules of glucose. This enzyme is found in bacteria, insects and mammals. In mammals, it is restricted to the small intestine, kidney, liver and bile (Kalf and Reider 1958; Kenny and Maroux 1982). In humans, the physiological function of trehalase is to digest dietary trehalose in the small intestine, but its exact role in carbohydrate metabolism is not clear.
One of the notable properties of substrate trehalose is its ability to protect cellular integrity during desiccation. Desert plants, bacteria, yeast, mushrooms and insects all manufacture trehalose as a defense mechanism against dehydration (Wingler 2002; Elbein et al. 2003). Trehalose can also protect against oxidative stress (Echigo et al. 2012). In humans, the main dietary sources of naturally occurring trehalose are mushrooms, baker and brewer’s yeast and certain kinds of shrimp. However, trehalose’s unique properties of protecting the integrity of a cell during desiccation have made it an important additive in the food and drug industries. It is also commonly used as a sweetener in bakery goods, beverages, confectionery, fruit jam, breakfast cereals, rice, and noodles (Richards et al. 2002).
Some individuals are unable to absorb trehalose, and the hereditary trehalose malabsorption is correlated with trehalase deficiency. In Greenland, the prevalence of trehalase deficiency has been reported to be 8 %, which is considerably higher than that seen elsewhere (Gudmand-Høyer et al. 1988). Recently, this sugar has been widely touted as an alternative to sucrose in diets proposed to prevent diabetes, due to reduced glycaemic and insulinaemic responses following trehalose ingestion (van Can et al. 2009). On account of this, and on account of the linkage studies described above, we sought to explore the relationship between trehalase activity and T2D and also determine the genetic basis for trehalase activity.
All subjects in this study are participants of a longitudinal study of the etiology of T2D among the Gila River Indian Community in Arizona, where most of the residents are Pima Indians or Tohono O’odham (a closely related tribe, Knowler et al. 1978). Diabetic status was determined by an oral glucose tolerance test according to the criteria of the American Diabetes Association (The Expert Committee on the Diagnosis and Classification of Diabetes Mellitus 1997). The original observation of genetic linkage on chromosome 11q23-25 was determined in 966 individuals from 264 nuclear families who were selected for having family members with T2D (Hanson et al. 1998). Among these 966 individuals, 828 had complete data on covariates for association analysis with T2D, and, thus, were used for association mapping of the suggestive T2D linkage signal. Characteristics of these 828 individuals were: 648 (78 %) had T2D; 320 (39 %) were men; mean (±SD) BMI = 38.4 ± 8.5 kg/m2; and mean age at last exam =48.3 ± 13.0 years. Replication of SNP associations with T2D was assessed in two additional, non-overlapping samples from our longitudinal study on the Gila River Indian Community. The first replication sample included all additional individuals who were full-heritage Pima Indian (n = 2,942): 1,149 (39 %) had T2D; 1,294 (44 %) were men; mean BMI = 37.1 ± 8.6 kg/m2; and mean age = 37.1 ± 16.6 years. The second replication sample included all additional individuals who were not full-heritage Pima Indian (n = 3,897). Heritage in this “mixed heritage” group was, on average, ½ Pima and ¾ American Indian, which may include other tribes. Among these individuals, 727 (19 %) had T2D; 1,791 (46 %) were men; mean BMI = 34.6 ± 8.8 kg/m2; and mean age at the last exam = 27.6 ± 13.7 years.
Sequencing of the trehalase gene and genotyping of SNPs
Sequencing of the TREH coding regions and putative promoter region was initially done in 39 Pima subjects using Big Dye terminator (Applied Biosystems) on an automated DNA capillary sequencer (model 3730; Applied Biosystems). These 39 subjects (68 % had T2D) were selected to maximize identification of genetic variation. All 39 subjects were from different nuclear Pima Indian families, and were selected as having the most diverse combinations of microsatellite markers near the TREH locus in our linkage study. More recently, complete exome sequencing was performed in 177 additional full-heritage Pima Indians (28 % had T2D), each from a different nuclear family, using next-generation sequencing technology (ShanghaiBio Corp., North Brunswick, NJ). In addition, whole genome sequence data are currently being generated (Complete Genomics Inc, Mountain View, CA); to date, genomic data on 30 Pima Indians (17 % had T2D) are available for analyses.
SNPs identified by sequencing and tag SNPs (in intron and flanking regions) selected from databases were genotyped for association analyses using the TaqMan Allelic Discrimination (AD) Assay (Applied Biosystems) on an ABI Prism 7700 (Applied Biosystems) or SNPlex genotyping System 48-plex (Applied Biosystems) on an automated DNA capillary sequencer (model 3730; Applied BioSystems).
Measurement of plasma trehalase activity
Plasma samples for measurement of trehalase activity levels were available on 570 subjects who were part of the original linkage study. These samples were drawn during an exam where the subject was determined to be non-diabetic. Among these 570 subjects, 320 had follow-up information from subsequent exams, and 214 of the 320 subjects subsequently developed diabetes. Plasma trehalase activity was measured using the method adapted from Eze (1989). In brief, each plasma sample was incubated with or without substrate trehalose, and glucose concentration was measured by a glucose oxidase method (Beckman Instruments). The liberated glucose (i.e., the difference) was taken as trehalase enzyme activity. Each sample was measured in duplicate and the mean of the two measurements was used in statistical analyses.
Statistical analyses were performed using the software of the SAS Institute (Cary, NC). Linkage analysis of plasma trehalase activity, as a quantitative trait, was conducted for sibships by means of variance-components methods (Amos et al. 1996). The GENEHUNTER program (Pratt et al. 2000) was used to derive multipoint estimates of the proportion of alleles identical by descent at each chromosomal location for these analyses. Linkage analysis of diabetes, accounting for the age-specific occurrence of the disease, was accomplished with a cumulative incidence “residual” method. It uses age and affection status to produce an “age-adjusted” diabetes score that can be analyzed as a quantitative trait (Hanson and Knowler 1998). Single trait analysis suggested that both trehalase activity and diabetes were linked to the same region; therefore, bivariate linkage analysis was conducted by covariance-components models (Lange and Boehnke 1983; Almasy et al. 1997) to assess the extent to which the presumed gene affects both traits. Detailed linkage analysis methods have been described previously (Hanson et al. 1998).
The general association of genotypes with T2D was assessed with logistic regression analysis and was adjusted for covariates (age, sex, birth year and heritage). The model was fit with a generalized estimating equation (GEE) technique to account for correlation among siblings. Genotype was analyzed as a numeric variable representing the number (0, 1, 2) of copies of a given allele (i.e., an “additive” model). Estimates of the proportion of European ancestry were derived by the method of Hanis et al. (1986) from 45 informative markers with large differences in allele frequency between populations (Tian et al. 2007) for use as a covariate in these analyses. The association of trehalase activity with genotype was assessed by a linear mixed model that incorporated a random effect to account for the correlation among siblings in addition to fixed effects for genotype, age and sex. p values were calculated by the likelihood ratio test. Linkage disequilibrium (LD), haplotype blocks and haplotype frequencies were analyzed by Haploview (version 4.2) (Gabriel et al. 2002).
The extent to which associations could explain the linkage signal was assessed by a model that tests the amount by which an associated polymorphism reduces the variance attributed to the linked locus (Hanson and Knowler 2008). This method fits a bivariate linkage model to the original trait and to the residual adjusted for genotype. To assess the potential independent contribution of multiple associated SNPs, conditional analyses were conducted in which a SNP of interest was added to a model containing one or more additional SNPs. These analyses were conducted in a “step-wise” fashion to identify a set of SNPs that were associated with the trait of interest. Haplotypes were analyzed by a modification of the zero recombinant haplotyping method as previously described (Vozarova de Courten et al. 2005). Briefly, the MLINK program is used to calculate the probability that each individual carries one or two copies of each haplotype, given their genotypes and the genotypes of their family members. These probabilities are then used to conduct the analysis for each haplotype in a fashion analogous to that for single SNP.
Association mapping of the region of suggestive linkage to T2D
Linkage and association mapping of TREH activity
Sequencing of TREH and selection of tag SNPs
Association of tag SNPs with trehalase activity
Associations of 4 tag SNPs in the TREH gene with trehalase activity in 570 non-diabetic Pima Indians along with effect on the linkage analysis
Trehalase enzyme activity (Mean ± SE)
Percent variance explained
10.83 ± 0.50 (178)
20.50 ± 0.93 (264)
29.30 ± 3.36 (60)
3.48 × 10−14
8.99 × 10−5
14.68 ± 0.55 (427)
35.14 ± 2.14 (72)
80.27 ± 23.02 (4)
1.41 × 10−23
17.43 ± 1.00 (255)
18.82 ± 1.13 (217)
19.06 ± 1.82 (40)
20.43 ± 0.84 (362)
12.49 ± 0.85 (141)
4.16 ± 0.77 (10)
2.16 × 10−11
Association of tag SNPs with T2D
Associations of 4 tag SNPs in the TREH gene with T2D in Pima Indians
Participants in linkage study
Participants not included in linkage study
All linkage study (n = 828)
Full-heritage pima (n = 2,942)
“Mixed” heritage pima (n = 3,897)
All combined (n = 7,667)
OR (95 % CI)
OR (95 % CI)
OR (95 % CI)
OR (95 % CI)
Rs117619140 intron 11
1.6 × 10−4
Association of haplotypes with trehalase activity and T2D
TREH activity and T2D prediction
Although all individuals were nondiabetic at the examination for which trehalase activity was measured, 320 had been subsequently examined in the longitudinal study and 214 had developed diabetes. Among these individuals trehalase activity did not significantly predict development of diabetes (hazard ratio 0.93 per SD increase in trehalase activity, 95 % CI 0.81–1.07, p = 0.29). In an additional small nested case–control study comparing 48 individuals who initially had normal glucose tolerance, but who subsequently developed diabetes, with matched controls who remained free of diabetes in follow-up (Lindsay et al. 2002, Krakoff et al. 2003), baseline trehalase activity also did not predict diabetes (hazard ratio 1.18, 95 % CI 0.75–1.85, p = 0.48).
Plasma trehalase activity is strongly linked (LOD = 7.0) to a region near the TREH locus on chromosome 11q23 in 570 non-diabetic Pima Indians. Genetic variants in or near the TREH locus are strongly associated with trehalase activity and account for a large portion of the linkage signal. One of these variants, rs558907 is also reproducibly associated with T2D in three non-overlapping samples of Pima Indians.
That genetic polymorphisms influence plasma trehalase activity has been suggested decades ago in Eze (1989). In his report, a bimodal distribution of activity was observed in 30 normal Nigerian subjects, suggesting a potential underlying genetic influence. We identified four tag SNPs that are associated with the overall trehalase activity in plasma of Pima Indians. These associations are strong in that most achieve the statistical threshold typically required for genome-wide significance (p < 7.2 × 10−8) (Dudbridge and Gusnanto 2008). Together, these SNPs explain 51 % of the variance in trehalase activity and 79 % of the variance attributed to the linked locus, thus accounting in large part for the observed linkage signal. This suggests that these variants are in linkage disequilibrium with functional variants that influence trehalase activity. It is noteworthy that in the initial analyses of the effect of TREH variants on trehalase activity, some of the SNPs had a negligible effect on the linkage signal, despite strong evidence for association (e.g., rs558907). However, the association with rs558907 produced a substantial and significant effect on the linkage signal when adjusted for the effect of rs2276064. This may occur because none of the SNPs individually accounts for the entire linkage signal and, in this situation, adjustment for a variant that partially accounts for the signal may actually enhance power to detect residual linkage. More broadly, this suggests that additional susceptibility variants should be sought in regions implicated by linkage analysis for complex traits and in which strong associations have been observed that do little to explain the linkage signal.
Previous studies have linked plasma/serum trehalase activity to T2D in humans (Eze 1989; Isichei and Gorecki 1993). At present, the mechanism of the association between trehalase activity and T2D is unknown. Plasma trehalase activity was higher in subjects with than without T2D in both Nigerian and British populations (Eze 1989). Higher serum trehalase activity was also observed in diabetic subjects with glycosuria compared to diabetic subjects without glycosuria (Isichei and Gorecki 1993). In mice, trehalase activity was elevated in both alloxan-induced and genetic (Ob/Ob, Db/Db) diabetic mice (Baumann et al. 1981; Ramaswamy and Flint 1980). However, it is not clear whether elevated trehalase activity observed in these studies is the effect of T2D, or if people with higher trehalase activity, perhaps due to linked genes, are more prone to develop T2D. Moreover, no significant correlation was observed between plasma/serum trehalase activity and blood glucose levels in these studies or in the present study (data not shown). Among 241 non-diabetic, full-heritage Pima Indians who had undergone detailed metabolic testing and also had serum collected, trehalase activity was associated with basal carbohydrate oxidation (p = 0.03, adjusted for age, sex and percentage body fat) and carbohydrate oxidation under low- and high-dose insulin stimulations (adjusted p = 0.007 and 0.009, respectively). Higher trehalase activity was associated with lower carbohydrate oxidation. However, it remains to be determined if trehalase is involved in regulation of blood glucose level and/or glucose oxidation, if trehalase activity changes secondary to these processes or if both are influenced by correlated factors. A recent study by Boyd et al. (2009) has indicated that the TREH gene expression was directly regulated by the transcriptional factor HNF4α via binding of the TREH promoter in human intestinal epithelial cells. HNF4α has been extensively studied in hepatocytes and pancreatic β-cells for its role in development of MODY (maturity onset diabetes of the young) and T2D. However, in the present study we did not find that plasma trehalase activity measured at a non-diabetic stage directly predicts the development of T2D in Pima Indians, although our sample size for this analysis (n = 320) may have low statistical power to detect a modest predictive effect of trehalase activity on T2D as indicated by the confidence interval of the hazard ratio (95 % CI 0.81–1.07).
In the present study, a tag SNP in the TREH gene, rs558907, was associated with T2D in Pima Indians. The association with T2D was observed initially in individuals who participated in a linkage study for T2D, and consistent results were observed in other members of the population, both those of full Pima heritage and those of mixed heritage. The level of statistical significance (p = 1.6 × 10−4), however, does not achieve that typically required for genome-wide significance (Dudbridge and Gusnanto 2008), so it remains possible that the diabetes association is spurious. However, it should be noted that in all of the genetic studies that have been conducted in Pima Indians to date, which include genome-wide association studies using both the 100 K and one Million Affymetrix SNP chips (Malhotra et al. 2011; Hanson et al. 2007 and unpublished), the association in TREH remains among the top 15 strongest associations with T2D for variants genotyped in or near ~1800 genes. It is, however, likely that the functional variant for the T2D association may not be rs558907 or, given the extent of linkage disequilibrium in the region, in TREH itself. The flanking region of the TREH gene is in a large haplotype block and contains several additional known and predicted genes including PHLDB1 and DDX6 (Supplemental Figure 1B). Because of the high degree of LD between variants in the TREH locus and nearby loci, localization of any diabetes association may be difficult.
Ultimately, replication in other populations with lower LD across this region may be required to determine whether the association between TREH variants and T2D is spurious or real. At present, there are few data, except from Caucasian populations. From 13 TREH SNPs (rs7928371, rs745663, rs10790256, rs607527, rs692750, rs673770, rs642530, rs525485, rs11216943, rs561845, rs582630, rs519982 and rs644498) directly genotyped or imputed in the large Caucasian DIAGRAM study, meta-analysis has shown no association between these SNPs with T2D (Zeggini et al. 2008). In Caucasians, these 13 SNPs were captured by two tag SNPs, rs10790256 and rs558907. Although rs558907 was not genotyped in DIAGRAM study, a proxy, rs692750 (r2 = 1 with rs558907) was genotyped in DIAGRAM study, and not associated with T2D in Caucasians (p = 0.79). Further replication studies in other, particularly non-Caucasian, populations are required to determine the relative contribution of this gene to T2D susceptibility.
Our results indicate that SNPs in TREH strongly influence trehalase activity. However, the exact functional variant(s) which contribute to the trehalase activity have not been identified. The programs “SIFT” and “PolyPhen”, which bioinformatically predict the effects of coding variants, both predict a “tolerated” effect of Arg486Trp (rs2276064) on protein function. This does not necessarily preclude an important effect of this variant on plasma trehalase activity, but, alternatively, intronic or promoter variants could be involved in the regulation of TREH expression. Variant analysis by Ingenuity Pathway Analysis predicts the loss of promoter function of rs642530, rs673770 and rs36077162 which are in complete LD with rs558907 (r2 = 1). Therefore, future functional studies are required to further define the causative variant(s) in this gene. Although our association analyses showed that SNPs in and/or near the TREH gene were associated with T2D, it remains unclear whether the diabetes causative variant(s) are within the TREH gene or a nearby gene. Furthermore, since there was little correspondence between the effect of TREH haplotypes on trehalase activity and diabetes risk, and little evidence that trehalase activity itself predicts T2D, it is likely that the associations between TREH variants and T2D are mediated by a different mechanism than the effects on trehalase activity.
We thank the participants from the Gila River Indian Community for their cooperation and Jill Loebel and Glenn Nishimoto for technical assistance. This research was supported by the Intramural Research Program of the National Institute of Diabetes and Kidney Diseases (NIDDK).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.