Multifactor-dimensionality reduction shows a two-locus interaction associated with Type 2 diabetes mellitus
- First Online:
- 823 Downloads
Type 2 diabetes mellitus is a complex genetic disease, which results from interactions between multiple genes and environmental factors without any single factor having strong independent effects. This study was done to identify gene to gene interactions which could be associated with the risk of Type 2 diabetes.
We genotyped 23 different loci in the 15 candidate genes of Type 2 diabetes in 504 unrelated Type 2 diabetic patients and 133 non-diabetic control subjects. We analysed gene to gene interactions among 23 polymorphic loci using the multifactor-dimensionality reduction (MDR) method, which has been shown to be effective for detecting and characterising gene to gene interactions in case-control studies with relatively small samples.
The MDR analysis showed a significant gene to gene interaction between the Ala55Val polymorphism in the uncoupling protein 2 gene (UCP2) and the 161C>T polymorphism in the exon 6 of peroxisome proliferator-activated receptor γ (PPARγ) gene. This interaction showed the maximum consistency and minimum prediction error among all gene to gene interaction models evaluated. Moreover, the combination of the UCP2 55 Ala/Val heterozygote and the PPARγ 161 C/C homozygote was associated with a reduced risk of Type 2 diabetes (odds ratio: 0.51, 95% CI: 0.34 to 0.77, p=0.0016).
Using the MDR method, we showed a two-locus interaction between the UCP2 and PPARγ genes among 23 loci in the candidate genes of Type 2 diabetes. The determination of such genotype combinations contributing to Type 2 diabetes mellitus could provide a new tool for identifying high-risk individuals.
KeywordsMultifactor-dimensionality reduction Type 2 diabetes mellitus gene to gene interaction uncoupling protein 2 peroxisome proliferator-activated receptor γ
single nucleotide polymorphism
peroxisome proliferator-activated receptor γ
fatty acid binding protein
insulin receptor substrate 1
Type 2 diabetes mellitus is a common complex genetic disease. Its high incidence in certain populations and among first-degree relatives of Type 2 diabetic patients, as well as the high concordance in identical twins, provides strong evidence that genetic factors underlie susceptibility to the disease . Monogenic forms account for only five per cent or less of all cases of Type 2 diabetes , while the common form of the disease results from interactions between multiple genes and environmental factors, without any single factor having strong independent effects . Thus, searching for susceptibility genes for the common form of Type 2 diabetes requires a thorough understanding of gene to gene and gene to environment interactions.
Traditionally, gene to gene interactions in complex diseases have been examined by logistic regression, multilocus linkage disequilibrium tests and the Hardy-Weinberg equilibrium test, all of which have limitations in their general application . Thus, the identification and characterisation of gene to gene interactions has been limited mainly by a lack of powerful statistical methods and a lack of large sample size . To overcome these limitations, the multifactor-dimensionality reduction (MDR) method was developed. It is used for detecting and characterising high-order gene to gene interactions in case-control and discordant-sib-pair studies with relatively small samples . The MDR defines a single variable that incorporates information from several loci and/or incorporates environmental factors that can be divided into high-risk and low-risk combinations. This new variable can be evaluated for its ability to classify and predict disease risk status using cross-validation (CV) and permutation testing. It can also be used in conventional analytic methods . The MDR method has been shown to have good power in relatively small case-control studies [6, 7].
In this study, we genotyped 23 different loci in the 15 candidate genes of Type 2 diabetes in unrelated Type 2 diabetic patients and non-diabetic control subjects, and examined gene to gene interactions with the MDR method.
Subjects and methods
We studied 504 unrelated patients with Type 2 diabetes mellitus (age: 59±10 years, 241 men, 263 women) and 133 non-diabetic control subjects (age: 65±3 years, 40 men, 93 women). Type 2 diabetes was diagnosed according to World Health Organization criteria . Selection of the non-diabetic control subjects was according to the following criteria: 60 years or older, no past history of diabetes, no diabetes in first-degree relatives, a fasting plasma glucose concentration of less than 6.1 mmol/l, and an HbA1c value of less than 5.8%. The Institutional Review Board of Clinical Research Institute in Seoul National University Hospital, Korea approved the study protocol and informed consent for genetic analysis was obtained from each subject. This study was carried out in accordance with the Declaration of Helsinki as revised in 2000 (http://www.wma.net/e/policy/17cnote.pdf).
Selection of polymorphisms
Using public databases, e.g. PubMed and Online Mendelian Inheritance in Man (http://www.ncbi.nlm.nih.gov/Omim/), we selected 15 candidate genes that have been characterised and potentially associated with Type 2 diabetes. Among these candidate genes, calpain 10 and adiponectin genes (located in 2q37.3 and 3q27 respectively) are positional candidates evidenced by genome-wide linkage analysis [9, 10]. In addition, peroxisome proliferator-activated receptor γ (PPARγ), fatty acid binding protein (FABP) 2, β3-adrenergic receptor (ADRB3) and insulin receptor substrate 1 (IRS1) have been linked to the phenotypes of Type 2 diabetes, e.g. obesity or insulin resistance [11, 12, 13, 14]. GenBank accession numbers for genes studied in this article are as follows: uncoupling protein (UCP) 2 (NM_003355), adiponectin (NM_004797), PPARγ (NM_015869), FABP2 (NM_000134), PPARγ coactivator 1 (NM_013261), FABP3 (NM_004102), uncoupling protein 3 (NM_003356), calpain 10 (NM_021251), resistin (NM_020415), mitochondrial NADH dehydrogenase subunit 2 (J01415), mitochondrial ATPase 8 gene (J01415), mitochondrial D-loop (J01415), ADRB3 (NM_000025), IRS1 (NM_005544), and glycogen synthase 1 (NM_002103).
Allele frequency of each polymorphism
ins/del in 7848T
ins/del in −530 CTC
Genotyping of polymorphisms
Genomic DNA was isolated with a commercial kit (Gentra Systems, Minneapolis, Minn., USA). The amplifying and extension primers were designed for single base extension for SNPs and ins/del. Primer extension reactions were done with SNaPshot ddNTP Primer Extension Kit (Applied Biosystems). To clean up the primer extension reaction, one unit of shrimp alkaline phosphatase was added to the reaction mixture, which was then incubated for 1 h at 37°C, followed by 15 min at 72°C. The DNA samples containing extension products and GeneScan 120 Liz size standard solution were added to Hi-Di Formamide (Applied Biosystems) according to the manufacturer’s instructions. The mixture was incubated for 5 min at 95°C followed by 5 min on ice, and then was electrophoresed by ABI Prism 3100 Genetic Analyzer. Automatic genotype assignment was done by ABI Prism GeneScan and Genotyper software (Applied Biosystems).
Multifactor-dimensionality reduction analysis
MDR was done as described previously [4, 5, 6, 7]. Briefly, the data set is divided into ten parts of equal size for 10-fold CV. A training set composed of 9/10 of the data used to build the MDR model is selected. The remaining 1/10 of the data is used to compose the testing set, which is used to determine the generalisability of the MDR model. Next, a set of n genetic and/or environmental factors is selected. The n factors are represented in n-dimensional space. During the model selection process, each multifactor cell class in n-dimensional space is labelled as high-risk if the ratio of cases to controls meets or exceeds the threshold of 1.0 and as low-risk if the threshold is not exceeded. We have observed in simulation studies that a threshold of 1.0 is optimal at this stage of MDR modelling for unbalanced data sets (unpublished results by the authors).
When the final best model is selected, a model for high-risk and low-risk genotype combinations is formed using an adjusted threshold that is equal to the ratio of cases and controls in the dataset. Among all of the two-factor combinations, a single model that maximises the cases to controls ratio is selected. This two-locus model has the minimum classification error among the two-locus models. To evaluate the predictive ability of the model, the model is evaluated on the basis of the 1/10 of the data left out for testing. This procedure is done ten times, each time using a different 1/10 of the data for testing. In our study we also performed this entire ten-fold CV procedure ten times, each time using different random number seeds to reduce the chance of observing spurious results due to chance divisions of the data.
This MDR procedure can be carried out for each possible model size (i.e. each number of loci and/or environmental factors), if computationally feasible. Due to computation restrictions, we set out to detect all two-locus interactions through five-locus interactions. The result is a set of models, one for each model size considered. From this set, the model with the combination of loci and/or discrete environmental factors that maximises the CV consistency and minimises the prediction error is selected. Cross-validation consistency is a measure of the number of times a particular set of loci and/or factors are identified in each possible 9/10 of the subjects [5, 6, 7]. The proportion of subjects for whom an incorrect prediction was made is the prediction error. When CV consistency is maximal for one model and prediction error is minimal for another, statistical parsimony is used to choose the best model. Thus when the CV metric and the prediction error metric support different models, the model with the fewest loci/factors is selected. Hypothesis testing of this final best model can then be done by evaluating the magnitude of the prediction error. We determined statistical significance by comparing the average prediction error from the observed data with the distribution of average prediction errors under the null hypothesis of no associations derived empirically from 100 permutations. In this study, prediction error is evaluated for the best model identified by MDR in each permuted dataset. The null hypothesis was rejected when the upper-tail Monte Carlo p value derived from the permutation test was 0.05 or lower.
Allele frequencies were estimated by gene-counting method, and the chi-square test was used to examine the Hardy-Weinberg equilibrium. Each genotype was assessed by chi square test with the use of dominant, recessive and additive genetic models. To examine the results of the MDR analysis, logistic regression analysis was used and odds ratios (OR) with 95% CIs were obtained. A p value of less than 0.05 was considered statistically significant.
Allele frequencies and association with Type 2 diabetes mellitus
All 23 polymorphisms examined in this study were in Hardy-Weinberg equilibrium and their allele frequencies were similar in the patients with Type 2 diabetes and in the control subjects (Table 1). Among the results of the association study using the dominant, recessive and additive genetic models, no locus was found to be associated with Type 2 diabetes after correction for multiple comparisons.
Summary of MDR results
No. of loci in model
Genes included in the best combination in each modela
UCP2,FABP2, PGC1, FABP3b,CAPN10
Conventional statistical analyses
With regard to the question of whether the genotype combinations from the UCP2 Ala55Val polymorphism and the PPARγ 161C>T polymorphism are associated with susceptibility to Type 2 diabetes, the overall chi square of all the genotype combinations of UCP2 Ala55Val and PPARγ 161C/C was 8.805 (df=4, p=0.0030). As genotype combinations with PPARγ 161T/T homozygote are very rare (as shown in Fig. 1), it is difficult to classify these combinations into high- or low-risk groups. Excluding the rare genotype combinations harbouring PPARγ 161T/T homozygote, the combination of UCP2 55Ala/Val and PPARγ 161C/C was the low-risk combination, while all others corresponded to the high-risk combination. In this model, the OR for the low-risk combination of the UCP2 55Ala/Val heterozygote and the PPARγ 161C/C homozygote was 0.51 (95% CI: 0.34 to 0.77, p=0.0016).
Using the MDR method followed by conventional statistical analysis, we showed an interaction between two of 23 loci in 15 candidate genes of Type 2 diabetes. It is well known that Type 2 diabetes is polygenic or multifactorial in nature but the gene to gene interactions associated with the disease have been very difficult to demonstrate. This obstacle has been mainly explained by the lack of powerful statistical methods and lack of large sample sizes . In this study, we effectively found evidence for a significant gene to gene interaction among two loci out of a total of 23 loci, proving the MDR method used by us to be effective in detecting multigenic interactions among many different loci in many different genes.
Using only two candidate SNPs, several studies have shown gene to gene interactions in obesity and insulin resistance but not in Type 2 diabetes. The ADRB3 gene has been the gene most frequently examined and reported in connection with gene to gene interaction, e.g. with UCP1 gene in Danish  and in Finnish populations , with the human type 2 deiodinase gene in white populations , with the FABP2 gene in Japanese  and with the PPARγ2 genes in Mexican American populations . In a different way, using multi-point allele-sharing analysis with lod score, other investigators showed that the loci on chromosome 2 (NIDDM1) and 15 (near CYP19) interact to increase susceptibility to Type 2 diabetes in Mexican Americans .
Among several models with different numbers of loci, the best gene to gene interaction model identified in our study from 23 loci of 15 candidate genes was a two-locus model including the UCP2 Ala55Val polymorphism and the PPARγ 161C>T polymorphism. In this model, the combination of the UCP2 55Ala/Val heterozygote and the PPARγ 161C/C homozygote was associated with a reduced risk of Type 2 diabetes. Although these variants have been studied for their association with susceptibility to Type 2 diabetes in several populations [22, 23, 24], a significant association with Type 2 diabetes has not been reported independently for each variant. In this study, by contrast, the combined effect of these variants was a significantly reduced risk of Type 2 diabetes, which implies a genetic interaction between these genes in the pathogenesis of the disease.
However, the nature of the gene to gene interaction between the UCP2 Ala55Val polymorphism and the PPARγ 161C>T polymorphism is not clear, as suggested previously when the MDR method was applied to breast cancer . If there was a consistent trend of high-risk or low-risk cells across a series of rows or of columns, it could indicate that a particular locus had a main effect. We could not find such a trend in our best two-locus model. As Type 2 diabetes does not have a typical pattern of Mendelian inheritance that can be explained by a linear model, our observation could be regarded as quite natural. Similarly, other investigators reported that the greatest risk of Type 2 diabetes in Mexican Americans was associated with the 112/121-haplotype combination in CAPN10 gene, but they were unable to find an increased risk of Type 2 diabetes in subjects who were homozygous for either of the haplotypes .
Although Type 2 diabetes is a heterogeneous syndrome resulting from defects of insulin secretion and insulin action , we mainly included the genes responsible for insulin resistance in this study. A balanced number of genes responsible for both insulin secretion and insulin action was not studied. Given gene to gene interactions are expected to be a ubiquitous component of the genetic architecture of common human diseases such as Type 2 diabetes mellitus , and the identification, by MDR, of genotype combinations contributing to Type 2 diabetes could enable timely identification of high-risk individuals, who could benefit from early behavioral or medical intervention to prevent the disease developing.
This study was supported by a grant from the Korea Health 21 R & D Project, Ministry of Health and Welfare, Republic of Korea (00-PJ3-PG6-GN07-001).