Introduction

Idiopathic inflammatory bowel disease (IBD) is a multifactorial disorder characterized by chronic and relapsing inflammation specific to the gastrointestinal tract, thus resulting in intestinal malabsorption, mucosal immune system abnormalities, and exaggerated inflammatory responses [15]. IBD has two main subtypes, namely ulcerative colitis (UC) and Crohn’s disease (CD). Although the precise etiology remains unknown, several environmental factors, such as commensal bacteria, food antigens, and smoking, as well as multiple genetic factors may contribute to the occurrence and development of IBD [15]. Genome-wide linkage-based family studies, candidate gene-based association studies, and large-scale genome-wide association (GWA) studies using single-nucleotide polymorphisms (SNPs) have shown possible IBD susceptibility genes and chromosomal loci [610].

IBD is involved in a complex interplay of innate and adaptive immune cells, including lymphocytes, macrophages, and dendritic cells. In such a setting, alterations in cytokine synthesis and cytokine signaling pathways are attributed to the pathogenesis of IBD [15]. GWA studies have recently indicated that the multiple genes implicated in the interleukin 23 (IL-23) and its receptor (IL23R) signaling pathway are associated with susceptibility to CD as well as UC [612]. For example, IL-23, IL23R, interleukin 12 precursor (IL12B), interleukin 12 receptor (IL12R), the Janus kinase (JAK) families, and the signal transducers and activators of transcription (STAT) families belong to a gene network in the IL-23/IL23R signaling pathway [13], implying that a subset of these genes can play a central role in the pathogenesis of IBD and may function as a key conductor of innate and adaptive inflammatory responses at multiple levels in the intestinal mucosa of IBD patients.

IL-23 as well as JAK2 [9, 14] and STAT3 [9, 12, 14] are associated with susceptibility to CD. These genes are involved in a gene network in the IL23/IL23R signaling pathway. JAK2 has been recently identified as a CD susceptibility gene in a meta-analysis of GWA data [9, 14] but not by any single-marker association studies on CD. Likewise, tyrosine kinase 2 (TYK2), which is a member of the JAK families located in β1 subunit of IL12 receptor (IL12RB1) [4, 1517] and in gp130 of interleukin 6 receptor (IL6R) [18], is also identified as a CD susceptibility gene by the meta-analysis [10] but did not reach significance by single-marker association studies. Furthermore, TYK2 is activated via signaling from a broader range of cytokine receptors and induces phosphorylation, homodimerization, and nuclear translocation of STAT3 [1519], thus resulting in several gene transcriptions and leading to IL-23-induced production of IL-17, a pro-inflammatory cytokine in natural killer cells, natural killer T cells, CD4+ T cells, and CD8+ T cells [2022]. This signaling cascade also plays a role in the differentiation of CD4+ (naive) T cells into Th17 cells [17, 2022], which is involved in the first line of host defense by controlling immune responses [22].

Therefore, we performed a candidate gene-based association study by selecting TYK2 and STAT3 as candidate genes. The purpose of this study was to investigate whether SNPs and their combination polymorphisms, which are referred to as haplotypes, in TYK2 and STAT3 are also associated with susceptibility to IBD in a Japanese population and whether such polymorphisms can be used as new genetic biomarkers for predicting the onset of IBD.

Subjects and Methods

Subjects

The study subjects were Japanese who were unrelated to one another. The subjects included 112 patients with UC, 83 patients with CD, and 200 gender-matched, healthy volunteers as control subjects. IBD patients were enrolled from eight general hospitals in Nagasaki, Japan from October 2003 to October 2008. The clinical characteristics of the subjects at the end point of this study are shown in Table I. The study protocol was approved by the Committee for Ethical Issue dealing with the Human Genome and Gene Analysis at Nagasaki University, and written informed consent was obtained from each subject.

Table I The Clinical Characteristics of Study Subjects

The diagnosis of IBD was made based on the endoscopic, radiological, histological, and clinical criteria established by both the World Health Organization Council for International Organizations of Medical Sciences and the International Organization for the Study of Inflammatory Bowel Disease [2325]. Patients with indeterminate colitis, multiple sclerosis, systemic lupus erythematosus, or any other diagnosed autoimmune diseases were excluded from this study.

Patients with UC were classified into subgroups according to age at onset (≤40 or ≥40 years), extension of disease (proctitis, left-sided colitis, or pancolitis), disease severity (mild, moderate, or severe), and disease activity (active or inactive) (Table I). Likewise, patients with CD were classified into subgroups according to age at onset (≤40 or ≥40 years), the location of lesions (ileal, ileocolonic, colonic, or isolated upper), disease severity (mild, moderate, or severe), disease activity (active or inactive), and the behavior of disease (stricturing, penetrating, or perianal; Table I). The location and extension of UC and CD, disease severity of UC and CD, and behavior of CD were stratified in accordance with the Montreal classification [26] with slight modification. A high clinical activity index (CAI ≥ 5) for UC [27] and a high Crohn’s disease activity index (CDAI ≥ 150) [28] were regarded as active-phase patients.

Preparation of Genomic DNA

Genomic DNA was extracted from a whole blood sample from each subject using a DNA Extractor WB-Rapid Kit (Wako, Osaka, Japan) according to the manufacturer’s protocol.

Sources of the Candidate Genes and Their Polymorphisms

All of SNPs in STAT3 (GenBank accession number, AY572796; MIM 102582) located on chromosome 17q21 [29] and TYK2 (GenBank accession number, AY549314; MIM 176941) located on chromosome 19p13.2 [30] were obtained using data available on the International HapMap Web site [31]. Candidate tag SNPs were selected with priority in a minor allele frequency of more than 5%. Subsequently, linkage disequilibrium blocks and genotyped tag SNPs among the candidate tag SNPs were determined using the iHAP software program [32]. The gene structure and positions of the genotyped tag SNP sites in STAT3 and TYK2 are shown in Figs. 1 and 2, respectively.

Fig. 1
figure 1

Locations of the genotyped tag SNP sites in STAT3 in the International HapMap (upper) and iHap (lower) data. The horizontal bars in the middle indicate the genomic sequence of STAT3. Blue vertical bars indicate the positions of all SNP sites. Yellow rectangles represent the positions of linkage disequilibrium blocks. A list and the locations of candidate tag SNPs are shown in the lower right position using Haploview 4.0 software. Red inverted triangles indicate the genotyped tag SNPs sites in this study, and their names are presented above each inverted triangle

Fig. 2
figure 2

Locations of the genotyped tag SNP sites in TYK2 in the International HapMap (upper) and iHap (lower) data. The horizontal bars in the middle indicate the genomic sequence of TYK2. Blue vertical bars indicate the positions of all SNP sites. A yellow rectangle represents the positions of a linkage disequilibrium block. A list and the locations of candidate tag SNPs are shown in the lower right position using Haploview 4.0 software. Red inverted triangles indicate the genotyped tag SNPs sites in this study, and their names are presented above each inverted triangle

Determination of Three SNPs in STAT3

Three SNPs, rs8074524 in intron 3, rs2293152 in intron 11, and rs957970 in intron 23, were selected as genotyped tag SNPs (Fig. 1) and were subsequently analyzed by polymerase chain reaction (PCR)-restriction fragment length polymorphism (RFLP). The polymorphic region was amplified by PCR with a GeneAmp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, CA, USA) using 25 ng of genomic DNA in a 25-µl reaction mixture containing 0.8× GoTaq Green master mix (Promega, Madison, WI, USA) and 15 pmol each of the following primers: forward primer 5′-GTCTGGAAAGCTCCATCTGC-3′ and reverse primer 5′-AGAGGCCAGATTAGTGCTGG-3′ for rs8074524; forward primer 5′-TCCCCTGTGATTCAGATCCC-3′ and reverse primer 5′-CATTCCCACATCTCTGCTCC-3′ for rs2293152; and forward primer 5′-CTGGGCTCAAGTGATCTTCC-3′ and reverse primer 5′-GTACTCATCGCCCTCCATTG-3′ for rs957970. The amplification protocol comprised initial denaturation at 95°C for 2 min, followed by 30 cycles of denaturation at 95°C for 30 s, annealing at 62°C for 30 s, and extension at 72°C for 30 s and final extension at 72°C for 5 min. The PCR products were digested with restriction enzyme, Hpa II (Takara Bio Inc., Kyoto, Japan) for rs8074524 and rs2293152; Xsp I (Takara Bio Inc.) for rs957970. The digests were separated by electrophoresis on a 2% agarose gel (Nacalai Tesque, Kyoto, Japan) and visualized with an ultraviolet transilluminator (Alpha Innotech Co., San Leandro, CA, USA) after ethidium bromide (Nacalai Tesque) staining.

Determination of Four SNPs in TYK2

Four genotyped tag SNPs in TYK2, rs280496 in intron 3, rs280519 in intron 14, rs2304256 in intron 18, and rs280523 in intron 20 (Fig. 2), were analyzed by PCR-RFLP using 15 pmol each of the following primers: forward primer 5′-CGGGGTGATATGCTCATTGG-3′ and reverse primer 5′-CAACGTGCTGCTGGACAACG-3′ for rs280496; forward primer 5′-CCGCCATGGTGAAAGTTAGC-3′ and reverse primer 5′-ATTTGTGCAGGCCAAGCTGC-3′ for rs280519; forward primer 5′-TCACCAGGCACTTGTTGTCC-3′ and reverse primer 5′-CGGCTTCCAGCATGTGTATG-3′ for rs2304256; and forward primer 5′-ACATTTCCCCCTGCCTACAC-3′ and reverse primer 5′-TTACAGACATGCGCCACCAC-3′ for rs280523. The other constituents of the PCR mixture were the same as described above. The amplification protocol comprised initial denaturation at 95°C for 2 min, followed by 30 cycles of denaturation at 95°C for 30 s, annealing at 64°C (rs280496, rs280519, and rs280523) or 62°C (rs2304256) for 30 s, and extension at 72°C for 30 s and final extension at 72°C for 5 min. The PCR products were digested with Bsl I (New England BioLabs Inc., Beverly, MA, USA) for rs280496, Hpy99 I (New England BioLabs Inc.) for rs280519, Bsm I (New England BioLabs Inc.) for rs2304256, and BsiE I (New England BioLabs Inc.) for rs280523. The digests were then separated on a 2% agarose gel as described above.

Haplotype Structure of TYK2

Two tag SNPs in TYK2, which showed a close association of susceptibility to CD and were located within the same linkage disequilibrium block (Fig. 2), were utilized to infer the haplotype structure as well as to analyze the haplotype frequency using the SNP Alyze 7.0 standard software package (Dynacom Inc., Yokohama, Japan) to emphasize the variability and to enhance the power of detecting allelic association of rare variants [33, 34].

Statistical Analysis

Differences in age and gender between UC or CD patients and control subjects were evaluated by an unpaired Student’s t test and a chi-square test, respectively, using the SPSS 17 (SPSS Japan Inc., Tokyo, Japan) and Prism 5 (GraphPad Software Inc., San Diego, CA, USA) statistical software packages. The frequencies of the expected alleles were calculated from those of the observed genotypes according to the Hardy–Weinberg equilibrium. The frequencies of the observed and expected alleles were compared by the chi-square test with Yates’ correction using the SNP Alyze 7.0 standard software package. The frequencies and distributions of alleles, genotypes, haplotypes, and diplotypes were statistically compared between UC or CD patients and control subjects by the chi-square test and logistic regression analysis using Prism 5 and SPSS 17. Subsequently, a comparison of the genetic risk factors between the statistically significant genotype of STAT3 and diplotype of TYK2 for susceptibility to CD was carried out by a multivariate logistic regression analysis using SPSS 17. The odds ratio (OR) with 95% confidence interval (CI) was calculated using SPSS 17. A P value of less than 0.05 was considered to be statistically significant.

Results

Association of Tag SNPs in STAT3 with Susceptibility to IBD

The frequencies and distributions of alleles and genotypes at the three tag SNPs in STAT3 were identified and compared between UC or CD patients and control subjects (Tables II and III, respectively). The C allele at rs8074524 SNP, G allele at rs2293152 SNP, and A allele at rs957970 SNP are major alleles, whereas the other alleles are minor alleles (Table II). The distributions of these tag SNPs in STAT3 among IBD patients and control subjects corresponded well to the Hardy–Weinberg equilibrium, thus implying that the subject base has a homogeneous genetic background.

Table II Distributions of Polymorphic Alleles at the Genotyped Tag SNP Sites in STAT3 and TYK2 Among Study Subjects
Table III Distribution of Genotypes at the Tag SNP Sites in STAT3 and TYK2 Between CD Patients and Control Subjects

The frequencies of the C allele and its homozygous C/C genotype at rs2293152 in CD patients were significantly higher than those in control subjects (45.2% vs. 33.3%, P = 0.007 and 22.9% vs. 8.5%, P = 0.001, respectively). No significant differences were observed in the frequency of other alleles and genotypes between patients and control subjects.

Association of Tag SNPs in TYK2 with Susceptibility to IBD

The frequencies and distributions of alleles and genotypes at the four tag SNPs in TYK2 were identified and compared between UC or CD patients and control subjects (Tables II and III, respectively). The C allele at rs280496 SNP, A allele at rs280519 SNP, C allele at rs2304256 SNP, and G allele at rs280523 SNP are major alleles, whereas other alleles are minor alleles (Table II). The distributions of these tag SNPs in TYK2 among IBD patients and control subjects corresponded well to the Hardy–Weinberg equilibrium. The frequencies of the A allele and its homozygous A/A genotype at rs280519 in CD patients were significantly higher than those in control subjects (65.7% vs. 56.0%, P = 0.034 and 44.6% vs. 31.5%, P = 0.037, respectively). Likewise, the frequencies of the C allele and its homozygous C/C genotype at rs2304256 in CD patients were also significantly higher than those in control subjects (77.7% vs. 65.5%, P = 0.004 and 62.7% vs. 43.0%, P = 0.003, respectively). In contrast, the frequency of the C/A heterozygous genotype at rs2304256 was significantly lower in CD patients in comparison to that in control subjects (30.1% vs. 45.0%, P = 0.021).

Association of Haplotypes and Diplotypes of TYK2 with Susceptibility to CD

Subsequently, four haplotypes composed of these two tag SNPs (rs280519 and rs2304256), which displayed a significant association with CD susceptibility and were located within the same linkage disequilibrium block, were constructed and identified using the SNP Alyze 7.0 standard software package (Table IV). A logistic regression analysis revealed the frequency of a Hap 1 haplotype (A allele at rs280519 SNP and C allele at rs2304256 SNP) to significantly increase in CD patients in comparison to that in control subjects (65.7% vs. 55.3%, P = 0.023, OR = 1.549). In contrast, the frequency of a Hap 2 haplotype (G allele at rs280519 SNP and A at rs2304256 SNP) was significantly decreased in CD patients in comparison to that in control subjects (22.3% vs. 33.7%, P = 0.007, OR = 0.563).

Table IV Distributions of Haplotypes of TYK2 Between CD Patients and Control Subjects

Furthermore, eight diplotypes composed of four haplotypes were identified (Table V). A logistic regression analysis showed that the frequency of the CD patients possessing a Hap 1/Hap 1 diplotype was significantly higher than that of the control subjects (44.6% vs. 30.5%, P = 0.024, OR = 1.833). In contrast, the frequency of the CD patients having a Hap 1/Hap 2 diplotype was significantly lower than that of the control subjects (24.1% vs. 36.5%, P = 0.045, OR = 0.552). The results of the diplotype analysis regarding the Hap 1 haplotype of TYK2 coincided well with those of the haplotype analysis between CD patients and control subjects; however, a Hap 2/Hap 2 diplotype showed no statistically significant lack of susceptibility to CD (Table V).

Table V Distributions of Diplotypes of TYK2 Between CD Patients and Control Subjects

Gene–Gene Interaction Between STAT3 and TYK2 for Susceptibility to CD

The gene–gene interaction between STAT3 and TYK2 was analyzed between CD patients and control subjects. A multivariate logistic regression analysis indicated that two variable genetic factors, the C/C genotype at rs2293152 SNP in STAT3 and the Hap 1/Hap 1 diplotype of TYK2, independently contributed to susceptibility to CD (P = 0.002, OR = 3.113, 95% CI = 1.515–6.399 and P = 0.030, OR = 1.783, 95% CI = 1.042–3.053, respectively; Table VI).

Table VI Gene–Gene Interaction Between STAT3 Genotype and TYK2 Diplotype for Susceptibility to CD

Furthermore, with regard to the gene–gene combination effect of STAT3 genotype and TYK2 diplotype for susceptibility to CD, a multivariate logistic regression analysis showed the OR to significantly increase (7.486, P = 0.0008, 95% CI = 2.310–24.261) in the individuals possessing both the C/C genotype at rs2293152 SNP in STAT3 and the Hap 1/Hap 1 diplotype of TYK2 in comparison to that in the individuals possessing the other genotypes (Table VII).

Table VII The Gene–Gene Combination Effect of STAT3 Genotype and TYK2 Diplotype for Susceptibility to CD

Discussion

This study is the first demonstration of the single-marker association of STAT3 and TYK2 polymorphisms with CD susceptibility in the Japanese population, although meta-analyses using GWA data previously indicated that STAT3 and TYK2 appear to be the genetic determinants of CD in the European and North American populations [9, 10, 14]. Furthermore, STAT3 is associated with only CD, but not UC, in the Japanese population by a candidate gene-based association study, thereby supporting the meta-analysis of the GWA data in populations of European and North American ancestry [9].

The presence of the C allele and its homozygous C/C genotype at rs2293152 SNP in STAT3 conferred susceptibility to CD. CD- and UC-susceptible rs744166 SNP, which was identified by GWA studies [9, 12, 14], was not analyzed in this study because this SNP was not selected as a genotyped tag SNP by the iHap software program. Because rs744166 and rs957970 SNPs are located within the same linkage disequilibrium block (Fig. 1) and rs957970 SNP was not associated with susceptibility to CD in this study, rs744166 SNP may not be associated with CD in the Japanese population. The difference in susceptibility to CD at the SNP site between Caucasian and Japanese subjects can be attributed to genetic background, although STAT3 may contribute to the same mechanisms of the immunopathogenesis of CD in both Caucasian and Japanese patients.

The presence of the A allele and its homozygous A/A genotype at rs280519 SNP in TYK2, the C allele and its homozygous C/C genotype at rs2304256 SNP in TYK2, the Hap 1 haplotype (A allele at rs280519 SNP and C allele at rs2304256 SNP) of TYK2, and its homozygous Hap 1/Hap 1 diplotype of TYK2 showed susceptibility to CD. CD-susceptible rs12720356 SNP, which was identified by GWA studies [10], was not analyzed in this study because this SNP was not selected as a genotyped tag SNP by the iHap software program. Although rs280519 and rs2304256 SNPs, which were examined in this study, are located within the same linkage disequilibrium block, rs12720356 SNP does not belong to any linkage disequilibrium blocks (Fig. 2). Furthermore, because rs280496 SNP, nearby rs12720356 SNP, was not associated with susceptibility to CD in this study, rs12720356 SNP may thus not be associated with CD in the Japanese population. This disparity can be also attributed to genetic differences between Caucasian and Japanese individuals, although TYK2 may contribute to the same immunopathogenesis in both Caucasian and Japanese CD patients.

In addition, the presence of both the C/C genotype at rs2293152 in STAT3 and the Hap 1/Hap 1 diplotype of TYK2 independently contributed to the pathogenesis of CD and remarkably increased the odds ratio for CD, thus indicating an approximately 7.5-fold increase in susceptibility to CD in this study, although such CD patients account for only approximately 13% (11 of 83 = 13.3% in Table VII) of the genetic variance observed in CD. These findings imply that STAT3 and TYK2 are genetic determinants for the predisposition to the onset and/or development of CD in Japanese individuals. However, this study population was relatively small, and further studies on a larger number of Japanese subjects and on other ethnicities are necessary to confirm the association between the STAT3 and TYK2 polymorphisms and CD. Additional studies are needed because different populations will often have different allele frequencies and haplotype structures.

Recent GWA studies on the IL-23/IL23R signaling pathway have shifted the focus to the IL-23 cytokine [612]. After IL-23 binds to the receptor, which comprises IL23R and IL12RB1 [35], IL-23 signaling may induce the activation of JAK2 in IL23R as well as TYK2 in IL12RB1 because the IL-12RB1 and IL-23R require TYK2 [16], thus resulting in the phosphorylation of STAT3 as well as STAT1, STAT4, and STAT5 in activated macrophages and dendritic cells [35]. The signaling cascade eventually leads to the differentiation of CD4+ (naive) T cells into Th17 cells [17, 2022]. Th17 cells produce IL-17A, IL-17F, and IL-22, which are involved in the first line of the host defense by controlling the immune responses [22]. Indeed, the expression of IL-12, IL-23, STAT3, IL-17, and IL-22 has been reported to increase in the lamina propria of the intestinal mucosa in CD patients [3641]. Taken together, the IL-23/IL23R signaling pathway is central to the inflammation leading to CD and modifies an individual’s risk of developing CD. For these reasons, it may be speculated that the polymorphisms of STAT3 and TYK2, especially the C/C genotype at rs2293152 in STAT3 and the Hap 1/Hap 1 diplotype of TYK2, may affect the gain-of-function of both STAT3 and TYK2, thus altering the efficiency of the IL-23/IL23R signaling pathway. These changes can lead to the perpetuation of the chronic intestinal inflammatory process, thereby resulting in the onset and/or development of CD.

Conclusions

As TYK2 and STAT3 appear to be the genetic determinants of CD in the Japanese population, the combination polymorphism of TYK2 and STAT3 may be useful as a new DNA-based diagnostic biomarker for identifying high-risk individuals susceptible to CD. Finally, STAT3 and TYK2 may be good target molecules for the development of novel drugs in the future.