Introduction

Celiac disease (CD) is a chronic inflammatory condition of the small intestine because of an immunological intolerance for the food protein gluten. Patients have to adhere to a life-long diet devoid of gluten to prevent the detrimental effects of a prolonged nutrient and mineral deficiency (Green and Jabri 2003). Susceptibility for CD is predominantly determined by genetic factors, and the complex inheritance patterns suggest the interaction of multiple genes (van Heel et al. 2005). It is well established that the adaptive immune response to gluten plays a pivotal role in the pathogenesis of CD. Th1 activation of CD4+ T cells follows gluten-peptide presentation by DQ2 or DQ8 molecules expressed on antigen-presenting cells (Sollid 2002). The HLA-DQA and -DQB gene variants coding for these molecules are the major genetic determinants for CD susceptibility (Koning et al. 2005). Recently, the importance of innate immunity in CD pathogenesis was also underscored by the observation of induced IL15 expression and NKT cell chemotaxis through the MICA and NKG2D molecules (Hue et al. 2004; Meresse et al. 2004). However, no genetic contribution of the cognate genes has been demonstrated. The notion of crosstalk between the adaptive and innate immune systems is not limited to CD and gets much attention in studies of the inflammatory process (Hoebe et al. 2004). This raises the question whether some aspects of innate immunity may contribute to the genetic susceptibility for CD. The innate immune system uses a wide array of defense mechanisms against the invasion of pathogens. These encompass the expression of pattern recognition receptors, release of antimicrobial molecules, and preservation of epithelial barrier and tissue integrity by, e.g., serine protease inhibitors (Kimbrell and Beutler 2001).

One branch of the family of serine protease inhibitors is that of the Kazal type (SPINK) that originally consisted of four members in humans (SPINK1, SPINK2, SPINK4, and SPINK5). Recently, as part of a cluster of SPINK genes on chromosome 5q32 that already included SPINK1 and SPINK5, five new SPINK(-like) members were identified that were located more distally: SPINK5L2, SPINK6, SPINK5L3, SPINK7, and SPINK9, respectively (NCBI Map Viewer, build 36.1). However, these new members lack functional annotation and were therefore not included in this study. SPINK family members 1, 2, and 4 have a comparable size and structure coded for by 4 exons with a single Kazal type serine protease inhibitor domain. SPINK5, in contrast, contains 33 exons that encode 15 inhibitory domains. All four SPINK members are thought to be involved in the protection against proteolytic degradation of epithelial and mucosal tissues, although their major site of expression may differ. SPINK1 is expressed in the pancreas and the gastrointestinal tract, and mutations in this gene are reported in various forms of pancreatitis (Pfutzer and Whitcomb 2001). SPINK2 (located on 4q12) is expressed in the testis, epididymis, and seminal vesicle, where its antimicrobial function may be involved in protection of fertility (Rockett et al. 2004). SPINK4 was originally isolated from pig intestine (Agerberth et al. 1989) and is abundantly expressed in human and porcine goblet cells in the crypts of Lieberkühn but was also found in monocytes and in the central nervous system (Metsis et al. 1992; Norberg et al. 2003). SPINK5 is expressed in the thymus, vaginal epithelium, Bartolin’s glands, oral mucosa, tonsils, and the parathyroid glands (Magert et al. 1999). Mutations in SPINK5 are responsible for the Netherton syndrome, a lethal skin disorder characterized by ichthyosis, hair shaft defects, atopy, skin barrier defects, and recurrent bacterial infections (Bitoun et al. 2002). Mouse models of the Netherton syndrome have shown enhanced proteolysis of desmoglein 1 and filaggrin in SPINK5 mutants (Descargues et al. 2005; Hewett et al. 2005). Moreover, SPINK5 has also been associated with asthma and atopic dermatitis (Blumenthal 2005).

Interestingly, both SPINK1 and SPINK5 are located on chromosome 5q32. This region contains the CELIAC2 susceptibility locus that emerged repeatedly in linkage studies (Babron et al. 2003). Despite the fact that this region is rich in candidate cytokine genes and intense mapping efforts were made, no closely associated genes were identified (Ryan et al. 2005). Likewise, SPINK4 is located on chromosome 9p13.3 and resides within a linkage region (9p21-13) where we previously identified a novel CD locus that segregated within a four-generation Dutch family (van Belzen et al. 2004). Taken together, the role of SPINK genes in epithelial and mucosal protection and the important genetic locations of SPINK1, SPINK4, and SPINK5 prompted us to subject the four conventional members of the SPINK family to gene expression and genetic association analyses to ascertain their possible role in CD pathogenesis.

Materials and methods

Patient material

Duodenal biopsy samples were collected by endoscopy as part of a routine CD diagnostic procedure or to monitor the response to a gluten-free diet in previously diagnosed patients. All patients were classified using the Marsh nomenclature according to the UEGW criteria (Report of a working group of the United European Gastroenterology Week in Amsterdam 2001). Two biopsy samples taken in parallel to those used for histological examination were pooled and used for determination of gene expression. In total, 62 individuals were examined with quantitative reverse transcription polymerase chain reaction (qRT-PCR), of which 16 were normal controls, and 46 were CD patients (Fig. 1). The patient group consisted of 15 untreated cases with villous atrophy (MIII) and 31 patients treated with a gluten-free diet who were in various stages of mucosal recovery: MII (crypt hyperplasia; n = 11), MI (lymphocyte infiltration; n = 8), and M0 (complete remission; n = 12). The biopsies of all participating patients were reevaluated by an experienced pathologist, and only CD patients with a proven original Marsh III lesion were included in this study. The genetic association study on all four SPINK genes was initially conducted on a cohort of 310 independent CD patients and 180 independent age- and sex-matched controls, all of which were from Dutch Caucasian decent. In a second stage, exclusively focused on all SPINK4 SNPs, we added 360 controls to a total of 540. In case of three SPINK4 variants with suggestive P values, the power of the study was further enhanced by adding 169 extra CD cases to a total of 479. In parallel, we also examined the SPINK4 gene in a previously described four-generation Dutch CD family (van Belzen et al. 2004). Six family members and a CEPH control were subjected to DNA sequence analysis of the SPINK4 coding regions and splice sites boundaries. Additionally, we performed SPINK4 SNP haplotype analysis in this family. All patients and family members that volunteered for this study signed an informed consent. The study was approved by the Medical Ethics Committee of the University Medical Center Utrecht.

Fig. 1
figure 1

Results of qRT-PCR of SPINK genes in normal controls (NC) and CD patients, either untreated (MIII) or on a gluten-free diet (MIIM0). The Marsh stages refer to the pathological conditions of the mucosa, characterized by atrophy of the villi (MIII); hyperplastic crypts between the villi (MIIIMII); and enhanced lymphocyte infiltration (MIIIMIIMI). Stage M0 indicates complete remission comparable to controls. The genes tested were as follows: SPINK1 (a); SPINK2 (b); SPINK4 (c); and SPINK5 (d). Measurements were made in triplicate, on pools of separately prepared cDNA samples. Expression data were normalized to the normal control pool.(e) Relative expression of all four SPINK genes with respect to SPINK2 in the healthy duodenal mucosa. Note the logarithmic scale here. The GUSB gene was used as an endogenous control in all tests. Error bars indicate standard deviations

Expression study

The isolation of total RNA from biopsy samples and the analysis of gene expression by real-time qRT-PCR on an ABI Prism 7900HT was performed as described before (Wapenaar et al. 2004). We used the commercially available Assay-on-Demand test for SPINK1 (Hs00162154_m1), SPINK2 (Hs00221653_m1), SPINK4 (Hs00205508_m1), SPINK5 (Hs00199260), and the endogenous control gene GUSB (PDAR 4326320E; Applied Biosystems, Foster City, CA). All samples were tested in triplicate on pooled cDNA samples representing each Marsh class. The results were confirmed with cDNA from individual samples tested in duplicate. Relative levels of gene expression were obtained using the SDS2.1 software (Applied Biosystems).

Genetic association and data analysis

Haplotype tagging SNPs were selected for SPINK1, SPINK2, SPINK4, and SPINK5 based on HapMap (Phase I) data using Haploview (Barrett et al. 2005). For each haploblock containing SNPs in high linkage disequilibrium, one or more representative SNPs were selected that should capture the genetic variation within that block. For the four SPINK genes tested, this resulted in a set of 18 haplotype tagging SNPs and one coding SNP (see Table 1 and Fig. 2). SNP assays were obtained from Applied Biosystems and analyzed on an ABI Prism 7900HT. Hardy–Weinberg Equilibrium (HWE) was evaluated separately in cases and controls for all SNPs tested. Allele frequencies were compared between cases and controls, and P values were obtained by χ 2 analysis.

Table 1 Allelic distribution of SPINK haplotype tagging SNPs in a Dutch CD case-control cohort
Fig. 2
figure 2

Genomic organization of the four SPINK genes. The upper horizontal line indicates exon locations (vertical bars) and SNP positions (numbered asterisks). The SNPs are numbered for each gene consecutively as they appear in Table 1. SPINK4 SNP no. 4 represents the nonsynonymous (Val7Ile) coding SNP rs706107. The arrow points indicate the orientation of transcription. The lower portion of the figure shows the pairwise linkage-disequilibrium structure between indicated SNPs given by D′ statistics based on the European population in the HapMap database (Phase II). Darker red intensities indicate higher D′ values (numbers indicate D’ value, whereas SNP pairs without number have a D′ = 1)

DNA sequence analysis

DNA sequence analysis was performed on SPINK4 in six members of a four-generation Dutch CD family and one CEPH control (family, 1,331; individual, 2). Of these six family members, four were affected (index 02, 08, 32, and 41) and carried the disease-linked haplotype, and two were nonaffected (index 21 and 31) without this haplotype (Fig. 3). All coding sequences of the SPINK4 gene were PCR-amplified, including the intron–exon boundaries (for primers and protocols, see supplementary data Table 1). PCR products were examined on a 2% agarose gel and purified with a Millipore Vacuum Manifold (Billerica, MA). Samples were prepared with the BigDye terminator cycle sequencing ready kit (Applied Biosystems) according to the manufacturer’s protocol. PCR and sequencing amplification were performed on a GeneAmp PCR system 9700 (Perkin Elmer, Foster City, CA). Sequences were run on an ABI Prism 3730 analyzer (Applied Biosystems). Analysis and sequence alignment was carried out with Sequence Navigator (Applied Biosystems) and Vector NTI (InforMax, Massachusetts) software packages.

Fig. 3
figure 3

Pedigree of the Dutch multigeneration CD family. Only affected descendents are depicted (10 out of 13 siblings in the second generation were affected). The grandparental SPINK4 haplotypes that are boxed and shaded are identical to the grandmaternal at-risk haplotype (noninformative). The SNPs are ordered (top-to-bottom) as they appear in Table 1. Genotype numbers 1, 2, 3, and 4 refer to A, C, G, and T alleles, respectively. Sequence analysis was performed on family members 02, 08, 21, 31, 32, and 41. Family member index numbers are indicated in bold

Results

SPINK gene expression in the CD mucosa

The expression of all four conventional members of the human SPINK family was determined by real-time qRT-PCR on duodenal biopsy-derived cDNA pools from normal controls and CD patients, either untreated or in various stages of remission on a gluten-free diet. The results shown in Fig. 1 indicate that only SPINK4 (Fig. 1c) is differentially expressed, and that its transcriptional activity, which is at its highest in Marsh III (20-fold compared to controls), decreases sharply (fourfold) when patients improve and make a transition to Marsh II. To preclude that the results for SPINK4 might be biased by fortuitous differences in individual expression levels within the generated pools, we also examined the control and case samples each separately. This did not change the observed drop in SPINK4 expression during tissue recovery (see supplementary Fig. 1). Likewise, we performed the same analysis for the other three SPINK genes without affecting the profile already observed in the pools (results not shown). We also examined the relative expression of the four SPINK genes with respect to each other in the normal intestinal mucosa. This showed that both SPINK1 and SPINK4 have the highest expression, which is respectively 480-fold and 240-fold higher compared to SPINK2, whereas SPINK5 is in the same order of magnitude (fivefold) as SPINK2 (Fig. 1e). In conclusion, only the SPINK4 gene appears to be differentially regulated in the intestinal mucosa during recovery from the gluten-evoked CD lesion. This observation prompted us to examine whether SPINK4, or any of the other SPINK genes, could also be causally related to the CD pathogenesis.

Genetic association analysis of SPINK genes

We designed a haplotype tagging SNP strategy to capture all genetic variation in SPINK1, -2, -4, and -5. An overview of these four SPINK genes with their genomic organization, linkage-disequilibrium structure, and the position of the haplotype tagging SNPs used is depicted in Fig. 2. Initially, these haplotype tagging SNPs were tested in 310 CD cases and 180 controls (Table 1) and showed no significant association for any of the haplotype tagging SNPs in the four SPINK genes. Despite the initial negative result, we decided to pursue SPINK4 further because it is expressed in goblet cells (Metsis et al. 1992), displayed a CD pathology-related differential expression in the intestinal mucosa, and mapped within a CD linkage region (van Belzen et al. 2004). Initially, we expanded the control group with 360 samples to a total of 540 for all SPINK4 SNPs tested. As a result, the Val7Ile coding variant rs706107 and its flanking haplotype tagging SNPs rs891671 and rs706109 yielded suggestive but nonsignificant P values of 0.0595, 0.0510, and 0.1122, respectively (data not shown). To increase the power of the study even further, we subsequently added 169 CD cases to a total of 479. The effect on the P values of the three SNPs tested was such that they dropped below the significance threshold (see Table 2). From this, we conclude that the four SPINK genes tested do not contribute to the genetic susceptibility in the Dutch CD population.

Table 2 Allelic distribution of three selected SPINK4 SNPs in the extended Dutch CD case-control cohort

SPINK4 sequence analysis in a multigeneration family

We have previously described a four-generation CD family with an extraordinary high incidence of affected individuals (see Fig. 3). The disease segregated with a grandmaternal haplotype on chromosome 9p21-13 (van Belzen et al. 2004), a region that encompasses SPINK4. The apparent dominant inheritance pattern could be caused by a mutation that is rare in the general CD population but present with a high phenotypic penetration in this specific family. To assess if any functional variants of the SPINK4 gene were present in this family, we sequenced all its exons and intron–exon boundaries in six family members. However, we did not observe mutations in any of the samples tested (results not shown). Neither was the exon 1 coding SNP rs706107 specific for affected individuals as all seven individuals tested (including the CEPH control) carried the most frequent GG genotype (Fig. 3). To exclude the possibility of deletions in SPINK4 to be misinterpreted from the sequence data as homozygous genotypes, we also performed segregation analysis of the grandparental SPINK4 haplotypes within the entire family but observed no suspect inheritance pattern (Fig. 3). In conclusion, we have found no evidence that SPINK4 is a candidate gene for the chromosome 9p21-13 CD locus in the Dutch population in general or in the multigeneration Dutch CD family specifically.

Discussion

Chronic inflammatory conditions and autoimmune disorders are typically characterized by a deregulated adaptive and innate immune system. The innate defense consists of multiple components that include physical barriers, antimicrobial molecules, pattern recognition receptors, circulating phagocytes, and the complement system (Hoebe et al. 2004). A breach of the epithelial barrier and loss of microbial containment is often the first of a series of events that trigger or sustain chronic inflammatory diseases (Tlaskalova-Hogenova et al. 2004) as described, e.g., in Crohn’s disease, atopic eczema, asthma, and psoriasis (Schreiber et al. 2005). In CD, the gut–lumen separation is undermined by dietary gluten that evokes a combined innate and adaptive immune response (Londei et al. 2005). It is the joined action of gluten peptides, environmental factors, and genetic determinants that precipitates this enteropathy. The human leukocyte antigen locus is the major genetic contribution to the adaptive Th1 reaction (Koning et al. 2005). Recently, we identified MYO9B as a susceptibility gene in the Dutch population that possibly has an effect on epithelial barrier integrity (Monsuur et al. 2005). Several other studies have underscored the involvement of innate immunity in CD, however, without identification of underlying causative gene variants (Londei et al. 2005). Interestingly, it was also reported that the epithelial glycocalyx and the bacterial composition in the CD gut is distinct (Forsberg et al. 2004; Tjellstrom et al. 2005).

In search of genes that may have a primary contribution to CD pathogenesis, we focused our attention to the SPINK family of serine protease inhibitors that play an important role in tissue preservation through the containment of uncontrolled proteolysis and bacterial growth. In this study, we demonstrated differential gene expression of mucosal SPINK4 in CD. Crypt hyperplasia is a feature of the Marsh III and Marsh II stages of CD, and the concomitant increase in the number of goblet cells may contribute to the increased SPINK4 expression. However, the observed sharp decrease in gene expression sets in during the MIII/MII transition, whereas crypt normalization is observed only later at the MII/MI recovery phase. This suggests that SPINK4 downregulation sets in soon after commencement of the gluten-free diet. This SPINK4 differential expression probably reflects altered goblet cell activity, but its functional significance and regulatory mechanism in CD pathology remains to be established.

The combination of functional relevance and mapping to CD linkage intervals pointed to the SPINK family members as attractive functional and positional candidate genes. We have chosen a robust strategy for genetic association testing based on haplotype tagging SNPs and linkage-disequilibrium structure of the SPINK loci applied to a considerably sized Dutch case-control cohort. With our study design, we had 75% power to confirm association with SPINK1, -2, and -5 (relative risk 2.0; allele frequency 0.1–0.45; 95% confidence interval), whereas this was even 95% (RR 2.0) and 80% (RR 1.6) for SPINK4. These power estimates reflect a Type I error rate of 0.05, which is appropriate for testing a previously reported result. Initial detection of a new genetic association would require much more stringent criteria to assure reproducibility, and power would be correspondingly less.

In parallel, we examined the extended Dutch CD family for variants and deletions in SPINK4. We hypothesized that a specific SPINK4 mutation, although rare in the general population, could have a dramatic impact on mucus composition, bacterial containment, and gluten sensitivity, thereby explaining the apparent dominant and high penetration inheritance pattern in our extended CD family. With both approaches, we were not able to establish a genetic involvement of the SPINK genes tested. However, we cannot completely rule out the possibility of a rare noncoding mutation in SPINK4 (outside the splice donor and acceptor regions) that might specifically segregate in this atypical CD family, characterized by an exceptional high prevalence of affected members.

Despite this negative result in the Dutch CD population, we cannot formally rule out the possibility of genetic contribution of SPINK genes to CD in other European populations like the Italian in whom, unlike the Dutch (van Belzen et al. 2003), chromosome 5q linkage was established (Greco et al. 1998; Percopo et al. 2003). Genuine population heterogeneity has been reported before, e.g., between CARD15/NOD2 and Crohn’s disease (Lesage et al. 2002; Croucher et al. 2003) and between SPINK5 and asthma (Blumenthal 2005; Jongepier et al. 2005). The new SPINK members on chromosome 5q (SPINK5L2, SPINK6, SPINK5L3, SPINK7, and SPINK9) were not part of this study. Currently, no functional annotation is available for these genes that are located near SPINK1 and SPINK5 in a chromosomal region that appears to have been subjected to gene duplication during evolution. Therefore, we cannot exclude their possible involvement in CD or any other inflammatory disorder.