Background

T1D is characterised as a common autoimmune disease, mainly resulting from a T-cell mediated destruction of pancreatic beta cells that leaves patients completely dependent on exogenous insulin to regulate their blood glucose level. T1D is strongly clustered in families with an overall genetic risk ratio, an estimate of the familial clustering of the disease, of approximately 15[1]. However, of the hundreds of association studies reported to date, only four loci have been identified and successfully replicated: the HLA class II genes on chromosome 6p21[2]; the insulin gene (INS) on chromosome 11p15[3, 4]; CTLA4 on chromosome 2q33[5, 6]; and PTPN22 on chromosome 1p13[7, 8]. CD25 on chromosome 10p15 has been implicated, but this finding awaits independent replication[9]. Given that these genes alone cannot explain the familial clustering of T1D, many other genes remain to be identified.

Recently, there have been several reports focusing on the relationship between autoimmune disease and the complement system, which is composed of more than 30 soluble and membrane-bound proteins[10, 11] and plays an important role in innate host defence. As inappropriate regulation of the complement system can lead to significant damage of host tissues[12], a number of membrane-bound complement regulatory proteins are active, such as DAF, a glycosylphosphatidylinositol-anchored membrane protein that restricts complement activation by inhibiting the formation of C3 convertases in both the classical and alternative pathways[13, 14].

Dysfunction of human DAF on erythrocytes contributes to the paroxysmal nocturnal hemoglobinuria (PNH) by increasing their sensitivity to complement lysis[13, 15, 16]. In addition, a proportion of DAF-deficient (Cromer INAB) patients develop inflammatory bowel disease. However, little is known about DAFs role in autoimmune disease in vivo[17].

Recently, it has been reported that DAF modulates T cell immunity by controlling T cell- and antigen-presenting cell- induced alternative pathway of C3 activation during cognate interactions [1820]. According to gene targeting studies, mice deficient in the DAF1 gene, the murine homologue of human DAF, showed more susceptibility to complement mediated inflammatory injury, especially DAF1 deficient female mice in a MRL/lpr background, a model for human systemic lupus erythematosus, which showed aggravated lymphadenopathy and splenomegaly, higher serum anti-chromatin autoantibody levels, and dermatitis[21].

Given this prior evidence, DAF may function as a negative regulator of autoimmune response by modulating T cell activity and directly protecting host tissues in vivo and that recombinant DAF may be an ideal therapeutic agent for autoimmunity[22]. On the other hand, DAF does not lie under any of the reported T1D linkage peaks[23, 24] nor have there been any reports of genetic association studies between DAF and autoimmune disease, although recently differential expression of DAF was observed when comparing T cells from nonobese diabetic (NOD) mice and diabetes-resistant NOD mice having a congenic interval containing the DAF gene thereby making it a candidate gene for the Idd5.4 region (William Ridgway and Linda Wicker, unpublished observations).

In this study, to elucidate the susceptibility of DAF with T1D, we performed an association study using a LD mapping approach, together with the direct analysis of three non-synonymous SNPs (nsSNPs) in large case-control and family collections.

Results

Linkage disequilibrium analysis

Initially, we used phase II genotyping data from the HapMap project[25, 26], a catalogue of common human genetic variants, providing their allele frequencies and intermarker LD patterns among people, within and among populations from African, Asian, and European ancestry. In the DAF region, about 40 kb on chromosome 1q32, 21 common SNPs (minor allele frequency (MAF) ≥ 0.05), have been genotyped in 60 U.S.A. residents with northern and western European ancestry, collected in 1980 by the Centre d'Etude du Polymorphisme Humain (CEPH, CEU). We note that all of these SNPs were located in non-coding regions and that the average inter-SNP distance was 2 kb. A LD map of the region, using pairwise D', shows little evidence of recombination within the region (Figure 1a).

Figure 1
figure 1

LD map for human DAF region on Chromosome 1q32. All markers with the MAF of less than 0.05 or with insufficient genotyping data were excluded in the LD measurement. a. LD map with 21 markers genotyped in 60 individuals obtained from HapMap II. b. LD map with 22 markers identified by resequencing with 32 CEPH's individuals. figure c. LD map with 38 markers, a combined dataset of both HapMap II and in-house resequencing data with 32 CEPH's individuals.

DAF resequencing

As we were concerned about adopting a LD mapping approach given the HapMap SNP density[27], we resequenced DAF in 32 CEPH individuals, selected from the 60 CEPH individuals used by the HapMap project, to increase the SNP density across the region.

Analysis of the resequencing data identified 32 polymorphisms, 26 of which were SNPs and six were deletion/insertion polymorphisms (DIPs), of which 12 SNPs and four DIPs were novel when compared to dbSNP build 125 (Table 1). Twenty-two polymorphisms were common (MAF ≥ 0.05), five of which were also found in the HapMap II data. The relatively small number of common polymorphisms found in both datasets is not unexpected, as HapMap II SNPs were selected to provide an even coverage in terms of distance across the genome, whereas the resequencing is focused on regions of interest and extracts all common polymorphisms present in these individuals. A LD map of the region, based on these 22 polymorphisms (Figure 1b), revealed additional evidence of recombination within the region over and above that apparent in HapMap II data alone (Figure 1a). There was a breakdown in LD towards the 5' end of DAF that was not evident in HapMap II data.

Table 1 Polymorphisms identified in human DAF. Map positions on human chromosome 1 were from NCBI build 35. Selected tag SNPs are in boldface. Alleles are coded Major > Minor. MAF calculated from the 32 individuals used for tag SNP selection, except for nsSNPs, in which case the MAF is calculated from genotyped controls. DIL = identified by in-house resequencing, HapMap = identified in HapMap II dataset, DIL&HapMap II = Identified in both in-house resequencing and HapMap II dataset.

Tag SNP analysis of DAF

To test for an association between T1D and the DAF region, we adopted a LD mapping approach, which exploits the non-random relationships between SNPs (known as LD) in a region of interest to reduce the amount of genotyping required. As the causal SNP is unknown, we assume that predicting the causal SNP is likely to be no more difficult than predicting any other SNP. The predictive performance of the tag SNPs was assessed using a R2 measure, which measures the ability to predict each known SNP by multiple regression on the set of tag SNPs. The tag SNPs were analysed using a multilocus test, as described by Chapman et al[28], which tests for an association between the tag SNPs and T1D due to LD with one or more causal variants[28, 29].

We first combined our resequencing data with the HapMap data, providing a panel of 38 common polymorphisms genotyped in 32 individuals (Table 1), and generated a combined LD map of the region (Figure 1c). Subsequently, seven tag SNPs were selected[9] from the 38 common polymorphisms, required to capture the variation within the DAF region with a minimum R2 of 0.80[28] (Table 1). The tag SNPs were genotyped in 3,523 cases and 3,817 controls, and in 725 Caucasian multiplex T1D families (Table 2). The case-control and family multilocus P-values were 0.12 (3,523 case and 3,817 control genotypes; F7,7321 = 1.63) and 0.69 (parent-child trio genotypes = 1,390; χ72 = 4.72), respectively, providing no evidence for the association between T1D and the DAF region. In the case-control collection, the multilocus test was stratified by broad geographical within the UK in order to minimize any confounding due to variation in allele frequencies across Great Britain[9, 30].

Table 2 Genotyping data of tag SNPs. The number of individuals with each genotype in case-control data and the number of alleles in family data are indicated. T, Transmitted. UT, Untransmitted. MAF, minor allele frequency calculated from either control samples or parents.

Analysis of DAF non-synonymous SNPs

Recently, it has been proposed that complex diseases such as T1D may result from the effects of a large number of rare variants, with substantial allelic heterogeneity at causal loci[31, 32]. In DAF, several rare non-synonymous SNPs (nsSNPs) were reported in the exons encoding the short consensus repeat (SCR) domains of the DAF protein, which have subsequently been shown to be related with antigen of the Cromer blood group system[13, 14, 33, 34]. On the basis of the rare variant hypothesis, we genotyped three rare nsSNPs in3,490 cases and 3,814 controls (Table 3), under the hypothesis that a rare functional variant in DAF might have a strong effect in T1D. The following three nsSNPs were assessed: DAF-WESa/b(G > T) located in exon 2 with a MAF of 0.0055–0.0060 in a Finnish population[35, 36]; rs28371588(C > A), also located in exon 2[34]; and, rs12135160(G > A), identified by SsahaSNP detection tool (NIH and Sanger Institute, UK) in exon 8 and not previously genotyped. All result in amino-acid substitutions, but their phenotypic influences have not been characterized. In the present study, the MAF of rs12135160(G > A) was 0.00042 in 3,768 controls, and consequently, we have no statistical power to detect an association. Both rs28371588(C > A) and DAF-WESa/b(G > T) were monomorphic in the case-control collection.

Table 3 Genotyping data of non-synonymous SNPs. The numbers of individuals with each genotype in case-control data are indicated. MAF, minor allele frequency calculated from control samples.

Discussion

In this study, we did not find any evidence for an association between T1D and the DAF region in large case-control and family collections using a LD mapping approach. We combined the HapMap II genotyping data and resequencing data, for the selection of tag SNPs. Had we chosen the tag SNPs using only the HapMap II genotyping data, only two tag SNPs (rs2564978 and rs1507765) were required to capture the detected variation within the ~40 kb DAF region with a minumum R2 of 0.8. However, when the predictive performance of the two tag SNPs were applied to the combined sequence dataset, they no longer captured the variation within the region to the required level since seven of the thirty-six common polymorphisms had an R2 below 0.8. The inability of the tag SNPs selected from HapMap II data to tag the combined dataset (minimum R2 = 0.35) suggests that for the analysis of localized regions containing candidate genes, as opposed to whole-genome association studies, HapMap II data alone may not provide sufficient information to facilitate a comprehensive LD-mapping approach. In the tag SNP approach, as the causal variant is unknown, we assume that the problem of predicting the causal polymorphism is likely to be no more difficult than that of predicting any other polymorphism[28]. Consequently, the power of the tag SNP approach to detect a causal polymorphism is based upon the minimum R2[28], assuming that the majority of common polymorphisms in a region are known. In this instance, incomplete knowledge of the common polymorphisms in a region inflated the minimum R2, providing false confidence in the ability of the tag SNPs to capture the variation within a region, and in the power to detect a causal variant. Our results indicate that for some genes/regions HapMap II data may need to be supplemented by additional resequencing data to allow comprehensive association mapping of common variants.

Conclusion

We conclude that variation in DAF itself is unlikely to have a major effect in T1D in these populations. Analysis of an extended region, surrounding the DAF region analysed in this study, showed a cluster of several other genes involved in the complement system, including C4b binding protein (C4bp) and membrane cofactor protein (MCP), both known regulators of complement activation (RCA) genes[11, 37, 38]. C4bp and MCP restrict complement activation by inhibiting the formation of C3 convertases in the classical pathways like DAF, suggesting that they modulate each other in direct and indirect ways. To clarify the relation of autoimmune disease and complement system, including DAF, further genetic association studies and functional studies on RCA genes are needed. The set of tag SNPs and the LD map for the DAF region will be useful for such further studies.

Methods

Subjects

The resequencing panel consisted of 32 CEPH individuals; Utah residents with ancestry from Northern and Western Europe collected in 1980 by the Centre d'Etude du Polymorphisme Humain (CEPH).

The 3,523 cases were recruited as part of the Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory's United Kingdom Genetic Resource Investigating Diabetes (U.K. GRID) study, which is a joint project between the University of Cambridge Department of Paediatrics and the Department of Medical Genetics at the Cambridge Institute for Medical Research. Most cases were < 16 years of age at the time off collection, all resided in Great Britain, and all were of European descent (self-reported). The 3,817 control samples were obtained from the 1958 British Birth Cohort (1958 BBC), an ongoing follow-up of all person born in Great Britain during one week in 1958 (National Child Development Study)[39]. All cases and control were of white ethnicity.

All families were Caucasian and of European descent, with two parents and at least one affected child. The family collection consisted of 457 multiplex families from the U.K. British Diabetic Association Warren 1 repository[40] and 268 multiplex families from U.S.A. Human Biological Data Interchange[41].

The Cambridge Local Research Ethics Committee gave full ethical approval, and informed consent was obtained for the collection and use of these DNA samples from all subjects.

DAF resequencing

We first annotated the DAF gene locally[42, 43] and displayed the annotation through gbrowse[44] within T1DBase[45], using these annotations we resequenced all 11 exons, exon/intron boundaries and up to 3 kb of 3' and 5' flanking sequence of the DAF gene in 32 CEPH individuals, to increase the SNP density across the region. The sequencing reactions were carried out on nested PCR products using Applied Biosystems (ABI) BigDye terminator v3.1 chemistry and the sequences resolved on an ABI3700 DNA Analyser. Polymorphisms were identified using the Staden Package [46] and double-scored by a second operator.

Statistical analysis

The multilocus test has been described in detail elsewhere[9, 28, 29, 47], briefly, for the case-control data, the multilocus test is essentially Hotellings T2[48, 49], in which we score each diallelic locus as 0, 1 or 2 and compare the mean score vectors between cases and control. In the case of the family data, the multilocus test takes the form of a multilocus TDT[28], in which, for each parent, we calculate a vector whose elements describe transmissions of each of the tag SNPs. If the parent is homozygous at a locus, the corresponding element is scored as zero, otherwise it scored as either +1 or -1 depending on which allele was transmitted. The multilocus test tests the mean of this vector against zero; it is asymptotically distributed as a χ2 with degrees of freedom (df) equal to the number of tag SNPs[28, 47].

The program for the selection of tag SNPs[28] and association analysis used here are implemented in the Stata statistical system and may be downloaded from our website[50].

Genotyping

Genotyping was performed using Taqman MGB (Applied Biosytems Inc, Foster City, CA)[51]. All genotyping data were double-scored to minimize error. All genotyping data were in Hardy-Weinberg equilibrium (P > 0.05). Genotyping failure rates for all assays in both the family and case-control collection were ≤ 6%.