Introduction

Rheumatoid arthritis (RA) is the most frequent chronic inflammatory rheumatic disease in the world, with prevalence estimates of 0.25% to 0.5%. Its pathogenesis is multifactorial and genetic factors may contribute for 40% to 60% of the total risk [1]. Among possible genetic factors, the HLA-DRB1 gene appears clearly associated with RA [2]. This association was first suggested more than 30 years ago [3] and was elaborated 10 years later by Gregersen and colleagues [4], who demonstrated that RA was associated with several HLA-DRB1 alleles (DRB1*0101, DRB1*0102, DRB1*0401, DRB1*0404, DRB1*0405, DRB1*0408, DRB1*1001, and DRB1*1402) encoding the RAA sequence of amino acids at positions 72 to 74 in the third hypervariable region of the DRβ1 chain, known as the shared epitope (SE). Despite significant improvement in molecular biology techniques, association mechanisms between HLA-DRB1*SE+ alleles and RA remain debated and authors have demonstrated that each SE allele does not confer the same risk [46].

In a more recent study, Tezenas du Montcel and colleagues [8] advanced a new classification of HLA-DRB1 alleles, reconsidering the SE model in RA susceptibility. According to this new classification, the susceptibility to RA, which depends on whether the RAA sequence occupies positions 72 to 74, was modulated by the amino acids at positions 70 and 71, which led to the definition of five groups of HLA-DRB1 alleles: S1, S2, S3P, S3D, and X alleles. Michou and colleagues [9] tested and validated this new classification in an independent sample of 100 French Caucasoid RA trio families, providing estimates for the susceptibility risk genotypes. In the present study, we used worldwide RA samples from the 13th International Histocompatibility Working Group (IHWG) to investigate the relevance of this new HLA-DRB1 allele classification in terms of RA susceptibility across various Caucasoid and non-Caucasoid population samples.

Materials and methods

Selection process of case and control population samples

RA cases and healthy controls included in the present study were selected from a population of 2,376 individuals (1,210 cases and 1,166 controls), initially gathered by 19 laboratories in 17 countries in the framework of the 13th IHWG. The data are publicly available from the dbMHC (Major Histocompatibility Complex database) website of the National Center for Biotechnology Information (Bethesda, MD, USA) [7]. All RA cases met the following criteria: adult onset RA (by definition, 16 years of age or older) and the American College of Rheumatology criteria for RA [8]. For each laboratory, healthy controls were selected within the same geographical area as the RA cases. A selection procedure of cases and controls was carried out in order to allow the comparison of the data issued from the different laboratories that participated in the 13th IHWG: (a) cases and controls of undocumented origin were excluded, (b) samples consisting of cases without controls and samples of less than 20 individuals were discarded, and (c) cases and controls that were matched beforehand for specific HLA-DRB1HLA-DQB1 haplotypes were excluded. Data from different submitters consisting of individuals from the same origin were pooled when no significant departures were found as assessed by an admixture test, which asymptotically follows a chi-square distribution with 1 degree of freedom. According to this selection procedure, 758 cases and 789 controls, issued from 10 different ethnic origin subsamples, were included in the present study (Table 1).

Table 1 Composition of the selected rheumatoid arthritis case and control population samples

HLA-DRB1genotyping

All RA cases and controls were genotyped for HLA-DRB1. HLA-DRB1 typing techniques used in the framework of the 13th IHWG are described extensively in the 13th IHWG Proceedings [9, 10].

HLA-DRB1classification

HLA-DRB1 alleles were divided into five groups according to the classification proposed by Tezenas du Montcel and colleagues [11, 12]. Briefly, the HLA-DRB1 alleles were first divided into two groups according to the presence or absence of the RAA sequence at positions 72 to 74 and were denoted S and X alleles, respectively. The S alleles were subsequently divided into three groups according to the amino acid (alanine [A], glutamic acid [E], lysine [K], or arginine [R]) at position 71: S1 for ARAA and ERAA, S2 for KRAA, and S3 for RRAA. Since an aspartic acid (D) at position 70 was reported to be protective against RA in contrast to a glutamine (Q) or an arginine (R) at the same position [13], two additional groups were defined: S3D for DRRAA and S3P for QRRAA or RRRAA [11, 12].

Statistical analysis

To identify association with RA susceptibility, odds ratios (ORs) were calculated for the presence of the S1, S2, S3P, S3D, and X alleles. Confidence intervals (CIs) are given at 95% confidence. Consistently with previous findings [79] and with the main objective of this work (which is to challenge these previous findings in various Caucasoid and non-Caucasoid populations), we performed the whole analysis under a dominant effect model by comparing carrier frequencies for the different HLA-DRB1 allele groups defined according to the classification between RA patients and controls across the 10 Caucasoid and non-Caucasoid samples.

We used a meta-analysis approach to combine the data issued from the different laboratories that participated in the 13th IHWG. The Mantel-Haenszel method assumes a fixed effect and combines studies using a method similar to inverse variance approaches to determine the weight given to each study. It provides a common OR estimate, taking into account the weight of the different samples and 95% CI. OR and 95% CI are shown on forest plots for each allele group studied. Statistical heterogeneity of the considered samples was assessed on the basis of the Q test (chi-square), using a significance level of 0.05, and reported with the I2 statistic (in which high values indicate high heterogeneity). An I2 value of greater than 50% was considered the threshold for heterogeneity. Genotype risk analyses were conducted using the same method. All computations were performed using the Revman 4.2.8 software package developed by the Nordic Cochrane Center (Copenhagen, Denmark) [14] and Stata version 7.0 software (StataCorp LP, College Station, TX, USA). All p values were two-sided. P values of less than 0.05 were considered significant, and corrections for multiple testing were mentioned when relevant.

Results

Carrier frequencies of the different HLA-DRB1allele groups in RA cases and controls for the various Caucasoid and non-Caucasoid population samples

Figure 1 shows the carrier frequencies for the different HLA-DRB1 allele groups, as defined according to the classification developed by Tezenas du Montcel and colleagues [11], in cases and controls of each sample selected from the 13th IHWG. No significant departures from Hardy-Weinberg equilibrium were observed (all p > 0.05 after correction for multiple testing). Statistical testing for heterogeneity in the X allele group revealed a significant difference between samples (I2 = 62.9%, p = 4 × 10-3). No significant heterogeneity could be observed for the S1 (I2 = 0%, p = 0.57), S2 (I2 = 15.9%, p = 0.30), S3P (I2 = 19.5%, p = 0.27), or S3D (I2 = 23.6%, p = 0.23) groups of HLA-DRB1 alleles.

Figure 1
figure 1

Carrier frequency comparisons of the different HLA-DRB1 allele groups between rheumatoid arthritis (RA) cases and controls across the various Caucasoid and non-Caucasoid population samples and overall effect estimation. This figure provides a summary meta-analysis of allele carrier frequencies according to HLA-DRB1 allele classification, in selected samples among the data available from the 13th International Histocompatibility Working Group on Rheumatoid Arthritis. For each population sample, odds ratios (ORs) and 95% confidence intervals (95% CIs) evaluate the significance of the association between the different HLA-DRB1 allele groups and RA susceptibility (blue boxes). The combined ORs and 95% CIs evaluate the significance of the global effect of the different HLA-DRB1 allele groups on RA susceptibility over all population samples. P values were calculated with the Mantel-Haenszel method (black diamonds).

Carrier frequency comparisons of the different HLA-DRB1allele groups between RA cases and controls across the various Caucasoid and non-Caucasoid population samples and overall effect estimation

Results of allele carrier frequency comparisons between RA cases and controls across the various Caucasoid and non-Caucasoid population samples are presented in Figure 1. An overall positive association with RA susceptibility was found for S2 alleles (OR 2.15, 95% CI 1.54 to 3.00; p < 10-5) and S3P alleles (OR 2.74, 95% CI 2.01 to 3.74; p < 10-5). An overall negative association with RA susceptibility was highlighted for S1 alleles (OR 0.60, 95% CI 0.48 to 0.76; p < 10-4) and X alleles (OR 0.58, 95% CI 0.39 to 0.84; p = 4 × 10-3). No significant association with RA susceptibility was found for the S3D group of alleles (OR 0.89, 95% CI 0.69 to 1.14; p = 0.88). In such an analysis, a potential bias may be introduced by the presence of allele adverse effect in the control group. For example, in the analysis of the S2 effect, the association may be overestimated due to the presence of S3D carriers in the control group (noncarrier of S2). Similarly, the effect of S2 may be underestimated thanks to the presence of S3P carriers. After controlling for the adverse effect of S3D and S1 in the analysis of S2, the association with RA susceptibility remained significant (p < 0.05).

Carrier frequency comparisons of the different HLA-DRB1allele groups between RA cases and controls in Caucasoid and non-Caucasoid population samples

Results of allele carrier frequency comparisons between RA cases and controls in Caucasoid and non-Caucasoid population samples are presented in Table 2. In the Caucasoid population sample, S2 alleles (OR 2.61, 95% CI 1.87 to 3.64) and S3P alleles (OR 1.86, 95% CI 1.39 to 2.49) were positively associated with RA susceptibility, whereas S1 alleles (OR 0.59, 95% CI 0.45 to 0.79) and X alleles (OR 0.74, 95% CI 0.56 to 0.96) were negatively associated with RA susceptibility. In the non-Caucasoid population sample, S3P alleles (OR 2.93, 95% CI 2.21 to 4.04) were positively associated with RA susceptibility, whereas S1 alleles (OR 0.52, 95% CI 0.37 to 0.71) and X alleles (OR 0.61, 95% CI 0.45 to 0.83) were negatively associated with RA susceptibility.

Table 2 Carrier frequency comparisons of the different HLA-DRB1 allele groups between rheumatoid arthritis cases and controls in Caucasoid and non-Caucasoid population samples

Overall effect estimation of genotypes resulting from the classification of HLA-DRB1 alleles on RA susceptibility

Using the approach proposed by Michou and colleagues [9], we further pooled the three low-risk allele groups (S1, S3D, and X), thus producing a new grouping called L alleles. Thus, in subsequent analyses, we considered only three allele groups (S2, S3P, and L alleles), with six corresponding genotypes [12]. The results of observed genotype distributions and of genotype relative risks are shown in Table 3. S2/S3P and S3P/S3P were associated with the greatest risks for RA, with ORs (95% CIs) of 7.25 (3.26 to 16.14) and 5.15 (2.91 to 9.12), respectively. These are followed by S2/S2, S2/L, and S3P/L, with ORs (95% CIs) of 4.95 (2.2 to 11.18), 2.41 (1.60 to 3.65), and 2.33 (1.57 to 3.45), respectively. These analyses were all performed using the L/L genotype as reference.

Table 3 Overall effect estimation of genotypes resulting from the classification of HLA-DRB1 alleles on rheumatoid arthritis susceptibility

Discussion

In the present association study, we investigated the relevance of the classification of HLA-DRB1 alleles proposed by Tezenas du Montcel and colleagues [11] regarding susceptibility to RA, across various Caucasoid and non-Caucasoid population samples, using publicly available data from the 13th IHWG RA studies. Across these various population samples, our approach strengthens the relevance of this classification, exhibiting an overall positive association with RA susceptibility for S2 and S3P alleles and an overall negative association with RA susceptibility for S1 and X alleles. The genotype analysis performed in the present study fits with the genotype risk hierarchy previously reported in Caucasoid RA sporadic cases [11] and families [12].

The present combined analysis included 10 samples from different genetic backgrounds. Although we did not observe significant heterogeneity for S1, S2, S3D and S3P allele groups, we observed significant heterogeneity for the X allele group across the different population samples. The fixed effect model of the Mantel-Haenszel method, used for the overall effect analysis of the HLA-DRB1 allele and genotype groups on RA susceptibility in the present study, assumes that each allele group carries out a homogeneous effect on RA susceptibility across the various Caucasoid and non-Caucasoid samples. The heterogeneity observed for the X allele group may be questioned according the heterogeneity of the HLA-DRB1 allele and genotype groups at two levels across the different population samples: the effect level and the frequency level. Our data suggest that there is a differential effect of the S1, S2, S3D and S3P allele groups on RA susceptibility. Each of these effects seems homogenous across the various population samples. Because the SE allele distribution varies across these populations, the resulting effect of the X allele group on RA susceptibility depends both on the frequency of the S1, S2, S3D and S3P allele groups, and their respective effects on RA susceptibility, which might explain the observed heterogeneity of the effect of the X allele group in our study.

The contribution of SE alleles to RA susceptibility has been confirmed by numerous studies on different populations. For example, a recent meta-analysis on Latin American RA patients has shown the important role played by SE in RA susceptibility [15]. However, RA prevalence studies have shown differences in frequency estimations between populations with different genetic backgrounds. The highest prevalence rates have been found in Native American populations with estimation ranges of 32 to 48 per 1,000 men and 59 to 70 per 1,000 women. In Afro-Caribbean people who live in the UK, RA prevalence appeared to be lower than that in the general population. In urban African populations, RA prevalence was estimated around 10 per 1,000 and was found to be significantly higher than in rural populations. Studies on Chinese populations have reported lower prevalence estimations than in European ones. Molokhia and McKeigue previously pointed out the difficulty brought up by admixture in investigating the etiology of rheumatic diseases, notably for RA [16]. The significant variations observed in the incidence and prevalence of RA among different populations or ethnic groups could be explained, in part, by genetic variations in the HLA region, especially variations in the prevalence of SE in different populations [17, 18]. In addition, as no consideration of environmental exposure variations between the population samples studied was made, the heterogeneity could be explained by the different impact of environmental factors on RA susceptibility in each different sample, such as nutrition as previously suggested, in particular in the Greek population [18, 19]. In addition to nutrition, environmental factors such as exposure to cigarette smoking [20, 21] or individual factors such as gender [22] may influence susceptibility to RA by interacting with genetic factors such as HLA-DRB1.

The classification proposed by Tezenas du Montcel and colleagues [11], based on amino acid sequence at positions 70 to 74, does not aim to account for all previously reported associations between particular HLA-DRB1 alleles and RA susceptibility in specific ethnic backgrounds. For example, the previously reported association between the HLA-DRB1*0901 allele and RA susceptibility in East Asian populations could not be tested in the present study, as this particular allele was classified together with many others as an X allele [2325]. The high frequency of the HLA-DRB1*0901 allele in the Javanese population could contribute both to the association found between X alleles and susceptibility to RA in this particular population sample and to the observed heterogeneity of the X allele group.

The contribution of the HLA-DRB1 allele classification in accounting for the genetic contribution of the HLA-DRB1 gene was previously analyzed in terms of RA severity and in terms of autoantibody production such as anti-cyclic citrullinated peptide (anti-CCP) antibodies and anti-deiminated human fibrinogen autoantibodies. As RA severity outcomes as well as anti-CPP information were not collected in the framework of the 13th IHWG, we were not able to discuss the relevance of the classification of HLA-DRB1 alleles proposed by Tezenas du Montcel and colleagues [8] regarding RA severity or autoantibody production in the various Caucasoid and non-Caucasoid population samples included in the present study.

Conclusion

Across these various samples coming from both Caucasoid and non-Caucasoid populations, we investigated the relevance of the classification of HLA-DRB1 alleles proposed by Tezenas du Montcel and colleagues [11] regarding susceptibility to RA. We confirm previous findings on the contribution of the S2 and S3P risk allele groups to RA susceptibility. In spite of the small sample size in some ethnic groups, the present study allows the differentiation between predisposing and protective HLA-DRB1 SE alleles in both Caucasoid and non-Caucasoid RA patients.

This report also emphasized the very crucial importance of public release of large-scale study data in genetic epidemiology. The need for large samples to refine the study of effects of modest magnitude and the necessity to replicate studies across different ethnic backgrounds rely on easy access to a large variety of data organized in a systematic way. After an initial period of restricted use of the data by the initial investigators, the access to clinical and genetic anonymous individual data should be made possible; this is the current policy of the National Institutes of Health (Bethesda, MD, USA) for genome-wide association study results [26]. Combined with a detailed description of the sampling scheme for both patients and controls, advanced statistical analysis will contribute to enhance secondary uses of data valorizing the efforts of previously completed studies [27].