Introduction

Systemic lupus erythematosus (SLE) is an autoimmune disease with an unclear etiology. It usually presents with a diverse spectrum of clinical manifestation spanning from malar rash to kidney injury. Genetic factors explain about 50% to 60% of the disease etiology [1]. The concordance rate for SLE is much higher in monozygotic (25% to 70%) than in dizygotic (2% to 9%) twins [2,3], indicating the importance of genetic contributions.

In genome-wide association studies (GWAS) on SLE, researchers have identified more than 40 loci associated with the disease [4-16]. Many regions, such as STAT4, BLK and IRF5, were found to modulate risk of multiple diseases, although the causal variant(s) may not be shared by different diseases [17]. Cotsapas et al. [17] estimated that 44% of single-nucleotide polymorphisms (SNPs) associated with one autoimmune disease might also be associated with another one. These observations support the hypothesis that autoimmune diseases may share common genetic basis.

The 22q11.21 genomic region was found to be associated with multiple autoimmune diseases, including SLE [14,18], systemic sclerosis (SSc) [19], Crohn’s disease (CD) [20], celiac disease (CeD) and rheumatoid arthritis (RA) [21], psoriasis (PS) [22] and inflammatory bowel disease (IBD) [23]. SNP rs5754217-A (denoting risk allele A of rs5754217), located in the intron of UBE2L3, showed a suggestive association with SLE in women of European ancestry (P = 7.53 × 10−8) [18]. SNP rs463426-A and rs131654-A, located downstream of HIC2 and upstream of UBE2L3, was identified as susceptibility variants with SLE in a Han Chinese population (P = 1.48 × 10−16 and 2.99 × 10−16, respectively) [14]. SNP rs2298428-T, which is a missense variant in YDJC, was significantly enriched in diffuse SSc, though no independent study showed the variant reaching GWAS significance (P =0.017) [19]. SNP rs2298428-T was also suggested to be associated with CD (P = 5.22 × 10−5) [20]. The same SNP was also reported to be associated with both CeD and RA (rs2298428-T, P = 2.5 × 10−10) [21]. Another SNP, rs181359-A, located in intron 2 of UBE2L3, was established as associated with PS and CD (P = 8.02 × 10−10, P = 6.3 × 10−13, respectively) [22]. SNP rs2266959-T, also located in intron 2 of UBE2L3, was also robustly associated with IBD (P = 1 × 10−16) [23]. Despite the various reports on this region, the details of association and potential independent effects are unclear. Further elucidation of the signals in this region may help improve understanding of the shared etiological basis among autoimmune and inflammatory diseases.

In the present study, we examined the association for SLE in the 22q11.21 region and further replicated the association of SNP rs2298428 in a total of 4,271 cases and 5,721 controls of Asian ancestry. To that end, our results confirmed rs2298428 as the most significant SNP associated with SLE in this region. Meanwhile, the risk allele of this SNP is highly correlated with higher expression of UBE2L3 in different cell lines.

Methods

Study participants

The samples included in the present study were collected from Hong Kong and Anhui, China, and from Bangkok, Thailand (Additional file 1). All the cases fulfilled the revised criteria of the American College of Rheumatology for diagnosis of SLE. Cases from Hong Kong were recruited from five hospitals in Hong Kong: Queen Mary Hospital, Tuen Mun Hospital, Queen Elizabeth Hospital, Pamela Youde Nethersole Eastern Hospital and Princess Margaret Hospital (HK_GWAS and HK_REP). Clinical records were well documented with autoantibody profiles and subphenotypes. Controls from Hong Kong were individuals from other GWAS studies who had no overlapping manifestations with SLE in the discovery stage (HK_GWAS). Cases from Anhui were patients visiting the Department of Rheumatology at Anhui Provincial Hospital and the First Affiliated Hospital of Anhui Medical University in Hefei, Anhui province (AH_REP), with corresponding controls from healthy blood donors in Anhui (AH_REP). The cases for the Anhui GWAS (AH_GWAS) were recruited from several hospitals in central and southern China, and the controls were carefully selected with geographically matched and clinically unrelated individuals (AH_GWAS). The cases from Thailand were patients at King Chulalongkorn Memorial Hospital (TH_REP), and geographically matched healthy donors were used as controls (TH_REP). All the individuals involved in the present study gave informed consent. The study conducted in Hong Kong was approved by the institutional review board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster. The study in Anhui was approved by the institutional review board of Anhui Medical University. The Thailand study was approved by the intuitional review board of the Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand.

Imputation

IMPUTE2.3.1 was used to perform imputation on the Hong Kong and Anhui data by using all the samples from 1000 Genome Project (released in September 2013). SNPs that violated Hardy-Weinberg equilibrium (HWE) and SNPs with minor allele frequency <0.05% were removed for further analysis.

Genotyping

The GWAS on the Hong Kong and Anhui cohorts were conducted using the Illumina Human610-Quad BeadChip array (Illumina, San Diego, CA, USA), as previously reported (HK_GWAS and AH_GWAS). Further replication of the candidate SNPs was performed by using the TaqMan genotyping method (Life Technologies, Carlsbad, CA, USA) with the remaining samples from the Hong Kong cohort that were not included in the discovery stage (HK_REP); samples collected from Bangkok, Thailand (TH_REP); and samples from an independent Anhui cohort (AH_REP). Genotyping concordance between Illumina Human610-Quad BeadChip and TaqMan SNP genotyping method was also checked on randomly selected samples, and the two methods showed complete concordance.

Association analysis

We used inverse variance method for the meta-analysis installed in METAL [24]. Joint analysis of association was conducted using the Cochran-Mantel-Haenszel (CMH) test, taking into account the effect of SNP differences between cohorts. The homogeneity of the effect size between different cohorts and different stages of the study was evaluated by using the Breslow-Day test (P_het in Table 1), both installed in PLINK 1.07.

Table 1 Association results of single-nucleotide polymorphism rs2298428 from each cohort and joint analysis a

Stepwise logistic regression was performed using IBM SPSS 16.0 software (IBM, Armonk, NY, USA). Tests of independent contributions toward disease associations for SNPs in a single locus were done using logistic regression, adjusting for the effect of a specific SNP in the same locus, while also taking into account differences among cohorts. SNPTEST v2.2.0 was used to perform the logistic regression tests in this study [25]. Linkage disequilibrium (LD) patterns and values were obtained using Haploview [26].

Results

Imputation and meta-analysis of two genome-wide association studies on Han Chinese populations from Hong Kong and Anhui

First, imputation was performed using IMPUTE2 [27] on two GWAS on Han Chinese populations. Association analysis was performed using SNPTEST v2.2.0, taking the genotype uncertainty into account. Meta-analysis was performed using METAL [24] with the inverse variance-based model. We examined the meta-analysis results in 22q11.21 and observed a total of 4,834 SNPs in this 1.9-Mb region, from 20,220,110 to 22,131,990 bp (GRCh37/hg19). On the basis of meta-analysis P-values (P_meta), 121 SNPs showed suggestive associations (P_meta <0.0001), aggregating in a 187-kb region (Additional file 2). SNP rs2298428 showed the most significant P-value (P_meta =2.70E-09). Of the 96 SNPs, 25 SNPs had P_meta-values reaching genome-wide significance (5E-08), including SNP rs2298428. The other 24 SNPs all had high LD with rs2298428 (r 2 > 0.9).

Linkage disequilibrium pattern of single-nucleotide polymorphisms in 22q11.21 associated with immune-related diseases

For the purpose of finding potentially shared susceptible variants and/or causal variants between SLE and other immune-related diseases, we focused on SNP rs2298428, which showed the most significant P-values for SLE in our meta-analysis results, and other reported SNPs in this region that showed association with SLE (rs5754217, rs463426 and rs131654) [14,18], CD and PS (rs181359) [22], and IBD (rs2266959). SNP rs2298428 was also reported to be associated with other immune-related diseases, including SSc [19], CeD and RA [21]. As shown in Table 2, all six SNPs reported for different diseases showed strong evidence of association with SLE in Asian populations (2.70E-09 ≤ − ≤ 7.27E-05). The LD patterns of these six SNPs are shown in Figure 1, based on different populations from HapMap data including Han Chinese, Beijing, population and Utah residents with ancestry from northern and western Europe (CHB and CEU, respectively), Hong Kong (HK) and Anhui (AH). The LD patterns in CHB, HK and AH populations were similar. In these three Chinese populations, all but rs463426 showed moderate to high LD between each other (r 2 > 0.5). We also compared the LD patterns between the Chinese populations and CEU (Figure 1). In Caucasians, SNPs rs463426 and rs131654 showed minimal LD with all the other SNPs (r 2 < 0.2). The other four SNPs showed high LD with each other (r 2 > 0.95).

Table 2 Meta-analysis results for single-nucleotide polymorphisms reported in previous genome-wide association studies on other immune-related diseases in the 22q11.21 region a
Figure 1
figure 1

The linkage disequilibrium patterns of the susceptibility single-nucleotide polymorphisms for autoimmune diseases in different populations. AH, Anhui; CeD, Celiac disease; CEU, Utah residents with ancestry from northern and western Europe; CHB, Han Chinese, Beijing; HK, Hong Kong; IBD, Inflammatory bowel disease; PS, Psoriasis; RA, Rheumatoid arthritis; SLE, Systemic lupus erythematosus.

Independence test on the single-nucleotide polymorphisms associated with immune-related diseases

A stepwise logistic regression analysis was performed to test the independence of these SNPs (Table 3). The method begins with an empty model, to which the variables were added one at a time. The analysis showed that SNP rs2298428 exhibited the strongest, and the only, significant association with SLE. Further addition of any other SNPs involved did not show significant improvement of the model, which is partially explained by the high LD among most of these SNPs. However, the SNPs with moderate LD did not show significant improvement in the model, demonstrating a lack of evidence of further independent signals of association for SLE in this region.

Table 3 Independent effects among the single-nucleotide polymorphisms in the 22q11.21 region a

Conditional logistic regression analysis was performed to investigate the independent effects among these replicated SNPs with immune-related diseases (Table 4). SNP rs2298428 remained significant when the effect of any other SNPs was accounted for, except for SNP rs2266959. More intuitively, the association P-values of all SNPs before and after the effect of rs2298428 was adjusted for are shown in Figure 2.

Table 4 Conditional logistic regression analysis a
Figure 2
figure 2

Association P -values of all single-nucleotide polymorphisms before and after adjustment for the effect of rs2298428. The graph depicts the association P -values of all single-nucleotide polymorphisms (SNPs) before (the gradient colors of dots reflects the linkage disequilibrium (LD) between the SNPs and SNP rs2298428, using hg19/1000 Genomes March 2012 ASN (Asian) as a reference) and after (gray) adjustment for the effect of rs2298428. The arrows show the reduction of the SNPs after the association adjusted by the effect of rs2298428.

Replication on single-nucleotide polymorphism rs2298428

Replication for SNP rs2298428 was performed by using a TaqMan SNP genotyping method (Life Technologies) on the Hong Kong cohort independent from Hong Kong samples genotyped in the GWAS stage; samples collected from Bangkok, Thailand; and samples from Anhui, China, which were independent from the Anhui GWAS cohort. As shown in Table 5, replications in different cohorts showed consistent results as those from the discovery stage (Table 1). The joint analysis of association, taking into account the effect of SNP differences among cohorts from Hong Kong, Anhui and Thailand from both discovery stage and replication stage, was conducted using the Cochran-Mantel-Haenszel test. SNP rs2298428 showed stronger evidence of association with SLE (P =1.31E-11), and testing of between-population heterogeneity of odds ratios by the Breslow-Day test did not show significant differences among the cohorts (P_het =0.26).

Table 5 The expression quantitative trait loci evidence on the single-nucleotide polymorphism rs2298428 on systemic lupus erythematosus and other immune-related diseases a

Expression quantitative trait loci in this region

Expression quantitative trait loci (eQTL) associations between SNP rs2298428 and UBE2L3 and other genes in this region were closely examined. Two datasets, from Stranger et al. [28] and Fairfax et al. [29], were investigated. In the first study, the researchers examined the correlation of SNPs to gene expression using the lymphoblastoid cell lines (LCLs) of 726 individuals in 8 cohorts from the HapMap3 project. In the second study, the investigators assessed the correlation of SNPs to gene expression in a cell-specific manner using 288 paired, purified primary monocytes and B cells from Caucasians. As shown in Table 5, the genotype of SNP rs2298428 correlated with expression of UBE2L3 in five different populations. (The other three cohorts did not show significant correlation, possibly due to the low frequency of the alternative allele, thus lower power.) The genotypes of the SNP also significantly correlated with the expression of UBE2L3 in B cells and monocytes. The results consistently demonstrated that the risk allele T from rs2298428 is correlated with higher expression of UBE2L3.

The gene expression pattern of UBE2L3 was also examined using a publicly available database, NextBio [30] (Additional file 3). Five independent studies reported increased expression of UBE2L3 from patients with lupus compared with healthy controls in different cell lines. In addition, UBE2L3 was found to have increased expression in a number of other autoimmune diseases, including CD, T1D, SS, PS and PA, using the same database, indicating that UBE2L3 might be the key player in disease association of this region.

Discussion

In this study, through meta-analysis of two existing GWAS on Han Chinese populations with a total number of 1,659 cases and 3,398 controls matched geographically, we identified SNP rs2298428 as the SNP with the highest association with SLE in the 22q11.21 region (P_meta =2.70E-09). The association of rs2298428 was further supported by replication in three cohorts from Hong Kong, Anhui and Thailand, and the results improved by two orders of magnitude after joint analysis from the discovery stage and the replication stage (P_all =1.31E-11, odds ratio =1.23).

Many GWAS hits were aggregated in this region for different autoimmune diseases, and here we tried to find out whether all the reported SNPs (six SNPs included) were linked to the same causal variant or whether they were derived from independent signals. All of the SNPs showed strong evidence of association with SLE in the current investigation with P_meta <7.27E-05. Stepwise logistic regression and conditional logistic regression were performed to examine the independence of these SNPs. The results supported the notion that rs2298428 exhibited the strongest association with SLE. According to the LD pattern, SNP rs463426 is relatively independent from the other five SNPs (Figure 1). However, we were unable to find evidence of independence based on the presently reported results (Figure 2). This might be due to the fact that the exploration of independently contributing variants from this region is based mainly on the meta-analysis data, which might not have enough power to detect multiple independent signals. The five SNPs reported for different autoimmune diseases are located in the same LD block in Chinese population, and their association may be derived from the same casual variant. These cross-phenotype associations in this region highlighted the shared genetic involvement in autoimmune diseases.

In addition, to identify how this region might influence susceptibility to SLE and other autoimmune diseases, we investigated the potential biological function of the gene. eQTL analysis is an important approach in detecting functional mechanisms underlying association by testing whether identified variants may lead to variations in mRNA expression of nearby genes. Using publicly available eQTL datasets, the SNPs in the 22q11.21 region were analyzed. All the results pointed to increased expression of UBE2L3 as the mechanism for association with SLE. UBE2L3 encodes a member of the E2 ubiquitin-conjugating enzyme family. This enzyme was demonstrated to participate in the ubiquitination of p53, c-Fos and the nuclear factor κB precursor p105 in vitro [31,32]. There is also evidence showing the interaction between UBE2L3 and RNF125 [33]. RNF125 is reported as a negative regulator of type I interferon (IFN) signaling. It is well known that patients with SLE have elevated serum levels of type I IFN [34] and that these increased levels correlate with disease activity and severity [35]. Among numerous immunologic alterations present in patients with lupus, the type I IFN system is thought to play a pivotal role in pathogenesis [36-38], which points to a possible role of UBE2L3. However, the exact mechanism of UBE2L3 is still not fully understood.

Conclusions

Focusing on the SNPs in 22q11.21 region with strong evidence of being associated with SLE in previous work, we have identified one more novel susceptibility variant showing the most significant genetic contribution for SLE via meta-analysis and further replication in independent cohorts. The putative susceptibility gene, UBE2L3, is suggested to be related to the type I IFN signaling pathway in SLE pathogenesis. Our findings may shed light on the shared biological mechanisms between different diseases with immunological components.