Introduction

Rheumatoid arthritis (RA) is a chronic inflammatory disease resulting primarily in chronic inflammation and destruction of symmetric joints. Persons with RA are often in working age and the accompanying fatigue significantly affects working capacity. Ongoing joint destruction is however seen in more than a third of patients after initiation of a treatment regime. There is currently no generally accepted way to predict treatment efficacy in individual patients, so medications are prescribed according to consensus recommendations. Firstline treatment is typically methotrexate (MTX), an inhibitor of protein and nucleic acid synthesis that leads to inhibition of immune cells (1). Around 30% of RA patients do not respond to MTX and are then prescribed a combination of MTX and an anti-TNF blocking agent. TNF drives the inflammation within the joint, and blocking reduces immune cell infiltration and immune mediated joint destruction (2). About 30% of patients prescribed their first anti-TNF therapy fail to respond, upon which other biologic therapies are prescribed. The current challenge of translational research in this area is to better utilize the treatment options that already exist, in a personalized or stratified manner. Several groups have attempted to use transcriptomics (310), genetics (1113) and proteomics (14), as well as better use of clinical data (15) to predict treatment response, particularly for TNF blockade. Success has been limited with virtually no findings validated in independent material, and no biomarker for prediction of response is currently used in clinical practice (11). In any such study the collection of relevant biological samples is of key importance, and independent validation of results is necessary for further research.

We therefore set out to compile the COMBINE biobank of samples from RA patients that included global profiling of transcriptomics, genetics, proteomics, flow cytometry and clinical information. With this unique resource, as a first stage we performed a complete quantification of all previously suggested anti-TNF response biomarkers: to investigate how well precision medicine would actually work, given the input of all previous knowledge on RA precision medicine that we have today. To our knowledge this biobank is currently the largest collection of such multi-omics data from RA patients. We present this as an essential guidance in the highly discrepant field of drug response stratification research, as a resource for combining the findings of the many excellent studies already published.

Materials and Methods

Study Design and Sample Collection

The COMBINE biobank was generated after written informed consent from all participants had been obtained according to the declaration of Helsinki and with approval by the Stockholm (number 2010-351-31-2) and Uppsala (2009-013) Regional Ethics Committees. The key inclusion criteria were patients with rheumatoid arthritis, according to the ACR 1987 or the 2010 ACR/EULAR criteria, who were undergoing change or start of a new treatment regimen at the Rheumatology Clinic, Karolinska University Hospital, Stockholm from February 2011 to May 2013. Our cohort includes 3 patients groups (Figure 1A): one group of patients with symptoms initiating no more than 14 months ago and initiating MTX treatment (cohort A), one group of early RA patients initiating anti-TNF therapy of any type (cohort B) and one group of patients initiating a second-line biologics treatment, which could be either anti-TNF or other biologics (cohort C). Additionally, healthy controls were recruited from the Swedish Blood Centre service in Uppsala (cohort HC), with age and sex as closely matched with patient groups as possible. The blood samples were collected at the baseline visit (timepoint 0m) and at a follow-up visit approximately 3 months later (timepoint 3 months, 95.3 d, standard deviation (SD) 10.5 in patient groups, but larger variation in healthy controls). At both visits patients underwent a full clinical evaluation, and data were recorded on number of swollen and tender joints, general health, C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR). From these the disease activity score (DAS28) was calculated, and we have specifically used the DAS28 including CRP levels in our analysis. At the clinical visits information regarding smoking status, all current and previous medications, cell counts of all subsets of leukocytes, as well as samples for RNA sequencing of peripheral blood mononuclear cells (PBMC), flow cytometry analysis of PBMC and whole blood and serum and plasma analyses for protein biomarkers were taken. At baseline, patients were additionally evaluated for anticitrullinated protein antibody (ACPA) and rheumatoid factor (RF) positivity, and gave DNA samples for microarray genotyping and human leukocyte antigen (HLA) haplotype characterization. A total of 492 blood samples from 246 different individuals were collected.

Figure 1
figure 1

A) Overview of study design and cohort setup. B) DAS28-CRP score stratified by cohort and visit time, with an assumed value of 0 for the HC cohort. Each dot shows one sample. The Y-axis shows DAS28-CRP level and the X-axis shows cohort membership and time point. Grey lines connect paired time-points, and horizontal black bars indicate median value. Color-coding corresponds to figure 2C. C) Reproducibility of RNA-seq expression levels from technical replicates, in different library preparations with three months separation. Each dot shows the read count of one single gene. The Pearson correlation coefficent of log transformed counts was 0.991 (P<2e-16). D) Principal components analysis of genome-wide genotype profile and self-reported country of origin of individuals. Each dot shows one patient, color-coded by self-reported country of origin. TNFi: TNF inhibitor, MTX: Methotrexate, Rx naïve: no methotrexate initation.

Serum Protein Measurements

Proteins were measured using multiple commercial protein biomarker panels. These included 62 analytes in the HumanMAP assay (Myriad RBM), 12 analytes in the VectraDA (Crescendo) as well as 33 autoantibodies using a fluorescence enzyme immunoassay technique from Phadia (Thermo Scientific). These plasma protein analyses were performed at external centers according to the standard protocols of the labs as indicated. The proteins BAFF and CXCL13 were analyzed using RnD Systems and APRIL using AH Diagnostics ELISA kits. Depending on assay type, between 422 and 451 samples were successfully analyzed.

RNA Purification and Sequencing

RNA was purified from PBMC samples collected using CPT tubes (BD Biosciences) and purified using isopropanol extraction. The quality of each RNA sample was characterized using an Agilent Bioanalyser (2100 total RNA pico chip). RNA was sequenced at the Aros Applied Biotechnology center, using an Illumina HiSeq 2000, the TruSeq RNA sample preparation kit and 2 × 100 base pair (bp) paired end setup. Samples were randomized to flow cells with focus on obtaining balanced distribution of cohort type and drug response magnitude. Mean read depth was 15.7 million read-pairs per sample. Sample quality was inspected using read depth, fastqc quality calls and a check for Y-chromosome expression. Altogether 17 RNA samples were re-sequenced due to quality concerns and finally 2 samples were omitted from the study. A total of 360 samples were successfully RNA sequenced. One of the samples was divided in two aliquots before library preparation to investigate reproducibility of RNA-sequence expression.

A pre-filtering on quality of reads using the fastqc package (0.10.1) was applied, removing adaptors and applying fastqc_quality_trimmer (-q 30, -p 85) and fastqc_quality_filter (-t 30 -L 40). This was followed by tophat2 (2.0.10) alignment to the ensemble GRCh37 genome together with the -gtf switch with the hg19 UCSC transcripts (16). Gene expression was quantified using htseq followed by TMM-normalization, mean-scaling and log2 transformation as implemented in the edgeR package (3.8.6) (17) as recommended by Dillies et al. (18).

DNA Purification and Microarray Genotyping

Blood was collected in Vacutainer citrate tubes (BD) and DNA purified according to the QIAsymphony 400-ul buffy coat protocol. The DNA was hybridized using the HumanOmniExpress BeadChip Kit and Illumina OmniExpress arrays (12v1). Y-chromosome call rates were compared with recorded sex and found to be mismatched for three samples which were excluded. Because of these sex concordance based sample switches, coding genotypes in highly expressed genes were compared with those obtained from RNA-sequence data using the allelicImbalance package (1.4.2) (19). This resulted in an additional 3 omitted samples due to mismatch of genotypes call. Altogether 230 individuals were successfully genotyped. Principal components analysis was performed using the made4 package (1.40.0) on all measured genotypes. The first 10 principal components were investigated — the eighth and ninth showed the best resolution of larger ethnicity groups, the remaining components instead being driven by smaller outliers or sex.

Imputation of non-measured SNPs was performed using the Mach1.0 imputation algorithm (1.0.18.c), with separate phasing in minimac (2012-11-16). For reference the 1,000 genomes project EUR was used (release_v3.20101123). This provided 7.7 million SNPs available for analysis at Rsq quality values above 0.7.

Association Between Biomarkers and DAS28-CRP

The primary outcome variable was defined as difference in DAS28-CRP three months after initiation of treatment with TNF inhibitors. To account for variance in methodology of the included discovery studies, the association between biomarker level at baseline and ΔDAS28-CRP was calculated using 4 different models: one model which used a Student t test between patients with ΔDAS28-CRP values above and below −1.2, and three linear regression models, one without covariates, one with age and sex as covariate and one with age, sex and baseline DAS28-CRP value as covariate. One exception was the protein data in Figure 2C, which was handled using the same split by median values described in Dennis et al. (14). This was followed by a linear regression with the indicated four protein level categories as predictor variable.

Figure 2
figure 2

A) Replication of mRNA as stratification biomarkers for anti-TNF-response in 59 patients from cohort B. Each dot shows one gene, and the Y-axis shows association strength. For each of 7 published studies, all reported genes were investigated for association to anti-TNF-response. Four different models were calculated: three linear regressions with different covariate setups, and one binary-response/no-response setup set at ΔDAS28-CRP < −1.2. The strongest covariates reported in the respective studies are indicated by the color code in the legend. B) Comparison with mRNA expression data from Toonen et al. (GSE33377). The X- and Y-axis shows the log10(P)-value for association with TNFi response for each measured gene, from the Combine and the GSE33377 data, respectively. The compared P-values were calculated based on binary response/no-response. C) Replication of serum protein expression biomarkers sICAM1 and CXCL13 previously reported by Dennis et al. (14). Protein levels were divided by median values for each of the two proteins as indicated along X-axis. The P-value was calculated for a linear regression over the four levels. Each dot shows one patient. Color coding and symbol shape indicate baseline DAS28-CRP and gender. D) ROC plot illustrating the ability to estimate if a patient will have more or less than -1.2 ΔDAS28-CRP values, based on the fitted values of a model containing the 11 predictive biomarkers. This curve shows how different threshold levels will affect the rate of true and false positives predicted: the interpretation of the area under curve (AUC) is that 81.5% of randomly selected TNFi responders will have a higher test result than randomly selected non-responders.

The multiple testing burden of the study corresponds to the count of literature identified biomarker candidates. Details of the literature search are provided in Supplement 1. A total of 2 + 52 + 72 biomarkers were included in the restricted search space. For these, P values are always reported as raw uncorrected values. The comparison plot in Figure 2B was made using the smoothScatter from the R-graphics base-package and the receiver operated characteristics plot in Figure 2D was calculated using a ROCR package. All calculations were performed in R version 3.1.2.

All supplementary materials are available online at www.molmed.org .

Results

In the COMBINE biobank we collected 492 blood samples from 246 individuals, and applied a range of high-throughput measurement techniques to most them (Table 1). The study design is illustrated in Figure 1A, and the DAS28-CRP changes in Figure 1B. Herein is also a clear illustration of the main issue of non-response to common drug types: only 47% of the patients had ΔDAS28-CRP scores better than −1.2 for evaluation at 3 months after treatment initiation. Figure 1C shows the precision of the RNA-sequence and Figure 1D illustrates the self-reported and genotype derived ethnicity of participants, which could be an important aspect of the likelihood of reproducing SNP-based drug stratification markers.

Table 1 Demographic information for patients in the biobank.

Transcriptomics-Based Anti-TNF-Response Association

Using keywords anti-TNF, RA and gene expression profiles in blood cells we identified eight other studies containing similarly designed setups of gene expression profiles and data for anti-TNF-response transcriptomics (310). For each of these studies we sought to replicate the findings in the COMBINE biobank (Figure 2A). From a total of 72 candidates of previously proposed blood gene expression biomarkers, we observed that the genes SORBS3, AKAP9, CYP4F12, MUSTN (4), CX3CR1 (5), SLC2A3 (6), C21orf58 and TBC1D8 (7) could be replicated as stratification biomarkers at P < 0.05 from the COMBINE dataset, and with consistent direction. One study had provided genome-wide expression data from whole blood, accession GSE33377 (10), and we additionally attempted to quantify how much overlap of findings there was when considering all genes (Figure 2B). Ideally one would expect several genes to have strong significance in both studies, which was not the case. There is thus a limited ability to reproduce findings between cohorts of otherwise similar size, which emphasizes the need for more validation studies in the field and argues for integration of data obtained from several patient cohorts.

Proteomics-Based Anti-TNF-Response Association

Next, we performed a similar analysis for recent findings that sICAM1 and CXCL13 serum protein levels could be used for stratification of anti-TNF-response (14). When dividing the anti-TNF initiating patients (cohort B) into medians of sICAM1 and CXCL13 levels and comparing the results of the four groups as done by Dennis et al., we found a strong replication of the original findings (Figure 2C). Interestingly, correlation and other approaches that were not based on the arbitrary median cutoffs did not provide any ability to stratify patient groups. The number of Toculizumab treated patients in the cohort was too low to allow separate replication of the Toculizumab findings in Dennis et al. Using only the levels of these two proteins to predict anti-TNF-response of ΔDAS28-CRP below −1.2, we obtained a Receiver Operating Characteristic (ROC)-curve with an Area Under Curve (AUC) of 0.714.

Genetic Associations with Anti-TNF-Response

We identified three published studies that identified SNPs associated with anti-TNF-response (1113), as well as the results from the winning algorithm at the recent Rheumatoid Arthritis Responder Challenge, developed by team GuanLab, University of Michigan (https://www.synapse.org/#!Synapse:syn2504648). Altogether 52 potentially anti-TNF-response associated SNPs were considered. Since the COMBINE database is more than 10 times smaller than current cohorts for genetic association studies, it is underpowered for a formal validation of these SNPs. Nonetheless we tested all 52 SNPs for association to ΔDAS28-CRP after treatment with TNF-blocking agents for 3 months. Only two SNP were found to be associated with ΔDAS28-CRP in COMBINE, one rs6028945 (13) (P = 0.047) and one rs7305646 (12) (P = 0.014).

Integration of Known Biomarkers for Anti-TNF-Response Prediction

To provide an integrative approach for prediction of anti-TNF treatment, we tested the prediction level obtainable when combining the best current high-throughput methodology and published findings replicated in the COMBINE cohort. In the test we include the expression of the 8 genes from transcriptomics literature, the sICAM1/CXCL13 finding (14) and the two SNPs from GWAS literature (12,13). Combining these 11 variables gives a model that explains 51% of the variation in ΔDAS28-CRP in the TNFi group of the COMBINE cohort (linear regression, adjusted R2). The largest explanatory power came from the sICAM1/CXCL13 finding as well as the two most significant measures of gene expression levels (CX3CR1 and SLC2A3). If the ΔDAS28-CRP is converted to a binary response with −1.2 as cutoff, a ROC-curve based on the fitted values yields an AUC of 0.815 (Figure 2D). An overview of all response and model definitions is given in the Supplement 3. This includes a comparison with age-gender-smoking status as well as ACR good response definitions.

Discussion

The main finding in our study was in replication of 11 biomarkers for anti-TNF response in RA and in successful integration of these biomarkers in a single model with high predictive value.

We collected double-time point blood samples from 185 RA patients, and acquired clinical information, transcriptomics (RNA-sequencing), genomics (genotyping microarrays) and proteomics (HumanMAP and VectraDA) data. Each of these steps was described in detail herein and an extensive quality control was performed. We chose to strictly investigate already published biomarkers for anti-TNF treatment response as a first stage to prove validity and usefulness of COMBINE biobank. To ultimately benefit patients, it is essential that such validation studies are performed (20). Additionally, it is noteworthy that a voluntary restriction of the search space for our investigations limits the otherwise large problems with multiple testing burdens and underpowered studies that are typically seen in high-throughput discovery studies.

The serum protein level of two previously proposed biomarkers had the strongest impact on prediction (sICAM1 and CXCL13). This was closely followed in strength by the PBMC gene expression of two previously suggested transcriptomic biomarkers. The knowledge gained from genetic markers, in this case two previously proposed SNPs, only improved predictions in a comparatively lesser fashion. Keeping in mind that the COMBINE study design was fundamentally method agnostic, this has an important impact on where to focus future efforts in biomarker discovery.

Specifically for the sICAM1 and CXCL13 serum protein levels it is of interest that these have already been extensively discussed as biomarkers in the literature: high baseline plasma CXCL13 levels in RA patients are associated with an improved chance of remission after 2 years (21), and it has also been proposed that high concentrations of CXCL13 in plasma indicate recent onset of inflammation that may respond better to early aggressive treatment. On the other hand, high expression levels of CXCL13 gene in synovium of RA patients correlates with more severe disease (22). Likewise, serum CXCL13 correlates with synovial CXCL13 measured at a single joint, suggesting synovitis as an important source of circulating CXCL13. Cell and animal models suggest the implication of ICAM1 in RA (23), but comparable evidence of levels in serum and synovial tissue of ICAM1 in RA patients is not available.

The set of 11 SNPs, genes and proteins that were shown to be predictive in this study represents a step forward toward use of anti-TNF-response-predictive signatures in a clinical setting. In addition to being individually replicated, their compound prediction accuracy exceeds what was previously observed. While the prediction accuracy is still below that of precision medicine in cancer (for example, Herceptin), it is still conceivable that a sizeable fraction of anti-TNF-initiating patients could be selected who would benefit from progressing directly to other biologic treatments now available. To accurately assess the size of this, however, a prospective study with a pre-defined biomarker signature is required.

Conclusion

To our knowledge this is the first time a combined multi-omics approach has been taken to the question of drug-response stratification in RA. The prediction strength was calculated as an AUC of 0.815 that is relatively high for pharmacogenomics studies.

In conclusion we present these results as a comment to the state of precision medicine, providing evidence that a few blood biomarkers may serve as a prognostic tool for the prediction of clinical response of TNF blocking therapy. If a prediction proficiency of this magnitude is introduced in a clinical setting, it may have a strong impact on established clinical treatment routines.

Disclosure

The vast majority of expenses related to this study including laboratory measurements and study costs were funded by the pharmaceutical company Novo Nordisk A/S, at which the authors LF and UGWM were employed at the time of the data collection. However, the authors wish to make clear that this company had no influence on any conclusions in this study, nor are any authors currently affiliated or employed by this company. The stratification potential of this prediction signature is being investigated for patenting as US application number 14947077 and UK application number 1520524.8.