Background

Rheumatoid arthritis (RA) is a chronic systemic autoimmune disease, characterized by synovitis, systemic inflammation and generation of autoantibodies. In industrialized countries, up to 1.0% of the adult population is affected by RA and suffers from joint damage and loss of physical function [1]. Disease-modifying antirheumatic drugs (DMARDs) are used to treat RA. The most commonly used DMARD is methotrexate (MTX), which is the “anchor drug” for active RA and can be combined with a variety of other drugs. Biological agents can be used when arthritis is aggressive and/or not sufficiently controlled by chemical DMARDs [2]. However, the world wide use of biological agents is restricted by their high costs and risk of severe infections [3].

Tripterygium wilfordii Hook F (TwHF) is widely used in traditional Chinese medicine as a potent treatment for joint pain, fever, chills, edema and local inflammation [4, 5]. Extracts of TwHF have been analyzed and the three major diterpenoids, triptolide, tripdiolide and triptonide, are mainly responsible for its anti-inflammatory and immune regulatory activities [6,7,8,9]. TwHF has been approved to treat RA in China. Our clinical experiences from treating more than 30,000 patients with RA each year in Peking Union Medical College Hospital (PUMCH) also support the high cost effectiveness of TwHF or a combination of MTX + TwHF, with increases in daily therapeutic expense less than 1 US dollar [10]. Previously in three randomized controlled trials, extracts of TwHF have also been shown to have good efficacy in treating RA compared with placebo or sulfasalazine [11,12,13].

To further evaluate the role of TwHF in treating RA by comparing its effects to MTX, we recently conducted the “Comparison of Tripterygium wilfordii Hook F with methotrexate in the treatment of active rheumatoid arthritis” (TRIFRA) study [14]. In this open-label, multicenter randomized controlled trial, 207 DMARD-naïve patients were randomly allocated into three arms and treated with TwHF, MTX or TwHF+MTX. We evaluated the proportion of patients achieving an American College of Rheumatology (ACR) 50% response (ACR50) at week 24, together with other parameters to measure disease activities including ACR20, ACR70, European League Against Rheumatism (EULAR) good or moderate response, clinical Disease Activity Index (cDAI), 28-joint count Disease Activity Score (DAS28), Health Assessment Questionnaire (HAQ) and 36-item Short-Form Health Survey questionnaire (SF-36) scores. At week 24, ACR50 response was achieved in about half of the patients using MTX or TwHF alone, and in more than three quarters of the patients receiving combination therapy. Similar patterns were found for other parameters. Moreover, with TwHF monotherapy and the combination therapy there was no increased incidence of adverse events compared to MTX alone. Thus, we concluded that TwHF monotherapy was not inferior to, and MTX + TwHF was better than, MTX monotherapy in controlling disease activity safely in patients with RA [14].

Long-term control of disease activity and associated joint damage, with the preservation of physical function in a safe manner is the ultimate goal of RA management. Therefore, the evaluation of treatment efficacy from long-term trials is needed to determine long-range benefit. Aside from benefit in clinical measures such as ACR50, prevention of joint damage evident on radiography (radiographic joint damage) is an important outcome in determining the long-term treatment effects in clinical trials, and recommended as a surrogate marker for overall functional status in patients with RA [15]. Previous studies have shown that treatment with MTX or other DMARDs could slow the progression of radiographic damage [16, 17]. The TRIFRA study was designed to be a 24-week, multicenter, randomized controlled trial (RCT). After its termination, the patients continued to be followed, and disease activities were monitored in the real-world situation. Based on changes in perceived disease activity, treatment could be modified accordingly. In this observational report, we followed the participants from the TRIFRA trial for 2 years after the study initiation. Both functional measures and radiological images were collected at year 2 to determine whether the same efficacy patterns observed in the first 24 weeks were sustainable. Moreover, the long-term impact on radiographic progression and physical function was determined.

Methods

The TRIFRA study was designed as a 24-week, open-label, randomized study to evaluate the efficacy and safety of TwHF alone, MTX alone or the combination of TwHF + MTX in in the treatment of active RA. It was previously registered in ClinicalTrials.gov (NCT01613079). A detailed description of the study design was published previously [14]. After the end of the trial, subjects were followed by the investigators for 18 additional months and monitored using the same clinical outcome measures. In addition, radiographs of the hands and wrists were repeated to evaluate progressive joint damage.

Patients

At the time of enrollment, patients eligible for this trial had to meet the following criteria: (1) 18–65 years of age; (2) diagnosed with RA as determined by meeting the 2010 ACR/EULAR classification criteria and having had RA for at least 6 weeks; (3) at least three swollen joints (swollen joint count (SJC)) and five tender joints (tender joint count (TJC)); (4) erythrocyte sedimentation rate (ESR) >28 mm/h or C-reactive protein (CRP) >20 mg/L. Patients who completed 24 weeks in the TRIFRA study continued into the next 18 months of follow up. All patients signed written informed consent at the time of enrollment.

Study protocol

The protocol was approved by Peking Union Medical College Hospital (PUMCH) ethical review board. Initially, enrolled participants were allocated into three arms by centralized randomization, as follows: oral TwHF pills 20 mg three times a day; MTX starting from 7.5 mg once a week and increasing to 12.5 mg once a week (0.20–0.25 mg/kg) within 4 weeks, with folic acid 10 mg on the day after each MTX administration; or TwHF plus MTX at the same dosage as aforementioned. In this study, the TwHF used was the same as that in the TRIFRA study, in which the concentration of triptolide (C20H24O5), the major immunosuppressive anti-inflammatory diterpenoid, was 1.2 μg/10 mg, and the concentration of wilforlide (C30H46O3), an anti-inflammatory triterpene, was 36.6 μg/10 mg. The patient’s DAS28 was evaluated at week 12, and monotherapy was continued only if their DAS28 reduced more than 30%; otherwise the patients switched to MTX + TwHF combination therapy. After 24 weeks, patients were followed up and their therapy would be modified based on the physician’s judgement. A detailed description of the study design was published previously and is also shown in Fig. 1.

Fig. 1
figure 1

Study design and numbers of patients in each group who completed or withdrew from the 24-week TRIFRA study and 2-year follow up. RA, rheumatoid arthritis; MTX, methotrexate; TwHF, Tripterygium wilfordii Hook F

Outcomes and measurements

Similar to the initial 24-week TRIFRA study, though the treating doctors and patients were not blinded to medication allocation, a similar set of clinical efficacy parameters were evaluated in each patient at the end of the second year by trained evaluators who were unaware of the specific therapeutic regimen. The parameters included the ACR criteria [18], HAQ [19], the ESR or serum CRP level, EULAR good or moderate response, cDAI good response (defined as achieving ≥ 50% improvement in the cDAI, or cDAI ≤2.8) [20], clinical remission (defined as DAS28 <2.6) and low disease activity (LDA) (defined as DAS28 <3.2) [21] and change in HAQ or 36-item Short-Form Health Survey questionnaire (SF-36) scores. The proportion of patients achieving 20% improvement using the ACR criteria was calculated as ACR20, and similarly, the ACR50, and ACR70 were calculated. The safety profile was also recorded.

During the 2 years, radiographic progression was analyzed in patients who had at least two radiographic examinations with time intervals longer than 1 year. Radiographic images of the hands and wrists were independently read by two radiologists who were masked to treatment allocation, time sequence of radiographs and the patient’s clinical response. Joint erosions (JE) and joint space narrowing (JSN) were scored, which were summed to calculate the modified total Sharp score (mTSS) [22]. Inter-reader variability was assessed by the intraclass correlation coefficient and based on status score it ranged from 0.794 to 0.907. To balance the time-interval difference, linear extrapolation of actual change from baseline images was used for patients whose image was missing at the 2-year time point. Mean scores of the two radiographic readers were used for analysis. The radiographic data were reported in a systematic way as recommended [23]. mTSS non-progression was defined as a change from baseline mTSS between − 0.5 ~ 0.5 units at 2 years or less than the smallest detectable difference (SDD) [24]. The SDD was computed based on the observed difference between the readers. The estimated yearly mTSS progression at the baseline was defined as the baseline mTSS score divided by disease duration for each patient.

Statistical analysis

Analysis was performed using the modified intent-to-treat (ITT) method, which included all the patients who received the originally allocated treatment at least once. This method was used for the analysis of ACR responses, cDAI responses, EULAR responses, DAS28 remission, ESR, high sensitivity (hs)CRP level, pain measured on a visual analog scale (VAS) and HAQ score at year 2, with missing data interpolated with the last observation carried forward (LOCF) approach. To compare the efficacy variables of MTX monotherapy and TwHF monotherapy, a non-inferiority test was carried out. In the TRIFRA study, the non-inferiority margin was set as 10%, and the required sample size was then calculated accordingly with at least 80% power and 5% level of significance [14]. In this follow-up study, we used the same non-inferiority margin as we did in the TRIFRA study, as the sample size in the ITT analysis was the same. The efficacy variables were compared in the MTX + TwHF group and the MTX monotherapy using the chi square (χ2) test. We also conducted a per-protocol (PP) analysis that only included the participants who finished the 2-year follow up without violating the originally allocated treatment regimen.

A valid-for-efficacy (VFE) analysis was conducted for radiographic data, including patients who completed the 2-year follow up (completers). Radiographic changes in mTSS, JE and JSN scores were analyzed using analysis of covariance (ANCOVA) with treatment and baseline scores as covariates. Only patients with baseline images and at least one radiographic assessment after the initiation of treatment were included in the analysis. Radiographic non-progression was defined as an absolute value of the change in mTSS no greater than 0.5, which was analyzed using the χ2 test.

Categorical data are presented as number (n) or percentage (%). Continuous data are presented as mean (SD) or median (25th–75th centiles). Differences between groups were analyzed for significance using the χ2 test (categorical data) or ANCOVA with factors for treatment and baseline scores as covariates (continuous data). All analyses were computed using SPSS statistics V.22.0and SAS V.9.1.

Results

Patients’ follow up and withdrawal information

A total of 207 patients participated in the TRIFRA study. All three treatment groups were well-balanced with respect to baseline demographic and clinical characteristics [14]. Among 207 recruited patients, 33 patients (16%) dropped out in the first 24 weeks mainly because of side effects, inefficacy and protocol violation (Fig. 1). After that, 22 patients (11%) were lost to follow up and 41 patients (20%) refused to return for disease evaluation at year 2. Reasons for refusal included unwillingness because of symptom relief (10/41, 24%), non-medical reasons (22/41, 54%) and unwillingness to disclose information (9/41, 22%). The non-medical reasons mainly include travel expenditure and other personal issues. Two patients died of malignancy. Among 207 recruited patients, a total of 109 patients (53%) returned for the 2-year follow-up evaluation.

As shown in Fig. 1, numbers of patients who withdrew from the study at year 2 were comparable among the MTX monotherapy group (32/69, 46.4%), the TwHF monotherapy group (35/69, 50.7%) and the combination therapy group (31/69, 44.9%) (p = 0.777), and the rate of maintaining the initial protocol was not significantly different among the three groups (14/69 (20.3%) in the MTX group, 11/69 (15.9%) in the TwFH group and 22/69 (31.9%) in the combination group, p = 0.069). However, there was a trend towards a higher compliance rate among patients in the combination group compared to the other two groups. As aforementioned, the numbers of patients with favorable outcomes (including patients maintaining the initial protocol, patients no longer taking any drugs and patients declining to return for evaluation because of symptom relief) were not significantly different (20/69 (29.0%) in the MTX group, 20/69 (29.0%) in the TwFH group and 29/69 (42.0%) in the combination group, p = 0.172), although there was also a trend towards a higher favorable outcome rate among patients in the combination group compared to the other two groups.

Clinical efficacy

Disease activity evaluation

Disease activity was evaluated at year 2 by the ACR criteria, cDAI, EULAR good response, remission rate (DAS28 <2.6) and LDA rate (DAS28 <3.2). In the ITT analysis, we performed a non-inferiority test to compare the TwHF monotherapy group and the MTX monotherapy group, and similar statistical significances were shown in all the parameters: ACR20, 73.9% vs 55.0%; ACR50: 58.0% vs 46.4%; ACR70: 34.8% vs 21.7%; cDAI good response: 72.5% vs 56.5%; EULAR good response, 47.8% vs 23.2%; remission rate, 43.5% vs 17.4%; and LDA rate, 47.8% vs 26.1% (p < 0.05) in the TwHF vs MTX group (Table 1). There was a similar pattern at week 24 as described previously [14]. When we compared disease activity in patients from the combination and MTX monotherapy groups, there were significant differences in ACR20, EULAR good response and DAS remission rate at year 2 (ACR 20, 72.5% vs 55.0%; EULAR good response, 40.6% vs 23.2%; remission rate, 34.8% vs 17.4%, respectively (p < 0.05) (Table 1).

Table 1 Clinical efficacy measures over 2 years in intention-to-treat (ITT) analysis and per-protocol (PP) analysis

We also carried out the PP analysis that only included the patients who followed the allocated treatment regimen for 2 years. The results agreed with those found in the ITT analysis. The non-inferiority test was used to compare the TwFH group and MTX group, and showed statistical significances in all the parameters at year 2. However, there was no significant difference between the combination therapy and MTX monotherapy groups (Table 1).

We compared the core components of ACR responses and DAS28 in the three groups (Table 2). All treatment groups had decreases in DAS28 and HAQ scores and increases in SF36 scores, suggesting improvement in functional disability and life quality. However, there was no statistically significant difference in the improvement of these scores among the three groups at year 2 (p > 0.05) in either the ITT or the PP analysis (Table 2).

Table 2 Clinical and laboratory measures in the three groups over 2 years in intention-to-treat (ITT) analysis and per-protocol (PP) analysis

Radiographic outcome

At year 2, paired evaluable radiographic results before and after treatment initiation were available in 109 patients for the VFE analysis (52.7%). Mean TSS at baseline were 28.15, 33.02 and 26.8 in the MTX, TwHF and combination therapy groups, respectively (p = 0.768) (Table 3). After 2 years, the mean change from baseline in JE, JSN and TSS were not significantly different among the three treatment groups (p > 0.05). The estimated annual radiographic progression was lower in the TwHF and combination group compared with the MTX group. However, there was no statistical significance on ANCOVA with the baseline score as a covariate (p = 0.615).

Table 3 Radiographic changes from baseline after 2 years of treatment

After 2 years, 34.21% of patients receiving combination therapy had no radiographic progression (change from baseline in the mTSS <0.5), compared with 35.29% of patients receiving TwHF and 45.95% of those receiving MTX (p = 0.520). In the majority of patients (81.58% of those receiving combination therapy, 79.41% of those receiving TwHF and 83.78% of those receiving MTX), the change in the TSS was equal to or less than the SDD (5.24 units) (p = 0.893). Similarly, when JE and JSN were evaluated independently, changes equal to or less than the SDD were observed in the majority of patients in the three groups (p > 0.05).

The changes from baseline in TSS, JE and JSN scores were presented in cumulative probability plots to visualize radiographic data in all three groups (Fig. 2), and the majority of observations in the three treatment groups had values close to zero. The plots presenting change within the three groups were similar, indicating a comparable change in radiographic damage. The association between JE/JSN progression and disease activity has been reported before [25]. In our data, we analyzed the association between radiological progression and tertiles of change in the DAS28 during the 2 years. The increasing tertiles of change in the DAS28 were associated with JE, JSN and mTSS progression when analyzed with treatment as a covariate (p < 0.05).

Fig. 2
figure 2

Cumulative probability distribution for the modified total Sharp scores (a), joint erosions (b), and joint space narrowing (c) over the 2 years. MTX, methotrexate; TwHF, Tripterygium wilfordii Hook F

Side effects

Adverse events were monitored during the 2 years. Overall, 54.6% of the patients reported adverse events, 65% in the MTX, 48% in the TwFH and 51% in the combination group (p = 0.089) (Table 5). Similar to previous reports, the most common adverse effects recorded were gastrointestinal, including nausea, abdominal discomfort and liver dysfunction. Serious infection, such as pneumonia and urinary tract infection, were reported in five patients in the MTX group and two patients in the combination group. Two deaths from malignancy were reported; one subject died of gastric cancer and the diagnosis was unclear in the other subject. Among 170 female patients, 101 were postmenopausal and 17 (10.0%) developed irregular menstruation during the 2-year follow up, including 5 in the MTX group, 7 in the TwHF group and 5 in the combination group (p = 0.744).

In this follow-up study, none of the patients was reported to discontinue the treatment because of adverse events in any of the three arms. Separately, we also compared the side effects reported by the 109 patients who completed the 2-year follow up. Overall, 36.7% of the patients reported adverse events. Among these patients, the most common adverse effects were nausea and liver function abnormalities. Serious infection was not reported in any of the groups.

Discussion

The previous 24-week TRIFRA clinical trial is the first RCT that compared the efficacy of TwHF and MTX in treating DMARD-naïve patients with RA [14]. At week 24, TwHF was not inferior to MTX as measured by multiple parameters of disease activity, including ACR20, ACR50 and ACR70 response criteria, EULAR and cDAI good response criteria and DAS28 remission criteria and LDA rate. More importantly, patients with RA receiving MTX + TwHF combination therapy had better improvement in disease activity. Considering that RA is a chronic disease, we followed up the patients from the TRIFRA trial at year 2 and evaluated disease activity in the same way. Among 207 patients recruited in the TRIFRA trial, 109 of them returned for the 2-year follow up. Notably, the frequency of adhering to the initial protocol was comparable among the three groups. The disease activity at year 2 followed a similar pattern when comparing the MTX monotherapy and TwHF monotherapy groups (Table 1). The TwHF was not inferior to MTX in treating active RA. However, the combination therapy was not obviously more effective than MTX monotherapy at year 2. This may suggest that the combination therapy induces disease remission faster in the early stage of treatment, while the long-term efficacy was similar to that in the monotherapy groups. The limited sample size available is a concern at this stage and may have caused bias in the efficacy evaluation.

Aside from the clinical efficacy measures, we also obtained and scored paired radiological images of the hands and wrists from these patients to further objectively validate the efficacy of treatment. Consistent with previous studies [25], radiological progression was associated with disease activity measured by change in the DAS28 (Table 4). This confirmed the functional relevance of the radiological evaluation [26]. Radiographic progression was comparable among the three groups, though the annual radiographic progression trended toward being smaller in the combination group.

Table 4 Changes in JE, JSN and mTSS by changes in DAS28 tertiles over 2 years

Similar to the previous report, the safety profile showed that the frequency of adverse events was not significantly different among the three groups (Table 5). At year 2, the majority of patients who withdrew from the study did not do so in relation to adverse events. The antifertility effect of TwHF was well-known to the participants in our study, and the women recruited were mainly postmenopausal or were not planning to become pregnant. We monitored irregular menstruation in these women, and the incidence was similar among the three groups.

Table 5 Adverse events in patients

We did not attribute the two cancer deaths to the use of TwHF in this follow-up study. A number of clinical studies on TwHF have not found any association between cancer and TwHF, but rather with RA per se [12, 13, 27]. Notably, the TwHF + MTX combination group in this follow-up study also received the same dosage of TwHF, yet experienced no malignancies. Moreover, numerous pharmacological studies have suggested triptolide has an anti-tumor effect in various tumor models in vitro and in vivo [28, 29].

This study, as a long-term extension of a 24-week RCT, has several limitations. First, this follow-up study, together with the TRIFRA study, was designed as an open-label study. To increase the objectiveness of the results, blinded evaluators were employed. However, a randomized double-blinded trial is needed to provide more robust data and confirm our results. Second, a significant proportion of patients were lost to follow up or changed to other regimens for different reasons. This could bias the analysis, and, therefore, the power of the conclusion was weakened (Fig. 1). Furthermore, there might be significant bias in the analysis of adverse events. In this case, we performed both ITT and PP analysis of all the clinical measures to determine whether similar patterns were observed. Importantly, it was noted that the proportions of patients who withdrew or were lost to follow up were similar among the three groups. It was presumed that patients returned to “real-world” clinical practice and their treatments were monitored and optimized by their physicians’ judgement. Thus, we compared the rate of adherence to the original protocol and performed detailed cause analysis in the follow up, and showed that these were comparable among the three groups. Third, only the radiographic images of the hands and wrists were available for this analysis, and the feet were not included. Studies have shown that the joints of the feet are usually affected earlier than the joints of the hands, and therefore including the feet could help improve the sensitivity of joint damage assessment in early RA [30, 31]. In order to demonstrate the therapeutic efficacy, the patients recruited in the TRIFRA study were diagnosed with definite active RA and the mean disease duration was longer than 60 months. Moreover, considerable radiographic damage was noted in the hands and wrists, providing a reasonable baseline background to determine slowing of radiographic damage, even though the feet were not evaluated. The scores for the hands and wrists were representative of the disease activity at this stage, which was also indicated by its association with change in the DAS28. Ideally, the radiographic data should be obtained uniformly at baseline and the end of year 2. In this real-world follow-up study, we only successfully performed radiographic examination at the end of the 2 years in a proportion of subjects. To further maximize the power of our results, the analysis included all the patients who had at least two radiographic examinations with time intervals greater than 1 year. Notably, we calculated the estimated yearly radiographic progression based on these images. This might introduce additional bias to our results because this model presumed that radiographic progression was linear. Finally, the dose of MTX was limited to 12.5 mg per week and the dose of folic acid was 10 mg per week, which is very common in Asia. This point was discussed in our previous report on the TRIFRA study. As a follow up of the TRIFRA study, the same doses of MTX and folic acid were maintained. Although there has not yet been a direct comparison of the response rates to MTX in patients with RA from different ethnicities, we did notice that the clinical response in our TRIFRA trial was comparable to the data from patients with RA in Europe and North America receiving a higher dose of MTX [32]. We recognized that the response rate in RA may be further improved with more intensive combination treatment [33], although the frequency of adverse events may also increase as well. Most reviews and recommendations suggest a prescription of at least 5 mg folic acid (FA) per week [34]. So far, insufficient evidence has been collected on the optimal dose of FA. Studies have proven that low-dose FA (≤ 7 mg/week) was able to reduce MTX side effects significantly [35]. Doses larger than 5 mg did not lead to further amelioration of the MTX side effects and neither did they affect the efficacy of MTX therapy [36, 37]. Because of these considerations, we are less concerned that the therapeutic effects of MTX were compromised by our use of FA in our study.

This study confirmed the original finding that TwHF was not inferior to MTX in treating active RA in the long term, and that a further study with larger sample sizes should be performed to characterize the role of TwHF in RA therapy in greater detail.

Conclusions

Although this is a follow-up study with several limitations, including the relatively high rates of withdrawal and treatment strategy modifications, data from both the ITT and PP analyses indicate that TwHF monotherapy was not inferior to MTX monotherapy in controlling disease activity and retarding radiographic progression in patients with active RA. This is consistent with the previously published, randomized, controlled, parallel arm, active-comparator TRIFRA study.