FormalPara Key Summary Points

Why carry out this study?

This study aimed to provide information about the safety profile of baricitinib relative to tumor necrosis factor inhibitors (TNFi) in real-world patients receiving treatment for rheumatoid arthritis during routine clinical care.

During the 24-week placebo-controlled period of the rheumatoid arthritis clinical program, a numerical imbalance of venous thromboembolism between baricitinib 4 mg (6 of 997 patients) and placebo (0 of 1070 patients) suggested the potential for an increased risk of venous thromboembolism in baricitinib-treated patients. Incidence rates remained stable over long-term follow-up in single-arm studies.

What was learned from the study?

After propensity score matching, patients treated with baricitinib (average treatment length: 9 months; range < 3–17 months) had a 1.5-fold statistically significant increased risk of venous thromboembolism (IRR = 1.51, 95% CI 1.10, 2.08; IR difference = 0.26, 95% CI −0.04, 0.57) compared to patients treated with TNFi.

A non-statistically significant increased risk of major adverse cardiovascular events (IRR = 1.54, 95% CI 0.93, 2.54; IR difference = 0.22, 95% CI −0.07, 0.52 per 100 person-years) and serious infection (IRR = 1.36, 95% CI 0.86, 2.13; IR difference = 0.57, 95% CI −0.07, 1.21 per 100 person-years) were also observed compared to patients treated with TNFi.

Introduction

Rheumatoid arthritis (RA) is a systemic inflammatory disease characterized by synovial inflammation causing pain, swelling, stiffness, and progressive joint damage. Patients with RA also experience an increased risk of significant non-musculoskeletal comorbidities, including malignancy [1], infection, including tuberculosis (TB) [2], venous thromboembolism (VTE) [3,4,5], cardiovascular disease [6,7,8], and overall early mortality [9, 10], among others.

Current treatment of RA is typically initiated with a conventional disease-modifying antirheumatic drug (cDMARD) such as methotrexate. Patients with poor control of disease activity can receive additional treatment with biologic DMARDs (bDMARD), such as tumor necrosis factor inhibitors (TNFi) and oral targeted synthetic DMARDs (tsDMARD), such as Janus kinase inhibitors (JAKi). Baricitinib, an oral selective JAK1/JAK2 inhibitor, is approved for the treatment of adults with moderately to severely active RA, moderate-to-severe atopic dermatitis, severe alopecia areata, and hospitalized patients with SARS-CoV-2 (COVID-19).

The safety profile of baricitinib for the treatment of RA is based on clinical trial data from over 14,000 person-years of exposure [11]. During the 24-week placebo-controlled period of the baricitinib RA clinical program, a numerical imbalance of VTE between baricitinib 4 mg (6 of 997 patients) and placebo (0 of 1070 patients) suggested the potential for an increased risk of VTE in baricitinib-treated patients [12]. The duration of the comparative observation period and number of patients in these data, i.e., the total person-time available, limited the evaluation of uncommon events such as MACE and VTE [11]. Differences between patients who volunteer to participate in clinical trials and those who do not and between the clinical care received in trials and real-world settings can also influence the observed safety profile of a medication. Therefore, post-authorization safety studies within real-world populations are typically conducted to better characterize and establish the safety profile of medications.

Study B023 was initiated in January 2020 and aimed to compare the safety of baricitinib for the treatment of patients with RA in routine care, with TNFi for risk of VTE, MACE, or serious infection. A meta-analysis was used to combine results across 14 post-marketing data sources, including bDMARD and disease registries, administrative claims databases and national healthcare systems, in Europe, the United States (US), and Japan.

Methods

Data and Study Design

A new user active comparator design was used to reduce the risk of confounding and selection bias [13]. New users were defined as patients without prior use of the index medications (baricitinib or the specific TNFi or biosimilar) during the baseline period. The baseline period, or covariate assessment window, was defined as the 6-months prior to and including the cohort entry date. A schematic of the study design is available in Supplementary Material, Figure S1.

This study analyzed longitudinal information collected for purposes unrelated to the study objectives from 14 sources. Data came primarily from health insurance claims records and existing RA registries in Europe, the US, and Japan (Table 1; Supplementary Material, Table S1). All patients who were present in the data sources between the start of market availability of Olumiant (baricitinib) and the initiation of analyses were evaluated for eligibility. All data sources provided available longitudinal information on patient demographics, in- and outpatient medical diagnoses and procedures, including RA diagnosis, comorbidities, and prescription dispensing records, including treatments for RA, and healthcare resource utilization.

Table 1 Contribution of patients and person-time to VTE analyses, by data source

All data were de-identified to ensure patient confidentiality and used in accordance with data license agreements. This study was registered on the European Post-authorization Study (EU PAS) register (#32271; https://www.encepp.eu/encepp/studiesDatabase.jsp), where the protocol and a detailed study report will be available.

Study Population

The study population consisted of adults who were incident users of baricitinib (4 and 2 mg) or a specific TNFi (adalimumab, certolizumab pegol, etanercept, golimumab, or infliximab) or biosimilars. Cohort entry was defined as the date of first dispensing of baricitinib or a specific TNFi during the study period. In addition to the specific medications, patients in claims data were required to have a diagnosis of RA (ICD-10-CM M05–M05.9, M06.0, M06.8, and M06.9, or corresponding regional ICD-10 codes) from a physician encounter during the baseline period. In one US, claims-based study, the positive predictive value (PPV) of these codes for RA was 86% [14]. Similar criteria were used in registries, with RA diagnosis and treatments identified from information contributed by rheumatologists. Patients in claims data were required to have continuous medical and prescription drug coverage for ≥ 6 months prior to cohort entry and throughout their follow-up, with any gaps limited to ≤ 45 days. Patients showing prior use of another JAKi, or with a dispensing of any combination of two or more bDMARDs and/or tsDMARDs on the cohort entry date, were excluded. In US data sources, patients in the TNFi cohort were required to have prior treatment with ≥ 1 TNFi identified during the baseline period to mirror the US indication for baricitinib [15]. This was not required for data collected outside of the US. Patients eligible for either treatment cohort were prioritized for entry to the baricitinib cohort to maximize cohort size. Patients with a baseline history of the outcome under analysis were excluded from analysis of the same outcome. Patients were also excluded from VTE and MACE analyses if there was evidence of anticoagulant use on the cohort entry date.

Exposure and Outcome Definitions

Exposures were based on an as-treated definition, with patients followed for outcomes from the start of treatment until treatment discontinuation or switch (including to another TNFi for patients in the TNFi cohort), initiation of a concomitant bDMARD or tsDMARD, disenrollment from the insurance plan or registry, death (where available), or the end of the study period. Exposure to baricitinib was defined based on aggregate doses (2 and 4 mg).

The primary outcome was VTE, a composite of pulmonary embolism, deep-vein thrombosis, or other venous thrombosis. In claims data, VTE was identified using a validated case definition based on ICD-10 diagnosis codes, health care setting (emergent, inpatient, or outpatient), and, in some cases, dispensing of low molecular weight heparin or oral anticoagulant (PPV = 75.5%; see Supplementary Material, Methods). The definition was updated from a previous algorithm [16] and further adaptations were made to reflect differences in regional coding and healthcare systems. The case definition does not consider the fatality of an event. In French claims data (Système National des Données de Santé [SNDS]), the validated case definition included evidence of imaging procedures to address the absence of outpatient diagnosis codes (PPV ≥ 92%) [17]. In CorEvitas registry data, physician diagnosis and adjudicated endpoints, within the registry procedures, were used to identify VTE. In the Anti-Rheumatic Therapy in Sweden (ARTIS) data source, VTE was defined by a validated algorithm based on ICD-10 from the Swedish National Patient Register (PPV = 87%) [18].

MACE and serious infection were examined as secondary outcomes. MACE was identified in claims data based on ICD-10 diagnosis codes for myocardial infarction (PPV ≥ 93%) [19] or ischemic or hemorrhagic stroke (PPV ≥ 82% for ischemic and ≥ 87% for hemorrhagic stroke) [20], with local adaptation for other ICD-10 coding schemes. In registry data, MACE was defined based on physician diagnosis and adjudicated endpoints per registry procedures [21]. Serious infection was identified in claims data based on primary ICD-10 diagnosis codes from an inpatient stay (PPV = 90.2%) [22]. In registry data, serious infection was based on clinical judgement and adjudicated events when available for a specified infection. MACE and serious infections that met the case definition and subsequently led to death were included although fatal outcomes were not specifically identified. Hospitalized TB was also assessed, as a component of serious infection, and separately as a descriptive outcome.

Covariate Assessment

Patient characteristics were evaluated for potential imbalances in risk factors between groups, i.e., confounding, for each outcome. These included demographics, medical history, and comorbidities, RA treatments, and health care resource utilization evaluated from information available during baseline. Baseline was defined as the 6-month period prior to cohort entry, up to and including the cohort entry date.

Statistical Analysis

Comparative analyses were implemented after 1:1 baricitinib:TNFi nearest-neighbor propensity score matching [23, 24] to create comparison groups with a balanced distribution of baseline risk factors. Propensity score models were generated separately for each outcome, i.e., VTE, MACE, or serious infection. The performance of propensity score matching across baseline variables was assessed using standardized differences, with differences of ≤ 0.10 considered acceptable. The variables considered for inclusion in the propensity score models were risk factors specific to each outcome, including information on patient demographics, medical history including comorbidities, and RA disease treatments; not all data sources had data available on each risk factor (Supplementary Material, Table S2).

Within each data source, patient characteristics, e.g., baseline demographic and clinical conditions, were summarized by treatment group (baricitinib versus TNFi) in unmatched and matched cohorts. Patients were permitted to contribute person-time and events to only a single treatment group, either the TNFi cohort or the baricitinib cohort in an analysis. For all comparative analyses, baricitinib was the treatment of interest and the TNFi cohort was the reference group.

Modified Poisson regression was used to generate an overall incidence rate ratio (IRR) from meta-analysis, as a measure of association comparing events in baricitinib and TNFi treatment cohorts. This allows inclusion of data from all sources, including those with low or no events in either or both cohorts. Both random effects and fixed effect regression models were implemented. Only results from the fixed effect model are reported for the IRR since the data were too sparse and the variance/covariance matrix of the random effects matrix did not converge. Heterogeneity in the treatment effect was assessed using the standard Cochran χ2 test, and the magnitude of heterogeneity was evaluated using the I-squared statistic [25]; however, the sparse data from several sources limited the ability of these tests to detect heterogeneity.

Using Cochran−Mantel–Haenszel analysis, an overall incidence rate difference (IRD) was also estimated for each outcome as a supplemental result. Both the random and fixed effect model results were estimated but the random effects result is reported as the main IRD finding since it allows that the treatment effect may vary in different populations.

A sensitivity analysis was executed to understand the potential impact of bias due to unmeasured confounding by smoking, body mass index (BMI), and disease activity in US data sources and SNDS (details in Supplementary Material). Additional pre-planned sensitivity analyses to understand the impact of geography, disease severity, and length of baseline period are detailed in the final study report available on the EU PAS register.

Compliance with Ethics Guidelines

Study B023 was conducted in accordance with ethical principles of the Helsinki Declaration of 1964 and its later amendments, and Good Clinical Practice guidelines. Ethical approval was provided by Advarra IRB Committee, a centralized IRB in the USA, (Reference Pro00042607), and CNIL for French SNDS data (reference 919392). All data were de-identified to ensure patient confidentiality and used in accordance with data license agreements. The requirement for informed consent was therefore waived.

Results

Study Population

Patients were identified from 14 data sources across Europe, the US, and Japan (Table 1). Of 9013 eligible patients treated with baricitinib, 7606 (84%) were propensity score-matched 1:1 with patients treated with TNFi and included in the comparative analysis of VTE, for a total 5879 and 6512 person-years of baricitinib and TNFi, respectively. A greater proportion of eligible patients were successfully matched in European (ARTIS 97%, Betriebskrankenkasse [BKK] 90%, SNDS 88%) than in US (70% overall) or Japan (85% overall) data. On average, patients were followed for 9 and 10 months of baricitinib and TNFi treatment, respectively. The largest data sources, ARTIS and SNDS, contributed 2314 (1685) and 1855 (2859) person-years of baricitinib exposure (and patients) to the meta-analysis, respectively, with an average follow-up of 16 and 8 months. These data sources contributed 39% (ARTIS) and 32% (SNDS) of the total baricitinib exposure, i.e., person-time, to the meta-analysis for VTE, with the third largest source, BKK in Germany, contributing 9%. Person-time and counts of patients in the MACE and serious infection analysis cohorts did not differ meaningfully from the VTE analysis cohort and are therefore not reported.

Patient characteristics were described from information available prior to cohort entry, i.e., the baseline period 6 months prior to and including the date of initiation of the index medication (claims data, ARTIS) and information collected at enrolment (CorEvitas). Prior to matching, patients treated with baricitinib were more likely to be female and older than those treated with TNFi, with non-US patients treated with the 2-mg dose tending to be more than a decade older. Baricitinib cohort patients were more likely to have received bDMARDs or concomitant cDMARDs during baseline and to take more medications (e.g., antibiotics, antihypertensives, beta-blockers, calcium channel blockers, and statins) compared to patients treated with TNFi (Supplementary Material, Tables S3–S6). After propensity score matching, differences between treatment groups resolved, with little to no difference remaining in the prevalence of measured risk factors between treatment cohorts for each outcome analyzed. Because the 6-month baseline may have limited information available on clinical history, a sensitivity analysis in French data extended the baseline to 2 years. No important differences emerged with this extended period (Supplementary Material, Table S7). Selected characteristics of patients from the largest US and European data sources initiating baricitinib or TNFi treatment and included in comparative analyses are described in Table 2, and the remaining data sources are included in Supplementary Material, Tables S8-S11.

Table 2 Selected baseline demographics and disease characteristics of patients in propensity score-matched VTE cohorts of the largest US and European data sources

Primary Outcome—VTE

Across all data sources, 97 patients experienced a VTE during a mean overall follow-up of 9 months (baricitinib) and 10 months (TNFi), 56 of whom were treated with baricitinib. In decreasing frequency of cases, the data sources where at least five patients experienced VTE during follow-up were ARTIS (n = 37), SNDS (n = 33), HealthVerity Private Source 20 (PS20) (n = 10), and BKK (n = 9). The overall IRR was statistically significantly elevated for baricitinib vs. TNFi (IRR = 1.51; 95% CI 1.10, 2.08) (Fig. 1A). The IRD between baricitinib and TNFi was 0.26 (95% CI −,0.04, 0.57) per 100 person-years (Fig. 1B) from the random effects model, with the greater rate among patients treated with baricitinib; the IRD was not statistically significant. In other words, assuming a constant rate over time, for every 1000 patients treated with baricitinib instead of a TNFi, an additional three VTE would be expected each year. Both random and fixed effect model results were estimated for IRD (Fig. 1B) but only the random effects result is presented as the main IRD finding since it allows that the treatment effect may vary in different populations and point estimates did not differ. The incidence of VTE in each data source is provided in the Supplementary Material (Table S12). A bias analysis carried out to assess the possible impact of unmeasured confounding by smoking, obesity, and disease activity on the effect of baricitinib on VTE compared to TNFi suggested that results were unlikely to have been meaningfully impacted by not controlling for these factors (Supplementary Material, Table S13).

Fig. 1
figure 1

Meta-analysis for VTE comparing baricitinib and TNF inhibitors showing A incidence rate ratios and B incidence rate differences. ARTIS anti-rheumatic therapy in Sweden, BKK Betriebskrankenkasse, CI confidence interval, GLMM generalized linear mixed model, HIRD HealthCore Integrated Research Database, IRR incidence rate ratio, JMDC JMDC, Inc.’s claims database, JP Japan, MDR military health system data repository, Optum® Clinformatics® Optum’s de-identified Clinformatics® Data Mart Database, PS20 private source 20, PY  person-years, RD rate difference, SNDS Système National des Données de Santé, TNFi tumor necrosis factor inhibitor, US United States, VTE venous thromboembolism. For some data sources, low counts (i.e., <11) were masked as required to maintain data privacy, as required by local regulations

Clinical characteristics of patients with VTE were similar to those of the overall RA cohorts, except for age and sex (Table 3) but comparisons are limited by the small number of patients with events. The mean age of patients with a VTE appeared higher (mean age in ARTIS 64 years; SNDS mean 68 years) than the mean age of patients included in VTE analyses (mean age in ARTIS 59 years; SNDS mean 58 years), although sample sizes were small and no statistical comparisons were made. In the ARTIS, SNDS, and BKK data sources, almost all patients in the baricitinib cohort with a VTE during follow-up were male, unlike for TNFi cohorts (Table 3). Notably within PS20, all six patients with VTE treated with baricitinib, and one of four patients with VTE treated with TNFi, had a recent hospitalization within the 4 weeks prior to the event. Within data sources with at least five patients with VTE, there were no other differences in selected VTE clinical risk factors, although the number of patients with events limits these qualitative descriptions. The distribution of time to VTE was variable, ranging from 1 to 1458 days. For the baricitinib cohort, the mean (median) time to event was 502 (454) days in ARTIS and 227 (204) days in SNDS, consistent with the mean (median) follow-up time in these cohorts. In the TNFi cohort, the mean (median) time to event was 565 (454) days in ARTIS and 181 (113) days in SNDS, consistent with the mean (median) follow-up time in these cohorts.

Table 3 Demographic characteristics (sex and age) of patients with events in the VTE analysis cohort in data sources with greater than five events

Secondary Outcomes

MACE

A total of 93 patients experienced MACE during a mean overall follow-up of 8 months (baricitinib) and 10 months (TNFi), 54 of whom were receiving treatment with baricitinib. There were four data sources with more than five patients with a MACE during follow-up: ARTIS (n = 29), SNDS (n = 36), BKK (n = 12), and PS20 (n = 6). A numerically greater, non-statistically significant overall IRR was estimated when baricitinib was compared with TNFi with respect to risk of MACE (IRR = 1.54; 95% CI 0.93, 2.54; Fig. 2A). The difference in incidence rates (IRD) between baricitinib and TNFi was 0.22 (95% CI −0.07, 0.52) per 100 person-years from the random effects model, with a non-significant greater rate observed in patients treated with baricitinib (Fig. 2B). Stated differently, for every 1000 patients treated with baricitinib instead of a TNFi, an additional two MACE would be expected each year. Both random and fixed effect model results were estimated for IRD (Fig. 2B) but only the random effects result is presented as the main IRD finding since it allows that the treatment effect may vary in different populations and point estimates did not differ. The incidence of MACE in each data source is provided in the Supplementary Material (Table S12).

Fig. 2
figure 2

Meta-analysis for MACE comparing baricitinib and TNF inhibitors showing A incidence rate ratios and B incidence rate differences. ARTIS anti-rheumatic therapy in Sweden, BKK Betriebskrankenkasse, CI confidence interval, GLMM generalized linear mixed model, HIRD HealthCore Integrated Research Database, IRR incidence rate ratio, JMDC JMDC, Inc.’s claims database, JP Japan, MACE major adverse cardiovascular event, MDR military health system data repository, Optum® Clinformatics® Optum’s de-identified Clinformatics® Data Mart Database, PS20 private source 20, PY person-years, RD rate difference, SNDS Système National des Données de Santé, TNFi tumor necrosis factor inhibitor, US United States. For some data sources, low counts (i.e., ≤ 10) were masked as required to maintain data privacy, as required by local regulations

Clinical characteristics and use of RA medications in patients with MACE were generally consistent with the overall cohort of RA patients, except for age (Supplementary Material, Table S14) although these are qualitative comparisons limited by the small number of events. The mean age of patients treated with baricitinib with a MACE appeared higher (mean age in ARTIS 68 years; SNDS 68 years) than the overall age of the baricitinib cohort included in MACE analyses (mean age in ARTIS 59 years; SNDS 58 years). The majority of French patients in both treatment cohorts who experienced MACE were male (n = 22 of 36 overall; Supplementary Material, Table S14). The distribution of time to MACE was variable, ranging from 1 to 1460 days. For the baricitinib cohort, the mean (median) time to event was 503 (454) days in ARTIS and 216 (171) days in SNDS, consistent with the mean follow-up, i.e., treatment, in these cohorts. In the TNFi cohort, the mean (median) time to event was 583 (484) days in ARTIS and 226 (174) days in SNDS, consistent with the mean (median) follow-up time in these cohorts.

Serious Infection

There were 321 patients with serious infections during a mean overall follow-up of 10 months (baricitinib) and 11 months (TNFi), 176 of whom were treated with baricitinib. There were several data sources with more than five patients with serious infection during follow-up: ARTIS (n = 160), SNDS (n = 72), BKK (n = 29), PS20 (n = 16), CorEvitas Japan (n = 15), and PharMetrics Plus (n = 6). A numerically greater, non-statistically significant overall IRR was estimated when comparing risk of serious infection in baricitinib vs. TNFi cohorts (IRR = 1.36; 95% CI 0.86, 2.13) (Fig. 3A). The IRD between baricitinib and TNFi was 0.57 (95% CI −0.07, 1.21) per 100 person-years from the random effects model (Fig. 3B), with a greater incidence rate among patients treated with baricitinib; this difference was not statistically significant. This would mean for every 1000 patients treated with baricitinib instead of a TNFi, six additional serious infections would be expected each year. Both random and fixed effect model results were estimated for IRD (Fig. 3B) but only the random effects result is presented as the main IRD finding since it allows that the treatment effect may vary in different populations and point estimates did not differ. The incidence of serious infection in each data source is provided in the Supplementary Material (Table S12).

Fig. 3
figure 3

Meta-analysis for serious infection comparing baricitinib and TNF inhibitors showing A incidence rate ratios and B incidence rate differences. ARTIS anti-rheumatic therapy in Sweden, BKK Betriebskrankenkasse, CI confidence interval, GLMM generalized linear mixed model, HIRD HealthCore Integrated Research Database, IRR incidence rate ratio, JMDC JMDC, Inc.’s claims database, JP Japan, MDR military health system data repository, Optum® Clinformatics® Optum’s de-identified Clinformatics® Data Mart Database, PS20 private source 20, PY person-years, RD rate difference, SNDS Système National des Données de Santé, TNFi tumor necrosis factor inhibitor, US United States. For some data sources, low counts (i.e., ≤ 10) were masked as required to maintain data privacy, as required by local regulations

Clinical characteristics of patients with serious infections were similar to those observed in the overall cohort of RA patients but sample sizes were too small to be informative. The three data sources with the greatest number of serious infection events, consistent with their overall sample sizes, were ARTIS (n = 160), SNDS (n = 72), and BKK (n = 29). Within these sources, results suggest that patients with serious infections may be older and more often male, than patients in the overall cohorts (Supplementary Material, Table S15). However, these are qualitative observations that were not statistically tested.

The distribution of time to serious infection was variable, with minimum of 1 day and maximum of 1460 days. Optum's de-identified Clinformatics® Data Mart Database (Optum® Clinformatics®) and BKK reported the shortest time to serious infection. For the baricitinib cohort, mean (median) time to serious infection was 80 (79) days in Optum® Clinformatics® (n < 11) and 107 (100) days in BKK (n = 17). For the TNFi cohort, time to serious infection was 205 (216) in Optum® Clinformatics® (n < 11) and 188 (96) in BKK (n = 12). ARTIS (n = 94) reported the longest mean (median) time to serious infection, of 485 (428) days in the baricitinib cohort and 562 (453) in the TNFi cohort, consistent with the mean follow-up times of these cohorts.

Among the total 9013 eligible patients treated with baricitinib available prior to matching, there were no cases of hospitalized TB recorded; three cases in total were identified in the TNFi cohort.

Discussion

Study B023 aimed to compare the safety of baricitinib with TNFi for the treatment of patients with RA in routine care for risk of VTE, MACE, or serious infection. A meta-analysis was used to combine results across 14 post-marketing data sources in Europe, the US, and Japan. With a mean overall exposure of 9 months, treatment with baricitinib was associated with a significantly increased risk of VTE versus TNFi (IRR = 1.51, 95% CI 1.10, 2.08). The incidence rate was greater among patients treated with baricitinib than with TNFi, with an IRD of 0.26 (95% CI −0.04, 0.57) per 100 PY. Risk of MACE was also numerically greater with baricitinib versus TNFi, although not statistically significant, during a mean overall exposure of 8 months (IRR = 1.54, 95% 0.93, 2.54; IRD = 0.22 95% CI −0.07, 0.52 per 100 PY). Results for serious infection also estimated a numerically greater, non-statistically significant risk with baricitinib than with TNFi during a mean overall exposure of 10 months (IRR = 1.36, 95% CI 0.86, 2.13; IRD = 0.57, 95% CI −0.07, 1.21 per 100 PY). Overall incidence rates were not estimated in the study and comparative risk should be interpreted in terms of patient cohorts or populations, rather than individual risk.

Patients with RA are at greater risk of a wide range of comorbidities [26], including the conditions evaluated in this study. Risk of VTE in this population is increased by 30–40% compared to the general population [3, 5, 27], and has been associated with disease activity [28]. Among patients receiving treatment for RA, particularly those proceeding through a sequence of advanced therapies [29], risk of VTE was elevated with bDMARDs compared to cDMARDs or methotrexate treatment [5, 30].

Few studies have assessed the comparative risk of VTE associated with JAKi, the most notable of which was the post-marketing study for tofacitinib, the ORAL Surveillance randomized trial in patients with RA enriched for MACE risk factors [31]. A significant imbalance occurred in the incidence of pulmonary embolism and all-cause mortality in the 10-mg twice-daily arm of the trial, which led to a US Food and Drug Administration (FDA) black box warning update in 2021 for all JAKi approved in the US for the treatment of RA and other inflammatory conditions. Using US claims data (2012–2019), Desai et al. [32] compared new users of tofacitinib (5301 person-years) with TNFi (75,824 person-years) and did not detect a meaningful difference in risk of VTE (HR = 1.13; 95% CI 0.77, 1.65). A meta-analysis of data from 29 randomized trials (13,910 patients) found no significant association with risk of VTE for JAKi compared to placebo (odds ratio 0.91; 95% CI 0.57, 1.47), with consistent results for baricitinib (odds ratio 1.12; 95% CI 0.27, 4.69) [33]. Most recently, however, in an observational study conducted within the ARTIS data the risk of VTE in patients with RA treated with baricitinib (3412 person-years) was 1.79-fold (95% CI 1.25, 2.55) greater compared to patients treated with TNFi after adjusting for treatment history, smoking, and RA disease-related variables, i.e., DAS28, CRP, and HAQ [34].

Findings from previous studies examining the association between JAKi and cardiovascular outcomes have not been consistent. Results from the ORAL Surveillance randomized trial identified a 1.33-fold (95% CI 0.91, 1.94) greater risk of MACE with tofacitinib versus TNFi treatment in a cohort of patients enriched for cardiovascular risk factors (50 years or older with ≥ 1 risk factor) [31]. This elevated risk was present for both the 5- and 10-mg doses, although not statistically significant given the low incidence. The observational, non-randomized STAR-RA study, which compared tofacitinib with TNFi, detected a similar 1.24-fold (95% 0.90, 1.69) risk of MACE in a cohort designed to emulate the high-risk ORAL Surveillance population [35]. However, no difference in risk was detected when the comparison was made in the same data, but in an unselected real-world cohort with greater generalizability (HR = 1.01; 95% 0.83, 1.23). The authors hypothesize that the association between tofacitinib and cardiovascular outcomes is modified by baseline cardiovascular risk. An analysis of the ORAL Surveillance population found that increased risk of MACE was mainly observed in patients with older age and previous atherosclerotic cardiovascular disease [36].

The B023 study also calculated a numerically elevated risk of MACE (IRR = 1.54; 95% CI 0.93, 2.54), but point estimates from the two data sources that contributed the most person-time and events were not aligned, with IRRARTIS = 0.94 (95% CI 0.45, 1.96) and IRRSNDS = 2.33 (95% CI 1.15, 4.74). One explanation for this observed difference in point estimates may be the different proportions of more refractory patients in each data source. Patients who are first to initiate newly approved medications may have more refractory disease or differ in other important ways such as more comorbidities. The index period for Swedish ARTIS patients included in B023 (Feb 2017 to Dec 2020) is more than 1 year longer compared to French patients in SNDS (Sept 2017 to Dec 2019). In France, national guidance in 2017–2018 required French patients treated with baricitinib to have had previously failed treatment with bDMARDs [37]. This suggests that a greater proportion of patients in SNDS could be more refractory users compared to the ARTIS data. The observed differences in ARTIS and SNDS point estimates may therefore reflect baseline differences in risk that modify the relative risk of MACE, as proposed by the STAR-RA authors Khosrow-Khavar et al. [35]. While B023 was not designed to test for differences in MACE risk by baseline risk, this explanation is supported by the different incidence rates of MACE in ARTIS vs. SNDS (0.56 vs. 1.4 per 100 person-years, respectively). Alternatively, given the low incidence of MACE in general and in B023, this may also simply reflect variability due to low patient counts.

Patients with RA have an elevated risk of infection due to disease and therapeutic interventions [2]. Findings from interventional studies show that JAKi users have a similar risk of serious infection as TNFi users [38]. Incidence rates from development programs have tended to fall in the range of 3–4 cases per 100 person-years, with increased risk in older patients [38]. ORAL Surveillance detected a non-statistically significant, elevated risk of serious infection for treatment with 5 mg tofacitinib compared to TNFi (HR = 1.17; 95% CI 0.92, 1.50).

Incidence rates of serious infection in the B023 data sources were generally consistent for TNFi [31] and numerically greater for baricitinib compared to clinical trial rates [11]. The increased rates observed in the baricitinib cohorts may reflect a general upward shift in claims data due to differences in the case definitions between trials and claims data, differences in the populations analyzed, or both. Either way, the overall relative risk of serious infection estimated by the B023 meta-analysis was modestly increased with differences once again observed between the individual ARTIS (IRR = 1.65; 95% CI 1.20, 2.26) and SNDS (IRR = 1.04; 95% CI 0.65, 1.65) point estimates. This result does not support the hypothesis that the difference in effect estimates is related to a larger proportion of early adopters in SNDS than in ARTIS. However, rates of infection vary considerably by time since treatment start and there are important differences in mean follow-up between the two sources (ARTIS 1.3 years vs. SNDS 8 months).

Based on extensive longitudinal data from the baricitinib cohort from the clinical development program (> 14,000 person-years, median exposure 1683 days, max exposure 3405 days), the rate of VTE (pulmonary embolism or deep-vein thrombosis), MACE (myocardial infarction, stroke, and cardiovascular deaths), and serious infection during a median 4.6 years (maximum 9.3 years) treatment with baricitinib remained stable over time at 0.49 (95% CI 0.38–0.61), 0.51 (95% CI 0.40, 0.64), and 2.58 (95% CI 2.33, 2.86) per 100 person-years [11]. There did not appear to be differences between the 4 mg (VTE, MACE, and serious infection IRs of 0.51, 0.54, and 2.62, respectively, per 100 person-years) and 2 mg (VTE, MACE, and serious infection IRs of 0.49, 0.42, and 2.13, respectively, per 100 person-years) doses based on the available information. Observed VTE, MACE, and serious infection IRs in patients treated with baricitinib from the clinical development program and other RA populations from various external sources suggests they are numerically similar although no statistical comparison was conducted [39]. However, results from controlled comparative studies, including observational studies such as B023, suggest that incidence rates of these safety outcomes in patients treated with JAKi, including baricitinib, are elevated compared to similar populations treated with TNFi. The rate from the baricitinib cohort in the clinical program (VTE IR = 0.49 per 100 person-years) is not comparable with rates from individual B023 data sources (where VTE IR ranged from 0.60 to 2.55 per 100 person-years for baricitinib cohorts and 0.54 to 1.40 per 100 person-years for TNFi cohorts) as outcome definitions, prevalence of risk factors, and patient populations were different. Similar caution should be applied comparing IR from individual B023 data sources with results from external sources. The characteristics of patients evaluated in the B023 study are those of real-world patients with RA treated with baricitinib, particularly in Europe, where the large majority of eligible patients were included in analyses after propensity score matching to TNFi.

In the future, results from two ongoing post-marketing randomized trials, RA-BRANCH (NCT04086745) and RA-BRIDGE (NCT03915964), will be available to provide a more complete understanding of the risk of VTE, MACE, and serious infection associated with baricitinib compared to TNFi in high-risk patients with ≥ 1 VTE risk factor and inadequate response or intolerance to ≥ 1 prior cDMARD or bDMARD.

Strengths and Limitations

Several strengths and limitations should be considered when interpreting the results of this study. Since this study was not randomized and is based on data collected for other purposes, the potential for bias due to confounding is a concern. Several risk factors known to be associated with the outcomes evaluated in this study are not available or only partially complete. Claims data present limited ability to control for confounding by lifestyle factors such as BMI and smoking, or clinical measures of disease such as severity, activity, duration, or treatment history. A summary of each limitation along with the mitigations taken to address it is presented below for consideration.

First, we give a brief review of the strengths of this study. There was broad geographic representation of patients with RA receiving treatment with baricitinib in routine care. Second, the study used validated case definitions confirming the accuracy of VTE identified in French, Swedish, and US data with PPV of 75.5-92% [16,17,18]. Third, the study implemented several design and analysis strategies to control for and assess potential confounding, including the use of an active comparator new user study design, propensity score matching, and sensitivity analyses. Finally, the implementation of a common analytic strategy executed across individual data sources may also have reduced a source of heterogeneity.

There are also limitations. RA disease activity is a risk factor for each of the study outcomes (VTE, MACE, and serious infection) [28, 40, 41]. Inclusion of traditional risk factors may not account fully or at all for the effects of RA. In an effort to partially control for disease activity, an RA-specific measure of healthcare resource utilization was included in all propensity score models, but this measure is known to be a poor proxy [42]. A simple bias analysis was conducted to quantify the magnitude of bias that could have been introduced due to unmeasured confounding by disease activity [43]. For all outcomes, the bias analysis result suggested that the final interpretation of the study results is unlikely to have changed if information about disease activity had been fully accounted for. In support of this, a study in US RA registry patients showed that adding disease activity to a model of traditional risk factors for cardiovascular disease contributed limited additional ability to predict risk (change in c-statistic = 0.04) [44]. Further, the recent analysis conducted in ARTIS patients found that additionally adjusting for RA disease measures (treatment history, DAS28, CRP, and HAQ) did not attenuate the association between baricitinib and VTE [34].

Another limitation of the study was the length of follow-up, which was brief. This study was designed to provide rapid insight into the safety of baricitinib with respect to specific outcomes rather than to evaluate long-term safety. The 9-month average follow-up of patients may have limited the ability to fully evaluate risk.

Next, insurance claims data present limited ability to control for confounding by lifestyle factors such as BMI and smoking, both of which are risk factors for the outcomes investigated in B023. In addition to the study design and analytic strategies, such as propensity score matching and active comparator new user design, which were incorporated to minimize the impact of these potentially confounding factors, additional bias analyses were used to assess the robustness of results to missing information on BMI and smoking. As before, quantitative evaluation of the magnitude of potential bias that could have occurred due to BMI or smoking suggests that the study results were unlikely to have been impacted in an important way by not controlling for these factors.

Finally, baseline risk factors were assessed in the 6 months prior to initiation of study drug. This period may be too short to allow for complete assessment of patient comorbidities and relevant risk factors. To evaluate the impact, a sensitivity analysis in the SNDS data extended the baseline to 2 years. As expected, the overall prevalence of comorbidities increased, but no differences appeared between the treatment cohorts that were more extreme than the prevalences evaluated in the bias analyses, suggesting these differences would not meaningfully impact results. This was not evaluated in other data sources.

To date, Study B023 is the largest, real-world observational study evaluating VTE, MACE, and serious infections among patients treated with baricitinib compared to similar patients treated with TNFi. Despite the limitations, this large, multi-database study provides important additional information on the safety of baricitinib into the evolving landscape of safety for JAKi.

Conclusions

In conclusion, this study suggests that patients receiving treatment with baricitinib for RA have an increased risk of VTE compared with TNFi treatment. A numerically greater IRR was estimated for baricitinib compared to TNFi for MACE, but this did not attain statistical significance and point estimates from the largest data sources differed. Similarly, the overall IRR estimating risk of serious infection was numerically greater for baricitinib compared to TNFi and non-statistically significant. Findings from this study and their impact on clinical practice should be considered in context of limitations and other evidence regarding the safety and efficacy of baricitinib and other JAK inhibitors.