FormalPara Key Summary Points

Prophylactic treatment of hemophilia B with extended half-life recombinant factor IX products is associated with improved clinical outcomes, but head-to-head comparative studies of available treatments are lacking.

Matching-adjusted indirect comparisons (MAICs) are robust methods for conducting indirect treatment comparisons across trials.

This study used MAICs to estimate the efficacy of recombinant factor IX albumin fusion protein (rIX-FP) in the PROLONG-9FP trial relative to recombinant factor IX Fc fusion protein (rFIXFc) in the B-LONG trial in subjects that were on weekly prophylaxis.

The results demonstrated that rIX-FP may provide better clinical outcomes than rFIXFc relating to annualized bleeding rates and the proportions of patients without bleeding events of any nature.

Introduction

Hemophilia B (HB) is a congenital bleeding disorder characterized by a deficiency in coagulation factor IX (FIX) [1]. The symptoms associated with HB include spontaneous bleeding into joints and muscles, which can result in synovitis and chronic disabling arthropathy [2]. Bleeding in the intracranial space, although rare, may be life threatening. Prophylaxis with replacement FIX is the standard of care because it is associated with improvements in quality of life and favorable clinical outcomes especially when initiated at an early age [2,3,4]. However, the short half-lives of standard FIX therapies necessitate frequent intravenous injections to maintain adequate protection, and this might impact on patients’ compliance and adherence [5].

To improve bleed prevention and reduce treatment-related burden, extended half-life (EHL) rFIX products have been approved for use in clinical practice. The first EHL product approved was recombinant factor IX Fc fusion protein (rFIXFc) in 2014 following the phase 3 trial, B-LONG. This study confirmed that rFIXFc has a prolonged half-life (82 h in patients 12 years of age or older) compared to standard rFIX (22 h), a favorable safety profile [6, 7] and that prophylaxis with regular rFIXFc injections every 7 to 14 days resulted in low annualized bleeding rate (ABR) [7]. Following the phase 3 trial, PROLONG-9FP, recombinant factor IX albumin fusion protein (rIX-FP) was approved in 2016 for the treatment and prevention of bleeds in HB. The licensure trial demonstrated that rIX-FP has an improved half-life (102 h in patients 12 years of age or older) and is safe and effective in reducing bleeding rates when dosed prophylactically every 7, 10, or 14 days [8]. In comparison to rFIXFc, in silico analyses demonstrated that a lower weekly dose of rIX-FP was required to achieve target FIX trough levels and that rIX-FP maintained higher FIX activity levels in plasma [6]. Overall, however, both EHL rFIX therapies allowed for less frequent dosing and provided superior protection over standard rFIX products [3].

Although rIX-FP and rFIXFc have demonstrated efficacy and tolerability in preventing bleeding episodes in subjects with HB, direct head-to-head trial comparisons are lacking to show if those products have different efficacy. Matching-adjusted indirect comparisons (MAICs) are widely used and validated methods for indirect treatment comparisons (ITCs) across trials [9, 10]. In the absence of a common comparator (i.e., placebo) to compare treatment effects, subjects from each trial were aligned on key baseline characteristics to minimize biases in unanchored MAIC analyses. The aim of this study was to estimate the efficacy of rIX-FP relative to rFIXFc using MAICs with individual patient data (IPD) from the PROLONG-9FP trial and published summary-level data (SLD) from the B-LONG trial to balance the populations of interest for comparison.

Methods

Data Sources

Both the B-LONG and PROLONG-9FP trials were non-randomized, open-label studies that enrolled male subjects aged ≥ 12 years with severe or moderately severe HB (endogenous FIX ≤ 2 IU/dL) (Table 1) [7, 8]. Both studies were conducted in accordance with the Declaration of Helsinki and local regulations, the protocols were approved by the authorities and the institutional review board/ethics committee at each participating center, and signed informed consent was obtained from all patients. The primary efficacy endpoint of B-LONG was ABR. The primary efficacy endpoint of PROLONG-9FP was annualized spontaneous bleeding rate (AsBR) for bleeding episodes treated on-demand versus routine prophylaxis.

Table 1 Summary of trial designs for B-LONG and PROLONG-9FP

A qualitative ITC feasibility assessment was conducted on the study design, eligibility criteria, baseline characteristics, and outcomes and their definitions from the trials. Overall, the studies were comparable by matching and adjusting the IPD of PROLONG-9FP to the SLD of B-LONG. Subjects that were on weekly prophylaxis treatment were selected as the population of interest given that it was the only dosing regimen comparable between the two trials. From B-LONG, only subjects on weekly prophylaxis (B-LONG group 1) were selected for this MAIC (Table 1). From PROLONG-9FP, IPD was leveraged to combine subjects (1) who received weekly prophylaxis (PROLONG-9FP group 1, excluding data on prophylaxis every 10 or 14 days from subjects who switched after 26 weeks of weekly prophylaxis; Table 1) and (2) who received weekly prophylaxis after 26 weeks of on-demand treatment (PROLONG-9FP group 2; Table 1) [8]. This created a dataset that included complete data on weekly prophylaxis (i.e., excluding data not on weekly prophylaxis) and also included subjects who received either prior prophylaxis or prior on-demand treatment, from both PROLONG-9FP and B-LONG.

Matching and Adjusting Baseline Characteristics of Subjects

Individual patient data from the PROLONG-9FP trial were used to match and adjust subjects to the population of the B-LONG trial. No subjects were removed from PROLONG-9FP during matching given that the inclusion criteria of B-LONG was either comparable or broader. After matching, subjects from PROLONG-9FP were weighted using a method-of-moments propensity score algorithm to adjust for disease severity (endogenous plasma FIX level of < 1 IU/dL or 1–2 IU/dL), age, prior FIX regimen, and BMI so that the means and standard deviations (SDs), and proportions, were comparable to those in the B-LONG trial. The baseline characteristics selected for adjustment were based on previous literature, clinical input, and statistical performance assessments. The performance and suitability of each MAIC model were assessed on the basis of the effective sample size (ESS), which represents the number of non-weighted patients that would produce a treatment effect estimate with the same precision as the weighted sample estimate and is derived as the sum of patient weights squared divided by the sum of squared patient weights [11], and the distribution of patient weights with a particular focus on the presence of extreme values. A low ESS compared to the original sample size indicated large differences in patient weights due to large imbalances in patient populations prior to weighting. Analyses adjusting for prior ABR were explored, as this was also identified as a key baseline characteristic; however, the analyses could not be conducted because of substantial differences between trials causing extreme weights and a reduction in ESS to insufficient levels. In lieu of adjustment for prior ABR, prior treatment regimen may be a proxy for this factor.

Efficacy Outcomes

Six outcomes were assessed in this study: ABR, AsBR, and annualized joint bleeding rate (AjBR); proportion of subjects without bleeds (overall), spontaneous bleeds, and joint bleeds. Both trials aligned on the definition for each outcome and only bleeding episodes that were treated were included in the analyses [12]. Annualized bleeding rates for all three bleeding types were derived as

$$\frac{\text{Number of bleeding events}}{\text{Number of days of the observed period of interest}}\times 365.25.$$

Estimating Relative Treatment Effects

Analyses for each efficacy outcome included a naïve comparison and multivariable analyses which adjusted for 2–4 of the prognostic factors: disease severity, age, prior FIX regimen, and BMI. The analysis which adjusted for all four baseline characteristics was considered the full analysis.

Estimates of the comparative efficacy of rIX-FP versus rFIXFc were based on the difference between (a) an estimate of the outcome of interest for subjects in the comparator study (B-LONG) had they received rIX-FP and (b) the estimated outcome based on published SLD from the B-LONG trial. After subjects from the comparator trial were matched, a weighted estimate of the outcome with the PROLONG-9FP data was derived using a weighted, intercept-only generalized linear model. Specifically, a negative binomial distribution with log link was used for the ABR, AsBR, and AjBR outcomes, and a logistic distribution with logit link was used for binominal outcomes (i.e., proportion of patients without bleeding events, spontaneous bleeding events, and joint bleeding events). The intercept represents an estimate of the outcome of interest had patients from the comparator trial received rIX-FP. Robust standard errors (SEs) were estimated using the sandwich estimator with the R package “sandwich”, and relative treatment effects on the linear predictor scale (i.e., log-rate ratios [log-RR] and log-odds ratios [log-OR]) were derived by taking the difference between this estimated outcome based on SLD from the B-LONG trial [7, 13]. The variance of the relative treatment effect between rIX-FP and rFIXFc was estimated as the sum of the variance of the individual estimators for each treatment included in the comparison. The SEs were used to construct two-sided 95% Wald CIs based on normal approximation. Relative treatment effects (i.e., RR and OR) and CIs were transformed to the natural scale after estimation.

All analyses were conducted using R® version 3.6.1.

Compliance with Ethics Guidelines

This article is based on two previously published phase 3 trials (PROLONG-9FP and B-LONG), does not contain any new studies with human participants or animals performed by any of the authors, and thus did not require ethics approval. Both PROLONG-9FP and B-LONG studies were conducted in accordance with the Declaration of Helsinki and local regulations, the protocols were approved by the authorities and the institutional review board/ethics committee at each participating center, and signed informed consent was obtained from all patients. Informed consent was not required for this analysis given the deidentified nature of the PROLONG-9FP individualized patient-level data and the use of anonymized, previously published data for the B-LONG study.

Results

Baseline Characteristics of Subjects Before and After Adjustment

The population of interest for comparisons comprised subjects that were on a weekly prophylaxis regimen. Individual patient data from subjects who received weekly prophylaxis in PROLONG-9FP (n = 59) were used to weigh and align the subjects with the B-LONG population (n = 63) on key baseline characteristics (Table 1). Since three subjects from PROLONG-9FP had no outcome data available for ABR, AsBR, and AjBR they were excluded from analyses of these three outcomes. In the analysis that adjusted for disease severity, age, prior FIX regimen, and BMI, the ESS was reduced by 23% and 16% for the bleeding rates and the outcomes relating to the proportions of no bleeding events, respectively. After adjustment, the baseline characteristics of the PROLONG-9FP population aligned with the B-LONG population across all efficacy outcomes in the analysis that adjusted for all 4 prognostic factors (Table 2) and multivariable analyses adjusting for the first 2 or 3 prognostic factors (Table S1).

Table 2 Baseline characteristics of subjects treated with rFIXFc or rIX-FP before and after adjustment for four key prognostic factors

Efficacy Outcome Comparisons

Efficacy outcome comparisons between subjects treated with rIX-FP versus rFIXFc were performed before and after adjusting for disease severity, age, prior FIX regimen, and BMI. In the analysis which accounted for all these four prognostic factors, there was a numerical trend in favor of rIX-FP compared to rFIXFc for total ABR (RR 0.75; 95% CI 0.32, 1.75; P = 0.5095) and AjBR (RR 0.82; 95% CI 0.37, 1.82; P = 0.6178), and AsBR (RR 0.42; 95% CI 0.22, 0.82; P = 0.0107) was significantly lower in subjects treated with rIX-FP versus rFIXFc (Table 3). Treatment with rIX-FP also resulted in a significantly higher proportion of non-bleeders as compared with rFIXFc. Specifically, an OR of 3.24 (95% CI 1.41, 7.45; P = 0.0057), 3.47 (95% CI 1.56, 7.73; P = 0.0023), and 2.41 (95% CI 1.10, 5.26; P = 0.0274) was observed for the proportion of patients without bleeding events, spontaneous bleeding events, and joint bleeding events, respectively.

Table 3 Summary of efficacy outcomes for subjects treated with rFIXFc and rIX-FP before and after adjustment for four key prognostic factors

The results for other multivariable analyses were consistent with the four factor analyses and in favor of rIX-FP over rFIXFc, including statistical significance for AsBR and proportion of patients without bleeds of any nature (Table S2). In the two factor analysis, ABR was significantly lower for rFIX-FP than rFIXFc (P = 0.0439) and AjBR (P = 0.0789) was reduced and trended in favor of rIX-FP.

Discussion

Prophylactic treatment of HB with both rFIXFc and rIX-FP has demonstrated favorable safety and efficacy profiles, but such molecules have not been compared in a direct head-to-head trial. In this study, an ITC was conducted using unanchored MAIC analyses after matching and adjusting the subject populations for key baseline characteristics. The analysis that adjusted for four factors revealed that prophylaxis with rIX-FP is associated with significantly lower AsBR and significantly higher proportion of patients without bleeding episodes of any nature.

In a previous MAIC analysis, the relative efficacy of prophylactic rFIXFc in the B-LONG trial was shown to be similar to rIX-FP in the PROLONG-9FP trial based on ABR alone [14]. The data were analyzed between trials on the basis of prior on-demand or prophylactic treatment. For the prior on-demand treatment comparisons, mixed regimen groups, weekly and interval adjusted, from B-LONG were compared to a weekly regimen group from PROLONG-9FP. For the prior prophylaxis treatment comparisons, the same mixed regimen group from B-LONG was compared to a mixed regimen group from PROLONG-9FP, which included subjects that started on weekly prophylaxis and then switched to treatment every 10 or 14 days. The present study was performed with data on subjects that were exclusively on a weekly prophylaxis regimen and the estimated efficacy between treatments was assessed on the basis of six outcomes. The analyses also adjusted for multiple prognostic factors to minimize the differences between the B-LONG and PROLONG-9FP populations. Although prior ABR could not be adjusted for, prior treatment regimen may be a proxy for this factor. Overall, by sequentially adding up to four factors for adjustment, it demonstrated that the magnitude of efficacy outcomes consistently trended in favor of rIX-FP compared to rFIXFc.

MAICs are a robust way of comparing available therapies in the absence of direct comparative data [9]. Similar to other unanchored MAICs, this study has several limitations. One limitation was that subjects were allocated to treatment groups by an investigator in the B-LONG trial [7]. This may have biased the assignment of subjects to weekly prophylaxis based on their baseline disease severity and the duration of their prophylaxis treatment interval, which was not accounted for in the present study. Another limitation was that although the populations in this study were matched and adjusted prior to performing the analyses, it was not possible to account for all of the differences because of the broader inclusion criteria of the B-LONG trial with respect to age and prior FIX therapy [7]. In particular, B-LONG allowed patients aged ≥ 12 years, while PROLONG-9FP allowed only patients aged from 12 to 65 years. However, only two patients in the B-LONG trial exceeded the maximum age of 65 allowed in PROLONG-9FP, with a maximum age of 71 and the mean age was actually older in PROLONG-9FP (32.9) than B-LONG (32.3). Age was also included as an adjustment variable, so this difference was considered to have a minimal impact on this analysis. Moreover, although the number of key prognostic factors adjusted for was limited because of sample size constraints and the degree of initial imbalance, it was similar to other published ITC studies [14]. Adjusting the populations of interest by weighting the subjects also resulted in numerical discrepancies in the efficacy outcomes compared to the naïve data. Furthermore, unanchored MAICs assume that absolute treatment effects are constant across all prognostic factors and that these factors are balanced after adjusting for covariates. To mitigate these imbalances, IPD from PROLONG-9FP was used to adjust for key prognostic factors that were identified as the most important [8].

Conclusion

Overall, the results demonstrated that rIX-FP provided a statistically significant reduction in AsBR and the percentage of subjects experiencing no bleeding events, spontaneous bleeding events, and joint bleeding events relative to rFIXFc. Weekly prophylaxis treatment with rIX-FP was also associated with numerically favorable reductions in ABR and AjBR compared to rFIXFc. On the basis of these findings, prophylactic treatment of HB with rIX-FP may offer improved clinical benefits compared to rFIXFc.