FormalPara Key Summary Points

Why carry out this study?

Uncertainty about survival in patients suffering with progressive fibrosing interstitial lung diseases (PF-ILDs) hinders clinical decision-making and economic modelling for developing new treatments and bringing them to market

We reasoned that we could use trial data from patients with idiopathic pulmonary fibrosis (IPF) to estimate survival of patients with other PF-ILDs

What was learned from this study?

Various models led to consistent estimates of overall survival in patients with PF-ILDs other than IPF, suggesting the reliability of our approach

Median overall survival since starting treatment with nintedanib was 6.34–6.50 years. The equivalent estimate was 3.42–3.76 years if PF-ILDs were left untreated

These survival estimates may help clinicians and patients make evidence-based decisions about treating and managing PF-ILDs other than IPF, and they may accelerate development of new treatments

Introduction

Progressive fibrosing interstitial lung diseases (PF-ILDs) are relatively rare conditions in which patients experience a decline in lung function that reduces their health-related quality of life and often leads to premature death [1,2,3]. PF-ILDs include idiopathic pulmonary fibrosis (IPF), idiopathic non-specific interstitial pneumonia, autoimmune interstitial lung diseases, sarcoidosis and exposure-related diseases such as asbestosis. These diseases show limited response to standard anti-inflammatory and immunosuppressive therapies [4, 5].

IPF is the best-studied PF-ILD [1, 2, 6]. Much less is known about the survival of patients with other PF-ILDs. The most extensive clinical data on other PF-ILDs has come from the phase 3 INBUILD trial involving 663 patients. That work reported that nintedanib significantly slows disease progression, based on analysis of forced vital capacity (FVC) during 1 year. Whether the drug increases survival remains unclear.

Uncertainty about the survival of patients with PF-ILDs other than IPF makes it difficult for clinicians and patients to make evidence-based decisions about their treatment and management. It also prevents reliable economic modelling of costs and benefits of potential treatments for PF-ILDs, which payers, healthcare insurers, and pharmaceutical companies use to justify long-term investment in treatment allocation and drug development. Indeed, the ability of a new treatment to improve survival may be regarded by regulatory agencies, insurers and other stakeholders as a stronger argument than its ability to slow disease progression. However, survival data that have been collected over a sufficiently long period from an adequate number of patients are lacking for PF-ILDs other than IPF. A similar problem exists for rare and orphan diseases more generally, as well as for diseases that have only recently been defined clinically.

The Bayesian framework relies on a synthesis of evidence, such as historical data or subjective beliefs, to generate an informative prior, which is a prior belief of what the trial evidence (likelihood) may show. Bayesian extrapolation has been proposed to predict drug efficacy for children based on efficacy reported for adults [7], and the US Food and Drug Administration and the European Medicines Agency have integrated this approach into their guidance [8, 9]. The Decision Support Unit of the National Institute for Health and Care Excellence in the UK acknowledges the importance of “flexible survival models”, based on extrapolation using Bayesian methods or other techniques, in health technology assessments [10].

We attempted to use Bayesian methods to exploit the long-term survival data available for IPF, reflecting follow-up of up to 5.90 years [11], in order to estimate the survival of patients with other PF-ILDs. IPF is similar to other PF-ILDs in terms of the rate of FVC decline [2, 3, 12, 13]. We hypothesised that the survival would also be similar. First, we used propensity score matching to match patients with IPF in the TOMORROW, INPULSIS-1, INPULSIS-2 and INPULSIS-ON trials to patients with other PF-ILDs in the INBUILD trial. Next, we modelled the survival of the matched patients with IPF, and we used the best models from those patients to generate informative priors in Bayesian estimation of the survival of matched patients with other PF-ILDs, based on the method outlined in Soikkeli et al. [14]. The resulting survival estimates provide a foundation to understand the progression of other PF-ILDs and to promote the development of new treatments. Our approach may also be helpful in other rare, orphan or newly defined diseases for which survival evidence is limited.

Methods

Data Sources and Outcome

We considered patients with IPF who received nintedanib (300 mg/day) or placebo in the 52-week phase 2 TOMORROW trial (NCT00514683) [15] and the 52-week phase 3 INPULSIS-1 and INPULSIS-2 trials (NCT01335464 and NCT01335477) [15]. Long-term IPF data for patients receiving nintedanib were obtained from the open-label INPULSIS-ON extension trial (NCT01619085) [11]. Across the trials, patients with IPF received nintedanib for a median of 2.1 years (range 0.1–5.9 years).

We considered patients with PF-ILDs other than IPF who received nintedanib (300 mg/day) in the 52-week phase 3 INBUILD trial (NCT02999178) [16].

Overall survival (OS) was defined as the time between a patient’s first and last study visits, which corresponded to the time from treatment initiation until death or the last visit. Patients who were alive at the time of their last study visit were censored at that visit. This corresponded to the time when patients started treatment with nintedanib or control.

Our analysis were based on previously conducted studies and did not involve the conduct of any new studies with human participants or animals.

Propensity Score Matching

Patients with IPF were matched to patients with other PF-ILDs on the basis of their propensity score to ensure similar baseline characteristics and disease severity between the two populations. The following baseline characteristics were used in the matching: age, sex, ethnicity (Asian vs other), time since diagnosis, haemoglobin-corrected percentage of predicted CO diffusion capacity (DLco), percentage predicted FVC, and smoking status (never smoked, used to smoke, currently smokes). Patients for whom data were unavailable for any of these characteristics were excluded from the analysis.

Scores were matched using a kernel approach, in which patients with IPF with a propensity score closer to that of a patient with another PF-ILD were weighted more than those with scores farther away; and using a radius approach, in which all patients with IPF within a certain radius of a patient with another PF-ILD were weighted the same [17]. In both approaches, we tested radii of 0.1 and 0.05. Matching was performed separately for patients who received nintedanib or placebo. Propensity score matching was conducted using Stata IC version 14.2 (StataCorp, College Station, TX, USA).

The ability of the matching to minimise differences between patients with IPF or other PF-ILDs was assessed in terms of bias; Rubin’s B, defined as the absolute standardised difference in mean linear propensity score index between the two groups; and Rubin’s R, defined as the ratio of propensity score index variances in the two groups. We considered the matching successful if bias was less than 5%, Rubin’s B was less than 25 and Rubin’s R was between 0.5 and 2.0 [18].

Bayesian Survival Analysis

We aimed to extend the approach of Soikkeli et al. [14] to the extrapolation of data from one disease (IPF) to clinically similar ones (other PF-ILDs). First, we tested the following seven standard frequentist survival models to determine the best fit model of the survival data for the matched patients with IPF: exponential, Weibull, log-normal, log-logistic, generalised gamma, gamma and Gompertz. All these models are routinely used in health technology assessments of new treatments [19, 20]. Modelling was performed using the “flexsurv” function in R 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria) [21]. Model fit was assessed by visual inspection and in terms of the Akaike and Bayesian information criteria, where lower values indicate better model fit. We identified the three best-fitting models and used them to generate informative priors for the “shape” parameter of the Bayesian model. Bayesian modelling was carried out using OpenBUGS 3.2.3 (revision 1012; MRC Biostatistics Unit, Cambridge, UK). The OpenBUGS code and further details of the analysis are provided in the online Supplementary Material. Number of iterations and burn-in are detailed in Table S1 in the online Supplementary Material. If autocorrelation was high, thinning factors were applied to determine whether they affected parameter estimates. If not, estimates without the thinning factor were used.

Patient and Public Involvement

Patients were not involved in the design or conduct of this analysis.

Results

Patients

Of the 1239 patients with IPF in the TOMORROW, INPULSIS-1, INPULSIS-2 and INPULSIS-ON trials, 140 were removed during propensity score matching, leaving 1099 matched patients, of whom 640 received nintedanib and 459 received placebo (Fig. 1). Of the 663 patients with other PF-ILDs in the INBUILD trial, nine were removed during propensity score matching, leaving 654 matched patients, of whom 326 received nintedanib and 328 received placebo. Of all the radius matching algorithms tested for the nintedanib and placebo groups, the algorithm with a radius of 0.1 produced the best results: a Rubin’s B score of 13.3 for nintedanib and 12.8 for placebo, and respective Rubin’s R scores of 0.96 and 0.89. After matching, small bias was observed in the percentage of predicted FVC and the percentage of predicted DLco. Nevertheless, the residual differences between the matched patients with IPF or other PF-ILDs were not clinically significant (Table 1). There was sufficient overlap of propensity scores within each group of matched patients (Fig. S1 in Supplementary Material), and the matching process reduced the standardised percentage bias across covariates (Fig. S2 in Supplementary Material).

Fig. 1
figure 1

Summary of the study procedure. IPF idiopathic pulmonary fibrosis, PF-ILD progressive fibrosing interstitial lung disease

Table 1 Baseline characteristics of patients with IPF or other PF-ILDs after propensity score matching

Bayesian Estimation of OS

On the basis of the Akaike and Bayesian information criteria, three survival models gave good fits to the survival data for matched patients with IPF who received nintedanib or placebo: gamma, log-logistic and Weibull (Table S3 in Supplementary Material). For matched patients with IPF who received nintedanib, the log-logistic model estimated the highest median OS (6.48 years), while estimates were lower with the gamma model (6.13 years) and Weibull model (6.06 years). For patients who received placebo, the gamma and log-logistic models gave similar median OS estimates (2.93 or 3.00 years), while the Weibull model gave a lower median OS (2.61 years).

Since the three models visually fit the matched IPF data (Fig. 2), all three were considered in the Bayesian analysis to estimate OS for patients with other PF-ILDs. Diagnostic plots for the three models suggested good convergence across model parameter estimates (Figs. S3–S8 in Supplementary Material).

Fig. 2
figure 2

Modelling of overall survival of matched patients with IPF using gamma, log-logistic or Weibull models. Model output is shown against the corresponding trial data. IPF idiopathic pulmonary fibrosis

The complete results for the different model parameters estimated in the Bayesian analysis are provided in Table S4 in the Supplementary Material. For patients who received nintedanib, all three models gave similar estimates of median OS (6.34–6.50 years) and 5-year OS rates (59–60%) (Table 2, Fig. 3). For patients who received placebo, the three models gave similar estimates of median OS (3.42–3.76 years), but the Weibull model estimated a substantially lower 5-year survival rate (21%) than the other models (32% or 34%).

Table 2 Estimates of OS for patients with PF-ILDs other than IPF based on Bayesian extrapolation from survival data for patients with IPF
Fig. 3
figure 3

Estimation of overall survival of patients with progressive fibrosing interstitial lung diseases (PF-ILDs) other than idiopathic pulmonary fibrosis (IPF), based on Bayesian extrapolation from trial data for patients with IPF. Extrapolation was performed according to a gamma, b log-logistic, or c Weibull models

Discussion

Drawing on available survival data for patients with IPF, we used a Bayesian methodology to provide the first long-term OS estimates for patients with other PF-ILDs whose disease is left untreated or is treated with nintedanib. These estimates may help clinicians and patients make informed decisions about disease treatment and management, and it may help drug manufacturers and healthcare agencies more reliably estimate the costs and benefits associated with proposed treatments.

We estimated that when untreated, patients with PF-ILDs other than IPF show median OS of 3.42–3.76 years and 5-year survival of 21–34%. When treated with nintedanib, their median OS increases to 6.34–6.5 years and 5-year survival to 59–60%. While the three best-performing models for Bayesian extrapolation gave consistent results for median OS, the Weibull model gave a substantially lower 5-year survival estimate for placebo (21%) than the gamma and log-logistic models gave (32–34%). Given that a systematic review has reported a ≥ 5-year survival rate of 31% for patients with IPF not receiving antifibrotic treatment [22], it is likely that the Weibull model underestimates survival compared with the other two options. Our estimates may be refined in the future, as the ongoing open-label INBUILD-ON extension trial (NCT03820726) continues to provide new data. Our analysis suggests that to observe at least 50% patient death in clinical trials or registries, follow-up would need to exceed 3 years in the placebo arm and 6 years in the nintedanib arm.

The antifibrotic drug nintedanib has been shown to slow FVC decline of IPF and other PF-ILDs [23], and it may also improve OS of patients with IPF, based on an exploratory extrapolation of trial data [24]. The present extrapolation suggests that the same may be true for other PF-ILDs: the difference in median OS between the nintedanib and placebo groups was 2.61–3.03 years across all three models that we used to generate IPF-based priors for Bayesian extrapolation. While our results require validation with real-world evidence, they justify further investment into nintedanib.

In fact, our results may help accelerate the development of new treatments for other PF-ILDs, since nintedanib is currently the only drug licensed for this condition. Our approach may also prove useful for rationalising and stimulating investment in new treatments for other rare, orphan or newly described diseases where the lack of patient data may create too much uncertainty for drug or device manufacturers to justify investment. The reliability of such extrapolations from one disease to another depends on clinical similarity between the two. Here we relied on the demonstrated similarity in FVC decline between IPF and other PF-ILDs. Indeed, Simpson et al. showed that survival rates were similar between patients with IPF or with other PF-ILDs during follow-up of approximately 2.5 years [13].

Our estimates should be treated with caution as no long-term IPF survival data were available to generate priors for the placebo arm. These patients either discontinued or crossed over to nintedanib in the open-label extension and were therefore censored in the present study. In addition, we did not take into account potential confounding due to the use of anti-inflammatory medications by patients with other PF-ILDs in the INBUILD trial. That trial allowed patients to take such medications in addition to the study drug.

Despite these limitations, our analysis demonstrates the feasibility of extrapolating from long-term survival data for patients with one disease to estimate survival of patients with a clinically related disease. The resulting estimates can improve treatment decisions as well as payer treatment allocation decisions.

Conclusions

We used a Bayesian approach to estimate survival of one disease based on clinical trial data from a similar disease, allowing us to provide the first estimates of long-term overall survival for patients with PF-ILDs other than IPF. Our analysis suggests that nintedanib may prolong their survival, justifying further investment in the drug. Our approach may prove useful for economic modelling of rare, orphan and newly defined disorders for which only limited survival data are available.