FormalPara Key Points for Decision Makers

Early extubation to non-invasive ventilation (NIV) did not shorten time to liberation from ventilation.

However, the probability of NIV being cost effective relative to weaning without NIV was modest: between 57% and 59%.

For patients with chronic obstructive pulmonary disorder, the probability of cost effectiveness of NIV was much higher (82–87%).

Future trials with extended follow-up are needed to reduce uncertainty surrounding the long-term cost effectiveness of NIV.

1 Introduction

Optimising techniques to wean patients from invasive mechanical ventilation (IMV) remains a key goal of intensive care practice [1, 2]. To date, no definitive guidelines exist on the best approach to use in the intensive care unit (ICU) [2]. Evidence from structured and systematic and literature reviews of studies, including randomised controlled trials (RCTs) suggests that the use of non-invasive ventilation (NIV) as a weaning strategy (transitioning patients who are difficult to wean to early NIV) may reduce mortality, ventilator-associated pneumonia and ICU length of stay, although this beneficial effect may be limited to patients with chronic obstructive pulmonary disease (COPD) [3, 4, 8].

In light of current clinical practices [2], the findings of existing trials are of limited generalizability to clinical settings across a number of industrialised countries as treatment pathways for COPD exacerbations vary across settings and over time. Whereas patients with respiratory failure would have previously received IMV, in the UK for example, this is now largely reserved for patients who fail a trial of NIV. Furthermore, few published studies have reported the impact of NIV on health-related quality of life (HRQoL) outcomes [3] or economic costs associated with NIV in critical care settings [5, 6]. Where these have been reported, sample sizes are small [6] or focus has solely been on patients with an exacerbation of COPD. Importantly, only one study, conducted in Canada, has estimated the cost effectiveness of NIV as a weaning strategy in patients with COPD [7], whereas, to our knowledge, no study has estimated its cost effectiveness in patients within the ICU presenting with or without COPD.

It is crucial to evaluate the cost effectiveness of NIV before its use can be considered more widely. We therefore present a health economic evaluation from a multicentre RCT comparing protocolised weaning that includes early extubation onto NIV versus weaning without NIV (Breathe study; ISRCTN 15,635,197).

2 Materials and Methods

2.1 Trial Background

Details of the design and clinical outcome measures for the Breathe study are reported elsewhere [8]. Briefly, patients aged ≥ 16 years recruited from 41 UK critical care units were eligible for randomisation if they had received IMV for over 48 h, were ready to wean and had failed a spontaneous breathing trial (SBT). Patients were randomised to receive either protocolised weaning with extubation to NIV (non-invasive weaning) or protocolised weaning via IMV with daily SBTs (invasive weaning). NIV refers to the delivery of mechanical ventilation without the need for an endotracheal airway. Positive pressure ventilation is delivered to the patient through the mouth or nose via an interface such as a mask or helmet.

Clinicians were permitted to use one of three types of SBT in accordance with local unit practices: a T-piece trial, use of continuous positive airway pressure (CPAP) or low-level pressure support (5–7 cm H2O). A T-piece trial involves the patient breathing spontaneously through their endotracheal tube, with the appropriate inspired oxygen concentration being maintained by a cross-flow device (T-piece). CPAP involved leaving a standing pressure of 5–10 cm H2O delivered via the ventilator at the top of the endotracheal tube but with no assistance on inspiration. A low-level pressure support trial provided 5–7 cm H2O inspiratory assistance.

Each SBT was scheduled to last for at least 30 min and could be increased up to 120 min in patients considered to be at higher risk of re-intubation (e.g. prolonged ventilation, past history of COPD, heart failure). During SBTs, patients were closely monitored for the following signs of distress or fatigue: heart rate > 20% of baseline or > 140 beats min−1; systolic blood pressure > 20% of baseline or > 180 mmHg or < 90 mmHg; cardiac arrhythmias; respiratory rate > 50% of baseline value or > 35 min−1, respiratory rate (min)/tidal volume (L) > 105 min−1 L−1; arterial blood gases; clinical assessments such as agitation, anxiety or depression. A patient was considered to pass the SBT if no signs of distress or fatigue developed. A patient who displayed any sign of distress or fatigue was judged to have failed the SBT.

Clinicians were provided with information about the Walsh criteria, which were suggested as guidance to indicate when the patient was ready to commence weaning [9]. The Walsh criteria recommend that meeting all the following conditions indicates readiness for weaning: cooperative and pain free; good cough; PaO2:FiO2 (ratio of arterial oxygen partial pressure to fractional inspired oxygen) > 24 kPa; positive end-expiratory pressure < 10 cm H2O; haemoglobin > 7 g dL−1; axillary temperature 36–38.5 °C; vasoactive drugs reduced or unchanged over previous 24 h; and spontaneous ventilatory frequency > 6 breaths per minute. Heated humidified oxygen (on both invasive ventilation and NIV) was not mandated or recommended in the study protocol but could be used at clinician discretion in accordance with local unit policy.

A sample size of 364 (90% power, two-sided 5% type I error), allowing for a 23% dropout rate, was required to detect a clinically meaningful median difference of 24 h between the non-invasive and invasive group for the primary outcome of time to liberation from ventilation.

2.2 Study Perspective and Time Horizon

The primary economic analysis was undertaken from the perspective of the UK National Health Service (NHS) and personal social services (PSS) [10]. The time horizon for the within-trial economic evaluation was limited to a follow-up period of 6 months. In addition, a 5-year time horizon was considered for a long-term cost-effectiveness analysis. A discount rate of 3.5% per annum was applied to both costs and effects during years 2–5 for the longer-term cost-effectiveness analysis [10].

2.3 Measurement and Valuation of Resource Use

Resource use data were collected from randomisation to 6 months post-randomisation using case report forms (for initial hospitalisations) and participant questionnaires. The resource and cost components associated with the intervention were aggregated into five groups:

  1. 1.

    intensive care support, including organ support, level of care and use of sedatives,

  2. 2.

    tracheostomies,

  3. 3.

    use of high-cost antivirals and antifungals or other high-cost drugs in the ICU,

  4. 4.

    hospital care between ICU discharge and hospital discharge,

  5. 5.

    use of any emergency transport to transfer patients between hospital sites.

For critical care stays, healthcare resource groups (HRGs) were assigned according to the maximum number of organ systems supported daily during each stay. Critical care HRGs value standard resource expenditures (e.g. staffing, consumables, diagnostics), such that high-cost drugs and interventions were separately collected and valued to derive a total overall critical care cost. Following critical care discharge, the costs of step-down care were based on number of days spent in each step-downward/facility (until death or discharge) multiplied by the respective per diem cost for that level of care [11]. Further details of healthcare resource use can be found in Table 1 in the electronic supplementary material (ESM) and include the number of organs supported, the number of days in ICU, highest level of care and details of tracheostomies, antifungal/antiviral use, inpatient and outpatient care, residential care services, community health and social care, frequency of prescription medications, equipment and aids and additional health resource used by patient or carer/supporter.

Table 1 Economic costs for complete cases for entire follow-up period, by trial allocation and cost category (£; year 2015–16 values)

Differences in resource use between the two intervention groups for the period between randomisation and hospital discharge were determined by (1) comparing the number of days that patients had two or more organs supported while in the ICU (or alternatively, days in level 3 care); (2) comparing differences in overall length of stay in the ICU; and (3) comparing proportions of patients who required tracheostomies, used high-cost antifungals, used high-cost antivirals or needed emergency transport for transfers between hospitals. Patients requiring support for two or more organs and those receiving advanced respiratory support alone were considered to receive level 3 care [12].

Broader health and PSS resource use data (e.g. hospital readmissions, contacts with community health and social care professionals, medication use) were collected at 3 and 6 months post-randomisation using postal questionnaires completed by participants or their primary carers. We also collected data on direct non-medical costs (including travel expenses) incurred by patients and their caregivers, days off work and loss of earnings. Resource use values were converted into costs by applying unit costs obtained from key UK national databases [13,14,15,16,17,18,19,20] (Table 2 in the ESM).

All costs were expressed in £ sterling (year 2015–2016 values). Where appropriate, costs were inflated to year 2015–2016 values using the NHS Hospital and Community Health Services Pay and Prices index [21].

2.4 Measurement and Valuation of Health Outcomes

The primary outcome for the economic evaluation was the quality-adjusted life-year (QALY) [10]. HRQoL was assessed using the three-level EuroQol 5-Dimensions (EQ-5D-3L) [22] at 3 and 6 months post-randomisation. The EQ-5D-3L descriptive system consists of five dimensions (mobility, self-care, usual activities, pain/discomfort and anxiety/depression), each divided into three ordinal levels: (1) no problems, (2) some or moderate problems, and (3) severe or extreme problems. The UK time trade-off tariff was applied to each set of responses to generate an EQ-5D-3L utility score for each participant [23]. Given the challenges of collecting baseline data from patients in critical care settings, we assumed in the baseline analysis that the baseline utility value was − 0.402, the value assigned by the EQ-5D-3L tariff to an unconscious health state [24]. QALY values for each patient were calculated as the area under the baseline-adjusted utility curve [25] assuming a fixed baseline of − 0.402 (equivalent to an unconscious health state) and using linear interpolation between baseline and follow-up utility scores. QALYs were also derived from 6-Dimension Short-Form survey (SF-6D) utilities (UK tariff), generated from responses to the SF-12 as a sensitivity analysis [26], assuming a baseline utility value of zero. Patients who survived the initial hospital admission were also asked to recollect their pre-admission health state using both the EQ-5D-3L and the SF-12 questionnaires.

2.5 Missing Data

Multiple imputation under chained equations [27] was used for missing resource use or HRQoL data, based on the tested assumption that data were missing at random. Regression models were used to estimate missing costs and QALYs at each time point, by treatment allocation, conditional on fully observed baseline variables: age, sex, randomisation centre, presence/absence of COPD, non-operative/operative status and post-SBT PaCO2. In total, 20 datasets were generated using predictive mean matching. Estimates obtained were pooled to generate mean and variance estimates for costs and QALYs in each allocation group over the trial time horizon using Rubin’s rule [28]. All mean incremental costs take into account heterogeneity in baseline costs, leading to estimates of incremental costs that would have adjusted for costs arising from differences between groups in terms of severity at baseline [8].

2.6 Analyses of Resource Use, Costs and Outcome Data

Economic values were summarised by treatment group, resource category and assessment time; differences between groups were analysed using two-sample t tests. Non-parametric bootstrapping, based on 10,000 replications (10,000 was expected to stabilise the confidence intervals [CIs] for point estimates), was used to assess whether differences in mean total costs between allocation groups were statistically significant. EQ-5D-3L utility scores were compared using two-sample t tests.

2.7 Cost-Effectiveness Analysis

A cost-effectiveness analysis using individual patient-level data was conducted. Cost-effectiveness results were expressed in terms of an incremental cost-effectiveness ratio (ICER) and calculated by dividing the difference between trial arms in mean total costs by the difference in mean total QALYs. Value-for-money assessments involved comparing the ICER value with a range of cost-effectiveness thresholds. Cost-effectiveness thresholds held by UK decision makers typically range between £20,000 and 30,000 per QALY [10].

Several types of uncertainty analyses were undertaken. Stochastic uncertainty was presented in terms of CIs, decision uncertainty was undertaken using various cost-effectiveness thresholds that determined the INMB, and uncertainty from heterogeneity was implicitly assessed through modelling the pre-specified covariates (presence/absence of COPD, operative status, baseline costs, baseline utility) considered to be related to cost-effectiveness outcomes. In addition, 10,000 estimates of incremental costs and benefits were generated through non-parametric bootstrapping to determine the level of sampling uncertainty around the ICER. The bootstrap replicates were used to populate cost-effectiveness scatterplots. We also calculated the incremental net monetary benefit (INMB) of using NIV versus IMV across three cost-effectiveness thresholds: £15,000 [29], £20,000 and £30,000 per QALY gained. A positive INMB indicates that the intervention is cost effective compared with the alternative at the given cost-effectiveness threshold. Cost-effectiveness acceptability curves (CEACs) summarised the likelihood that NIV was cost effective as the cost-effectiveness threshold varies.

2.7.1 Sensitivity and subgroup analyses

Sensitivity analyses included (1) adopting a wider societal perspective encompassing direct non-medical costs incurred by trial participants and their families, and economic values placed on attributable work absences being collected in the case report form by asking patients/carers whether items such as travel cost and income lost occurred and how much cost was incurred; (2) restricting analysis to complete cases; (3) using SF-6D utility scores estimated from the SF-12 for the purposes of QALY estimation; and (4) additionally using the pre-randomisation EQ-5D-3L utility value (recalled at hospital discharge) as a covariate for the purpose of QALY adjustments. Pre-specified subgroup analyses were presence/absence of COPD and operative status.

2.8 Longer-Term Economic Modelling

2.8.1 Extrapolation of Survival Data

Long-term cost effectiveness was determined over a 5-year time horizon by extrapolating survival beyond 6 months. The 5-year point was arbitrary; but we felt that extrapolating beyond 5 years would have generated too much uncertainty. For a cost-effectiveness analysis conducted over a lifetime horizon, using data from many patients who were censored would potentially lead to highly uncertain survival rates. Observed survival curves (Fig. 1 in the ESM) showed that most deaths occurred between randomisation and hospital discharge; a 5-year time horizon therefore limited the uncertainty of the long-term cost-effectiveness results compared with modelling over a lifetime horizon.

A flexible parametric model was used to predict survival rates at each time point [30]. Flexible parametric models are commonly used in the assessment of the cost effectiveness of cancer drugs for prediction of survival beyond trial follow-up [31]. Three parametric models were considered: exponential, Weibull and a more flexible model using cubic splines with up to four knots (points at which splines are joined). This involved fitting survival curves using the observed survival data by modelling the background (baseline hazard) risk of death over time and the risk of death due to intervention. The choice of model selection was based on Akaike’s information criterion (AIC). The parameter estimates and corresponding standard errors were reported. The observed survival data (time to death) were used along with a censoring variable, and stratified covariates were included in the model.

2.8.2 Extrapolation of Costs and Quality-Adjusted Life-Years (QALYs)

We adopted a conservative approach to estimating (extrapolated) longer-term costs and health utilities beyond 6 months separately for each treatment group. Costs between 6 months and 5 years post-randomisation were estimated by using the observed 3- to 6-month post-randomisation total costs but also adjusting for covariates and multiplying by the predicted survival probabilities over the 5-year post-randomisation period. The trial-based estimates of health utilities (adjusted for covariates) were used as a basis for estimating utility values beyond 6 months.

The primary assumption for longer-term cost effectiveness was that future costs and utility patterns beyond 6 months were equal, so only predicted survival rates were likely to drive future costs and QALYs. Thereafter, several sensitivity analyses were undertaken for how future cost and utility patterns could behave, including (1) the 6-month utility values were carried forward (linear constancy); (2) utilities declined in a linear fashion; (3) utilities declined exponentially; (4) survival rates were lower in the NIV arm by 10% (justified by an examination of the plots of log-survival and hazard function); (5) future costs and utilities differed between arms based on 6-month values (carried forward); and (6) future costs were assumed equal but future utilities could differ. The estimated utility values in each of these alternative scenarios (adjusted for proportion of patients alive at the corresponding time point) were used to compute the 5-year QALY estimates. A further sensitivity analysis of longer-term cost-effectiveness outcomes adopted a societal perspective.

All statistical and cost-effectiveness analyses were undertaken using SAS® version 9.4 on a Windows platform. A published SAS macro [31] was used for flexible parametric modelling. Reporting was made in line with the CHEERS statement [32]. The trial protocol was designed by the trial investigators and was approved by South Central C Research Ethics Committee (reference 12/SC/0515).

3 Results

Details of the clinical study have been reported elsewhere [8]. Briefly, 364 patients were randomised: 182 to non-invasive weaning and 182 to invasive weaning. The primary clinical endpoint showed no clinical or statistical difference in median time to liberation from mechanical ventilation: median 4.3 vs. 4.5 days; adjusted hazard ratio 1.1; 95% CI 0.89–1.40. Early extubation to NIV did not shorten time to liberation from any ventilation. Approximately 52 and 50% of all health resource use data were complete at 3 months for the non-invasive and invasive groups, respectively; this was 51 and 46%, respectively, at 6 months (Table 3 in the ESM).

A complete QALY profile was available for about 50% of patients (179/364) (Table 3 in the ESM). Data were missing because patients either died before 3 months (n = 82), were not available to provide a response (n = 60) or withdrew from follow-up (n = 37).

3.1 Resource Use and Economic Costs

Resource use for the period between randomisation and hospital discharge was generally higher for patients allocated to the invasive group (Table 1 in the ESM). The proportion of patients who used antifungals was significantly higher in the invasive group (12% for IMV vs. 5% for NIV; p = 0.0168). Broader resource use (post-initial hospital discharge) was similar between the non-invasive and invasive groups.

The mean intervention costs from randomisation until hospital discharge were £29,697 and 32,052 for NIV and IMV participants with complete data, respectively: mean cost difference − £2355; 95% CI – 7292 to 2750; p = 0.4472 (Table 1).

The mean total NHS and PSS costs throughout the first 6 months post-randomisation were, on average, lower for the non-invasive group than for the invasive group: £31,711 versus 32,468; mean difference − 756.20; 95% CI – 6642 to 5246; p = 0.8321 (Table 1). Mean societal costs were £31,934 and 32,999, respectively; mean difference − 1065; 95% CI – 6804 to 5056; p = 0.7981. The wide CIs for differences in mean costs reflected the uncertainty in the estimates. Significant differences between groups in terms of baseline clinical characteristics were not observed.

3.2 Health-Related Quality-of-Life Outcomes and QALYs

There were no significant differences in EQ-5D-3L outcomes between the trial groups prior to hospital admission or at 3 months post-randomisation (Tables 4 and 5 in the ESM).

However, mean EQ-5D-3L utility scores among complete cases were significantly lower at 6 months post-randomisation for the NIV group (0.53 vs. 0.66; p = 0.0147). The mean QALY value was, on average, higher for the NIV group (0.0928 vs. 0.0747; p = 0.4522). The mean improvement in QALYs was because mean utility at a specific time point was based on observed cases and did not consider deaths (utility score of 0) and differential survival rates. Since more patients died in the invasive group, more utility scores were set to zero. Moreover, our QALY estimates are derivations over time, based on modelled estimates of mean utility at each time point, taking into account heterogeneity and missing data.

3.3 Cost-Effectiveness Analysis

The base-case economic evaluation, using imputed attributable costs and QALYs and covariate adjustment, indicated that—over the first 6 months—NIV was associated with a lower net cost (− £302; 95% CI – 5490 to 4760) and a higher net effect (0.02 QALYs; 95% CI − 0.01 to 0.05) and was therefore dominant (Table 2).

Table 2 Cost effectiveness, cost per QALY (£, year 2016 values): non-invasive vs. invasive weaning

The simulated ICERs showed uncertainty largely across the north-east and south-east quadrants of the cost-effectiveness plane (Fig. 1).

Fig. 1
figure 1

a Cost-effectiveness plane, b cost-effectiveness acceptability curve for base case: fixed baseline utility, imputed costs, adjustment for covariates

The NIV protocol showed net economic gains based on INMBs of £541, £620 and £779, on average, at cost-effectiveness thresholds of £15,000, £20,000 and £30,000 per QALY, respectively (Table 2). The CEAC shows that the probability that NIV is cost effective is approximately 57–59% across cost-effectiveness thresholds (Fig. 1). INMBs were similar across scenarios considered by the sensitivity analyses, indicating that the results are robust to alternative assumptions (Fig. 2; Figs. 2 and 3 in the ESM).

Fig. 2
figure 2

Sensitivity analyses and subgroup results (cost-effectiveness threshold of £30,000/quality-adjusted life-year)

3.3.1 Subgroup Analyses

Both presence of COPD and operative status had a notable impact on cost-effectiveness results (Fig. 2; Figs. 2 and 3 in the ESM). For patients with COPD, the probability that NIV is cost effective increased to 82–87% (Table 2). In contrast, the probability of cost effectiveness was < 30% among post-operative surgical patients. A tornado diagram displaying the impact on the INMB of variations in several inputs is provided in Fig. 5 in the ESM.

3.4 Results of Longer-Term Cost-Effectiveness Analysis

The base-case extrapolation analysis, based on mortality predictions using the Royston–Parmar model [3 knots; i.e. a RP(4) model] yielded mean survival times (over the 5-year time horizon) of 41.9 versus 33.3 months for the non-invasive versus the invasive group, respectively. Among the three models [exponential, Weibull and RP(4)] used to predict survival rates to 5 years (Table 7 and Fig. 1 in the ESM), the RP(4) model showed the smallest AIC value (a well-established metric of model fit). Extrapolated survival rate estimates were in broad agreement with those in published studies [33,34,35], which report 1-, 2- and 3-year survival rates of 69%, 50% and 47%, respectively. For the invasive group, these were 65%, 60% and 50% at years 1, 2 and 3, respectively. There was no statistically significant difference between the two survival curves for survival data observed during the trial (log rank p value = 0.366). The above models also confirmed this (Table 6 in the ESM).

The extrapolated survival to 5 years post-randomisation estimated that 67% of patients were expected to remain alive in the non-invasive group versus 45% for the invasive group. However, given the high uncertainty around these estimates as a result of extrapolation, we also conducted sensitivity analyses (Table 8 in the ESM) that included an assumption of equal future (beyond 6 months and 5 years) survival rates between arms.

Under the assumption that future costs and utility patterns beyond 6 months are equal (based on extrapolated survival data), the mean discounted expected QALYs were 2.25 (non-invasive group) and 1.82 (invasive group) (Table 7 in the ESM), resulting in an incremental QALY gain of 0.427. The mean NHS and PSS costs over the entire 5-year period were higher in the NIV group than in the IMV group (£43,759 vs. 41,787, respectively). The 5-year ICER associated with NIV was £4618 per QALY gained (an INMB of £10,838 at a cost-effectiveness threshold of £30,000). The probability that NIV was cost effective was > 90% at a cost-effectiveness threshold of £20,000 per QALY (Fig. 7 in the ESM).

However, the sensitivity analyses indicated that the probability of cost effectiveness for NIV strongly depended on assumptions surrounding future costs, utilities and survival rates beyond 6 months (Table 8 and Figs. 4 and 6 [tornado plot] in the ESM). In particular, under the assumption of equal survival rates between 6 months and 5 years, and equal future utilities, the mean incremental costs and QALYs were (non-invasive vs. invasive group) £6453 and − 0.092 (Table 8 in the ESM), respectively, yielding negative INMBs ranging between £7833 and £9213 favouring invasive weaning. This showed that the incremental QALY gains in favour of NIV were largely driven by the higher extrapolated survival rates for the NIV arm. Hence, despite predicted survival rate estimates being in broad agreement with those in published studies [33,34,35], the uncertainty around the long-term cost-effectiveness results should be carefully considered.

When the projected survival rates were assumed to be lower by 10% in the non-invasive weaning arm (vs. the invasive arm), the INMB fell by > 90% from £10,838 to 603, and the expected probability of long-term cost effectiveness fell to around 79% (Table 8 and Fig. 8 in the ESM). If future survival estimates reflect patterns shown here (Fig. 1 in the ESM), and future patient costs accumulate based on patterns observed during the trial, without improvements in QALYs (Table 8 in the ESM, scenario 4), NIV is no longer be cost effective (change from INMB of £10,838 in favour of NIV to INMB of £6061 favouring invasive weaning).

4 Discussion

This trial-based economic evaluation showed that NIV has potential to be cost effective compared with invasive weaning; the probability of cost effectiveness ranged from 57 to 59%, depending on the cost-effectiveness threshold. This is likely to be largely driven by the difference in the mean ± standard error number of days in the ICU (14 ± 1.08 vs. 15 ± 1.12) and ICU-related costs (Table 1; Table 1 and 9 in the ESM).

This finding remained robust to most sensitivity and subgroup analyses considered. The main exception related to subgroups of patients without COPD and those who required surgery, in whom IMV was the dominant strategy.

The primary clinical endpoint of time to liberation from mechanical ventilation did not show a statistical difference: median 4.3 vs. 4.5 days; adjusted hazard ratio 1.1; 95% CI 0.89–1.40. However, a lack of difference in the time to liberation from ventilation does not imply that costs of ICU care beyond this point are no longer relevant for the purposes of cost effectiveness. In this cost-effectiveness analysis, health resource use and HRQoL beyond liberation from ventilation were considered, which could logically result in differing conclusions. We observed that, on average, patients in the invasive group stayed about 1 day long in the ICU than those in the non-invasive group (Table 1 in the ESM). Moreover, other costs, including broader NHS and broader societal costs, also influenced the observed cost differential in favour of NIV. In addition, improved QALYs associated with NIV were observed once deaths are accounted for (driven by more deaths in the IMV arm). The cost-effectiveness analysis considered a broader assessment of consequences than the clinical study, resulting in a conclusion that NIV weaning may be cost effective, particularly for patients with COPD.

The differences in costs and QALYs between the trial comparators were not statistically significant. However, the trial was not powered to detect such differences over a short-term time horizon. A power calculation for a future trial covering a 5-year follow-up period, using the observed results from this trial, taking into account long-term survival predictions (i.e. using mean costs and projected QALYs and their standard deviations along with a £30,000 cost-effectiveness threshold), would require a sample size of at least 215 per group (430 in total) to show with at least 80% power (5% significance) that the INMB would be > 0. This would be the case even if NIV weaning was more expensive by as much as £2737 (one possible scenario of the sensitivity analyses in Table 8 in the ESM). Hence, this trial provides useful data to prospectively design a future (larger) confirmatory trial with longer follow-up that could demonstrate long-term cost effectiveness. In the more optimistic scenario, where NIV was projected over 5 years to be cheaper by £302 (Table 2; Table 8 in the ESM), the sample size required would be around 120 per group (240 in total). The observed probabilities of cost effectiveness observed in this trial, although < 80%, may partly be because the sample size (n = 364 vs. n = 430) was smaller than required to demonstrate cost effectiveness [36].

Previous clinical trials have reported significant clinical benefits with the NIV protocol among patients with COPD, in contrast to studies that enrolled mixed populations [3]. However, none of those studies reported impacts on HRQoL among survivors. Therefore, our findings add new insights into available evidence, emphasising that other factors may need to be considered when deciding on optimal weaning approaches for patients in critical care settings who present without COPD.

When the benefits of NIV weaning on mortality were extrapolated to 5 years, the NIV protocol remained cost effective (5-year ICER £4618/QALY (INMB £6568 at a cost-effectiveness threshold of £20,000) with probability of cost effectiveness of NIV > 90%). NIV as a bridge to liberation from mechanical ventilation could be recommended on economic grounds provided that the assumptions of constant costs and health utilities beyond the trial period continue to hold in practice. The longer-term cost effectiveness of NIV is highly dependent on assumptions surrounding costs and benefits beyond 6 months. However, we have included sensitivity analyses for various parameters (including survival rates) while noting that estimated survival rates were broadly similar to those reported in the literature [33,34,35]. We also suggest that additional information on longer-term outcomes is needed in future studies to reduce the uncertainty around our estimates.

To our knowledge, this is the first trial-based economic evaluation that compares the cost effectiveness of two protocolised weaning strategies for patients in the ICU. Previous studies have compared the cost effectiveness of NIV weaning with other strategies and in different countries and clinical settings. Chandra et al. [7] conducted a cost-effectiveness analysis of multiple interventions for COPD, including a comparative assessment of weaning protocols (NIV vs. weaning with invasive ventilation), within the Canadian setting, using a Markov probabilistic model. The results indicated that weaning with NIV dominated weaning with invasive ventilation; the probability of cost effectiveness for NIV weaning exceeded 99% at cost-effectiveness thresholds as low as $25,000 per QALY gained. Nonetheless, the analysis was restricted to patients with COPD. We also considered using the data from Chandra et al. [7] as an external input but felt that long-term extrapolations of survival could be misleading or unreliable because the populations differed. Other studies reported costs associated with NIV weaning in critical care settings [5, 6] but did not report HRQoL outcomes and were either based on small sample sizes [6] or focussed solely on patients with a COPD exacerbation [37].

A strength of the Breathe trial was that it was prospectively designed for an economic evaluation using individual-level data. Costs and outcomes were carefully considered in the trial design with a view to reaching a robust cost-effectiveness conclusion based on a large sample. However, potential limitations to this analysis do exist. First, we assumed that the baseline utility value for each patient was − 0.402, the value assigned by the UK EQ-5D-3L tariff to an unconscious health state. This assumption is in keeping with broader methodological practice for trial-based economic evaluations conducted in critical care settings [38]. Moreover, we recently demonstrated that applying alternative fixed baseline utility scores generally had no effect on incremental QALY calculations [38]. Second, approximately 35% of QALY data and 6–40% of costs (at the component level) were missing by 6 months. Had our base-case cost-effectiveness analysis only considered individuals with complete QALY and cost data, we would have removed approximately 50% of patients from the analysis, which would have likely biased the results. After demonstrating that the data were missing at random, we used multiple imputation to ‘replace’ missing values to allow a comprehensive analysis using the whole dataset. Third, the modelling undertaken was constrained by assumptions concerning post-6-month costs and health utilities. However, a systematic search of external studies that compared these two weaning approaches revealed that none reported long-term economic outcomes. As a result, the evidence base for extrapolating cost effectiveness is currently weak. Longer follow-up would have reduced uncertainty surrounding our long-term cost-effectiveness estimates.

5 Conclusions

The results from the Breathe study indicated that early extubation to NIV did not shorten time to liberation from any ventilation. The probability of NIV being cost effective relative to weaning without NIV ranged between 57 and 59% and was higher for patients with COPD (82–87%). Future trials with extended follow-up are needed to reduce uncertainty surrounding the long-term cost effectiveness of NIV.