FormalPara Key Points for Decision Makers

Treatment with the biventricular pacemaker (CRT-P) is less expensive than treatment with the biventricular defibrillator (CRT-D). But treatment with CRT-D resulted in a higher expected median survival.

The cost difference between CRT-D and CRT-P is largely influenced by device costs, more frequent hospitalisations and shorter device longevity.

The uncertainty in the cost-effectiveness ratio is mainly driven by uncertainty in the survival benefit of CRT-D compared to CRT-P.

1 Background

Heart failure is one of the leading causes of death in Germany [1]. It reduces survival and impairs quality of life [2,3,4]. Healthcare resource utilisation is high in heart failure—especially due to hospitalisations [5]. The prevalence of heart failure in Germany ranges between 2% and 4%, and rises with age [6, 7]. Cardiac resynchronisation therapy (CRT) is indicated, with the highest recommendation level for patients with symptomatic heart failure in New York Heart Association (NYHA) classes II–IV, reduced ejection fraction ≤ 35% and broad QRS complex, according to the current corresponding European guideline [8].

CRT is a well-established form of treatment that relies on two different treatment options: the biventricular pacemaker (CRT-P) and the cardiac biventricular defibrillator (CRT-D). The additional defibrillator is intended to protect patients from sudden cardiac death. However, CRT-D devices are more complex and costly, with a higher hospitalisation rate due to lead failure, infections or inappropriate shocks, which impairs quality of life [9, 10]. In Germany, 21,479 CRT procedures were performed in 2015, of which about 80% were CRT-D implementations. The relative share of CRT-D on all CRT devices is considerably higher in Germany compared to other European countries [2].

Several RCTs have shown that patients with CRT devices have significantly better outcomes compared to patients solely treated with optimal medical therapy (OMT) or an implantable cardioverter defibrillator [11,12,13]. However, there has been no sufficiently powered head-to-head trial of CRT-D and CRT-P to date. For the majority of patients who are eligible for CRT, both devices are appropriate according to the current guideline [8].

Two studies conducted a health economic evaluation for the German healthcare system, either for CRT-P versus OMT [14] or for CRT-D versus OMT [15]. Therefore, this study aimed to evaluate cost-effectiveness of CRT-D versus CRT-P by an indirect comparison from a German statutory health insurance (SHI) perspective. Since survival is the crucial parameter in this evaluation, a long-term perspective was applied by extrapolating the survival of Kaplan–Meier curves.

2 Methods

A Markov-model was developed to analyse the cost-effectiveness of CRT-D + OMT compared to CRT-P + OMT. To perform a comprehensive analysis the results of CRT-P + OMT compared to OMT are reported as well, as OMT is the low-cost alternative to treat this patient cohort. The model outcomes were quality-adjusted life-years (QALYs) and costs from a German SHI payer’s perspective.

Heart failure is a chronic disease and most common in older people; therefore extrapolation beyond the follow-up was necessary. After 20 years, the model predicted that 13% of CRT-D patients, 10% of CRT-P patients and 0% of OMT patients were still alive. For this reason, a time horizon of 20 years was chosen because a longer life model would add unnecessary uncertainty and the major health and economic outcomes could be captured. In addition, the model converged from that time onwards. Model results for 9 years, the maximum follow-up time of CARE-HF and for 15 years are reported as well. The cycle length was 1 month. The model was conducted for three identical and homogeneous cohorts, differing only in the three treatment strategies: (1) CRT-P + OMT, (2) CRT-D + OMT or (3) OMT. The model was run for a cohort of 1000 hypothetical patients for each strategy. Using a cohort simulation, expected costs and expected survival were calculated [16]. Costs and QALYs were discounted by 3% per year [17]. The outcomes were used to calculate the incremental cost-effectiveness ratio (ICER), which indicates the amount to be spent for an additional QALY. The model was set up in “R” [18] with the package “heemod” [19].

2.1 Target Population

In the European guideline, CRT is recommended for patients with reduced ejection fraction ≤ 35%, broad QRS complex and symptomatic heart failure in NYHA classes II–IV despite OMT [8]. Three treatment options are available for this patient group: OMT, CRT-P and CRT-D. The patients face three different causes of death: heart failure death, sudden cardiac death and non-cardiac death. The additional defibrillator of a CRT-D device should reduce sudden cardiac death. There are two RCTs that cover this patient collective: COMPANION [20] and CARE-HF [21]. The CARE-HF trial was chosen for the analysis of survival because it comprises the longest follow-up of a patient heart-failure collective due to left ventricular systolic dysfunction and cardiac dyssynchrony, which is the relevant patient cohort for researching cost-effectiveness of CRT-D compared to CRT-P. The initial age (66 years) of the hypothetical cohort was set according to the median age of CARE-HF patients at baseline.

2.2 Model Structure

The model consists of six Markov-states (Fig. 1). All patients entered the model via the implementation surgery, which could either be successful or fail. The main part was divided into a CRT section and an OMT section. In case the implementation surgery failed, patients were solely treated with OMT. Each subsection of the model comprised a Markov state “Stable” and a Markov state “Hospital”, which included hospitalisations due to heart failure, lead failure, a device infection or ventricular arrhythmia. The distribution of NYHA classes over time was taken from Colquitt et al. [11] (see Table 1). The absorbing state was all-cause death. The possibility of device upgrades was excluded. The same events could occur in the OMT section, except for hospitalisations due to device malfunctions or infections.

Fig. 1
figure 1

Markov-model for cardiac resynchronisation therapy; CRT cardiac resynchronisation therapy, OMT optimal medical therapy, NYHA New York Heart Association

Table 1 Distribution of New York Heart Association (NYHA) classes over time

2.3 Model Parameters

The model parameters for the chosen model structure consisted of the mortality risk, the probability of adverse events, costs that occur for specific events and health-related quality-of-life (HRQoL) values for the health states in the model. The main model was deterministic. Mortality was the crucial parameter in this modelling. To date there has been no randomised, controlled trial (RCT) that directly compared CRT-D to CRT-P, but many meta-analyses comparing CRT devices have already been performed. Therefore, meta-analyses have been systematically searched via PubMed to investigate the effect of the devices on mortality. We chose the analysis of Woods et al. [12] because they were the only ones who performed a network meta-analysis of RCTs with individual patient data of CRT-D, CRT-P, OMT and ICD trials. More recent meta-analyses such as Barra et al. [13] have not included other relevant RCTs.

Input parameters for the transition probabilities and device longevity calculations are based on single RCTs used by six meta-analyses [11,12,13, 22,23,24]. Two additional references [9, 25] were found by literature search via PubMed. With the exception of one observational study [9], only RCTs were used to compute transition probabilities, as these generally have a higher level of evidence. The probabilities were transformed to the cycle length of 1 month [16] and were pooled with a random-effects model [26] (see Table 2).

Table 2 Input parameters

One-way deterministic sensitivity analyses were conducted (see Table 2). For the probabilistic sensitivity analysis, a Monte Carlo simulation with 10,000 iterations was applied, where the values of the input parameters were assumed to follow certain probability distributions. Based on the recommendations by the ISPOR guideline, distributions for the input parameters were chosen [27]. The input parameters for all three treatment strategies are listed in Table 2.

2.3.1 Effectiveness Data

To estimate survival, the Kaplan–Meier curves for all-cause mortality in the CARE-HF [28, 29] trial for patients with either CRT-P or OMT were digitalised with DigitizeIt [30]. For OMT the publication with the shorter follow-up was chosen because in the subsequent publication a lot of treatment cross-overs were reported. In the CARE-HF study CRT-P was compared to OMT with a maximum follow-up of 9 years. The individual patient data were reconstructed using the R-code by Guyot et al. [31]. We plotted cumulative hazard functions for OMT and CRT-P. In both plots the hazard is monotonically increasing, which indicates a distribution that enables increasing hazards. Afterwards parametric survival curves were fitted (exponential, Weibull, Gompertz, gamma, log-logistic, lognormal). The differences in the Akaike Information Criterion (AIC) indicated a similarly good fit for all curves, except maybe for lognormal and log-logistic. Thus, the parametric survival distributions were selected according to visual conformity to the original Kaplan–Meier survival curve and plausibility in the long run. A Gompertz distribution was chosen for CRT-P and for OMT.

To construct a survival curve for CRT-D, a hazard ratio of 0.81 was applied to the parametric survival curve of CRT-P in order to derive device-dependent mortality rates. This value for the hazard ratio [95% confidence interval (CI) 0.67–0.99] was reported by Woods et al. in an individual patient data network meta-analysis [12], which incorporates the major RCTs for patients with heart failure and reduced ejection fraction who received an OMT, ICD, CRT-P or CRT-D, and is therefore considered the highest available evidence. In total, the meta-analysis included 12,638 patients with 2422 deaths. Mean follow-up was 2.5 years (range 0–7.5 years). For ten trials individual patient data were available. An unadjusted network meta-analysis was performed to determine the overall efficacy of the devices throughout all RCTs. An adjusted analysis was performed for different baseline characteristics to evaluate whether these subgroups experienced different treatment effects. In line with Mealing et al. [32], we assumed that the hazard ratio started to increase after 7.5 years, which was the maximum follow-up in the meta-analysis by Woods et al. [12], until it was 1 after 20 years. The observed survival curves for OMT and CRT-P were based on data from patients aged on average 66 years and the parametric survival curves were then extrapolated over the model time horizon (maximum follow-up 9 years, extrapolation for 20 years). The extrapolation of survival curves over such a long time period is subject to uncertainty. Thus, three scenario analyses were run for that input parameter.

The reported device longevity for CRT-P and CRT-D varied between the cost-effectiveness studies. Therefore, the median device longevity was taken from a National Institute for Health and Care Excellence (NICE) report calculated from approximately 40,000 implementations from 2000 to 2011 [33]. Lower and higher device longevity estimates were taken from cost-effectiveness studies for the sensitivity analysis [30,31,32] (see Table 2). In the case of battery depletion the whole CRT system has to be replaced.

2.3.2 Quality of Life

Beside the importance of survival, heart failure is characterised by decreased quality of life and frequent hospitalisations, which were assumed to also impair quality of life. Quality of life decreases with progress of heart failure. Boczor estimated HRQoL weights for German patients with chronic heart failure (n = 3387) with the EQ-5D-5L questionnaire [34]. HRQoL weights were reported for each NYHA class (range 0.834–0.564). We assumed that patients in Markov-state “Hospital” have a reduced quality of life because hospitalisation itself can negatively affect a patient’s quality of life and, moreover, a worsened health condition has led to hospitalisation that has affected the quality of life as well. HRQoL decrements for a hospitalisation were taken from Griffiths et al. [35] according to the severity of the NYHA class (range − 0.07 to − 0.21). We applied values from other CRT cost-effectiveness studies from 0.93 to 0.6 [23, 36] and 0.815–0.44 [23, 37] as upper and lower bounds within the deterministic sensitivity analysis.

2.3.3 Costs

All costs associated with hospitalisation were calculated using the reimbursement catalogue for hospital admissions (G-DRG flat rate catalogue) [38]. Each hospitalisation could be matched to one or more diagnoses-related group (DRG) codes and each DRG code is graded in ascending case severity. The costs for a hospitalisation event were estimated from these different DRG codes and their severity grades. This was achieved by weighting the DRGs with their distribution of case severities from the G-DRG Report Browser 2018, depending on ICD and OPS (operation and procedure keys) codes [39].

For instance inpatient treatment for heart failure is grouped into four DRGs (F62A: €10,894.26; F62B: €8,058.05; F62C: €2,943.74; F62D: €794.01). Weighting these DRGs with the G-DRG-Report Browser 2018 resulted in a hospitalisation cost for heart failure of €2,926.43. All other costs related to the management of heart failure such as outpatient, rehabilitation and medication costs subject to the NYHA class were derived from Biermann et al. [40] and inflated to 2018 prices (GDP inflation rate 1.38 [41]).

Costs incurred as a result of the implementation surgery were not considered because in Germany all additional hospital stays within 30 days are covered by one single DRG (case consolidation) and thus do not differ between CRT-D and CRT-P [42]. In the base-case analysis, we did not consider future costs that are not related to heart failure.

3 Results

The predicted median survival of CRT-D was 7.8 years and of CRT-P was 6.5 years (see Fig. 2). The median survival with OMT was 4.1 years. Treatment of patients with CRT-D caused an average cost of €32,447, treatment with CRT-P an average cost of €18,502 and with OMT €5,472. Hence, CRT-D compared to CRT-P resulted in €13,945 incremental costs per patient. The incremental cost of CRT-P compared to OMT was €13,029. When comparing CRT-D to CRT-P, 0.57 incremental QALYs were gained over the time horizon. Comparing CRT-P to OMT, 2.23 incremental QALYs were gained. The ICER was €24,659 per additional QALY for CRT-D compared to CRT-P and €5,837 for CRT-P versus to OMT respectively. Table 3 shows the disaggregated results for CRT-D to CRT-P and CRT-P to OMT for different model runtimes.

Fig. 2
figure 2

Model predicted survival curves; RCT randomised control trail, CRT-D cardiac biventricular defibrillator, CRT-P biventricular pacemaker, OMT optimal medical therapy

Table 3 Model results

3.1 Sensitivity Analysis

The top ten most influential input parameters on the ICER are depicted in a tornado diagram in Fig. 3a. The survival benefit (hazard ratio) of CRT-D compared to CRT-P was the input parameter with the most influence on the ICER. As an illustration of how strongly the survival benefit influenced the ICER, Fig. 3b shows the ICER as a function of the survival benefit, using the CI of the hazard ratio (0.67–0.99) as limits. Even small changes in the survival benefit had a major influence. Other influential parameters were the device cost of a CRT-D, the device longevity for CRT-D and probability of the implementation success. In further sensitivity analyses other HRQoL estimates did not greatly alter the model result. With regard to different time horizons, the ICER per additional QALY for 9 years was €41,020 and for 15 years €27,016. The results of the Monte Carlo simulation are presented in the cost-effectiveness plane (Fig. 3c). In total, 96% of all simulated cases were in the north-east quadrant with positive incremental costs and positive incremental QALYs. In 4% of the simulated cases CRT-P was dominant. The average ICER of the Monte Carlo simulation was €22,477 for an additional QALY. In the probabilistic sensitivity analysis, not the costs for hospitalisations were varied, but the distributions of the DRG severity. In Fig. 3d the cost-effectives acceptability curve (CEAC) is illustrated. It displays the probability of being cost-effective at a given willingness-to-pay.

Fig. 3
figure 3

Results of sensitivity analysis of the cost-utility analysis. a Tornado diagram; b incremental cost-effectiveness ratio as a function of the hazard ratio; c Monte-Carlo simulation; d cost-effectiveness acceptability curve. CRT-D cardiac biventricular defibrillator, CRT-P biventricular pacemaker, ICER incremental cost-effectiveness ratio

3.2 Scenario Analysis

Five scenario analyses were conducted for the cost-utility analysis of CRT-D compared to CRT-P. First, life-years (LY) were chosen as an outcome parameter. The ICER was €18,945 per additional LY. Second, to capture overall digitalisation and parametrisation uncertainty, the parametrisation of the CARE-HF trial by Colquitt et al. [11] was used to model survival. Using these data the ICER yielded €21,597 per additional QALY. Since the CARE-HF study is older it can be assumed that the treatment of heart failure has changed over time. Third, to account for possible treatment changes, the Kaplan–Meier curve for CRT-P of a long-term observational study was parametrised [43] and for CRT-D the hazard ratio was applied. This scenario yielded an ICER of €22,073 per additional QALY. This is just a rough estimate because the digitalised Kaplan–Meier curve was a little imprecise (only 550 of 580 deaths could be reconstructed). The estimated median survival varied only slightly between the different parametrisations. Corrao et al. [44] estimated the probability of death within 30 days of hospital discharge (0.047). As a fourth scenario analysis, this estimate was used to correct the probability of hospital mortality upwards for the period after the maximum follow-up. The ICER was €26,006 per additional QALY. Fifth, we calculated the ICER for additional future costs that were unrelated to heart failure. Applying an additional €7,275 per year for individuals aged 65–84 years and €16,616 for individuals over 85 years [45] resulted in an ICER of €34,460 per additional QALY.

4 Discussion

Applying a cohort Markov-modelling approach, we aimed to assess evidence on the cost-effectiveness of the CRT-D compared to CRT-P from a payer’s perspective. The development of the model was supported by systematic reviews for modelling heart failure, in particular CRT [42,43,44], various cost-effectiveness CRT studies [11, 36, 37], as well as by feedback from a cardiologist.

In the base case, therapy with CRT-D was more effective but also more costly. For the German healthcare system there was already one study for CRT-D compared to OMT [15] and one study for CRT-P compared to OMT [14]. Both CRTs were found to be more effective than OMT. Our study added the indirect comparison of CRT-D to CRT-P. In addition, we also compared CRT-P to OMT. In the comparison of CRT-D to CRT-P, the resulting ICER was €24,659 per additional QALY and €5,837 for CRT-P to OMT, respectively.

Our estimated ICER is lower than the ICERs reported in other studies investigating the cost-effectiveness of CRT-D to CRT-P. The range of costs per additional QALY reported in the literature is €30,447–€56,719 in 2014 prices [24]. Most of the studies are based on survival data from COMPANION or CARE-HF. However, the other analyses differ significantly from ours in utility weights, CRT device runtime, model time horizon, and the hazard ratio used. ICER differences could also be attributed to modelling disparities. A further reason for the lower ICER could be decreased costs for CRT devices over time or the reduced price difference between the CRT devices. The British HTA by Colquitt et al. [11] reported an ICER of €30,420 per additional QALY, which is quite similar to our one [24].

In Germany no official threshold exists for the assessment of cost-effectiveness. If one follows the result of a recent study that surveyed individuals in Germany about their willingness to pay (WTP) per additional QALY, CRT-D would not be cost-effective. In the study, the WTP ranged from €8580 to €18,420 per additional QALY [46]. CRT-P, on the contrary, would be cost-effective compared to OMT in relation to the specific WTP per QALY. According to the NICE guideline, the ICER for an additional QALY would be cost-effective [range €23,600–€35,400; 1.18 as exchange rate (27 March 2019)] [47]. The CEAC (Fig. 3d) can be interpreted as the probability of being cost-effective at a given threshold from a payer’s perspective. Taking the upper boundary of the NICE threshold as a hypothetical WTP, CRT-D would be cost-effective with a probability of 77% for an additional QALY.

With respect to the sensitivity analysis the model was robust, only 4% of the probabilistic ICERs were in the North-West quadrant. However, the model results have sensitively responded to changes in the survival benefit. It had by far the strongest influence on the model results. There is an ongoing debate about the additional value of the defibrillator in science [48, 49]. For instance, a reduction of sudden cardiac death as a result of improved medical therapy such as sacubitril/valsartan or improved utilisation of beta blockers, ACE inhibitors and mineralocortocioid antagonists [50], would be expected to decrease the survival benefit of CRT-D and thereby increase the ICER. In addition, Shen et al. [51] reported that the rate of sudden cardiac death in heart failure studies decreases over time. Furthermore, in patients who survived the first 5 years after implementation, the risk of sudden cardiac death was low, and the type of device was not a significant predictor for survival [48]. For this reason, the survival benefit of CRT-D decreased steadily after the maximum follow-up time in the meta-analysis [12]. A reduced survival benefit could be seen as critical in terms of harms aligned with CRT-D devices, such as device malfunctions, increased complication risk or impaired quality of life due to inadequate shocks [52, 53, 54].

Discussions pertaining the impact of survival benefits on cost-utility of CRT-D must be approached with caution, as the model input data for survival comes with some degree of uncertainty. While we acknowledge that we have used the best available hazard ratio for the survival benefit, there are some restrictions. First, the survival curves of CRT-P and OMT have been extrapolated to a long term scenario. Second, since no direct comparison of CRT-D to CRT-P was available in an RCT, the death probability of CRT-D was estimated with a hazard ratio applied to the survival curve of CRT-P.

Another influential input parameter was the device longevity. Device life and device changing costs are interconnected. With higher battery capacity, the high device changing costs would be incurred less frequently. Technological progress in battery capacity would change the ICER in favour of CRT-D. There are some studies on the longevity of CRT-D devices [55, 56], but there are no studies explicitly researching for CRT-P device longevity [2], and Colquitt et al. [11] noted that the estimates that were reported by NICE [33] could be overestimated. The ICER for a model horizon of 15 years was comparable to the base-case ICER, but if the model horizon was only 9 years, the survival advantage was not so pronounced.

The study has several limitations. First, as already mentioned, the input parameters were derived from older studies. Thus, evidence may be outdated considering the significant progress that has been made in treatment of heart failure patients. To control for this effect in the scenario analysis, the Kaplan–Meier curves of the recent 16-year-long observational study by Leyva et al. [43] were used to model survival, which did not change the results substantially. Second, hospitalisation costs were modelled accurately while outpatient and medication costs were approximated, and costs of sacubitril/valsartan were not included. However, this bias is assumed to be negligible because these parts of the costs were comparatively low. Third, there are more complex approaches to model heart failure that take more possible events into account, but the data basis is partly not sufficient and moreover the survival benefit superimposes most input parameters. Fourth, the probability of hospitalisation or death was not based on time-dependent variables like previous hospitalisations, since no data were available. Fifth, information on NYHA class changes was only available until the 19th month. The cost of monthly NYHA management and utilities depends on NYHA classes. We conducted scenario analyses, assuming the same NYHA class distribution for OMT and CRT for the remaining runtime. Model results were not sensitive to changes in this parameter.

In contrast to Yao et al. [37] and Colquitt et al. [11], this modelling approach did not distinguish between different mortality sub-classifications, because otherwise the hazard ratio by Woods et al. [12] would not have been sufficient to inform the model. The hazard ratio is taken from a patient data network meta-analysis of RCT-CRT studies, and was therefore assumed to be the best available evidence. Moreover, such a division of mortality could be inaccurate [12], and Cleland et al. [28] only reported number at risk in the Kaplan–Meier curves for all-cause death, which was needed to digitalise the curves with the highest accuracy.

A strength of the modelling is the use of pooled-effect estimators for the input parameters wherever feasible. To check for inaccuracy in the digitalisation process of the Kaplan–Meier curves, the parametrisation of Colquitt et al. [11] was used, which did not change the ICER.

5 Conclusions and Future Research

A literature-based cohort Markov-model was developed to examine cost-effectiveness for CRT-D to CRT-P from a payer’s perspective in Germany. CRT-D compared to CRT-P was more effective but also more costly, yielding an ICER of €24,659 for an additional QALY and an ICER of €18,945 for an additional life-year. Future technological changes in device longevity or lower device costs for CRT-D could reduce the ICER. Although results have to be seen in the context of uncertainty especially concerning the survival benefit, our sensitivity analysis shows that our model results were robust. For a reduction of uncertainty a direct comparison under RCT conditions with a longer follow-up would be desirable. It could provide the necessary data to perform subgroup analyses at the patient level to weigh the advantages and disadvantages more properly.