FormalPara Key Points for Decision Makers

This study presents a cost-effectiveness model that determines the cost-effectiveness threshold for life-extending new, innovative technologies based on health system opportunity costs.

The model suggests that a threshold value of nearly €90,000 per life-year gained for life-prolonging new, innovative health technologies (e.g., drugs) in Germany may not deteriorate the efficiency of the German health care system.

The analysis suggests that adjusting life-years for the quality of life does not decrease the threshold.

1 Introduction

According to an Organization for Economic Co-operation and Development paper [1], value-based pricing (VBP) refers to the regulation of reimbursement or pricing of pharmaceuticals on the basis of their therapeutic value. Accordingly, the following European Union (EU) countries have used some elements of VBP as a part of their policies (in some cases for pricing and in other cases for reimbursement) [1]: the Netherlands, Norway, Sweden, and the UK, based on formal economic evaluation; Belgium, France, and Germany because they assess the added therapeutic benefits of new products grouped into different categories and provide premiums for prices for innovative medicines; Italy for its system of innovation rating used in price negotiations and an advanced practice of performance-based agreements; and Denmark and Spain because they use some of the abovementioned elements of VBP in their decisions. A narrow definition of VBP refers to countries that rely exclusively on VBP for pricing, without the use of external reference pricing [2]. According to this definition, only Sweden applies VBP.

Pricing or reimbursement based on an economic evaluation requires the use of a cost-effectiveness threshold. The cost-effectiveness thresholds for a jurisdiction are termed policy thresholds. A policy threshold may or may not reflect the health opportunity costs of adopting cost-increasing technologies [3]. Cost-effectiveness thresholds that reflect health opportunity costs are called supply-side thresholds [4]. Health opportunity costs exist independently of whether a health care budget is exogenous or endogenous to the decision-making body [3].

Under a finite budget, the uptake of a new drug creates opportunity costs by displacing the existing health care programs. Assuming that the existing programs are displaced at random, the cost-effectiveness threshold should reflect the average cost-effectiveness ratio of the health care system [5], that is, the average of the costs and effects over all interventions provided by the health care system. Assuming that it is care at the margin of health care spending that is displaced, the threshold should reflect the marginal ratio, that is, the costs and effects of the interventions funded by health care spending at the margin. As a word of caution, spending at the margin can be irreversible. This occurs when there is an investment in fixed costs or set-up costs and it is difficult to “on and off” programs from one year to another. Nevertheless, spending on variable costs (e.g., supplies) is reversible, and thus, this presents a discretionary portion of resources. This distinction becomes more complex when the partial withdrawal of a program is feasible. That is, the program may be retained in sub-groups of the population where treatment is more cost effective (e.g., high-risk sub-groups). Another option could be the continued provision of a program, provided it can be procured at a lower price [6]. Unrelated to disinvestment decisions but related to fixed costs, there are more general concerns about the term “marginal”. It was argued that due to the problem of “lumpy” investments, spending decisions are not at the margin but need to be based on increments [7]. Notwithstanding these caveats, neither of the two definitions of a threshold requires the currently funded interventions to be used appropriately or efficiently.

Recently, several empirical studies have been conducted to provide a (supply-side) estimate of the incremental cost-effectiveness ratio (ICER) threshold that reflects health opportunity costs. Ideally, these types of analyses include information on individual-level health spending over a lifetime and across all areas of health care as well as individual-level causes of death and health-related quality of life [8]. Claxton et al. [9] published the first and the most highly cited study in this category based on the theoretical framework described by Martin et al. [10]. It exploited the differences between English primary care trusts in terms of health care expenditure (HCE) and mortality at a single point in time. The authors deployed a widely used econometric technique, instrumental variables, to address the problems of endogeneity and reverse causality. Most importantly, this technique relies on the assumption of the exclusion restriction, that is, instruments need to be uncorrelated with the error term of the structural equation. Nevertheless, as this assumption cannot be proven, the validity of the results of Claxton et al. [9] remains to be established. For instruments capturing socioeconomic deprivation, the assumption that they only indirectly impact mortality through their impact on HCE, but not directly, is questionable. Adding control variables may be necessary to obtain causal effects. However, this requires that the control variables satisfy a set of assumptions [11]. In addition to this concern about the identification strategy, the analysis by Claxton et al. [9] is limited by a lack of information on mortality for more than 50% of the disease areas (the so-called National Health Service [NHS] program budget categories). Finally, the analysis is restricted to capturing contemporaneous health gains and savings (within the same year of health spending) but not delayed health gains and savings (which would require the addition of multiple lags of health gains and savings, respectively). While a more recent analysis of the NHS in England produced “similar results” in terms of the elasticity of all-cause mortality with respect to health expenditure despite “a very different approach to identification” [12], I refer readers to Sampson et al. [4] for a comment on this study. An overview of the empirical estimates of the ICER threshold in various countries was provided by Edney et al. [8].

The variation in the reported supply-side estimates of the ICER threshold has been considerable for countries with similar economic development, ranging from £5000 per quality-adjusted life year (QALY) gained in the English NHS [12] to €74,000 per QALY gained in the Netherlands [13]. While part of the gap may be attributable to cross-country and cross-study differences in budgets, health outcomes, time points (due to changes in country-specific ICER thresholds over time), and econometric specifications (e.g., the omission of lagged health gains), additional investigations are warranted.

Given the abovementioned concerns regarding prior research, this study estimated the supply-side cost-effectiveness threshold based on an alternative methodology, decision modeling. While the threshold can be applied to any health technology, it is specifically applied to new, innovative, life-prolonging technologies (e.g., drugs). The study defines this threshold as the cost-effectiveness ratio of the entire health care system on average or at the margin of health care spending. The model was applied to the German health care system (strictly speaking, German statutory health insurance [SHI]). While this study did not purport to provide a definite answer on the supply-side estimate of the ICER threshold in Germany, it presents a reasonable range of estimate.

2 Methods

2.1 Basic Model

The cost-effectiveness ratio of the entire health care system was calculated based on the approach used by Cutler et al. [14], who divided the change in lifetime spending from one decade to the next by the change in life expectancy from one decade to the next. Specifically, they calculated the ratio of costs per LY gained over the entire life span (i.e., at birth) as well as at 15, 45, and 65 years of age. In the base case they assumed that 50% of the total gains in life expectancy are due to medical care.Footnote 1 My study also calculated the intertemporal differences in lifetime spending and life expectancy. Nevertheless, to avoid using a rule of thumb on the contribution of medical care to LYs gained, it used published data on mortality amenable to health care and all‐cause mortality in the base case (see “Model Application”). Additionally, to account for the age composition of the population, it weighted age-specific intertemporal changes in the remaining life expectancy and health spending by the age-specific population size. That is, it calculated the weighted-average remaining life expectancy and health spending, where the weights correspond to population size.

Differences in the remaining life-years (\(\Delta LY\)) and health care costs (\(\Delta C\)) between period \({y}_{1}\) and period \({y}_{2}\) were obtained in a population with individuals at age i = 1, 2, 3, …, j as follows:

$${LE}\;(y) = \frac{{\mathop \sum \nolimits_{k = 1}^{q} \left( {\mathop \sum \nolimits_{j = 1}^{k} \left( {\mathop \prod \nolimits_{i = 1}^{j} p_{i} } \right)} \right) \cdot N_{k} }}{{\mathop \sum \nolimits_{k = 1}^{q} N_{k} }},$$
(1)
$$\Delta {{LY}} = {{LE}}\;(y = y_{1} ) - {{LE}}\;(y = y_{2} ),$$
(2)
$${{EC}}\;(y) = \frac{{\mathop \sum \nolimits_{k = 1}^{q} \left( {\mathop \sum \nolimits_{j = 1}^{k} \left( {\mathop \prod \nolimits_{i = 1}^{j} p_{i} } \right) \cdot c_{j} } \right) \cdot N_{k} }}{{\mathop \sum \nolimits_{k = 1}^{q} N_{k} }},$$
(3)
$$\Delta C = {{EC}}\;(y = y_{1} ) - {{EC}}\;(y = y_{2} ),$$
(4)

where LE denotes life expectancy, EC is expected costs, \(N_{j}\) denotes population size at age \(j\), \(p_{i}\) denotes the probability of survival from age \(i\) to \(i + 1\), and \(c_{j}\) is the costs incurred during the time interval \((j, \;j + 1)\).Footnote 2 As shown in Eqs. (1) and (3), age-specific remaining life expectancy and health spending are weighted by the age-specific population size. The denominator represents total population size. Note that population size was held constant between the two periods to normalize age-specific HCE and survival by population size. As is customary but not shown in the equations, costs and health benefits were discounted to the present value.

Finally, the cost effectiveness of the entire health care system was calculated as:

$${{CER}}_{{{\text{health }}\;{\text{care }}}} = \frac{\Delta C}{{\Delta {{LY}}}}.$$
(5)

While life expectancy gains have a reverse impact on HCE (i.e., they are expected to increase HCE), the inclusion of this effect is justifiable if the ICER of a new technology includes life extension costs. Nevertheless, to improve the comparability of our estimate with those of published econometric analyses, the study adjusted for reverse causality, that is, it excluded the impact of the gains in life expectancy on HCE. To calculate the latter, it kept age-specific HCE between the two periods constant and determined the age-specific increase in HCE by multiplying the age-specific HCE by the age-specific increase in the remaining life expectancy. Thus, it did not determine the impact of aging on HCE but rather the pure life extension effect (in agreement with Edney et al. [8]).

2.2 Model Application

The German health care system is primarily funded by the insurance premiums paid by employees and employers. The prices of new, innovative drugs are negotiated based on their added therapeutic value. A cost-effectiveness analysis can only be requested by the SHI or the manufacturer if the negotiated or arbitrated price is unsatisfactory. Therefore, the potential of cost-effectiveness analysis to inform the prices of new, innovative drugs is currently limited. Based on a recent methodological change [15], a cost-effectiveness analysis will be conducted using a conventional ICER calculation in the future. Moreover, there are ongoing discussions about reforms in Germany’s drug assessment procedure and the role of cost-effectiveness analysis.

The analysis takes the viewpoint of the German SHI. It first calculated the marginal cost-effectiveness ratio (MCER) of the German health care system in 2014 compared to 2011 based on Eq. (5). To this end, it used data from the German Federal Office of Statistics (probabilities of survival and population size by age and gender up to the age of 100 years) [16, 17] and the Federal Social Insurance Office [18] (HCE by age and gender up to the age of 100 years) for the period between 2011 and 2014. The total HCE includes expenditures for outpatient services (including primary health care), hospital care, pharmaceuticals, medical aids, rehabilitation services, and sick pay. The latter is paid after 6 weeks of sick leave by the SHI. By contrast, co-payments, administration costs, and costs for non-mandatory health care services (including public health services) were excluded.

The year 2011 was chosen as the starting date because of the lack of availability of German data on amenable mortality (see below) for earlier years. The study performed calculations for men and women separately and then aggregated the results. In the base case it discounted both costs and effects at an annual rate of 3% [15] and varied the rate in the sensitivity analysis. To account for inflation, it adjusted costs to 2020 euros using the German Consumer Price Index [19].

To determine the proportion of LYs only influenced by the health care system, I followed the distinction between amenable and preventable mortality suggested by Eurostat [20], which is the statistical office of the EU, whose mission is to provide high-quality statistics for Europe. According to Eurostat’s definition, amenable mortality is avoidable through optimal quality health care “in the light of medical and technology at the time of death”, whereas preventable mortality is avoidable by public health interventions focusing on wider determinants of public health, such as behavior and lifestyle factors, socioeconomic status, and environmental factors. Avoidable mortality encompasses both amenable and preventable mortality but it is smaller than the sum of both because certain diseases are both preventable and amenable. Total (i.e., all‐cause) mortality is the sum of avoidable and unavoidable mortalities. The contribution of health care to the reduction in all‐cause mortality over the observational period (2011–2014) was obtained as the ratio of the difference in amenable mortality to the difference in all‐cause mortality [21]. While data on amenable mortality in Germany were obtained from Eurostat [20], data on all‐cause mortality stem from the German Federal Office of Statistics [22].

Migrants and refugees are included in both the German expenditure and mortality data. If the population became healthier over time due to a healthy migrant effect, it would decrease both total and amenable mortality (i.e., mortality that is avoidable through optimal quality health care) in the population. As the model considers both total and amenable mortality, it indirectly adjusts for population composition and hence migration.

2.3 Supplementary Analysis

While decision makers may displace marginal care when covering a new technology, health care services may alternatively be displaced at random [5]. Therefore, I also estimated the average cost-effectiveness ratio. This was accomplished by comparing today’s health and HCE with that of time “zero”, that is, before the inception of modern health care. To exemplify this approach for the German health care system, this study used a German cohort life table from 1896, which is the earliest German life table available. A cohort life table describes the actual mortality of a cohort as it ages. Based on the cohort life table, all individuals alive in 1896 died by 1960 (i.e., the maximum remaining life expectancy was 64 years). Most importantly, the mortality of this cohort was influenced by health care interventions applied until 1960.

To attribute gains in life expectancy between 1896 and 2014 to the health care system and thus control for the contribution of factors outside the health care system such as lifestyle, hygiene, nutrition, and education, I could not use the Eurostat data on avoidable mortality because they were restricted to the recent past. Perhaps the most informative study on avoidable mortality over a relatively long period was conducted in Australia between 1968 and 2001 [23]. In females, the decline in death rate from causes amenable to medical care contributed 54% to the total decline in mortality rates. In males, the corresponding contribution rate was 32%. In agreement with previous publications [14, 24] as well as our base-case analysis, I assumed that half of the life expectancy gains are attributable to the health care system. Given that this assumption is at the higher end of the range provided by Australian data, it implies a relatively large contribution of health care. Therefore, it overestimates the cost effectiveness of the health care system and underestimates the ICER threshold.

Due to lack of German data on per-capita HCE before the year 1960, the study applied two scenarios describing HCE at time “zero”, one representing a lower bound (biasing the cost-effectiveness ratio downwards) and another representing an upper bound (biasing the ratio upwards). The upper-bound scenario assumed that per-capita HCE was zero in 1896, while in the lower-bound scenario, it was assumed that per-capita HCE was at the level of 1960 [25].

3 Results

As shown in Fig. 1, both life expectancy and per-capita HCE increased monotonically from 2008 to 2015. As average per-capita HCE by gender is not regularly published in Germany (and is not quantifiable based on data from the Federal Social Insurance Office for the years 2008 and 2009), it was not displayed. A linear trend line explained more than 98% of the variance in both variables. Only small oscillations around the linear trend line supported the hypothesis that HCE exert a constant impact on life expectancy (and vice versa), and the period considered is representative. This should alleviate concerns regarding random shocks in the HCE and life expectancy.

Fig. 1
figure 1

Trends in life expectancy of women and men and per-capita health care expenditure in Germany. The upper graph on life expectancy refers to women. R2 values are based on a linear trend line

Dividing the change in HCE by the change in life expectancy over the period of interest (2011–2014) yielded an MCER of €42,634 per LY gained (Table 1). Nevertheless, the MCER overestimates cost effectiveness because factors outside the health care system also contribute to life expectancy gains. For the period between 2011 and 2014, the calculation showed that 48% of the total mortality reduction in Germany is attributable to a reduction in aggregated amenable mortality. Based on this proportion, the MCER of the current health care increased to €88,107 per LY gained (42,634/0.48). Finally, I rule out reverse causality. Given the small gain in per-capita life expectancy attributable to health care in the total population (0.057 years), the increase in HCE was just €135 per capita, leading to MCERs of 42,500 and 87,972 per LY gained before and after adjustment for factors outside the health care system, respectively.

Table 1 Cost-effectiveness ratios of health care spending at the margin before and after adjusting for factors outside the health care system

3.1 Supplementary Analysis

To estimate the average cost-effectiveness ratio, I assumed, for HCE at time “zero”, zero HCE in the upper-bound scenario and HCE at the 1960 level in the lower-bound scenario. Independent of the scenario, however, the average cost-effectiveness ratio was far smaller than the cost-effectiveness ratio of health care spending at the margin, ranging between €23,823 and €33,060 per LY gained, after controlling for the contribution of factors outside the health care system to LYs gained (Table 2).

Table 2 Average cost-effectiveness ratios of health care spending before and after adjusting for factors outside the health care system

4 Discussion

The base-case analysis suggested that a threshold value of nearly €90,000 per LY gained for life-prolonging new, innovative health technologies (e.g., drugs) in Germany may not deteriorate the efficiency of the German health care system. The estimate is based on a 48% contribution of medical care to life expectancy gains, thus agreeing with the estimates of amenable mortality for the 1980s in East and West Germany [26]. Even assuming that all gains in life expectancy are attributable to the health care system yielded a lower-bound estimate for the ICER threshold of already €42,634 per LY gained.

An ICER threshold close to €90,000 per LY gained is supported by data from an independent source, the Global Burden of Disease Study [27]. According to this source, amenable mortality in Germany decreased annually by 0.35% between 2010 and 2015 (based on the Healthcare Access and Quality Index), whereas the decrease in the number of deaths (standardized by age) was considerably larger over the same period, amounting to 1.7% per year [28]. Although a precise estimate of the contribution of medical care to changes in the number of deaths cannot be deduced from these data, they clearly imply that the contribution is less than 50% (0.35/1.7).

The supplementary analysis shows that if the German health care system displaced average care as a consequence of introducing life-prolonging new, innovative technologies, the ICER threshold would be reduced to less than half. Unlike the base-case analysis, which relies solely on official public data from German and European institutions, the supplementary analysis was subject to greater uncertainty due to additional assumptions. Nevertheless, it shows a clear direction of results.

Applying the base-case ICER threshold may lead to a reduction in HCE growth if value-based prices, which are otherwise negotiated, are decreased to meet the ICER threshold. However, they may lead to higher expenditure growth if manufacturers can increase their prices to meet the threshold. Notably, under a finite budget, technology funded at the threshold does not improve net population health. This occurs only when the ICER of a technology is below the threshold.

A limitation of the model is that it attributes life expectancy gains in one year to the HCE in the same year. However, part of the life expectancy gains in year t is attributable to spending in the earlier periods. Similarly, HCE in year t causes lagged health benefits. These two biases may or may not cancel each other out. Nevertheless, given that over the period of interest, increases in life expectancy and per-capita HCE show only a small deviation from the linear trend line, the net bias is likely to be small. Given the trend toward lower life expectancy gains in Germany even before the COVID-19 pandemic [29], which was projected to continue in the future [30], considering the effect of the current HCE on future life expectancy would support our (high) estimates. Previous econometric analyses using no lag or only 1-year lags between the independent and dependent variables suffer at least from a comparable limitation.

Furthermore, while the model (Eq. 3) is based on average age-specific cost data, one may also consider age-specific costs separated according to decedent/survivor status when the corresponding data are available. Moreover, in the numerator of Eq. (1) one may weight cumulative survival probabilities by quality-of-life weights and thus calculate QALYs. A cost-per-QALY threshold would be the same as the cost-per-LY threshold calculated in this study if quality-of-life gains produced by the health care system exactly offset a quality-of-life discount of gains in life-years due to underlying diseases. Nevertheless, if quality-of-life gains were larger than the quality-of-life discount of gains in life-years, the number of QALYs gained would be larger than the number of life-years gained. In this case the cost-per-QALY threshold would be lower than the cost-per-LY threshold, that is, the cost-per-LY threshold would underestimate the value of the health care system (cf. [14]) and thus overestimate the threshold value. However, empirical evidence for the period of interest, 2011–2014, stemming independently from Eurostat [31] and the Global Burden of Disease Study,Footnote 3 suggests that quality-adjusted life expectancy has decreased as a proportion of life expectancy. According to Eurostat, which, strictly speaking, reports life expectancy without any severe or moderate health problems, the reduction was 2%. While there is a theoretical possibility that an increase in this proportion through medical care has been more than offset by a simultaneous increase in risky behavior or environmental factors, this explanation is not supported by Eurostat’s [20] data, which show a decrease in preventable deaths by 4% over the period considered.

Therefore, the data suggests fewer QALYs than life-years that have been produced. Thus, the marginal cost-per-LY threshold presented in this study is at the lower bound of the cost-per-QALY threshold. The cost-per-LY threshold could also serve as the lower bound for assessing the cost effectiveness of purely quality-of-life–improving interventions. In support of a similar threshold (or at least not a lower cost-per-QALY threshold) are intervention-level data suggesting that in a sizable fraction of cost-effectiveness analyses, quality-adjustment did not substantially alter the estimated cost-effectiveness ratios [32, 33].

While “amenable mortality” has been widely used as an indicator to compare the performance of international health systems (e.g., by the Organization for Economic Co-operation and Development), limitations with respect to the underlying data exist. Strictly speaking, the definition of amenable deaths by Eurostat refers to “all or most deaths” that could be avoided. As Eurostat [34] may report “most deaths” but not “all deaths”, the number of amenable deaths reported by may be smaller than in reality. Furthermore, in most causes of death included in the avoidable definition by Eurostat, there is an upper age limit of 74 years [34]. This is because deaths at older ages are often difficult to definitively attribute to a single underlying cause and the chances of death are more affected by coexisting medical conditions and other factors. As both limitations apply to amenable and preventable mortality, they cancel out, to some degree, when dividing the change in amenable mortality by the change in total mortality. This holds, in particular, when unavoidable mortality has only changed to a small degree; that is, the change in total mortality has been largely attributed to changes in avoidable mortality (as shown, e.g., for New Zealand in the period between 1981 and 1997 [35]).

The results of this study for the German health care system may not be generalizable to other countries. Important reasons are differences in the growth of per-capita HCE and life expectancy over time as well as the contribution of factors outside the health care system to life expectancy gains.

In conclusion, the results of this study support an ICER threshold in Germany that lies in the upper range of previous supply-side estimates of the ICER threshold. Future research on the efficiency of the German health care system may collect data on intertemporal differences in health-related quality of life that are amenable to health care, thus allowing the calculation of QALYs gained through health care and an extension of the model to quality-of-life improving medicines. Additionally, continuous updating of the ICER threshold estimate is recommended based on the publication of new data on health care spending, life expectancy, and amenable mortality.