Annals of Operations Research

, Volume 251, Issue 1–2, pp 301–324 | Cite as

Fuzzy approach to decision analysis with multiple criteria and uncertainty in health technology assessment



Decision making in health technology assessment (HTA) involves multiple criteria (clinical outcomes vs. cost) and risk (criteria measured with estimation error). A survey conducted among Polish HTA experts shows that opinions how to trade off health against money should be treated as fuzzy. We propose an approach that allows to introduce fuzziness into decision making process in HTA. Specifically, in the paper we (i) define a fuzzy preference relation between health technologies using an axiomatic approach; (ii) link it to the fuzzy willingness-to-pay and willingness-to-accept notions and show the survey results in Poland eliciting these; (iii) incorportate uncertainty additionally to fuzziness and define two concepts to support decision making: fuzzy expected net benefit and fuzzy expected acceptability (the counterparts of expected net benefit and cost-effectiveness acceptability curves, CEACs, often used in HTA). Illustrative examples show that our fuzzy approach may remove some problems with other methods (CEACs possibly being non-monotonic) and better illustrate the amount of uncertainty present in the decision problem. Our framework can be used in other multiple criteria decision problems under risk where trade-off coefficients between criteria are subjectively chosen.


Multiple criteria decision making Fuzzy preferences  Uncertainty  Health technology assessment Willingness to pay Preference elicitation 

1 Introduction

The motivation for the present paper comes from the complexity of decision problems encountered in health technology assessment (HTA)—an interdisciplinary field evaluating the consequences of using health technologies (e.g., drugs, medical procedures, diagnostic tests) in order to suggest to a decision maker (e.g., the health sector regulator) which technologies should be reimbursed or recommended for use in clinical practice (Gold et al. 1996). In current operations research literature this problem is modeled as a stochastic multiple criteria optimization task (Moreno et al. 2010; Zaric 2010). Using a survey we show that this framework should be extended, because preferences in HTA are best described using a fuzzy approach. Following this insight we develop a methodology allowing to incorporate fuzziness in stochastic multiple criteria decision problems in HTA and we propose tools to analyze such models.

Multiple criteria appear in HTA because treatments are assessed with respect to their clinical effects and cost. The clinical effects typically are very complex, but in applied HTA they are often reduced to a single measure, quality-adjusted life years (QALY), expressing preferences towards duration and quality of health (Pliskin et al. 1980; Bleichrodt et al. 1997). Still, the decision maker is left with two criteria, wanting to maximize the effects and minimize the cost at the same time. Effectively she needs to decide about her willingness to pay (WTP) for one additional unit of health effect. This WTP may be given explicitly (e.g., in Poland as of this moment, 119,577 Polish Zlotys, PLN, per one year of life in full health,1 where ca. 3.7 PLN = 1 USD) but usually it is not. For example the National Institute for Health and Care Excellence (NICE, United Kingdom) does not reveal its decision making rules, nor does it admit whether or not such a threshold value exists. Empirical studies located this threshold to amount to approx. 35,000 GBP (Devlin and Parkin 2004). Even in Poland, where WTP is defined, proving that additional effects cost less than the threshold value does not guarantee the reimbursement, and sensitivity analysis showing the impact of WTP on the results is usually required.

Decision problems in HTA also involve uncertainty regarding both effects and costs; the actual outcomes differ between individual patients due to inherent randomness of the treatment process (first-order uncertainty), and even though the decisions are usually based on average characteristics (i.e., the expected cost and effect), the uncertainty regarding these mean values remains (second-order uncertainty) as they are only estimated based on random samples (e.g., randomized trials, registry data, expert surveys), cf. Briggs et al. (2012) or Zaric (2010).

Various tools have been proposed in the literature to support decision makers in HTA problems in view of uncertainty and necessity to perform sensitivity analysis with respect to WTP (Hout et al. 1994; Löthgren and Zethraeus 2000; Briggs and Fenn 1998; Fenwick et al. 2001; Eckermann and Willan 2011; Moreno et al. 2013). We present some of these tools in Sect. 2, but we claim that they do not resolve the actual problem, because—for now imprecisely—if WTP is known, they are not necessary; and if WTP is given with a stochastic uncertainty, then it can be merged with the uncertainty of the cost and effect measurement. Either way, we end up with a decision problem with a single stochastic criterion. The survey conducted by us in Poland among HTA experts of academic, pharmaceutical industry, and consultancy background shows that they do not have crisp opinions about what value of WTP should be used, but rather point to a range of values. We claim that this ambiguity is not of stochastic nature: there is no true value that we elicit that is only veiled by estimation error. Typically a person cannot decidedly state a single value to use to trade life (obviously important) off against money (obviously scarce). When respondents are forced to present a single number they do it, but they are not fully convinced of this value, but rather split between moderately and rather convinced. Hence, a fuzzy approach is a natural model to express such preferences.

Additionally, previous research in behavioral economics in general but also specifically in HTA shows that willingness to pay may greatly differ from willingness to accept (WTA), i.e., the amount we require to give back some goods (Severens et al. 2005). From an operations research perspective this can be expressed as varying value trade-off between cost and effects conditional on the sign of effects. Due to uncertainty in measurement of both criteria, we may actually not be sure that with a given technology we are buying health (new technology is on the average more effective than the current standard) or selling it (it is less effective), which additionally complicates the picture and should be accounted for in a proper decision making model.

To the best of our knowledge, the present paper is the first one to introduce fuzzy modeling to the methodology of health technology assessment. The major contributions are as follows. Firstly, we propose a set of axioms from which to derive the decision maker’s fuzzy preference, and we formalize the way WTP and WTA should be modeled in such an approach. In order to make the model more tractable and axioms more intuitive, we build this model in the certainty case (i.e., the cost and effect given precisely). Secondly, we report survey results showing that fuzzy approach is indispensable to reflect actual opinions of HTA experts, and providing insight into the properties of various methods how fuzzy preferences can be elicited. Thirdly, we suggest tools that can be used in a decision making process when multiple criteria, uncertainty, and fuzziness are combined. In particular we show how our model built for certainty case can be extended to account for uncertainty. The tools we propose can be juxtaposed with concepts used in standard HTA, but we show how our approach may provide better understanding of the problem.

The motivation for the study of fuzzy multiple criteria stochastic optimization problems comes strictly from the health technology assessment field. The ideas and obtained results can also be applied to other areas where uncertainty is present and multiple criteria may be difficult to trade off against each other (e.g., environmental problems).

The structure of the paper is as follows. In Sect. 2 we define more formally the decision problems typically encountered in HTA, and introduce basic terminology. In Sect. 3 we introduce fuzziness: we first present the survey results motivating our research, and then define the fuzzy preference structure in the certainty case we build on. In Sect. 4 we introduce uncertainty, put forward our tools, verify the properties thereof, and show illustrative examples. Section 5 concludes the paper. All proofs are “Appendicized”.

2 Decision problems in health technology assessment

In HTA health technologies (e.g., drugs, medical procedures, diagnostic procedures, health programs) are compared with respect to, inter alia, clinical and economic criteria.2 In the present paper we focus on cost-effectiveness analysis (CEA), where these two criterial are directly confronted. The effect is often expressed as a composite measure: quality-adjusted life years (QALY), which represent a health profile (living in some health for some time) as equivalent number of years in full health. QALYs can be treated as an á la von Neumann–Morgenstern (vNM) utility measure of preference towards a health profile accounting for longevity and quality of life (for an axiomatic approach see Pliskin et al. 1980; Bleichrodt et al. 1997). Various types of cost can be analyzed (drugs, hospital, medical transport, etc.) depending on the perspective of analysis, but they are typically aggregated into a single criterion. As the motivation for the present paper comes from the decision problems encountered by the health care sector, we assume the public payer perspective.

With no uncertainty we would simply represent each technology by two numbers: its effect and cost. In general, several technologies may be compared in a given decision problem. In the paper we propose to approach this problem by defining a preference relation between two health technologies. Therefore, in what follows we focus on comparing two technologies, denoted by \(i=1,2\). Each is then described by an ordered pair \((e_i,c_i)\) of its effect and cost, respectively. Disregarding a trivial case of dominance, we can assume without loss of generality that \(e_1>e_2, c_1>c_2\), and calculate the incremental cost-effectiveness ratio (Black 1990):
$$\begin{aligned} \text {ICER}=\frac{c_1-c_2}{e_1-e_2}, \end{aligned}$$
measuring the cost of an additional unit of effect when switching to technology 1 from technology 2 (Garber 2000). We can say that we are buying health with this switch. If WTP is known and \(\text {ICER}<\text {WTP}\) then this switch is recommended as intuitively, we buy effects cheaper than our reservation price. Algebraically, we can define the net benefit for each technology as \(nb_i=e_i\times \text {WTP}-c_i\), i.e., a scalar evaluation combining both criteria. Then in our case \(e_2\times \text {WTP}-c_2=nb_2<nb_1=e_1\times \text {WTP}-c_1\) and indeed technology 1 is recommended. The difference \(nb_1-nb_2\) is called incremental net benefit. We will usually omit the word incremental for brevity when it is clear we compare two technologies, and the analysis is done for a new technology (technology 1) trying to motivate its usage in place of a current standard (technology 2).

Observe that ICER cannot be sensibly defined in case of dominance: we get \(\text {ICER}<0\) when the new technology is either dominant or dominated (and so we may get the same ICER value in two, radically different situations). Additionally, if \(e_1<e_2, c_1<c_2\), then \(nb_1>nb_2\) (and so technology 1 should be used) whenever \(\text {ICER}>\text {WTP}\). Therefore, the interpretation of ICER hinges upon the sign of absolute differences of cost and effect (Obenchain 1997). Another element comes from behavioral economics. It defined an endowment effect: a greater reluctance to give up things decision maker already has as compared to readiness to pay to obtain new things (Kahneman et al. 2009). Similarly, in HTA it was suggested that a different threshold (willingness to accept, WTA) should be used when considering a switch to a less effective but cheaper therapy (O’Brien et al. 2002) than if we switch to a more effective therapy (willingness to pay, WTP).

So far we assumed a deterministic case, but uncertainty is an inherent element of health care markets (Arrow 1963). As mentioned in the introduction, there are two levels of uncertainty prevailing in HTA decisions. First-order uncertainty denotes that an outcome in a given patient (i.e., clinical effects & cost) is stochastic: the outcome will vary between individual patients due to the randomness of the treatment process. We assume that the decision maker disregards this type of uncertainty completely and is concerned only about the expected outcomes. We do that, because typically many patients will be treated with a given technology, and so the actual average will be close to the expectation. If the cost of each individual treatment is paid from a single budget (and not from each patient’s pocket), it is the average expenditure (for a given total number of patients treated) that matters. Secondly, health effects are expressed as a vNM utility. Therefore, by definition the attitude towards risk is incorporated in this measure.3

Second-order uncertainty stems from us not knowing the average outcomes precisely as these are only estimated based on sample data (e.g., clinical trials, perhaps embedded in some pharmacoeconomic model). We take the Bayesian approach in which we know the a posteriori distribution of these averages, denoted as random variables \(E_i, C_i\), for effect and cost, respectively, for technology i. This distribution is often an empirical one, obtained via Monte Carlo simulation of a pharmacoeconomic model or some bootstrapping procedure (Hunink et al. 1998; Briggs et al. 2006, 2012); the uncertainty is then represented as a cloud of points in the cost-effectiveness plane (CE-plane), cf. Fig. 1.

We take the following standpoint here, typically also assumed in HTA decision making. The decision maker is, in general, risk neutral also with respect to the second-order uncertainty, and so is interested in the mean values of the a posteriori distribution: \({\mathbb {E}}(E_i)\) and \({\mathbb {E}}(C_i)\). We can neglect the variability in effects again due to QALY being a vNM utility. We neglect the variability in cost as it is paid from a budget collected from many individuals via taxes and so the risk of investment is spread over many taxpayers (by the means of the Arrow and Lind 1970 theorem). The rationale behind being risk neutral was discussed in greater details, e.g., by Claxton (1999).

There are, however, several reasons to understand and present to the decision maker the magnitude of uncertainty. Firstly, it has been argued that a slight risk aversion of the decision maker could be assumed (Zivin and Bridges 2002). The decision maker might then be interested in learning the amount of uncertainty, also in order to be able to know how much weight to put on the other criteria (except for cost-effectiveness). Secondly, knowing this amount would allow her to determine if looking for additional evidence is needed to make a more informed decisions. Thirdly, the decision maker may clearly perceive as qualitatively different situations in which the analysed technology is more effective and costly (and we buy additional health) and situation in which the analysed technology is less effective and cheaper (and we sell health). Focusing only on the mean of posterior distribution does not allow to notice the difference between these situations.
Fig. 1

Cost-effectiveness (CE) planes for two different comparisons (quadrants numbered in corners). A single points denotes a draw from a distribution of \((E_1-E_2,C_1-C_2)\). New technology is more effective and costly than the standard one in the left part: \(E_1-E_2\sim N(5,1.22), C_1-C_2\sim N(5,1.22), \rho =0\); and oppositely in the right part: \(E_1-E_2\sim N(-5,1.22), C_1-C_2\sim N(-5,1.22), \rho =0.66\). Slopes of OA, OB depict limits of 95 % CI for ICER

ICER can be calculated based on the expected values of \(E_1-E_2, C_1-C_2\) (if they do not point to dominance). Should WTP be known, we could analyze a random variable denoting the (incremental) net benefit \(\text {NB}=(E_1-E_2)\times \text {WTP}-(C_1-C_2)\). We then base our decision on the expected net benefit (ENB), i.e., \({\mathbb {E}}(\text {NB})\), and measure the amount of uncertainty, e.g., as \({\mathbb {D}}^2(\text {NB})\). As WTP might not be given, another idea would be to calculate confidence intervals (CI, typically 95 % CI) for ICER, e.g., to see how much below our imagination of WTP it is. In Fig. 1 these 95 % CIs are depicted by the lines OA, OB (whose slopes illustrate the limits of 95 % CI). If the cloud of uncertainty is in the 1st quadrant and 95 % CI of ICER is below any value we possibly think of as a WTP, then sensitivity analysis confirms the decision to be robust (i.e., we are confident that for the true value of \((e_i,c_i)\) it is indeed best to select \(i=1\)).

It has been shown that many technical difficulties arise in this approach, ICER being a ratio measure (Briggs and Fenn 1998). Some methods of calculating CIs give imprecise results, and some may yield CIs of a form of disjoint intervals or not yield any result when \((E_1-E_2,C_1-C_2)\) is distributed in several quadrants of the cost-effectiveness plane (CE-plane). E.g., in Fig. 2 the uncertainty is scattered over all quadrants and no meaningful 95 % CI can be given for ICER.
Fig. 2

CE-plane (left part) with uncertainty scattered over all quadrants; \(E_1-E_2\sim N(0.2,3), C_1-C_2\sim N(-0.2,3), \rho =0.9\). CEAC for a new technology (right part) is non-monotonic

Problems with CIs for ICERs led to cost-effectiveness acceptability curves (CEAC) as a tool for doing sensitivity analysis. CEACs are functions presenting \({\mathbb {P}}(\text {NB}\ge 0)\) (and so the probability that the decision to use technology 1 is actually correct) depending on WTP (Hout et al. 1994; Löthgren and Zethraeus 2000). In Fig. 2 we present CEAC associated with the scatter plot in the CE-plane to the left. For instance, CEAC value for \(\text {WTP}=1\) is simply the fraction of points below the BOE line. The impact of WTP on CEAC can be seen as the changes in this fraction when the line rotates counter-clockwise around the origin. CEACs can always be drawn for any distribution of \((E_1-E_2,C_1-C_2)\). Severens et al. (2005) suggested a way for CEACs to account for the possible \(WTA\ne WTP\); in the last section we discuss how our fuzzy understanding of the decision problem casts some doubts on this suggestion.

Still, many drawbacks of CEACs have been pointed out in the literature (Fenwick et al. 2001; Sadatsafavi et al. 2008). One of the deficiencies, important for the current paper, is that CEACs need not to be monotonic (Fenwick et al. 2004; Jakubczyk and Kamiński 2010), as in Fig. 2. Hence, CEACs fail to comply with the basic intuition that increasing WTP denotes our increasing reservation price and we should be either willing to switch to a technology more (if it is more effective) or less (if it is less effective). Non-monotonicity may result in a technology looking attractive for some WTP, but unattractive for both WTPs smaller and greater, that may confuse a decision maker.

Observe, furthermore, that CEACs do not resolve the basic problem of an unknown WTP, and nor do confidence intervals. We still need to know the WTP to either see if it falls into the confidence interval, or to know at which point of CEAC to look. In CEAC we cannot directly show what values of WTP are more reasonable than others. If the WTP were known, we could simply analyze the NB distribution directly. If the WTP were known as a probability distribution, we could account for it when building and analyzing the NB distribution.

In the next section we show that the situation with WTP/WTA is qualitatively different. The trade-off between effects and cost is perceived by decision makers in a fuzzy manner (Klir and Yuan 1995; Lee 2005), i.e., some values are acceptable, some are somewhat acceptable, etc. Fortunately, as we show, it is possible to perform elicitation of these preferences using a simple questionnaire. Accounting for fuzziness of WTP/WTA in HTA requires defining new tools (Sect. 4.1).

Below (Sect. 3.2) we start building our model. We neglect the uncertainty altogether in the first step, which makes it easier to define the primitives of our model, i.e., the axioms on the preference structure. We can say we focus on a single point of the CE-plane as presented, e.g., in Fig. 1. In this simplified context we are able to introduce fuzziness and, in particular, define a fuzzy net benefit—a measure of (dis)satisfaction from a given decision that accounts for possible different attitudes towards buying or selling health. We then (Sect. 4) introduce uncertainty, i.e., account for the fact that the decision maker does not know the actual effect and cost increments, and so does not know the actual fuzzy net benefit. Because we generally believe that the decision maker is risk neutral also wrt second-order uncertainty, we define the expected fuzzy net benefit, which is still capable of accounting for different prices with varying confidence being put on health depending whether health is gained or lost. Nevertheless, in order to present the amount of uncertainty present in the problem (motivated by the discussion above), we also define the fuzzy acceptability concept, showing the probability that implementing a considered technology is optimal. Fuzzy acceptability can be compared to standard CEAC, but we show (Sect. 4.2) how the fuzzy version of CEACs resolves some paradoxes of regular ones (lack of monotonicity). We also show expected fuzzy net benefit may present the actual uncertainty present in decision problems better that standard tools. In this sense our fuzzy approach can be treated as an improvement over the sensitivity analysis typically present in the current literature (i.e. one in which no difference is made between the values of WTP/WTA regarding whether they are perceived as more or less appropriate by the decision maker).

3 Fuzzy preferences on CE-plane

3.1 Survey results

In order to verify our intuition that WTP/WTA cannot be located precisely by individuals (not mentioning potential inter-personal differences) we conducted a survey among HTA experts in Poland, where HTA has been developed for many years now. The Polish Agency for Health Technology Assessment was established in 2005 and since then has been supporting the Ministry of Health in the decision making process. The first HTA guidelines were issued in 2007. Currently the HTA process is strongly regulated by the reimbursement act issued in 2011 and accompanying documents. The Polish Pharmacoeconomic Society, currently grouping over 150 HTA experts, was formed as early as in the year 2000.

We explicitly asked respondents to focus on one specific area, diabetes, to avoid dilemmas whether they should be thinking in terms of very specific situations, e.g., rare or ultra-rare diseases.4 We informed the surveyed that we ask about their opinions and preferences, not about current regulations in Poland or other countries. The list of questions (shortened and translated) is presented in Table 1. Observe that questions 4, 6–8 relate to WTP, while question 5 relates to WTA. Questions 4 and 5 were meant to allow to look into the asymmetry between WTP and WTA. Questions 4 and 6–8 were designed to compare methods of WTP elicitation (not to overly complicate, we decided not to add any more questions on WTA). Lastly, we asked about the perceived difficulty of questions 4–7.
Table 1

Questions used in a survey among HTA experts


Answers allowed

1. Cost (beyond effects) should be used as one of the criteria

5-point Likert

2. An exact threshold for ICER should be used in decisions

5-point Likert

3. The above threshold should be publicly announced

5-point Likert

4. If \(e_2-e_1=1\), should 2 replace 1 (for various \(c_2-c_1>0\))?

5-point Likert

5. If \(e_2-e_1=-1\), should 2 replace 1 (for various \(c_2-c_1<0\))?

5-point Likert

6. What threshold (range, if unsure) should be used as WTP?

A number/range

7. What single value of the above range is most appropriate?

A number

8. How convinced are you that the above value is proper?

Not at all/moderately/quite/fully

9. How difficult was it to answer questions 4–7?

Very easy/easy/difficult/very difficult

Our sample encompassed 27 respondents, but we rejected two surveys due to important internal inconsistencies (increasing tendency to accept a technology the more expensive it is in question 4). Three surveys were only evaluated with respect to the first question (the answer: totally disagree), such approach was predefined and explicitly mentioned in the survey. Missing answers were simply omitted, but there were only four missing values in all questionnaires.

The general opinion prevailed that economic aspects should be included, and based on some publicly available threshold: the percentage of agree/totally agree to the first three questions amounted to 88, 90.5, and 100 %, respectively. At the same time the respondents were quite unsure what exact value should be used as WTP/WTA as presented below.

Figure 3 presents the results of WTP (upper part of the figure, question 4) and WTA (lower part of the figure, question 5) elicitation. In question 4 (question 5) we asked whether a technology should be used if it offers one unit of effect (QALY) more (less), when it costs more (less) by some specific amount. Various specific amounts were used, between 0 PLN and 5 million PLN. For each amount the respondents were asked to express their conviction that the technology should be financed from public resources, using a 5-point Likert scale from 1 (totally disagree), thru 3 (no opinion), to 5 (totally agree). In Fig. 3 greater fraction of respondents selecting a given conviction level for a given amount is depicted by a darker shade of gray (black denoting that all the respondents selected this answer for a given amount, white denoting that nobody gave that answer). Hence, Fig. 3 presents a general overview of replies coming from all the surveyed. More than a half (54.5 %) of the respondents used the middle Likert answer for at least one cost value. There was just one person who directly switched from totally agree to totally disagree as incremental cost changed in question 4 (and also immediately switched in the other direction in question 5). More than a half (59 %) reported a range in question 6.
Fig. 3

Results of WTP (upper part) and WTA (lower part) elicitation. Shade of gray used in both grids denotes the fraction of respondents selecting a given Likert-scale answer for a specific value paid for (or obtained for) a unit of health

Except for a single respondent (the immediate switcher mentioned above) no one was fully convinced of the exact value reported in question 7 (not even these who reported a single value in questions 6 & 7). Almost a half (45 %) were at most moderately convinced. Respondents tended to find WTA question (5) more difficult than WTP (4), and questions demanding directly reporting a WTP range (question 6) more difficult than stating their conviction for predefined WTP values (question 4), both on the verge of statistical significance (Mann-Whitney test, p values 0.061 and 0.062, respectively). Obviously, question 7 about a single value was considered more difficult than question 4 (p value \(=\) 0.032), but did not differ in difficulty from question 6.

In Fig. 3 we can see that in questions 4 and 5 the individuals differed between themselves more wrt WTA than WTP (looking vertically for various specific amounts of cost more areas are shaded). This indicates that there is less consensus among respondents about WTA. To have a more aggregated picture, we averaged the individual choices between all the respondents. We treated the 5-point Likert scale as an interval one (and transformed it to values: 0, 0.25, 0.5, 0.75, and 1, so as to be able to interpret it as a fuzzy set with a notation introduced below). Figure 4 presents the averages taken for all the respondents separately for WTP and WTA, and so on aggregate level WTP and WTA seem to be approximately symmetrical. We come back to this transformation and averaging later on, having introduced our model formally.
Fig. 4

Averaged results of WTP and WTA elicitation (presented on logarithmic scale and so values for 0 PLN are omitted)

Averaging the left and right ends of ranges reported in question 6 (taken to be equal for zero-length ranges) yields 88,863–125,000 PLN, averaging single values gives 105,000 PLN.

Figure 4 shows the average WTP range reported by respondents (vertical, dashed lines, questions 6 & 7) over the graph of how strongly respondents agree on average with various levels of WTP (question 4), so that we can see what the respondents’ confidence with values in this range is. Two things are noticeable. Firstly, the reported range is not located symmetrically (in the ordinate). Respondents are rather confident about the values in this range. That means that people in open questions tend to give values they strongly believe in. Secondly, the range is quite short, considering a much slower decrease in respondents confidence along the WTP Likert curve. That means that people are still not fully convinced that we should be willing to buy one unit of effect, if we can buy it cheaper than the lower bound of the range. The opposite also holds (even stronger, due to this range asymmetry): respondents still rather agree than disagree that we should be willing to pay more for one unit of health than the upper bound of the range. Thus, basing on question 7 might lead to an underestimation of the range of decision maker’s indecisiveness. That motivates eliciting preferences using a set of predefined values and Likert scale.

3.2 Formal representation

We take as the primitive of our model the decision maker’s preferences between pairs of technologies defined in deterministic way, i.e., as pairs of numbers (ec). Defining our approach in the deterministic case makes it more tractable, and as we show in Sect. 4, it can be easily extended to be used when uncertainty is present. We assume from the start, however, that the preferences may not be crisp, i.e., for some pairs \((e_1,c_1), (e_2,c_2)\) the decision maker would only be able to say that she somewhat prefers \((e_1,c_1)\) to \((e_2,c_2)\). Formally we use the fuzzy weak preference relation \(\mu ^*:{\mathbb {R}}^4\rightarrow [0,1]\), where \(\mu ^*(e_1,c_1,e_2,c_2)\) denotes the conviction of the decision maker that she weakly prefers \((e_1,c_1)\) to \((e_2,c_2)\), i.e., \((e_1,c_1)\) is at least as good as \((e_2,c_2)\). In other words, \(\mu ^*\) is a fuzzy assessment of the truthfulness of the statement: technology 1 is at least as good as technology 2.

We adopt several axioms for the preference structure. The first one allows to simplify the notation (and also imposes some regularity of \(\mu ^*\)).

Axiom 1

(Shift invariance) Preferences depend only on differences in both criteria:
$$\begin{aligned} {\forall (e_1,c_1,e_2,c_2)\in {\mathbb {R}}^4}:\, \mu ^*(e_1,c_1,e_2,c_2)=\mu ^*(e_1-e_2,c_1-c_2,0,0) \end{aligned}$$

Axiom 1 allows us to work with a simpler fuzzy relation \(\mu :{\mathbb {R}}^2\rightarrow [0,1]\), defined as \(\mu (e,c)=\mu ^*(e,c,0,0)\) (maybe it would be more intuitive to denote arguments of \(\mu \) as \(\varDelta e, \varDelta c\), but we simplify the notation). The remaining axioms are introduced using \(\mu \) but could be also expressed for the original \(\mu ^*\). We start with an axiom typically introduced for a regular (non-fuzzy) weak preference relation in decision theory.

Axiom 2

(Reflexivity) Preference relation is reflexive: \(\mu (0,0)=1\).

We take the above axiom to explicitly interpret \(\mu \) as a weak-preference relation, but in the whole system of axioms it only determines the value of \(\mu (0,0)\) and so is of no practical importance.

The next axiom imposes that the decision maker understands the meaning of both criteria, i.e., values the effects and dislikes the cost.

Axiom 3

(Monotonicity) \(\forall (e,c){\in }{\mathbb {R}}^2{:}\, \mu (e,\cdot ):\!{\mathbb {R}}{\rightarrow }\![0,1]\) is non-increasing, \(\mu (\cdot ,c):\! {\mathbb {R}}{\rightarrow }\![0,1]\) is non-decreasing.

We also assume that the decision maker is sensitive to even very small changes whenever the criteria need not be traded against each other.

Axiom 4

(Crisp sensitivity to single criteria)
$$\begin{aligned}&\forall \varDelta >0:\, \mu (-\varDelta ,0)=\mu (0,\varDelta )=0,\\&\forall \varDelta >0:\, \mu (\varDelta ,0)=\mu (0,-\varDelta )=1. \end{aligned}$$

Axioms 2, 3, and 4 fully determine the value of \(\mu \) in the 2nd and 4th quadrant (0 and 1, respectively). Observe that the second part of Axiom 4 is not needed as it is implied by Axioms 2 and 3, yet we write it down explicitly because Axiom 2 is likely to be dropped in other approaches.

We further assume that the decision maker is focused on the relative increments of effects and cost rather than absolute values. In other words the possibility to implement a decision partially or multiple times does not change the level of conviction of the decision maker that this decision is good.

Axiom 5

(Scale invariance) \(\forall \gamma >0\, \forall (e,c)\in {\mathbb {R}}^2:\, \mu (e,c)=\mu (\gamma e,\gamma c)\).

If we adopt Axioms 2, 3, 4 and 5 then for \(e>0\) knowing only \(\mu (1,c)\) for \(c>0\) unambiguously determines \(\mu (e,c)\) as scale invariance allows to project \(\mu \) radially in the 1st quadrant. We will call \(\mu (1,\cdot )\)fuzzy willingness-to-pay (fWTP). It can be interpreted in two ways: either as a fuzzy set of amounts the decision maker is willing to pay for a unit of effect, or as a fuzzy assessment of the truthfulness of the statement that it is worth to pay a given amount of money for a unit of effect. An explicit definition (whose sensibility does not hinge on any axioms) is presented below.

Definition 1

(fuzzy willingness-to-pay, fWTP) For a given fuzzy relation \(\mu :{\mathbb {R}}^2\rightarrow [0,1]\) define fWTP as a fuzzy set by defining its membership function: \(\text {fWTP}(x)=\mu (1,x)\).

An exemplary fWTP membership function is pictured on Fig. 4 as WTP Likert. Axioms 3 and 4 guarantee that \(\text {fWTP}(0)=1\) and \(\text {fWTP}(\cdot )\) is non-increasing. As with all fuzzy sets, we can define \(\alpha \)-cuts (Klir and Yuan 1995): \(\text {fWTP}_\alpha =\left\{ x\in {\mathbb {R}}:\, \text {fWTP}(x)\ge \alpha \right\} \), for \(\alpha >0\).5

The survey presented in Sect. 3.1 could help to elicit fWTP (i.e., define this fuzzy set). A natural approach would be to translate levels 1–5 of Likert scale in question 4 to \(\mu =0, 0.25, 0.5, 0.75\), and 1, respectively. That would result in fTWP being a normal fuzzy set if the respondent answered 5 (on the Likert scale) to at least one subquestion of question 4 (i.e., fully agreed that some incremental cost is worth paying). Observe, that such a translation per se does not impose treating consecutive levels of the Likert scale as equidistant as long as we only interpret \(\mu \) values as ordinal. Nonetheless, in actual applications we would most likely want to average the survey results for several respondents (as it is done in Fig. 4), i.e., average the \(\mu \) values. For this to be a sensibile operation we need to interpret the \(\mu \) values cardinally, and then the way we translate the 1–5 Likert scale to the [0, 1] interval has got an impact. In this paper we do not pursue these issues further as we focus on the theoretical properties of our approach. Assuming the simple translation and possibility to average \(\mu \) values, in our survey we on average obtained \(\text {fWTP}(10^5)\approx 0.73\), which means that our respondents were on average rather convinced that it is worth paying 100,000 PLN for a unit of effect.

If Axioms 2, 3, 4 and 5 hold, then fWTP suffices to describe preferences in the right half of the CE-plane. That does not impact the left-hand side, however, and so either more assumptions or a separate elicitation are required. Various axioms could be thought of. One idea would be to impose some form of antisymmetry and connectedness to obtain, e.g., \(\forall (e,c)\in {\mathbb {R}}^2\setminus \{(0,0)\}:\,\mu (-e,-c)=1-\mu (e,c)\). That comes at a price that the decision maker cannot be indifferent with full conviction between two non-identical technologies. The survey results show that this condition is at least closely approximated on the average level (Fig. 4), yet the discrepancies were greater for many individuals. In our approach we prefer to keep the flexibility, and so we simply notice that with Axioms 2, 3, 4 and 5\(\mu (e,c)\) for \(e\le 0\) is unambiguously determined by \(\mu (-1,c)\) for \(c<0\). Then \(\mu (-1,\cdot )\) can be used to define fuzzy willingness-to-accept (fWTA), a counterpart of fWTP. fWTA can also be interpreted in two ways: either as a fuzzy set of amounts the decision maker agrees to get in exchange for losing one unit of effect, or as a fuzzy assessment of the truthfulness of the statement that it is worth giving up one unit of effect in exchange of a given amount of money.

Definition 2

(fuzzy willingess-to-accept, fWTA) For a given fuzzy relation \(\mu :{\mathbb {R}}^2\rightarrow [0,1]\) define fWTA as a fuzzy set by defining its membership function: \(\text {fWTA}(x)=\mu (-1,-x)\).

An exemplary fWTA membership function is pictured on Fig. 4 as WTA Likert. This comes from translating Likert-scale answers to \(\mu \) values, similarly as for fWTP (and so with similar assumptions that make this process sensible). Again Axioms 3 and 4 guarantee that \(\text {fWTA}(0)=0\) and \(\text {fWTA}(\cdot )\) is non-decreasing. We define \(\alpha \)-cuts, \(\text {fWTA}_\alpha =\left\{ x\in {\mathbb {R}}:\, \text {fWTA}(x)\ge \alpha \right\} \), for \(\alpha >0\). Just as previously, we could rebuild our preference structure in the left part of CE-plane using fWTA.

We impose one more axiom that allows to deal with technicalities easier. It simply says that no change of effect can be so favorable (unfavorable) not to be possibly compensated by a sufficiently large change in cost.

Axiom 6

(Criteria tradeability, non-lexicographic preferences)
$$\begin{aligned}&\forall e>0\, \exists c\in {\mathbb {R}}:\, \mu (e,c)=0,\\&\forall e<0\, \exists c\in {\mathbb {R}}:\, \mu (e,c)=1. \end{aligned}$$

This axiom guarantees that \(\alpha \)-cuts of fWTP and fWTA may not be empty sets or a whole real line for \(0<\alpha \le 1\).

In what follows we assume axioms 16 hold. Exemplary values of \(\mu (\cdot ,\cdot )\) are shown in Fig. 2: below AOF line \(\mu =1\), above COD line \(\mu =0\) and \(\mu \) linearly (along vertical lines) changes in between. Observe how differently various regions of CE-plane should be treated. Points immediately below the OA line denote situations in which we surely prefer using the evaluated technology, and so do points below OF line (and in the whole 4th quadrant). Points within the AOB triangle denote situations in which this preference is somewhat weaker, and so do points in EOF. Then, points in BOC denote only a very weak preference, just as points in DOE.

An important difference from regular CEAC is noticeable. With CEAC we look at the fraction of points below a given line, e.g., AOD or COF, not noticing that reducing the slope in the 1st quadrant makes us focus on regions in which our preference is stronger, while automatically reducing the slope in the 3rd quadrant does the opposite at the same time. The overall effect is then difficult to interpret.

4 Decision analysis with fuzzy preferences and uncertainty

4.1 Definitions

As it was stated in Sect. 2 we want to extend the definitions presented there for crisp evaluation of WTP/WTA to their fuzzy counterparts. We start with the case when e and c are deterministic (a comparison of technologies represented by a single point in a CE-plane) and next move to the uncertainty case.

We denote \(e=e_1-e_2\) and \(c=c_1-c_2\), i.e. incremental effects and cost of new therapy versus some standard, similarly as when expressing \(\mu ^*\) in terms of \(\mu \) in the beginning of Sect. 3.2. In Sect. 2 we were interested in the value of the incremental net benefit, which can be denoted as \(e\times \text {WTP}-c\), and whether it is greater than or equal to 0. Starting with the latter: \(e\times \text {WTP}-c\ge 0\) means that (ec) is weakly preferred to (0, 0) when WTP is crisp. Therefore, we extend this condition to the fuzzy case in a direct way by stating that it is represented as \(\mu (e,c)\). Notice that if \(\mu \) is crisp (takes only values 0 and 1), then this definition is equivalent to the former one (if we assign the value 1 when inequality is true, and 0 otherwise). We will call \(\mu (e,c)\) the acceptability.

With the incremental net benefit we proceed in the following way. Observe that technology \((0,-e\times \text {WTP}+c)\) offers exactly the same net benefit as (ec). This gives an idea that the net benefit may be defined using preference between various alternatives. Namely, the net benefit of (ec) is at least x, if (ec) is weakly preferred to \((0,-x)\). We can automatically transfer it to the fuzzy approach, introducing \(\mu \) and using Axiom 1, to reduce the comparison to the one against (0, 0). The definition below formalizes this discussion.

Definition 3

(fuzzy net benefit, fNB) For given \((e,c)\in {\mathbb {R}}^2\) define a fuzzy number \(\text {fNB}_{(e,c)}\), by defining its membership function \(\text {fNB}_{(e,c)}:{\mathbb {R}}\rightarrow [0,1]\) as \(\text {fNB}_{(e,c)}(x)=\mu (e,c+x)\) (we will omit subscript (ec) if not needed).

\(\text {fNB}_{(e,c)}(x)\) is interpreted as the conviction of the truthfulness of the statement: the net benefit of (ec) is at least \(x. \text {fNB}(\cdot )\) is non-increasing (Axiom 3) and takes values 0 and 1 for some arguments (Axiom 6). Additionally \(\text {fNB}_{(e,c)}(x)\) for any \(x\in {\mathbb {R}}\) weakly increases with e and weakly decreases with c. The \(\alpha \)-cuts \(\text {fNB}_\alpha \) for any (ec) are the left-unbounded intervals. When we need to explicitly refer to the dependence of \(\alpha \)-cut \(\text {fNB}_\alpha \) on (ec) we will write \(\text {fNB}_\alpha (e,c)\).

Another definition, more directly related to how the crisp net benefit was defined, can be proposed, but for it to hold we need to assume that sets \(\mu (e,c)\ge \alpha \) are closed (which is equivalent to stating that \(\text {fWTP}\) is left-continuous and \(\text {fWTA}\) is right-continuous). The following proposition presents the details.

Proposition 1

For any \((e,c)\in {\mathbb {R}}^2\) and \(\alpha \in ]0,1]\) define
$$\begin{aligned} \tau _\alpha (e,c)=\left\{ \begin{array}{ll} e\times \sup (\text {fWTP} _\alpha )-c &{} \quad \text {for }e\ge 0,\\ e\times \inf (\text {fWTA} _\alpha )-c &{} \quad \text {for }e<0. \end{array}\right. \end{aligned}$$
If \(\forall \alpha \in ]0,1]:\, \left\{ (e,c)\in {\mathbb {R}}^2:\, \mu (e,c)\ge \alpha \right\} \) is closed then \(]-\infty ,\tau _\alpha (e,c)]=\text {fNB} _\alpha (e,c)\).

If the prerequisite of the above proposition holds then we could use this (perhaps a more intuitive) approach to define fNB.

We can now introduce uncertainty. It results in various pairs (ec) possibly representing the consequences of our choice. We assume that uncertainty is quantified. Mathematically we have a sample space \(\varOmega \), a set of events \({\fancyscript{F}}\), and a probability distribution \({\fancyscript{P}}\rightarrow [0,1]\). Pairs (ec) depend on \(\omega \in \varOmega \), and to simplify notation we take \(\varOmega ={\mathbb {R}}^2\) and so we identify (ec) with \(\omega \). We denote by E and C the random variables distributed with \({\fancyscript{P}}\).

Because of the risk neutrality assumption we take, we want to average the uncertainty out for fNB. We can use the fact that in our model the fuzzy numbers are determined by the right bounds of their \(\alpha \)-cuts for all \(\alpha \in ]0,1]\). Therefore, \(\alpha \)-cuts for the expected value of the fuzzy net benefit (fENB) can be defined as an interval with the right bound being an expected value of these right bounds. A formal definition is presented below.

Definition 4

(fuzzy expected net benefit, fENB) For each \(0<\alpha \le 1\) define a fuzzy number fENB by defining its \(\alpha \)-cuts as \(\text {fENB}_\alpha =]-\infty ,\phi _\alpha ]\), where
$$\begin{aligned} \phi _\alpha =\int _\varOmega \sup \left( {\text {fNB}}_\alpha \left( e,c\right) \right) \,{\mathrm {d}}{\fancyscript{P}}. \end{aligned}$$
For technical completeness define \(\text {fENB}_0={\mathbb {R}}\), and then we can define the membership function as \(\text {fENB}(x)=\sup \left\{ \alpha \in [0,1]:\, x\in \text {fENB}_\alpha \right\} \).

fENB is a fuzzy number. It represents the decision maker’s conviction (for various strengths \(\alpha \)) about the expected value of the net benefit. As we have not extrapolated the decision maker’s preferences to lotteries, we cannot directly interpret fENB in terms of preferences (as we did with fNB). We can construct, however, the following interpretation. Imagine a decision maker considering whether to adopt a new technology described with uncertainty. Assume that she will find out the true (ec) at some point of time after the decision. Then \(\text {fENB}_\alpha \) represents her expected satisfaction (measured as NB) for various levels of conviction. She will be able to say, e.g.: I’m rather convinced that by adopting new technology I can on average count on such and such NB. Apart from having an interpretation, fENB behaves intuitively in limiting cases: when uncertainty is reduced then fENB converges to fNB (Proposition 4); and when fuzziness disappears then fENB converges to regular (crisp) ENB (Proposition 3).

Observe that under our definition \(\alpha \)-cuts of fENB are right-closed, even though in general we did not assume anything about the continuity of \(\mu (\cdot ,\cdot )\) and so about the right-closedness of \(\text {fNB}_\alpha \). It might lead to some strange results, e.g., if \({\fancyscript{P}}\) is concentrated in a single point (ec) and if \(\mu (\cdot ,\cdot )\) is not upper semi-continuous in this point, then \(\text {fNB}\ne \text {fENB}\). We neglect this problem as it is of no practical importance, but an easy solution might be to add an axiom requiring preferences (of any conviction \(\alpha \in ]0,1]\)) to hold in the limit. Formally we would stipulate that: \(\forall \alpha \in ]0,1]:\, \left\{ (e,c)\in {\mathbb {R}}^2:\, \mu (e,c)\ge \alpha \right\} \text { is closed}\).

The left part of Fig. 5 presents an exemplary fENB calculated for the situation illustrated in Fig. 2. The interpretation would be the following: the decision maker is fully convinced that her net benefit on average amounts at least to \(-0.8\), she is somewhat convinced that the average net benefit is above 0 (as she is somewhat convinced that such favorable WTP/WTA can be applied), and she does not believe at all that an expected value of NB as high as 1.5 can be obtained.
Fig. 5

fENB (left) and fEA (right) for data presented in Fig. 2

We can now introduce uncertainty to the notion of acceptability defined at the beginning of this section. For each (ec) by \(\mu (e,c)\) we denote a confidence that new technology is acceptable. Under uncertainty we can imagine a decision maker who is going to find out the true (ec) after the decision. Then the decision maker may sensibly ask a priori about the probability of ascertaining a posteriori that the decision was good. Of course saying that the decision was good depends on the conviction level required. In other words, we want to define a concept allowing to express the probability that the decision is good at some confidence level \(\alpha \). Formalities follow.

Definition 5

(fuzzy expected acceptability, fEA) For each \(\alpha \in ]0,1]\) define a fuzzy number fEA by defining its \(\alpha \)-cuts:6
$$\begin{aligned} \text {fEA}_\alpha =\left[ 0,\int _\varOmega {\mathbf {1}}_{[\alpha ,1]}(\mu (e,c)) \,{\mathrm {d}}{\fancyscript{P}}\right] . \end{aligned}$$
For technical simplicity define \(\text {fEA}_0=[0,1]\), and then we can define the membership function as \(\text {fEA}(x)=\sup \left\{ \alpha \in [0,1]:\, x\in \text {fEA}_\alpha \right\} \).

In the right part of Fig. 5 we present an exemplary fEA calculated from the example illustrated in Fig. 2. The interpretation would be the following—the decision maker is fully convinced that the probability (stemming from estimation uncertainty) that new technology is better exceeds ca. 0.32, she is somewhat convinced (at \(\alpha =0.8\)) that it exceeds ca. 0.4, etc. Notice that fEA could also be interpreted differently: as a description of a probability distribution of \(\mu (e,c)\), and we could read from fEA e.g., the probability (a priori) that the decision maker will be convinced to such and such degree (a posteriori) that the decision was good. So there is about 32 % probability that the decision maker will be fully convinced that the decision was good once she has learned the true (ec).

Observe how the above results differ from regular CEAC in the right part of Fig. 2. When analyzing CEAC we would focus on values of WTP we consider sensible. Values of \(\mu (\cdot ,\cdot )\) assumed (Fig. 2, left part) show that the decision maker is sure it makes sense to pay at least 0.5 for a unit of health \((\mu (1,0.5)=1)\), and for sure does not make sense to pay as much as 1.5 \((\mu (1,1.5)=0)\). CEAC values in the range [0.5, 1.5], denoted by vertical lines in Fig. 2, suggest that the probability of new technology being a better choice exceeds 55 %, and so suggest that we are on the safe side whatever WTP we take (from the range we are willing to consider). fEA is more conservative: it shows that with full conviction we can only hope for a probability of around 0.32, thus the risk related to this decision is greater.

4.2 Properties

Whether fENB and fEA defined above can be successfully applied in decision support hinges upon their properties. We start with the trivial one.

Proposition 2

\(\text {fENB} (\cdot )\) and \(\text {fEA} (\cdot )\) are non-increasing.

The proposition and its proof are very straightforward. Still, as we mentioned in Sect. 2, sometimes monotonicity may be surprisingly violated for tools commonly used in HTA, as is the case with CEAC (cf. Fig. 2). For fENB and fEA the desired property holds: if I want to be more convinced of the statements I make, these statements must be more conservative.

The next two propositions aim to further motivate the very definitions of fENB and fEA. As these concepts combine uncertainty with fuzziness, it is interesting how they behave when either of these aspects is removed from the decision problem. As the following propositions state, we then get what we intuitively expect.

Proposition 3

If WTP and WTA are crisp and equal (as crisp numbers), i.e., they are represented in our model as step functions \(\text {fWTP} (x)={\mathbf {1}}_{]-\infty ,t]}(x)\) and \(\text {fWTA} (x)={\mathbf {1}}_{[t,+\infty [}(x), t\) being the common value, then \(\text {fENB} (x)={\mathbf {1}}_{]-\infty ,{\mathbb {E}}(Et-C)]}(x)\) and \(fEA (x)={\mathbf {1}}_{]-\infty ,{\mathbb {P}}(Et-C\ge 0)]}(x)\), where E and C are random variables denoting the uncertainty of (ec).

The conclusion is that fENB and fEA are direct extensions of their crisp counterparts (ENB and CEAC, both for given WTP). Therefore, fENB and fEA manage to include fuzziness, but when fuzziness disappears we are brought back to the realm of standard techniques.

fNB may illustrate better the impact of changing uncertainty in CE-plane than CIs or CEACs. Consider a bivariate random variable (EC) that is normally distributed around the vector of means (0, 0). Assume both dimensions are not correlated and have standard deviations of either 5 (both of them) or 1 (both of them). In both cases standard CIs for ICERs cannot be calculated, and regular CEACs are simply horizontal lines at 50 % level. Calculating regular ENB (for any WTP) would yield 0 in both cases (small and large uncertainty) as well. Hence, neither standard tool allows to visualize the difference in the amount of uncertainty present. Luckily, fENB is sensitive to the amount of uncertainty as presented in Fig. 6 (left part referring to \(\text {SD}=5\), right one to \(\text {SD}=1\)).
Fig. 6

fNB reacts when uncertainty is reduced from left\(\text {SD}=5\) to right\(\text {SD}=1\) (for the specification of the model use please refer to the text)

We showed above how fENB and fEA behave when fuzziness is reduced. In the last proposition we show formally what happens to fENB and fEA when uncertainty is getting smaller and smaller (e.g., due to obtaining more data), and we consider not a single (EC) bivariate random variable, but rather a sequence of such variables involving less and less uncertainty.

Proposition 4

Take a sequence of random variables \((E_{(i)},C_{(i)}, i\in \mathbb {N}\), converging in probability to a degenerate distribution located in \((e^*,c^*)\in {\mathbb {R}}^2\). Then the following holds (all limits for \(i\rightarrow +\infty \)):
  • if \(\mu \) is continuous at \((e^*,c^*)\) then \(\forall x\in ]0,1[:\, \text {fEA} _{(i)}(x)\rightarrow \mu (e^*,c^*)\),

  • \(\forall \alpha \in ]0,1]:\, \text {fENB} _{(i),\alpha } \rightarrow \text {fNB} _\alpha (e^*,c^*)\),

where we consider convergence of \(\alpha \)-cuts in terms of Hausdorff distance (in our case—right ends of \(\alpha \)-cuts converge as regular numbers). We added subscript (i) to differentiate fENB and fEA for successive random variables \((E_{(i)},C_{(i)})\) in the sequence.7

Several comments are due here. Convergence in probability of \((E_{(i)},C_{(i)})\) will follow when the sample used to estimate the true effect and cost increments increases in size.8 Continuity of \(\mu (e^*,c^*)\) denotes that we are not considering points in which the fuzzy conviction measures suddenly changes whenever we move a slightest bit in any direction. That precludes using the above proposition for fEA in \((e^*,c^*)=(0,0)\), as the converging distribution may still substantially vary the total mass concentrated in various quadrants of CE-plane.

Further, observe that different types of convergence are used for fEA and fENB. fEA converges vertically, i.e., for each value of probability, \(x\in ]0,1[\), the conviction that this is indeed the probability of the decision to adopt a new technology is right converges to \(\mu (e^*,c^*)\). fENB converges horizontally, i.e., for each value of conviction, \(\alpha \in ]0,1]\), the maximal values of ENB obtained with this conviction converge.

The limits of both fEA and fENB are fuzzy numbers, but the limit of fEA is, in a sense, a simpler one as its membership function is constant for almost all points, i.e., for \(x\in ]0,1[\). The limit of fENB is a typical fuzzy number, i.e., fNB.

The behavior of \(\text {fEA}(\cdot )\) in 0 and 1 may be non-intuitive at first sight. \(\text {fEA}(0)=1\) always, by the very definition (Eq. 3). \(\text {fEA}(1)\) need not to converge to \(\mu (e^*,c^*)\) if the distribution of \((E_{(i)},C_{(i)})\) allows the slightest possibility of lower values of \(\mu \). Figure 7 illustrates the convergence of fEA (darker colors denote less uncertainty).
Fig. 7

fEA converges to point \(\mu (e,c)\) (inside the ]0, 1[ interval) when uncertainty is reduced (darker color represents smaller uncertainty)

5 Conclusions

Problems with deciding about the level of WTP/WTA stem from the fact that individuals find it hard to give their single most appropriate value, because a wide range of values may look somewhat appropriate to a greater or lesser degree. We attribute this phenomenon to the fact that WTP/WTA require a decision maker to perform a difficult assessment how she values life. That is why it is necessary to develop methods accounting for fuzziness in HTA decision making, which is what we do in the paper. As we show a fuzzy approach to WTP/WTA can be motivated axiomatically, quantified, elicited, and embedded in formal decision support methods.

Standard methods of decision analysis used in HTA, e.g., CEAC, present results for a whole range of WTP, simply postponing the necessity to decide about the (range of) WTP to know at which region of CEAC to look at. The methods we propose (fENB, fEA) try to get the unknown WTP/WTA out of the picture. The price is that the decision maker needs to select fuzzy confidence levels, so one might say that one parameter was simply replaced by another. In some situations that is truly the case, e.g., when uncertainty is contained in the 1st quadrant (e.g., Fig. 1, left part). Then CEAC is increasing with WTP, larger WTP means lower \(\mu \), and so lower membership is associated with larger probability in fEA. However, it is important that our methods provide new insight when uncertainty is scattered over several quadrants, which is often the case in HTA practice. As the example from Fig. 2 shows, CEACs may throw points very different from one another into one bag. e.g., calculating CEAC for AOD line as a cutoff, we consider both: points that denote outcomes which we like with great confidence (1st quadrant), and points which we barely like (just below OD segment). As we discussed in Sect. 4.1, it may result in an overly optimistic assessment. fEA approaches it differently, combining points of equal threshold of confidence. Changing this threshold fans out the region which we look at. Obenchain (2008) also discussed the rationale for looking at regions fanning out the 4th quadrant of CE-plane, however that was done without a formal consideration of fuzziness of the decision maker’s preferences.

Trying to use the proposed methods in actual decision making would require eliciting preferences of individuals first. As shown, various techniques could be used. Our results suggest that using a Likert-scale and predefined thresholds is better than directly asking for a range. Based on the previous suggestions in the literature, we decided to elicit WTP and WTA separately. That permits, first of all, WTP and WTA to differ. Secondly, asking about a money-health trade-off coefficient (either directly or in Likert-scale form), without specifying the context (buying health or selling health, i.e., switching to more costly/more effective or less costly/less effective) might be difficult for respondents. A given amount might sound appropriate in the context of buying health, and be very unsuitable in the context of selling health. In a sense, the respondents would have to solve two dilemmas at the same time. Looking at Fig. 2 (left part), is the true coefficient equal to the slope of OA, OB or OC, or something else? Taking this approach would lead to WTP being a fuzzy number, i.e., it would have \(\alpha \)-cuts in the form of bounded intervals.

Our survey results also shed some new light on the issue of WTP and WTA not being equal. If we look at the average Likert-based curves in Fig. 4 we see that WTP and WTA look like mirror images to the left of their crossing. To the right the WTP curve quickly decreases, while WTA increases much slower. If we wanted to find a 0.5-conviction interval for WTP/WTA (i.e., values that translate into confidence of 0.25 and 0.75), we would end up with (95,000; 200,000) for WTP, and (110,000; 500,000) for WTA. Upper limits of conviction intervals differ then, which means that a large amount is needed to strongly convince individuals that it makes sense to give up some health in exchange for that amount.

Greater differences between WTP and WTA arise for different reasons, though. As mentioned in Sect. 3.1 individuals, when asked about a range of WTP, tend to report ranges of which they are strongly convinced, rather than ranges located centrally and symmetrically in their conviction scale. If the same things happen for WTA, then we should expect ranges around (400,000; 1,000,000) as these values provide confidence of around 0.75. In a fuzzy context it is not to say, then, that \(\text {WTA}>\text {WTP}\), but rather that for \(\alpha \)s large enough \(\sup (\text {fWTA}_\alpha )>\sup (\text {fWTP}_\alpha )\) while for small \(\alpha \)s the opposite may hold. We did not ask questions about a range for WTA, as we had not wanted to end up with too lengthy a survey, and we had not foreseen the above-mentioned phenomenon. That might be one of the issues to pursue in the future.

Coming back to the comparison against standard decision analysis tools. Severens et al. (2005) showed that CEACs can account for differences between WTP and WTA, by assuming a fixed ratio of \(\frac{\text {WTA}}{\text {WTP}}\) and presenting a regular CEAC as a function of WTP. This ratio can be changed, resulting in several CEACs being plotted. Observe a crucial difference between their approach and ours. In their approach moving to the right (i.e., increasing WTP) also increases WTA (via a fixed ratio). Hence we account for larger part of the 1st quadrant, but for a smaller part of the 3rd quadrant. In our approach taking a larger cut of the 1st quadrant means we want to be more permissive in our decisions. That should be accompanied by taking also a bigger part of the 3rd quadrant, and this is what happens in our approach. Hence, the two approaches are very different in their philosophy.

In the paper we suggested some axioms, believing they can be naturally assumed for comparing health technologies. Importantly, the methods we introduce can still be used, even if we drop some axioms. Axiom 2 is unimportant in a typical case of continuous probability distribution. We can further drop Axiom 5 and meaningfully use fENB and fEA, however then fWTP makes no sense. Dropping Axioms 4 and 6 would lead to technical difficulties as some \(\alpha \)-cuts \(\text {fNB}_\alpha \) might be empty or rightly unbounded, causing problems in the definition of fENB. However, fEA is still sensibly defined then. Dropping Axiom 3 does not cause technical problems but makes the whole preference structure very weird. Dropping Axiom 1 would require redefining the whole approach.

We focused on the fuzziness in the preferences of each individual separately, but in general it makes sense to combine preferences of individuals to make decisions that impact everyone in the society. Fuzziness might still be a good approach to account for differences between individuals. How convinced are we, as a society, that we value health at a given amount? We could say that, e.g., a society is totally convinced if everyone agrees that it is worth spending this amount, rather convinced if a substantial fraction agrees, etc. Our methodology could be applied directly also under this interpretation.

The very basic rationale for the present paper comes from the inability to decide about a single WTP/WTA value. We must notice, however, that there are situations in which this threshold will be given. Firstly, the decision maker could simply adapt a contractual approach, and decide to arbitrarily select the WTP/WTA threshold, not so as to fulfill somebody’s preferences best, but simply to provide a transparent basis for decision making. No preferences are directly addressed, but the decision rule is explicit. Secondly, if an exogenous budget constraint is given, then a constrained optimization problem might be presented, where decision variables denote which technologies are used (Weinstein and Zeckhauser 1973). Then WTP/WTA threshold is given endogenously as part of the solution of this problem (informally, as the largest \(\frac{{\mathbb {E}}(C)}{{\mathbb {E}}(E)}\) ratio for accepted technologies). If we adopted this point of view, then also tools such as CEAC, 95 % CI for ICER would be made obsolete, not the case in real-life HTA decision making. Eichler et al. (2004) and Gafni and Birch (2006), among others, present a more general discussion where the thresholds (hard or soft) could possibly come from.

Further research could address extending the results to comparing more than two technologies at the same time (as standard CEACs can do). Another line would be to translate into the fuzzy context other methods used in standard HTA decision making, e.g., the expected value of perfect information. The results presented here can also be treated as more general ones and directly applied to areas other than HTA, where multiple criteria, uncertainty and dilemmas what trade-offs to accept are present.


  1. 1.

    Defined in the Reimbursement Act to be equal to triple annual gross domestic product per capita.

  2. 2.

    There are other criteria that may be used in HTA, e.g., the decision maker may want to give priority to treatments being offered to particularly vulnerable populations like patients with rare or ultra-rare diseases.

  3. 3.

    Another issue arises here, beyond the main interest of this paper: based on what grounds can we aggregate utilities between patients, neglecting the problem of inter-personal utility comparisons? Briefly, due to ethical issues we do not want to differentiate treatment provided to individual patients because of differences in their preferences. That is why in HTA we typically measure the clinical improvements in single patients objectively: longevity of life is totally objective, and we might use some definition of health-related quality of life, like EQ-5D. We can then translate health states into utilities using a single population-based value set. Hence, we do not aggregate utilities but rather apply a single utility function to all the patients.

  4. 4.

    It is sometimes claimed that higher drug prices and higher WTP ratios are rationalized in these situations, as the only way to stimulate the R&D process, when the demand is small.

  5. 5.

    Throughout the paper we use the notation that if X is a name of a fuzzy set, then \(X(\cdot )\) denotes a membership function defining this set, and \(X_\alpha \) denotes the \(\alpha \)-cut. Obviously, the membership function and \(\alpha \)-cuts can be derived from each other, and both can serve equally well to define a fuzzy set.

  6. 6.

    By \({\mathbf {1}}_A(x)\) we denote the standard indicator function, whose value is 1 if \(x\in A\), and 0 if \(x\notin A\).

  7. 7.

    In Sect. 2 we denoted by \(E_i\) and \(C_i\) random variables describing the uncertainty of the effect and cost assessment of technology i. To clarify, here \(E_{(i)}\) and \(C_{(i)}\) denote the sequence of random variables, with each element of this sequence (denoted by i) measuring the uncertainty of the difference between the effect and cost of two alternatives being compared.

  8. 8.

    Whenever the underlying first-order uncertainty permits the central limit theorem to hold, i.e., the inter-patient variability is such that increasing the sample size allows to estimate the average effect and cost more precisely.

  9. 9.

    In standard Euclidean metric in \({\mathbb {R}}^2\), denoted by \(d(\cdot ,\cdot )\).



It would not have been possible to collect the survey results presented in the paper without the help from M. Niewada, who facilitated the contact with the respondents. We would like to acknowledge the help of HTA experts who participated in the survey: K. J. Filipiak, K. Jahnz-Różyk, and the others, who opted to remain anonymous. We also express our gratitude to D. Golicki, T. Macioch, W. Wrona, and again M. Niewada, who commented on the first version of the survey.


  1. Arrow, K. (1963). Uncertainty and the welfare economics of medical care. The American Economic Review, 53(5), 141–149.Google Scholar
  2. Arrow, K., & Lind, R. (1970). Uncertainty and the evaluation of public intervention decisions. The American Economic Review, 60, 364–378.Google Scholar
  3. Billingsley, P. (1999). Convergence of probability measures (2nd ed.). New York: Wiley.CrossRefGoogle Scholar
  4. Black, W. (1990). The CE Plane: A graphic representation of cost-effectiveness. Medical Decision Making, 10, 212–214.CrossRefGoogle Scholar
  5. Bleichrodt, H., Wakker, P., & Johannesson, M. (1997). Characterizing QALYs by risk neutrality. Journal of Risk and Uncertainty, 15, 107–114.CrossRefGoogle Scholar
  6. Briggs, A., Claxton, K., & Sculpher, M. (2006). Decision modelling for health economic evaluation. Oxford: Oxford University Press.Google Scholar
  7. Briggs, A., & Fenn, P. (1998). Confidence intervals or surfaces? Uncertainty on the cost-effectiveness plane. Health Economics, 7(8), 723–740.CrossRefGoogle Scholar
  8. Briggs, A., Weinstein, M. C., Fenwick, E. A., Karnon, J., Sculpher, M. J., & Paltiel, A.D., on behalf of the ISPOR-SMDM Modeling Good Research Practices Task Force. (2012). Model parameter estimation and uncertainty analysis: A report of the ISPOR-SMDM Modeling Good Research Practices Task Force Working Group-6. Medical Decision Making, 32(5), 722–732.Google Scholar
  9. Claxton, K. (1999). The irrelevance of inference: A decision-making approach to the stochastic evaluation of health care technologies. Journal of Health Economics, 18(3), 341–364.CrossRefGoogle Scholar
  10. Devlin, N., & Parkin, D. (2004). Does NICE have a cost-effectiveness threshold and what other factors influence its decisions? A binary choice analysis. Health Economics, 13, 437–452.CrossRefGoogle Scholar
  11. Eckermann, S., & Willan, A. (2011). Presenting evidence and summary measures to best inform societal decisions when comparing multiple strategies. Pharmacoeconomics, 29(7), 563–577.Google Scholar
  12. Eichler, H. G., Kong, S., Gerth, W., Mavros, P., & Jönsson, B. (2004). Use of cost-effectiveness analysis in health-care resources allocation decision-making: How are cost-effectiveness thresholds expected to emerge? Value in Health, 7(5), 518–528.CrossRefGoogle Scholar
  13. Fenwick, E., Claxton, K., & Sculpher, M. (2001). Representing uncertainty: The role of cost-effectiveness acceptability curves. Health Economics, 10(8), 779–787.CrossRefGoogle Scholar
  14. Fenwick, E., O’Brien, B., & Briggs, A. (2004). Cost-effectiveness acceptability curves facts, fallacies and frequently asked questions. Health Economics, 13, 405–415.CrossRefGoogle Scholar
  15. Gafni, A., & Birch, S. (2006). Incremental cost-effectiveness ratios (ICERs): The silence of the lambda. Social Science & Medicine, 62, 2091–2100.CrossRefGoogle Scholar
  16. Garber, A. (2000). Advances in cost-effectiveness analysis of health interventions. In A. J. Culyer (Ed.), Handbook of health economics (Vol. 1A, pp. 181–221). Amsterdam: North-Holland.Google Scholar
  17. Gold, M., Siegel, J., Russell, L., & Weinstein, M. (Eds.). (1996). Cost-effectiveness in health and medicine. Oxford: Oxford University Press.Google Scholar
  18. Hunink, M. G., Bult, J. R., de Vries, J., & Weinstein, M. C. (1998). Uncertainty in decision models analyzing cost-effectiveness: The joint distribution of incremental costs and effectiveness evaluated with a nonparametric bootstrap method. Medical Decision Making, 18(3), 337–346.CrossRefGoogle Scholar
  19. Jakubczyk, M., & Kamiński, B. (2010). Cost-effectiveness acceptability curves-caveats quantified. Health Economics, 19, 955–963.CrossRefGoogle Scholar
  20. Kahneman, D., Knetsch, J., & Thaler, R. (2009). Experimental tests of the endowment effect and the coase theorem. In E. L. Khalil (Ed.), The new behavioral economics. Volume 3. Tastes for endowment, identitiy and the emotions (Vol. 3, pp. 119–142). London: Elgar.Google Scholar
  21. Klir, G., & Yuan, B. (1995). Fuzzy sets and fuzzy logic: Theory and applications. Englewood Cliffs NJ: Prentice Hall.Google Scholar
  22. Lee, K. (2005). First course on fuzzy theory and applications. Berlin: Springer.Google Scholar
  23. Löthgren, M., & Zethraeus, N. (2000). Definition, interpretation and calculation of cost-effectiveness acceptability curves. Health Economics, 9, 623–630.CrossRefGoogle Scholar
  24. Moreno, E., Girón, F., Vázquez-Polo, F., & Negrín, M. (2010). Optimal healthcare decisions: Comparing medical treatments on a cost-effectiveness basis. European Journal of Operational Research, 204, 180–187.CrossRefGoogle Scholar
  25. Moreno, E., Girón, F., Vázquez-Polo, F., & Negrín, M. (2013). Optimal treatments in cost-effectiveness analysis in the presence of covariates: Improving patient subgroup definition. European Journal of Operational Research, 226, 173–182.CrossRefGoogle Scholar
  26. Obenchain, R. (1997). Issues and algorithms in cost-effectiveness inference. Biopharmaceutical Report, 5(2), 1–7.Google Scholar
  27. Obenchain, R. (2008). ICE preference maps: Nonlinear generalizations of net benefit and acceptability. Health Services and Outcomes Research Methodology, 8, 31–56.CrossRefGoogle Scholar
  28. O’Brien, B., Gertsen, K., Willan, A., & Faulkner, L. (2002). Is there a kink in consumers’ threshold value for cost-effectiveness in health care? Health Economics, 11, 175–180.CrossRefGoogle Scholar
  29. Pliskin, J., Shepard, D., & Weinstein, M. (1980). Utility functions for life years and health status. Operations Research, 28(1), 206–224.CrossRefGoogle Scholar
  30. Sadatsafavi, M., Najafzadeh, M., & Marra, C. (2008). Acceptability curves could be misleading when correlated strategies are compared. Medical Decision Making, 28(3), 306–307.CrossRefGoogle Scholar
  31. Severens, J., Brunenberg, D., Fenwick, E., O’Brien, B., & Joore, M. (2005). Cost-effectiveness acceptability curves and a reluctance to lose. Pharmacoeconomics, 23(12), 1207–1214.CrossRefGoogle Scholar
  32. van Hout, B., Al, M., Gordon, G., & Rutten, F. (1994). Costs, effects and C:E-ratios alongside a clinical trial. Health Economics, 3, 309–319.CrossRefGoogle Scholar
  33. Weinstein, M., & Zeckhauser, R. (1973). Critical ratios and efficient allocation. Journal of Public Economics, 2, 147–157.CrossRefGoogle Scholar
  34. Zaric, G. (2010). Cost-effectiveness analysis, health-care policy, and operations research models. In Wiley Encyclopedia of operations research and management science, Wiley. doi:10.1002/9780470400531.eorms0202.
  35. Zivin, J., & Bridges, J. (2002). Addressing risk preferences in cost-effectiveness analysis. Applied Health Economics and Health Policy, 1(3), 135–139.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Decision Analysis and Support UnitWarsaw School of EconomicsWarsawPoland

Personalised recommendations