Introduction

Increasing scrutiny of healthcare costs leads to a demand for proof of value for all medical expenditures. Cost-effectiveness analyses (CEA) intend to provide additional information about the possibilities of maximizing health effects, taking into account limited health care resources [1]. CEA have already become common practice in the evaluation of disease treatment strategies and diagnostic screening programs [2]. Diagnostic imaging (DI) is currently the fastest growing category in medical expenditure [3, 4]. Over the last years, an increasing number of CEA of DI technologies have been published [512], though broad application has yet to happen. The distinct central role diagnostic imaging plays in medical decision-making, as well as the continued emergence of new and varied imaging technologies, increases the importance of cost-effectiveness evaluation in imaging technology assessment. Several articles provide an overview on the theory of CEA in DI [1315]. Although they contain excellent technical background, radiologists and other DI professionals still might feel insecure in performing and interpreting CEA, as economic evaluation is not part of medical training. Even for those doctors who received additional training, performing CEA analyses in DI is challenging due to missing standardized methodologies. Furthermore, the effects both on costs and health outcome largely depend on the treatment strategy decisions that are made based on the imaging results themselves. Consideration of these remote indirect effects requires more complex methodologies for CEA in DI compared to CEA in therapeutic services [16]. Synthesis of available evidence incorporated in decision analytic modelling forms the link between a diagnostic test and its effects in terms of costs and health outcome. A comprehensive practical guide to the use of decision modelling techniques can be found in the book of Briggs et al. [17]. The aim of this article is to provide an introduction to the tools necessary to perform and interpret CEA. We thereby transfer the theory of evidence synthesis and decision analytic modelling to practical clinical research by demonstrating key principles and steps of CEA in diagnostic imaging.

Rationale of cost-effectiveness analysis in diagnostic imaging

A cost-effectiveness analysis is the comparative analysis of alternative courses of action in terms of both their costs and consequences. In imaging, these alternative courses of action can be utilization of different imaging techniques, or, more generally, imaging versus no imaging. The rationale of CEA in DI is that the choice of DI test influences both costs as well as effectiveness of disease management. In a conceptual framework developed by Fineberg et al. [18] and modified by others [19], effectiveness of a diagnostic test is expressed on subsequent hierarchical levels: technical performance, diagnostic accuracy, diagnostic impact, therapeutic impact and health outcome. Effectiveness in terms of patients’ health outcome is indirectly influenced by the diagnostic test due to medical care decisions based on imaging. The health outcome can also directly be affected by the imaging test itself. Health effects can be physical, for example because of altered treatment, and psychological, for example because of receiving a diagnosis. Direct and indirect health effects can be measured in utility scores, from which Quality Adjusted Life Years (QALYs) can be derived, combining survival and quality of life. Both physical and psychological health conditions are incorporated in a QALY. Looking at costs, these are directly affected by the costs of the diagnostic test itself as well as indirectly influenced by costs of treatment chosen based on imaging and resulting costs of patients’ health outcome. Figure 1 illustrates the concept of these direct and indirect effects used in CEA in DI.

Fig. 1
figure 1

Comprehensive framework of cost-effectiveness analysis in diagnostic imaging

When assessing the cost-effectiveness of DI, the initial question is whether adding an imaging test in a medical pathway does improve medical decision-making. Taking a hypothetical 100 % accuracy for a test would be used to assess changes in outcome by adding this particular test to the medical pathway. Only if the perfectly accurate test provides added value, it is reasonable to continue the analysis with the actual sensitivity and specificity of the test [20]. However, in many clinical situations imaging is already an existing standard part of the particular disease management. CEA is, therefore, used to compare potential new imaging technologies or imaging strategies to each other as well as to the current reference standard. We will focus on this second model throughout the article.

Decision-analytic modelling

While studies are being performed to inform decision makers about optimal DI tests, these studies often focus on the accuracy and short-term effects of the imaging test. However, for decision making it is also important to know how well a test can help to improve health outcome (e.g., survival and/or quality of life) [16, 21]. In addition, the societal impact may be relevant, including the cost-effectiveness or value for the money needed for a diagnostic test. It is possible to undertake trials that include such long-term consequences but these trials are often costly and practically challenging, due to the remoteness of the effects, requiring long follow-up periods and large sample sizes. Moreover, withholding a non-invasive imaging test that might provide useful diagnostic information may create ethical dilemmas.

Given these limitations, decision-analytic modelling is often used to synthesize data from trials with other available evidence [13]. Hence, the characteristics of a diagnostic test (e.g., sensitivity and specificity) can be linked with long-term patient outcomes. The results of these models tend to better fit the needs of decision makers.

The development and analysis of a decision-analytic model proceeds in a stepwise fashion. Comprehensive guidelines were recently published [22]. We present a six-step methodology for CEA in DI, including: (1) Defining the decision problem, (2) Choosing, and further developing the decision model, (3) Selecting input parameters, (4) Analysis and uncertainty analysis, (5) Interpretation of results , and (6) Transferability and validation.

Defining the decision problem

A research question is developed with a clear statement of the imaging decision problem, modelling objectives and the scope of the model. Factors to consider are the decision maker (e.g. government or medical doctor), the perspective (e.g., societal perspective if government is the decision maker or health care perspective for medical doctor), comparators (which imaging strategies should be compared in the analysis), outcomes of the model (e.g., which health consequences are relevant), and target population (for which patients is the decision relevant). It is important to note that this step should be driven by the decision problem and research question, and not by data availability [23]. We will illustrate how a decision model can be used for medical imaging by addressing the question whether 18FDG-Positron-Emission-Tomography computed tomography (18FDG-PET-CT) is cost-effective compared to CT or conventional chest X-ray in the follow-up of patients with non-small cell lung cancer (NSCLC) after radical radiotherapy [6]. The decision maker in this example is the medical doctor with a clear health perspective, with a choice of three imaging techniques to use in the follow-up of patients with NSCLC after potentially curative treatment. The expected costs and effects were calculated over a period of five years. Outcome is defined as patients’ health state, which is measured in QALYs.

Developing the decision model

Following the definition of the decision problem, the second step consists of developing the decision model by choosing an appropriate model structure. In medical imaging, decision trees and Markov models are most commonly used [24].

The decision tree

To visualize and calculate the effects of imaging on costs and health outcome in a static situation and short time frame, a flow model in the form of a decision tree can be used and easily made in Excel (Microsoft, Redmond, WA) or other spreadsheet application software. A hypothetical cohort of patients passes through the model and is divided over the different pathways in the decision tree, according to the assigned probabilities. A decision tree starts with a decision node, after which the alternative strategies are depicted. In each strategy, patients in the model do or do not have the disease or condition of interest and undergo an imaging test for diagnostic workup. Medical care decisions are based on the imaging results and determine whether patients undergo further imaging or other diagnostic tests and possible therapy. At the end of the decision tree, the cohort is divided into different possible health outcomes due to characteristics of diagnostic imaging and choice of therapy. These health outcomes can be assigned QALYs and are based on whether patients are alive and, when they are alive, whether they experience problems that may impact quality of life. Costs are the sum of the costs that occur in the pathways, such as costs for the imaging tests and eventual therapy, and health outcome related costs. Two or more diagnostic tests can be compared with each other: using the sensitivity and specificity of the different imaging modalities, the probabilities to be accurately diagnosed with corresponding optimal treatment can be assessed in the decision tree model. A schematic decision tree for the example of 18FDG-PET-CT versus CT versus chest X-ray in the follow-up of curatively treated NSCLC is shown in Fig. 2, which is a simplified version of a decision tree used in a recently published study [6].

Fig. 2
figure 2

Schematic example of a decision tree model

The Markov model

In non-static situations, such as in chronic diseases or malignancies, health states of the patients may alter depending on imaging and treatment options. In such situations, a Markov model can be used to reflect alterations in health states during the period of CEA. In a Markov model, the clinical situation is described in terms of the conditions that patients can be in (‘health states’), how they can move in such states (‘transitions’), and how likely such moves are (‘transition probabilities’) [25]. The time frame of the analysis can be chosen as suitable for the underlying medical question with respect to the available literature. For CEA of screening imaging tests and follow-up imaging in cancer, a long frame over several years or lifetime may be adequate. For example, in a Markov model used for the decision problem on optimal follow-up imaging after treated NSCLC, patients can be in the health states “no evidence of disease”, “progressive disease” or “dead” (Fig. 3). Progressive disease can be detected (i.e., the true positive test results) or not (the false negative test results). No evidence of disease represents the paths in the decision tree with true negatives and the false positives test results. At each time point of follow-up imaging, patients stay in the state they are, or move to different states, e.g., from no evidence of disease to progressive disease (and vice versa) or from no evidence of disease or progressive disease to dead. Please note that in this example, the Markov model directly follows on the health outcomes in the decision tree. It is common practice in imaging studies to combine a decision tree for the short-term diagnostic accuracy and treatment decision with a Markov model for the longer-term consequences of the disease.

Fig. 3
figure 3

Schematic example of a Markov model

Model input parameters

To predict costs and health outcome as effects of DI, both the direct and indirect effects of the imaging test itself on health outcome and costs should be considered as model input parameters (Fig. 1).

Direct effects

Direct effects are those that apply for all patients undergoing the diagnostic test regardless of the outcome of the test. Direct effects of imaging modalities on health vary, depending on the modality used, from negligible in, e.g. ultrasound, to considerable in invasive techniques like catheter angiography. The “considerable” effects are mainly a consequence of the risk of complications inherent to these invasive techniques. The use of contrast agents in diagnostic imaging carries the risk of adverse reactions and nephrotoxicity [2629]. The effects of these sequelae on health status depend on the chemical properties of the contrast agent used and the way it is administered (oral, intravenous, intra-arterial, intrathecal). Radiation exposure from diagnostic tests using X-Rays has a small but not negligible effect on health status especially in models addressing a large patient population. The risk of inducing cell mutations is present in all diagnostic modalities using X-Rays and increases with radiation dose and exposure times. Data on these risks are provided in the Biological Effects of Ionizing Radiation (BEIR) reports [30]. Besides these stochastic effects of radiation exposure deterministic effects, or tissue reactions, can occur at high dosage or long exposure times. These effects occur above a certain threshold and should, therefore, only be considered if there is a possibility that this threshold will be reached. Direct psychological health effects of imaging should be considered in CEA of DI if relevant, for example if imaging is very burdensome [31, 32]. These effects consist of potential short-term psychological effects of the imaging test itself.

Direct costs of diagnostic testing include depreciation of the hardware, cost of personnel and materials (e.g., contrast agents). To be considered in the determination of these direct costs are influencing factors like imaging time and mode of utilization of the equipment (24/7 versus office hours) since both define the patient throughput for the imaging equipment. The latter form of direct costs rise especially when using expensive hardware and long examinations, such as MRI and PET. Besides the aforementioned types, additional costs caused by the above described adverse events and complications need to be included based on their prevalence. These costs comprise additional hospitalization and treatment costs. Ultimately societal costs of radiation-induced malignancies can be transferred to additional costs of those tests for which this is applicable. Ideally real costs are used in a CEA, and tariffs or reimbursement fees are used only if they accurately represent the real costs.

Indirect effects

Indirect effects of a diagnostic test depend on the consequences of the test result. Both health status and costs related to this health status are characteristics of the patient and diagnostic tests do not directly change these parameters. Diagnostic tests guide the management of patients. Assuming optimal circumstantial factors, a “perfect” diagnostic test will theoretically result in the optimal management. One may assume that the optimal management leads to the best health status (though not necessarily to lowest costs). “Imperfect” diagnostic tests will in some cases lead to suboptimal management. The effect of diagnostic tests on outcome parameters lies, therefore, only in the imperfectness of the test, usually presented as sensitivity and specificity, or positive and negative predictive values. Indirect psychological effects are gaining more attention and consist of the positive or negative psychological effects of the diagnostic information of the test result on a patient’s view on his or her health [3234].

Outcome parameters for health status are generally represented as quality-adjusted life years (QALYs), although other outcome parameters might be used for specific study objectives (e.g., life-years saved). QALYs are derived from the general life expectancy estimates of the target population and disutilities (i.e., reduction in quality of life) related to the disease of interest, the treatment, and the diagnostic test. The use of QALYs allows for the relative importance of false positive or negative results of a diagnostic test (reflected in sensitivity and specificity) to be weighed. Costs are calculated from the allocated treatment, medication, hospitalization, etc. Furthermore, costs can be transferred from known societal costs of specific health conditions that might apply. Table 1 provides a schematic overview of model input parameters for the example of different follow-up imaging strategies in curatively treated NSCLC [6].

Table 1 Schematic example of model input parameters

Analysis of values and uncertainty

Cost-effectiveness modelling serves two purposes. First, it serves as an estimation of expected costs and outcomes for medical decision-making. Secondly, it assesses the uncertainties around these estimates and its range of validity [36].

The expected costs and outcomes can be obtained by multiplying the probabilities with their relevant costs and health outcomes. This results in average expected values per patient, for costs as well as health outcome. Specific software (e.g., TreeAgePro, TreeAge Software, Williamstown, MA, U.S.) can be used for this purpose, but Markov models and decision trees are also often analyzed using standard PC software (Microsoft Excel). The quality of estimates derived from the model directly depends on the quality of the input parameters. In general, the input parameters are desired to be as evidence-based as possible. However, model input parameters are surrounded by uncertainties. Parameter uncertainty reflects the uncertainty of the model input parameters because we do not know the precise values. This can be a result of multiple and conflicting or low quality studies, or merely from variation in the data as reflected in standard deviations surrounding a mean estimate. Parameter uncertainty can be addressed by a deterministic or probabilistic sensitivity analysis [36]. In a deterministic sensitivity analysis the mean values of the input parameters are altered to assess impact on total cost and outcome of the model. Even in the case when input parameters of the model might not be available, hypothetical values and borders can be set in the model to determine hypothetical estimates.

Alternatively, parameter uncertainty can be addressed by means of a probabilistic sensitivity analysis. Each uncertain model input parameter is assigned a distribution, based on the mean and standard deviation/error as found in the literature. Examples of distributions surrounding the input parameters can be found in Table 1. In the probabilistic sensitivity analysis, different samples are taken from these distributions. For each sample, the hypothetical cohort runs through the model based on these sampled probabilities, and costs and outcome are derived. This results in a range of estimates of costs and outcome, representing the uncertainty in the result of the cost-effectiveness estimation. This range in outcome is usually expressed as 95 % Confidence Intervals (CI), based on the probabilistic analysis. In our example, the expected costs per patient are € 15,266 (95 % CI € 14,072–16,440) and expected QALYs 1.30 (95 % CI 1.00–1.61), if all patients were to receive PET-CT at three months follow-up.

Interpretation of the results

When interpreting the results of the model, total estimated costs of diagnostic test, treatments and health state are compared to estimate health outcome over the chosen period of time. Figure 4 graphically illustrates the costs versus effectiveness comparison. The imaging strategy that yields the highest number of QALYs is considered most effective. If a strategy is less costly and more effective, it is cost-effective and superior to the alternative strategy (Fig. 4d). If a strategy is more costly and less effective than its alternative, it is dominated by the alternative (Fig. 4a).

Fig. 4
figure 4

Cost-effectiveness graph. QALY: Quality adjusted live years. ICER: Incremental cost-effectiveness ratio

A decision-analytic perspective implies that for an imaging strategy to be adopted, it has to be cost-effective compared to its next best alternative [37]. In case the strategy is more costly and more effective, or less costly and less effective than the alternative (Fig. 4b,c), incremental cost-effectiveness ratios (ICERs) need to be calculated by dividing the incremental costs by the incremental QALYs. The decision whether the strategy is deemed cost-effective then depends on how much society is willing to pay for a QALY gained or lost. In the Netherlands, for example, the informal societal willingness to pay (WTP) threshold level is € 80,000 [38]. If in quadrant b the ICER is lower than this level, the strategy is cost-effective compared to the alternative (Fig. 4b). In quadrant c, the ICER needs to be higher than the WTP in order to be cost-effective. To illustrate the results of the probabilistic sensitivity analysis, cost-effectiveness acceptability curves (CEACs) can be calculated [39]. CEACs are shown in Fig. 5 demonstrating the probability that a strategy is cost-effective, given different values of willingness to pay for a QALY. This figure shows that if societal WTP for a QALY is low, conventional follow-up is certainly the most cost-effective strategy. If the willingness to pay increases, the probability that this is true decreases. At a willingness to pay of € 80,000 per QALY, PET-CT-based follow-up and conventional follow-up have a similar probability of being cost-effective of approximately 48 %. The probability that CT is cost-effective is only 5 %. This implies that, although it is uncertain which diagnostic test is most cost-effective, it is quite certain that it is not CT.

Fig. 5
figure 5

Schematic example of cost-effectiveness acceptability curves (CEACs). The probability of cost-effectiveness of three investigated imaging tests is plotted against the willingness to pay for a quality adjusted life year (QALY) [6

Transferability and validation

When developing the model, it is important to consider the decision maker, and the jurisdiction by which the decision is made. The value of input parameters may well differ between countries, and the chosen estimates should be relevant for the given society. This implies, however, that results of decision-analytic models are not directly transferable to other countries [40]. A decision model that has been developed for one country might need adaptation to support decision-making in another. While direct effects such as test accuracy may be generalizable to other settings, this may not be the case for health outcomes, resource use, and in particular, costs [41]. For example, costs are known to be proportionally lower in Europe than in the United States (U.S.) [14]. This does not necessarily imply that the conclusions of a decision model are not generalizable from Europe to the U.S. or vice versa, because the incremental costs and effects may not differ. In our example, test accuracy and health outcomes will not differ much between different countries and are easily transferrable. Costs though, both direct and indirect, may show considerable variations between countries. It is, therefore, very important that input parameters of a model are transparently described. This allows others to explore whether the inputs are relevant for their country or decision problem, and if not, how this may affect the results. It is then possible to recalculate results based on modified cost parameters. Deterministic sensitivity analyses, where parameters are changed in order to explore the impact of the parameter on the results, may also be helpful to address the issue of transferability of the results to other settings [41].

A transparent description of all inputs and choices is also important for reasons of validity. In every decision-analytic model, the validity of the results is dependent on the validity of the inputs. A model can be validated by means of face validity (evaluation of model structure, data sources, assumptions and results by experts), verification or internal validity (check accuracy of coding), cross validity (comparison of results with other models analyzing the same problem), external validity (comparing model results with real-world results), and predictive validity (comparing model results with prospectively observed events) [42].

Conclusion

This article illustrates the possibilities and challenges of CEA in DI. In order to provide DI professionals with an introduction to the tools necessary to perform and interpret CEA, we describe a comprehensive framework of direct and indirect effects that should be considered, suitable for all imaging modalities. The framework is supported by a six-step methodology for complete and uniform CEA studies in DI.