Background

Screening represents a cornerstone of preventive medicine. Its rationale is to identify disease during an early and pre-symptomatic stage [1]. With appropriate treatment, screening can result in disease prevention for those patients identified as at-risk. Early disease may be easier and less expensive to treat, which positions screening strategies as potentially sound investments for healthcare systems. Several countries have developed national screening programs that have led to increased disease detection rates and prevention [2, 3].

However, screening is not entirely risk-free and usually represents an immediate economic burden for systems with tight budget constraints. Some screening tools are associated with direct health risks (X-rays and radiation), and others might not provide a real additional value if, for instance, no follow-up treatment is available [1]. Additionally, tests need to be sufficiently reliable and accurate, since high proportions of false negatives or false positives might represent worse health outcomes and unnecessary diagnostic costs [4, 5]. To maximize value, an economic evaluation is a useful tool to compare the potential benefits, risks, and costs of different strategies and to inform resource allocation decisions. All health systems have scarce resources and are faced with opportunity costs; this means that any investment in a screening tool will come at the cost of other health services to the detriment of those patients who would have been treated [6].

Recognizing opportunity costs, healthcare systems may require that health interventions are both clinically and cost-effective to be considered for implementation [7]. Cost-effectiveness analysis (CEA) can be trial-based evaluations that use trial data to compare alternatives [8]; however, they are expensive to conduct and often require large sample sizes to obtain sufficient statistical power [9]. To overcome these challenges, model-based economic evaluations of screening tools have become a commonplace. Inputs are obtained from the best available sources and combined in mathematical models that replicate patient use of different strategies and provide a summary of costs and consequences for further analysis and comparison [10]. However, given that screening tools are used early in the treatment pathway, economic evaluations of screening strategies have many specific challenges to overcome. The objective of this study is to provide an overview of the different types of challenges and methodologies reported in the most recent cost-effectiveness analyses of screening strategies.

Methods

Eligibility criteria

A systematic review was conducted to identify the latest cost-effectiveness analyses (CEAs) of screening tools. Review and reporting followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [11]. Only research articles published in English and in 2017 were eligible for inclusion. CEAs comparing screening strategies versus no screening or other alternatives were included. There were no exclusion criteria based on the disease area. However, studies focusing on genomic screening and screening for blood transfusion, cost-benefit and cost-minimization studies, and review articles, editorial letters, news, study protocols, case reports, posters, and conference abstracts were excluded.

Searches and study selection

We searched the online databases of EMBASE and MEDLINE. Search terms included Medical Subject Headings (MeSH), Emtree, and keywords for “mass screening” or screening, economic evaluation, and cost-effectiveness analysis. The last search was run on August 17, 2017. The search strategies can be found in Appendix 1 and Appendix 2. Two independent authors (NI and ES) screened all titles and abstracts. Any reference included by either reviewers at this stage was included for full-text review. This section was conducted independently and in duplicate. Disagreements at this stage were settled by discussion until a consensus was reached by both authors (NI and ES).

Data extraction

We extracted the study characteristics and findings including the population, disease/condition, screening tools (strategies), comparators, perspective, time horizon, discounting, outcome or effectiveness measures (i.e., expected life years, quality-adjusted life years, cases detected), and incremental cost-effectiveness ratios (ICERs). A description of the findings was portrayed in a narrative synthesis. Results were compared to an economic evaluation focused on the early diagnosis and treatment of psoriatic arthritis (PsA) that is currently being developed by the authors (NI and ES).

Results

A total of 1059 records were found after 109 duplicates were removed. Two hundred nineteen articles were included for full-text assessment after 840 were excluded during the abstract screening stage (Fig. 1). Finally, 68 economic evaluations of screening tools were narratively synthesized (Table 1). A total of 26 studies (38.2%) evaluated the screening tools for cancer, 6 (8.8%) for hepatic disease, 5 (7.3%) for sexually transmitted disease, and 4 (5.8%) for heart disease. Twenty-nine (42.6%) added a “no screening” alternative for comparison. Thirty-five (51.4%) used quality-adjusted life years (QALYs) as the main outcome. Fifty-three studies (77.9%) modeled treatment options that followed screening and diagnostic testing. Finally, 7 studies (10.3%) concluded that the screening tool(s) they were evaluating were not cost-effective compared to current practice. The rest concluded that the implementation of screening tools had a high probability of being cost-effective. However, some specific recommendations regarding target populations, cost-effectiveness thresholds, and screening frequencies were made by some CEAs. Reported challenges and limitations of the economic evaluations were divided into three categories. The first one pertains to the screening pathway. It takes into account the test availability and sequencing, treatment options, accuracy, and patient compliance. The second describes the pre-symptomatic disease, prevalence, progression, and treatment effects. Finally, challenges with non-health benefits and spillovers are reported.

Fig. 1
figure 1

PRISMA flowchart. The PRISMA flow diagram details the search and study inclusion/exclusion process. It is a graphical representation of the flow of citations throughout the review

Table 1 Study characteristics

Screening pathway

The value of the screening test is dependent on the full screening pathway. This refers to the screening test and the subsequent follow-up undertaken because of the results of the screening. The review identified multiple studies that evaluated different screening pathways by modifying the order in which screening tests were administered [12,13,14,15,16,17]. This allowed investigators to determine trade-offs between potential screening sequences. However, these models are dependent on data availability, and lots of different types of evidence are necessary to inform the screening pathway including screening and diagnostic test accuracy and screening compliance. Most studies explored challenges such as conditional test accuracy, a lack of a diagnostic gold standard, outcomes of false positives and false negatives, or screening compliance.

Accuracy

Twenty-five studies (36.7%) explicitly reported challenges regarding screening test accuracy [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43]. One common challenge was the lack of data on test accuracy. In some cases, authors had to assume the accuracy of the screening test [28, 30, 32, 33]; more commonly, it was assumed that tests had the same performance regardless of prior testing [19, 34]. This assumption is particularly important when different sequences of screening and diagnostics tests are being evaluated. Accuracy assumptions were often tested using different combinations of sensitivity and specificity. Barzi et al. modeled a hypothetical test and, through model iterations, determined the combination of test sensitivity and specificity that would yield optimal results in terms of cost-effectiveness [19]. Crowson et al. undertook a two-way sensitivity analysis of sensitivity and specificity to determine their importance to health outcomes and costs [23]. Sensitivity analyses are useful tools to evaluate the uncertainty around test accuracy estimates. These analyses allow a threshold to be determined at which a specific screening tool would result in a cost-effective strategy.

To understand the implications of screening on patients’ health, it is important to model the outcomes of any follow-up diagnostic tests. However, one common difficulty is that there is usually no information on the accuracy of the diagnostic test in the screen-positive population. A few assumptions were made to account for this uncertainty. A study in Thailand for non-alcoholic fatty liver disease used pooled estimates of diagnostic accuracy from a meta-analysis assuming independence between the screening and diagnostic accuracy [33]. Chowers et al. tested different accuracy rates for HIV diagnostic tests with sensitivity analyses [21]. Other studies assumed specific accuracy estimates (usually 100%) and acknowledged the limitations, such as potentially overestimating cost-effectiveness estimates by excluding pertinent costs associated to misclassified patients [22, 29, 44].

False positive and negative outcomes

Screening and diagnostic accuracy determines the proportion of patients who will continue to receive treatment or further follow-up. It is important to understand the health outcomes of all patients screened. Patients identified as false positive or false negative are particularly difficult to consider in cost-effectiveness analysis given the lack of data on these patients. Costs and outcomes for patients who followed incorrect screening and treatment pathways were included in 22 (32.3%) of the studies [12, 17, 18, 21, 23,24,25, 29, 36, 40, 42, 43, 45,46,47,48,49,50,51,52,53,54]. Even though some cost-effectiveness analyses identified false positives in the screening pathways, one alternative was to assume 100% accurate diagnostic tests; this meant patients identified incorrectly during screening would never go on to inappropriate treatment [29, 42, 49]. In these cases, there were extra diagnostic costs, but no treatment-specific costs or outcomes were pertinent. Health outcomes may be overestimated when assuming 100% accurate diagnostic tests. Alternatively, some studies assumed that diagnostic tests were not perfect and included costs and health consequences of the incorrect treatment of false positive patients, such as healthy patients receiving unnecessary treatment and having side effects [17, 43, 48, 53, 54]. Whenever a treatment poses a considerable threat to false positives (or a considerable monetary cost), CEAs should acknowledge and include these scenarios. When false negative patients were modeled, it was assumed that they would progress at the same rate as untreated patients and were usually identified as being sick once symptoms appear [17, 21, 45, 46, 48]. This is comparable to the pathway for all sick patients under a “no screening” arm. A high proportion of false negatives (i.e., tests with low sensitivity) will translate to fewer identified sick patients. Depending on the disease, tests, costs, and health outcomes, a CEA could evaluate whether repeated testing is worth implementing to reduce this proportion of patients. Four studies failed to model false positives and/or negatives after acknowledging their potential effect to the evaluation [12, 18, 25, 36].

Compliance

Screening pathways are greatly altered by different rates of participation and compliance. Screening is only effective if the target population and healthcare providers are engaged. Twenty-nine evaluations (42.6%) identified patient participation and compliance as an important model parameter [12, 14, 16,17,18,19,20, 25, 27, 28, 32, 36, 37, 43,44,45,46, 48,49,50,51, 55,56,57,58,59,60,61,62]. Morton et al. reported that the results of a national breast cancer screening program in the UK would be impacted by the proportion of the at-risk population who decided to participate [50]. Lower compliance translates to lower screening and diagnostic costs, but also represents a higher burden of disease if non-compliers are diagnosed at later and more expensive-to-treat stages of disease. Screening can also raise costs without improving health outcomes if identified patients fail to follow further recommended treatment due to unreliable testing. John et al. also modeled non-compliers who had a chance of getting sick and being identified by opportunistic screening [48]. Additionally, studies such as that conducted by Aronsson et al. explain how compliance rates are dependent on the screening tool to be evaluated [12, 19]. They model colonoscopy and fecal immunochemical tests (FIT) to screen for colorectal cancer, and take into account the different compliance rates for each alternative. Since colonoscopy is expected to make people more uncomfortable than the FIT, less people are expected to comply with the former [12, 19]. To test this, willingness-to-pay to avoid colonoscopy was estimated [12]. However, information about the compliance rates for different screening tests was rarely available. Ten studies (14.7%) assumed a 100% compliance rate [17, 25, 27, 28, 32, 45, 51, 55, 57, 59, 63]. The effect of this assumption over cost-effectiveness estimates depends on the specific evaluation being conducted, specifically the trade-off between lower screening costs and worse health outcomes due to unidentified disease.

Pre-symptomatic disease

Disease prognosis and patient evolution from pre-symptomatic stages of disease were modeled in most cases to estimate aggregate costs and outcomes. All included studies but 2 (3%) [38, 64] explicitly commented on challenges encountered while trying to adequately model disease progression and patient transition through health states. Pre-symptomatic disease refers to the point in progression when the disease is developing but no symptoms are apparent. This is the point when screening tools are useful but usually when there is very little data about progression of the disease. Once identified as having a disease, more data is available for modeling cost-effectiveness.

Prevalence/incidence

Screening models often focus on at-risk populations. Incidence rates are used to determine the proportion of patients who enter the models at pre-symptomatic stages. This is useful for scenarios with repeated screening procedures, as a dynamic model can be developed to evaluate repeated screening processes while taking into account new at-risk patients [47]. On the other hand, some studies included population-specific incidence rates [19, 35, 47]. A different approach consists on evaluating one-time-only screening procedures targeting prevalent disease [49]. Deciding between repeated versus one-time testing depends on the type of disease and population of the evaluation. A one-time test for tuberculosis might be appropriate for immigrant populations, while testing for lung cancer among smokers is recommended to be carried out repeatedly. The sequence and frequency of tests can be tested through modeling to determine the cost-effective option. Sensitivity analyses determined that cost-effectiveness estimates were highly sensitive to changes in prevalence and incidence estimates [25, 49, 65]. Testing for a rare disease might not result cost-effective compared to a common disease given a similar health and economic burden.

Pre-symptomatic disease progression

Once an at-risk population is identified, some cost-effectiveness analyses focused on modeling the pre-symptomatic stages of disease. There is a time interval before clinical symptoms appear and after disease onset where disease is identifiable by screening tools. This timeframe, also called sojourn time, is a major challenge for CEA since progression of pre-symptomatic disease if often unknown (Table 2). Uncertainty around sojourn time was tested by 3 studies (4.4%) [36, 60, 66]. van Luijt et al. determined a fixed preclinical stage of breast cancer where disease could be identified by screening [67]. This study also allowed for disease regression or progression to more advanced pre-symptomatic stages. Atkin et al. modeled similar pre-symptomatic stages for colorectal cancer and adenoma [18]. Sensitivity analyses allowed to estimate the effect of varying the interval for sojourn time on cost-effectiveness. These studies concluded that longer sojourn time represented improved disease identification rates.

Table 2 Summary of methodological issues and suggestions to develop CEAs of screening tools

Modeling patient progression during the sojourn time, i.e., through pre-symptomatic health states, remains a challenge. Three studies extrapolated progression rates from symptomatic disease stages to pre-symptomatic disease [18, 56, 68]. In some cases, fast progressing disease may cause death before diagnosis. Death rates for pre-symptomatic disease were available for colorectal cancer using Kaplan-Meier estimators from lifetime data [18], health state-specific mortality risks in chronic kidney disease [69], and gastric cancer [62]. Additionally, based on differential progression rates and life expectancy, two studies evaluated the potential effect of lead time bias in their studies [35, 54]. This bias explains how early diagnosed patients might not experience an increase in expected survival, but instead spend longer periods under treatment. This effect gives the illusion of higher survival expectancy [35], resulting in biased cost-effectiveness estimates. Survival has a major impact over health-related outcomes in CEAs, and assuming a higher rate will overestimate the health benefits. This is one example of a model input that is likely to affect the cost-effectiveness of a screening tool and should be tested in sensitivity analyses. Yang et al. used population matching (cancer cases vs general population) and a difference in difference methodology to determine if early diagnosis provided improved life expectancy [54]. Both studies showed differential survival rates favoring patients who were diagnosed early after accounting for potential lead time bias [35, 54].

Treatment effect and health outcomes

According to the WHO, screening interventions are expected to provide treatment alternatives for those patients with identified cases of disease [1]. However, 15 studies (22%) failed to model a treatment pathway [22,23,24, 26, 29, 31, 38, 44, 57, 64, 70,71,72,73,74]. The main outcomes captured by these studies were the following: cases detected, missed cases, avoided cases, and identified true positives and true negatives. Decision trees were most commonly used for these modeling tasks. However, these models are insufficient for making reimbursement decisions, since efficacious interventions or therapies are required to follow screening and diagnostic procedures to improve patients’ health. Without these benefits, screening procedures are not capturing all consequences, leading to incomplete CEAs. On the other hand, studies that modeled treatment pathways captured different health outcomes to evaluate cost-effectiveness of screening strategies. Quality-adjusted life years (QALYs) were estimated by 39 studies (57.3%) [12,13,14,15, 17, 18, 21, 27, 28, 32,33,34,35, 37, 39, 40, 42, 45,46,47,48,49,50,51, 53,54,55,56,57, 59, 62, 65, 67,68,69, 72, 74,75,76,77,78], and expected life years (ELYs) by 10 (14.7%) [18, 19, 36, 43, 52, 58, 65, 66, 79, 80]. Utilities were widely used, and the following challenges and methodologies were reported: Chowers et al. acknowledged having underestimated QALY outcomes in their prenatal HIV screening evaluation by excluding maternal utility measures. Additionally, treatment for false positives and its repercussions were excluded, even though treatment for healthy newborns is expected to cause disutility [21]. Ferguson et al. observed there was a difficulty assigning utilities for patients with undiagnosed chronic kidney disease. Therefore, they assumed similar utilities for undiagnosed and diagnosed cases [69]. Cheng et al. extrapolated already estimated utility weights for pre-symptomatic hepatitis A to model the preclinical stage of hepatitis E [76]. Assumptions around utility estimates are common, but require careful consideration to avoid a deviation from the initial target population. Although health outcomes are most often captured after treatment begins, some models included screening and diagnostic specific health effects. Risk of perforation due to colonoscopy was included by Atkin et al. in their colorectal cancer CEA [18]. Yang et al. included radiation-induced cancer cases from radiography screening [54]. Failing to include potentially negative health effects of screening tests will overestimate the health benefits and potentially underestimate associated costs.

Some studies reported uncertainty around treatment efficacy inputs [15, 44, 65, 66, 75]. Sensitivity and scenario analyses were broadly used to account for this uncertainty. Not surprisingly, cost-effectiveness estimates were influenced by treatment efficacy of early treatment and uptake [65, 66, 75]. A few studies conducted a value of information analyses to estimate the value of collecting further information to resolve decision uncertainty [18, 44, 75].

Non-health costs and outcomes, and spillovers

CEAs take into account the costs and outcomes of specific interventions and compare them to determine if they provide enough benefits relative to the cost compared to the next best alternative. However, not all potential benefits and costs are necessarily health related. The perspective of a CEA determines what kind of effects and costs will be included. A healthcare perspective seeks to compare costs and consequences that directly pertain to the healthcare sector. They generally focus on health-related outcomes [81]. Alternatively, a societal perspective attempts to capture all relevant costs and outcomes, health-related or not. Transportation costs, out-of-pocket expenses, and productivity losses are a few examples. These analyses evaluate the trade-off between health and any other outcome, but this information is rarely known, i.e., societal preferences between health and productivity or educational benefits [81]. This review identified 38 (55.8%) and 15 (22%) studies that developed their analyses under a healthcare [12, 14, 18, 20,21,22,23, 25, 27, 29, 34,35,36,37,38,39,40, 42, 44, 46,47,48,49,50,51,52,53,54,55, 58, 60, 62, 65, 66, 68, 69, 75, 78] and societal perspective [13, 15, 17, 19, 28, 33, 41, 43, 56, 57, 59, 61, 67, 76, 80], respectively. The following were specific studies that included non-health costs and/or outcomes: Cressman et al. estimated the productivity loss of lung cancer patients who had been previously working before starting treatment [56]. Phisalprapa et al. included non-medical costs (transportation, meals, accommodations, and facilities) in their evaluation of non-alcoholic fatty liver disease [33]. Pil et al. used a patient questionnaire to assess indirect costs in their skin cancer screening CEA related to productivity loss, morbidity, and early mortality [59]. Sharma et al. included patient transportation costs [61]. The decision to include indirect (or non-medical) costs and outcomes depends on the decision maker’s perspective. The societal perspective allows a thorough analysis by including a broader spectrum of the associated consequences. However, including all indirect outcomes or externalities might prove a difficult task, and missing important outcomes will render the evaluation incomplete and possibly biased. It is also true that although most studies considering a societal perspective focused on costs, there was one that also included non-health benefits or outcomes. Chen et al. compared the benefits of the different types of education that children received after being screened and treated for neonatal hearing loss. Children who were successfully identified and treated for hearing loss were expected to have better educational outcomes [45]. Sensitivity analyses determined that cost-effectiveness estimates were most affected by the inclusion of the societal costs [80].

One concern of adopting a societal perspective is the implicit assumptions on how resources should be distributed; for example, including productivity costs (an important part of non-health outcomes) generally benefits treatments of the working age population at the cost of children and seniors [82]. Prusa et al. developed a CEA of toxoplasmosis screening for children in Austria. Besides considering the projected lifetime productivity loss of the affected children, they also considered the productivity loss of parents [80]. Consequences (health-related or not) that fall on third or external parties are called spillover effects [83]. Spillover effects were not identified or modeled in any other study. Basu and Meltzer argue that CEAs might better reflect all associated costs and outcomes by considering spillovers [83]. CEAs that focus on screening tools have specific challenges to address regarding spillovers or externalities, especially health-related ones. False positive tests for venereal diseases, for instance, can have negative consequences for families and third parties in terms of anxiety, stress, and divorce. On the other hand, there are potential positive spillovers. For example, screening tests might have a modest capacity to identify similar conditions. This review did not identify studies that included benefits of such opportunistic identification.

Discussion

This study reviewed the latest CEAs of screening tools and provided a thorough breakdown of challenges and suggestions to overcome them. The included studies mentioned several assumptions and methodological alternatives that were grouped in four major categories: the screening pathway, pre-symptomatic disease, treatment outcomes, and spillovers and externalities. To capture all important costs and outcomes of a screening tool, screening pathways should be modeled through the treatment of the patient. Also, false positive and false negative patients are likely to have important costs and benefits and should be included in the analysis. As these patients are difficult to identify in regular data sources, common treatment patterns should be used to determine how these patients are likely to be treated. Many assumptions are needed when modeling screening tools. It is important that these assumptions are clearly indicated and that the consequences of these assumptions are tested in sensitivity analyses. These include the assumptions such as the independence of consecutive tests and the level of patient and provider compliance to guidelines and sojourn times, i.e., the time between when a patient can be identified by screening test and when they would have been identified due to symptoms. As data is rarely available regarding the progression of undiagnosed patients, extrapolation from diagnosed patients may be necessary. Not surprisingly, different scenarios concluded that longer sojourn times were likely to result in improved health outcomes. This becomes one of the main drivers of the effectiveness of a screening test, besides the accuracy at which it identifies patients correctly. This was particularly true when available treatment was capable of modifying disease progression. Finally, non-health costs and outcomes were observed for studies that developed their analyses under a societal perspective. These were not consistently reported, mostly likely due to different guidelines from decision makers.

This review thoroughly examined the latest methodological challenges associated with modeling CEAs of screening tools. However, some limitations are to be noted. Studies focusing on genomic and blood transfusion screening tests were excluded. Genomic screening was excluded because a recent paper evaluated CEAs of genomic screening tests [84]. Blood transfusion tests were excluded because different issues arise when testing blood for treatment rather than testing patients for disease [85]. Challenges and methodologies of CEAs are expected to vary considerably between these groups. Finally, studies were limited to 2017 to capture the most recent state of the art in this area. We were interested in the latest available evidence to appropriately review the most up-to-date methodologies for modeling screening tools from a health economic perspective. However, all diseases were included to avoid disease-specific issues and to provide a broad learning across disease areas.

Conclusion

Many new screening tools are being developed and require cost-effectiveness analyses to support their value proposition. Screening tools should follow diagnostic guidelines, but have additional challenges given that sojourn times and pre-symptomatic progression data is rarely known. Current cost-effectiveness analyses extrapolate pre-symptomatic progression from symptomatic patients and thoroughly test assumptions in sensitivity analyses, including sojourn times. By following these methodological suggestions, screening tool evaluations are expected to become a better reflection of medical practice and to provide better quality evidence for decision makers making difficult trade-offs between funding screening interventions or other health technologies.