FormalPara Key Points for Decision Makers

This review summarises the methodologies and results of all model-based economic evaluations focussing on tests used in the diagnosis of PAD.

The review highlights the limited amount of model-based economic evaluation literature available in this clinical area, in particular for tests used in a primary care setting.

Methods and findings highlighted in this review may be used to support future modelling work in related areas.

1 Introduction

Peripheral arterial disease (PAD) is a common condition in which atherosclerotic narrowing, or occlusion, in the arteries restricts blood supply to the leg muscles [1]. PAD can cause leg pain on walking (intermittent claudication) and, in more severe cases, may cause ulcers or gangrene and could potentially lead to amputation. The prevalence of PAD in the UK ranges from 3 to 10%, rising to 15–20% in the elderly [2]. One in five people aged 65–75 years in the UK have clinical evidence of PAD, although only a quarter of them are symptomatic [3]. Due to underlying atherosclerosis, patients with PAD tend to have reduced functional capacity [4, 5] and an increased risk of cardiovascular morbidity, i.e. myocardial infarction and stroke, and mortality [6,7,8].

Detecting PAD early gives the opportunity to try and control its associated vascular risk factors, reduce adverse cardiovascular events and avoid the need for surgery. The National Institute for Health and Care Excellence (NICE) clinical guidelines for PAD (CG147) [9] recommend diagnosis through symptoms and signs and measurement of the ankle brachial pressure index (ABPI) in primary care. In a secondary care setting, diagnosis is often made through imaging techniques such as magnetic resonance angiography (MRA), digital subtraction angiography (DSA) and duplex ultrasound (DUS), amongst others.

As part of a wider study exploring the costs and effects of introducing a new test for the diagnosis of PAD, a systematic review of existing economic evidence in the area was required. Although information on the relative costs and effects of alternative methods to detect PAD are sparse, a limited number of economic decision modelling analyses have been conducted in this area. The aim of this review was to provide a summary of existing model-based economic evaluation literature, up until the year 2017, on currently available methods to detect PAD in either a primary or secondary care setting in order to support future model-based economic evaluations comparing methods of diagnosis in this clinical area.

The specific objectives of this review were to:

  1. 1.

    map the relevant economic evidence base for model-based economic evaluations of methods to detect PAD in both primary and secondary care;

  2. 2.

    assess the methodological quality of the identified studies;

  3. 3.

    identify key strengths and weaknesses of the identified studies when comparing the different diagnostic tests; and

  4. 4.

    highlight the cost-effectiveness evidence on existing methods of diagnosis.

2 Methods

A systematic literature review was conducted to identify model-based economic evaluations of diagnostic tests to detect PAD in either a primary or secondary care setting. The review work was conducted in accordance with the methods outlined in the Centre for Reviews and Dissemination’s guidance for undertaking reviews in healthcare, which provides a comprehensive guide to best practice in conducting reviews in healthcare [10]. Studies (individual papers) were required to meet each of the following criteria in order to be included in the review:

  • population were adults (> 18 years) with suspected PAD or at risk of PAD, adults with intermittent claudication or adults with PAD undergoing further diagnostic testing;

  • included an intervention targeted at the detection of PAD (in primary or secondary care); and

  • involved a full model-based economic evaluation (study in which a comparison of two or more interventions or care alternatives is undertaken, and in which both the costs and outcomes of the alternatives are examined).

There were no restrictions on the type of comparator or the outcomes that needed to be included in the study. There were no restrictions placed on the publication year. Conference abstracts were excluded due to concerns about quality, and the potential for there to be insufficient detail reported. Studies were excluded if they were not in the English language or if they did not meet the inclusion criteria described.

2.1 Search Strategy

Systematic searches were undertaken in the following databases:

  • UK National Health Service (NHS) Economic Evaluation Database (EED).

  • MEDLINE.

  • Cochrane Central.

  • Database of Reviews of Effectiveness (DARE).

  • Cochrane Database of Systematic Reviews.

  • Health Technology Assessment (HTA) database.

The searches were undertaken by an information specialist (SR) with experience in devising search strategies for economic evaluations. Searches were iterative to take into account any terms, phrases or concepts that were discovered during other parts of the review. These searches were undertaken in June 2017. Records were downloaded from databases and then imported into EndNote® X7 (Thomas Reuters, Toronto, ONT, Canada) bibliographic software, where duplicate records were removed and remaining recorded were screened. The complete search strategy, designed to run in MEDLINE (OVID), is outlined in Electronic Supplementary Material (Appendix 1).

2.2 Study Selection

The searches identified 419 publications. Two systematic reviewers (JO’C and EM) independently screened the titles and abstracts to identify potentially relevant studies. Disagreements were resolved by a third reviewer (DC). Four papers were excluded on the basis of being a duplicate study and 409 studies were excluded as they did not meet all of the inclusion criteria applied, or they were either a conference abstract and/or were not in the English language. Six studies were deemed adequate to be potentially included in the review following title and abstract screening, and the full text of each of these papers was examined. Once again, the two reviewers independently screened the full publications and five of the six studies were considered appropriate for inclusion in the review. The sixth study was excluded as it did not meet all of the inclusion criteria. The references of these five studies were hand-searched and two additional potentially relevant studies were identified. When independently checked by the two reviewers, these papers were deemed to meet the inclusion criteria for the review. Therefore, seven studies were included in the final review. The flow chart of the selection process is depicted in Fig. 1.

Fig. 1
figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart of selection process for included studies

2.3 Data Extraction

Relevant data from included studies were extracted accordingly. Data extracted from each study included the setting and patient population, the perspective of the study and the type of economic evaluation conducted. Additionally, the type of model structure, the comparators included in the analysis, and the time cycle and time horizon of the models were identified and extracted for each study.

2.4 Quality Assessment

Following data extraction, a quality assessment of the included studies was undertaken using the Philips checklist, a quality assessment tool developed for decision-analytic modelling studies [11]. In this tool, studies are assessed under the three general components of a model—‘structure’, ‘data’ and ‘consistency’—and results of the quality assessment are presented under these headings. The full list of questions included in the checklist is presented in Table 3 in Electronic Supplementary Material (Appendix 2). The quality assessment was used to assess the methodological quality of the included studies.

3 Results

Seven studies were included in the review. All studies identified for this review were published between 1995 and 2014, with no published studies identified from before or after these dates. An overview of the included studies is first presented, followed by a description of the data extracted from each study. Next, an assessment of the methodological quality of included studies, and a description of their shortcomings, is presented. Finally, the cost-effectiveness evidence on existing methods of diagnosis is presented. A brief overview of the included studies is presented in Sect. 3.1, with a full overview presented in Table 1.

Table 1 Overview of included studies

3.1 Overview of Included Studies

  • Vaidya et al. [12] evaluated the lifetime cost-effectiveness of the strategy of selective PAD screening with ABPI and consequent preventive treatment compared with no screening and no preventive treatment.

  • Coffi et al. [13] compared duplex scanning in combination with arterial DSA with two other diagnostic strategies: duplex scanning plus supplementary angiography if duplex scanning is inconclusive, and duplex scanning plus confirmative angiography if duplex scanning is either inconclusive or shows lesions.

  • Visser et al. [14] compared the cost-effectiveness of gadolinium-enhanced MRA, colour-guided DUS and intra-arterial DSA used in a variety of diagnostic strategies.

  • Visser et al. [15] compared the cost-effectiveness of multi-detector row computed tomography angiography (CTA) with that of gadolinium-enhanced MRA.

  • Collins et al. [16] compared the cost-effectiveness of DUS, MRA and CTA for the diagnosis and assessment of symptomatic lower-limb PAD.

  • Yin et al. [17] compared the cost-effectiveness of MRA with conventional angiography.

  • Visser et al. [18] compared the cost-effectiveness of MRA, duplex ultrasonography and DSA.

3.2 Data Extraction

3.2.1 Setting and Patient Population

Four studies were performed in the Netherlands [12,13,14,15], one study in the UK [16] and two studies in the USA [17, 18]. The patient population in four studies [12, 13, 16, 17] was patients with PAD, or suspected PAD. One of these studies [12] specifically looked at patients with suspected PAD at high risk of experiencing acute cardiovascular events (asymptomatic patients over the age of 55 years), and another [17] looked at patients with suspected PAD which was limb-threatening (comparison of interventions for pre-operative evaluation). Another concentrated on patients with PAD [13], while the other focused on patients either with intermittent claudication or with limb-threatening ischaemia, who needed to undergo lower-limb vascular imaging to formulate an appropriate treatment plan for their condition [16]. The patient population in the remaining three studies [14, 15, 18] was patients with intermittent claudication, with one of these studies [18] specifically looking at patients with lifestyle-limiting intermittent claudication. Patients in these three studies [14, 15, 18] were at a stage where they would require a diagnostic test as part of pre-treatment imaging work-up.

3.2.2 Perspective

The majority of the studies (five of seven) were conducted from a societal perspective [12, 14, 15, 17, 18]. One study [13] was conducted from a provider (hospital) perspective, and another [16] was conducted from a UK NHS perspective.

3.2.3 Type of Economic Evaluation

Two studies performed both a cost-effectiveness analysis (CEA) and a cost-utility analysis (CUA) [12, 16]. One study performed a CEA only [13], and four studies performed a CUA only [14, 15, 17, 18]. One study [15] additionally undertook a threshold analysis. Of the studies that conducted a CEA, one study [12] used life-years gained as an outcome measure, one study [13] used an additional correctly identified case and one study [16] used a correctly diagnosed patient as the outcome measure. All of the studies that conducted a CUA [12, 14,15,16,17,18] used quality-adjusted life-years (QALYs) gained as the outcome measure.

3.2.4 Model Structure and Comparators

Four studies [12, 15, 16, 18] were model-based economic evaluations involving a decision tree combined with a Markov state transition model. Of these, one study [12] evaluated a screening strategy using the ABPI; one study [15] evaluated multi-detector row CTA, as compared with gadolinium-enhanced MRA; one study [16] compared contrast angiography with MRA, DUS and CTA; and one study [18] evaluated pre-treatment work-up using MRA, DUS or intra-arterial DSA.

Two studies [13, 17] were model-based economic evaluations involving a decision tree only. Of these, one study [13] compared duplex scanning in combination with arterial DSA with two other diagnostic strategies (duplex scanning plus supplementary angiography if duplex scanning is inconclusive, and duplex scanning plus confirmative angiography if duplex scanning is either inconclusive or shows lesions). The other study [17] compared MRA with conventional angiography.

One study [14] was a model-based economic evaluation involving a Markov Monte Carlo model embedded in a large decision-analytic model. This study compared gadolinium-enhanced MRA, colour-guided DUS and intra-arterial DSA used in a variety of diagnostic strategies.

3.2.5 Time Horizon and Time Cycle

Four studies used a 1-year time cycle and lifetime time horizon [12, 14, 15, 18]. Although three of these did not explicitly state the time cycle used [14, 15, 18], it did appear to be 1 year based on the parameters included in the model. One study included a short-term model with a 1-year time horizon, and a long-term model with a lifetime time horizon, the latter of which used a 1-year time cycle [16]. For two studies, time cycle and time horizon were not relevant as the model structure was a decision tree [13, 17].

3.3 Methodological Quality of the Included Studies

The results of the quality assessments are presented in Table 3 in Electronic Supplementary Material (Appendix 2), with a description of the findings presented below. Results are discussed under the headings provided in the framework used (structure, data and consistency) [11].

3.3.1 Structure

General All studies clearly stated a decision problem, and the specified objectives of all models were consistent with these stated decision problems. None of the included studies clearly specified the primary decision maker, although it did always appear to be the healthcare provider. All of the included studies stated and justified the scope of their models. The outcomes of the models of the included papers were all consistent with the overall objectives of the model. For all studies, the model’s inputs were consistent with the chosen perspective.

Model Structure Four of the included studies’ models were based on previously published models [14,15,16, 18] and two others stated that the model structure was based on reviewing the published literature [12, 13]. The final study [17] used a de novo decision-analytic model in their analysis. Of the studies that used previous literature to inform their model structure, all but one [13] clearly referenced the sources used to develop their models so it would be possible for the reader to refer to that published literature to assess the quality and appropriateness of the models included. For all studies, the structure of the model was consistent with a coherent theory of the health condition under evaluation. All of the included studies clearly stated the structural assumptions used in their models and these were appropriately justified. For all studies, these structural assumptions were reasonable and consistent with the stated scope and perspective, where stated.

Comparators All of the included studies provided a clear description of the options under evaluation in their models. For only one study could it be said that all feasible and practical options were not evaluated [16]. This study provided a justification for this exclusion, which was corroborated by expert opinion and, therefore, can be considered appropriate.

Model Type and Time Horizon The model types chosen by the included studies were all appropriate. Five studies [12, 14,15,16, 18] had sufficiently lengthy time horizons (lifetime) to assess the long-term differences (in terms of costs and effects) between the options. All of these studies used 1-year time cycles. For two studies, time cycle and time horizon were not relevant as the model structure was a decision tree [13, 17]. The duration of treatment and duration of treatment effect was described and justified in all relevant studies.

Face Validity of Structure For the majority of the studies [12,13,14,15,16,17], the disease states/pathways used in their models reflect the underlying biological process of the condition. For one study [18], the initial decision tree is appropriate; however, the longer-term Markov model was not well-described and so its appropriateness is unclear. For the studies where cycle length was applicable [12, 14,15,16, 18], the cycle length was explicitly stated in only two cases [12, 16].

3.3.2 Data

Data Identification For all studies, the identification of data was reasonably well-reported. Five studies did not discuss alternative sources of data [12,13,14,15, 18], nor the reason(s) for selecting the included data over alternative data. From the remaining studies, one study [17] described and justified the choices made between data sources, and one study [16] used pooled estimates of the identified studies. Five studies justified the process of selecting key parameters and used systematic methods to identify the data required for the model [12, 14,15,16,17]. Only two studies discussed quality assessment of the included data [16, 17]. Two studies [14, 16] discussed the use of expert opinion to populate their model. Of these, only one [16] described and justified the expert opinion used.

Data Analysis Only three studies [14,15,16] provided a justification for the choice of baseline data used. Four studies [12, 14,15,16] appropriately described the calculation of the transition probabilities used. None of the studies discussed the application of a half-cycle correction to costs and outcomes. None of the studies derived relative treatment effects using trial data. Five of the studies required an extrapolation of results to final outcomes [12, 14, 15, 17, 18]. Of these, two studies described these extrapolations [12, 17] and one study justified the methods used and used alternative extrapolation assumptions in the sensitivity analysis [17]. The other three studies [14, 15, 18] did not go into detail about the extrapolation of results.

Utilities Six studies conducted a CUA [12, 14,15,16,17,18], either on its own or alongside another form of analysis, and all six of these studies clearly referenced the source of the utility weights used in their studies (five used EQ-5D scores [12, 14,15,16, 18], while one used the Quality of Well-Being Scale [17]). Only one study [15] provided a justification of the methods of derivation used to derive the utility weights used; the remaining studies did not discuss methods of derivation and only referenced the sources of the data used. The utilities incorporated in the models were appropriate for the majority of these studies [14,15,16, 18]; however, one study did not provide sufficient detail [12], while another study used a number of assumptions to derive utility values [17].

Costs One study used published literature to identify relevant health state costs [12], but systematic methods to identify this literature were not reported. Additional treatment, travel and productivity loss costs were sourced from routine sources available in the country where the study was conducted, and all can be considered appropriate. One study carried out a micro-costing exercise (focusing on personnel, material and overheads) [13] to estimate the costs associated with alternative diagnostic strategies. All relevant costs were included in the analysis, and all were identified using appropriate sources of data. One study followed the Dutch guidelines for cost calculations in healthcare [14] and included direct medical costs (personnel, materials, housing, equipment, hospital admissions and overheads) and direct non-medical costs to the patient, including travel expenses and patient time. All costs included were identified using appropriate sources. These same cost categories were presented in another study [18], but from a US perspective, with appropriate data included and suitable identification methods applied. One study used a combination of data from Medicare reimbursement rates and data identified in the literature to include in the economic model [15]. The process of identifying cost data from the literature in this study was not well-reported. One study sourced the cost of the test, and patient management, from the literature, with additional medication costs identified using routine data sources [16]. One study focused on the costs to the hospital in carrying out the test, and as a result of subsequent patient management, and additional productivity loss costs due to the hospitalisation of the patient [17]. All studies included appropriate cost data given the perspective of the analysis.

Data Referencing All of the seven included papers referenced the data incorporated in their respective studies. The process of data incorporation was transparently presented in four studies [12, 14, 16, 17]. One study provided limited details of data incorporation [13] and two studies did not discuss data incorporation [15, 18].

Uncertainty None of the seven included papers addressed all four principal types of uncertainty: methodological uncertainty, structural uncertainty, parameter uncertainty and patient population-related uncertainty (i.e. uncertainty related to potential patient heterogeneity). Only one study [16] provided any justification for this omission. Only two studies [13, 14] addressed methodological uncertainties in their models. Four studies [13, 15, 16, 18] made efforts to address the structural uncertainties within their models. None of the included studies discussed the issue of heterogeneity in their studies. For all studies, the method of assessing parameter uncertainty was appropriate.

3.3.3 Consistency

None of the included studies discussed whether the mathematical logic of their models was tested and, so, it is not clear if this task was undertaken. However, the conclusions provided by each of the included studies can be considered valid given the data presented. None of the studies reported counterintuitive results. Four studies [12, 13, 16, 18] discussed the results of previous models and discussed their results in relation to these previous studies. One study [14] discussed another study broadly by comparing the US with the Danish results; as such, this discussion was more on the generalisability of the results across these two countries rather than a calibration of the study’s results. The two remaining studies [15, 17] discussed previous studies but a comparison of results was not formally conducted or presented.

3.4 Cost-Effectiveness Results

An overview of the base-case cost-effectiveness findings from each of the included studies is presented in Table 2. Of the seven studies included in the final review, only one of these [12] assessed the cost-effectiveness of an intervention typically used in a primary care setting for the diagnosis of PAD: ABPI. The remaining six studies [13,14,15,16,17,18] focused on the cost-effectiveness of imaging techniques that would typically be conducted in secondary care. The study which explored the potential cost-effectiveness of ABPI [12] found that the intervention of “screening with ABPI and providing treatment to test positive patients” was dominant (cheaper and more effective) than the comparator of “no testing and no preventive treatment”. From the results of the other six studies [13,14,15,16,17,18], there was no one definitive cost-effective method of diagnosis as there was wide variation in the types of intervention assessed, as well as the combination of tests and techniques included in the individual analyses.

Table 2 Summary of base-case cost-effectiveness results from included studies

4 Discussion

A systematic review of the literature, up until the year 2017, to identify model-based economic evaluations comparing techniques for detecting PAD, was conducted. Seven studies were included in the full review, all of which were published between 1995 and 2014.

The population included in the analyses was variable, with two studies focusing on patients with suspected PAD [12, 17], one study focusing on patients with clearly defined PAD [13], one study focusing on patients with intermittent claudication or limb-threatening ischaemia [16] and three studies focusing on patients with intermittent claudication [14, 15, 18]. Notably, four of the studies were performed in the Netherlands [12,13,14,15]; however, two of these studies were conducted by the same lead author and were based on a similar analysis [14, 15]. The most common type of model structure used amongst included studies was a Markov state transition model embedded in a larger decision-analytic model, which was used in the three studies published by Visser et al. [14, 15, 18], and in the studies by Vaidya et al. [12] and Collins et al. [16].

In terms of the model structure in included studies, these were of adequate quality, with the majority of the included studies reporting and justifying the relevant information associated with model structure. However, of the included studies that used previously published models in their analyses, the methods used to develop these model structures were not well-reported and readers would need to refer to the references provided to assess their quality and appropriateness.

The reporting of the data within the included studies could have been improved. Only one study [16] described the use of systematic methods to identify the data required for the model, and described a quality assessment of the data used. Only three studies [14,15,16] provided a justification for the choice of baseline data used. All of the seven included studies referenced the data incorporated in their studies. The process of data incorporation was transparently presented in four studies [12, 14, 16, 17]. One study provided limited details of data incorporation [13] and two studies did not discuss data incorporation [15, 18]. Finally, none of the seven studies addressed all four principal types of uncertainty, with only one study [16] providing any justification for this omission.

The conclusions provided in each of the included studies may be considered valid given the data presented in the studies. The majority of studies drew comparisons between their own findings and previous research, and the studies that did not make a formal comparison still placed their results in the context of existing evidence.

The cost-effectiveness findings from each of the studies are not presented with the intention of outlining a definitively cost-effective technique for the detection of PAD. The interventions described in many of the studies are paired with a variety of complimentary tests/treatments and the results can only be considered relevant in light of the context in which they are presented. However, it is worth noting that only one of the included studies [12] involved an evaluation of a test typically used for the diagnosis of PAD in a primary care setting: ABPI. All other studies evaluated tests which are more commonly utilised in secondary care, such as imaging techniques.

A limitation of the review is that the number of studies identified is quite low, and this sample may not be considered robust enough to make any definitive conclusions about the quality of reporting of modelling studies in this clinical area. This was partially due to the focus on model-based studies specifically, with all other economic evaluations excluded from the review.

This systematic review shows the limited amount of model-based economic evaluation literature that currently exists in this clinical area and the scope that exists for future work. In particular, there is a lack of related work involving tests that would generally be used in primary care and there is scope for future work to focus on the costs and health outcomes of such tests. Additionally, any future reviews in this area may choose to include all economic evaluations, rather than focusing on modelling studies specifically, to determine how much more data would be available if this exclusion criterion was not applied.

5 Conclusion

This review brings together all applied modelling methods for tests used in the diagnosis of PAD, the results of which could be used to inform future model-based economic evaluations in this field. The limited cost-effectiveness information available on tests typically used for the detection of PAD in a primary care setting, in particular, highlights the importance of future work in this area.