FormalPara Key Points for Decision Makers

This study has shown that a discrete-choice experiment (DCE) can be successfully used to help understand the important characteristics to users of a new diagnostic device.

DCEs can be used in the primary care setting to aid decision making relating to the implementation of medical technology.

Decision makers can make informed commissioning decisions using information from DCEs, relating to the use of existing and new medical devices.

1 Introduction

Peripheral arterial disease (PAD) can be asymptomatic, can cause leg pain on walking, or may progress to cause ulcers or gangrene, leading to limb amputation [1, 2]. Symptomatic PAD affects approximately 5% of people aged 55 years and over [3], and 100,000 people are diagnosed with PAD each year in the UK. These people are approximately sixfold more likely to suffer a heart attack or stroke than those without PAD as PAD is usually associated with atherosclerosis. Detecting PAD early allows cardiovascular risk factors to be controlled early, and allows the implementation of lower cost therapies that prevent costly and harmful longer-term adverse events from occurring. A new test for early PAD detection, and a formal evaluation would be of value, but integral to that evaluation is the need to understand whether the test would be adopted in practice. This issue can be explored using economic approaches to quantify the strength of preference for the use of a test by those who used it in practice.

The National Institute for Health and Care Excellence (NICE) guidance for PAD testing [4] in primary care is to use the ankle-brachial pressure index, however it is time-consuming and technically challenging to perform. There are currently no easy-to-use PAD detection devices on the market that require minimal training and that have been informed by the preferences of UK-based primary care clinicians. Besides ABPI limb pressure measurements, there are other approaches that can be used, including the potentially very low-cost vascular optical technique known as photoplethysmography (PPG) [5]. PPG measures the pulse at peripheral sites such as the finger or toe, with the pulse having distinct characteristics when there is PAD present in the limb being studied. A novel rapid PAD detector device based on multi-site PPG (base technology called MPPG) [5, 6] is being developed for primary care use. This device aims to be comfortable for patients, non-invasive, low-cost, safe, and easy and quick to use by a range of clinicians.

In order to incorporate preferences of primary care practitioners and maximise the suitability of a new PAD detection device for general practitioners (GPs) and practice nurses, a discrete choice experiment (DCE) was conducted among clinical practitioners. The DCE facilitated the identification and exploration of the extent to which a practitioner values different aspects of the MPPG-based device. An investigation of the value attached to different aspects of the proposed MPPG device was undertaken.

In the DCE reported in this paper, the diagnostic technology is described in terms of its characteristics or attributes. The extent to which a practitioner values the technology would depend on the level of those attributes (see Table 1). By presenting questions that compare two profiles that differ by the level of the attributes presented, it is possible to estimate the uptake of a technology that might be configured in different ways [7].

Table 1 Attributes and levels included in the discrete choice experiment

1.1 Objective

The overall aim of this study was to determine the strength of practitioners’ preferences, using a DCE, for different characteristics describing a new diagnostic device for the detection of PAD. In order to address the study aim, the DCE answered the following key questions.

  • What key characteristics should be considered in the development of a novel rapid PAD detection device (based on MPPG), from the perspective of primary care practitioners?

  • What are the relative preferences for different levels of these attributes among practitioners?

  • In what ways can the results from the DCE be used to inform future product investment and development decisions?

2 Methods

DCEs are an economic technique used to explore preferences for different types of service, policy or intervention [8], and have been used extensively to explore patient, provider and policy-maker preferences for different characteristics of goods and services [9]. They have been used extensively in health care, and recent reviews have identified several hundred different DCEs that have been reported in recent years [10, 11]. The design of the DCE in the current study followed well-documented guidelines for best practice [7, 12, 13]. The methods that were employed can be broken down into four key steps: Step 1: Identification of attributes and levels; Step 2: Experimental design; Step 3: Data collection, and Step 4: Data analysis and interpretation.

2.1 Step 1: Identification of Attributes and Levels

Two main sources were used to inform the list of attributes and levels that were included in the DCE: an expert panel (comprising the clinical specialists and device designers in the project team), and preceding qualitative studies that were undertaken in the wider device development project. This qualitative study comprised five focus groups exploring views on the device that fed into the identification and wording for the attributes contained in the DCE study. The interviews were conducted between December 2014 and February 2016: three were in general practices, and two were with Tissue Viability Nurses. The breakdown of health care practitioners was 14 GPs and 20 nurses (a mix of District Nurses, Tissue Viability Nurses and Practice Nurses). Specifically, details on the (1) sensitivity and specificity of the test, and (2) device acceptability, ease of interpretation, and confidence in findings, were provided from this qualitative work. All of the information elicited from the consultations were consolidated, facilitating the creation of key attributes and associated levels that may influence the uptake of the new PAD detection device, from the perspective of practitioners. In addition to this, the clinicians and device developers who formed part of the study team also had input in the final refinement of attributes and levels, prior to piloting the DCE survey. Finally, qualitative work, in terms of think-aloud [14] interviews with clinicians, were conducted as part of the health economics DCE to finalise attributes and levels for inclusion (further details are provided in the Think-Aloud Pre-Testing and Data Collection section below). Table 1 shows the finalised list of attributes and levels that was included in the DCE. An experimental design, generated for the DCE, is described in the next step.

2.2 Step 2: Experimental Design

All possible combinations of attributes and levels described in Table 1 generated a large number of profiles, i.e. 1536 (22 × 3 × 42 × 8). Each profile is made up of one level for each attribute. A set of choice scenarios can be defined where each profile is compared with another profile. The total number of scenarios generated is therefore too large to be considered by individuals. Consequently, an efficient design was selected that allowed for all main effects to be estimated [7, 10]. The design for the DCE was generated using the Ngene design software [15]. The best design generated by Ngene was chosen with the aim of minimising the standard errors [16] (the reliability of the model parameters to be estimated can be quantified in terms of the asymptotic standard errors and covariances; thus improvements in reliability suggested a reduction in the asymptotic standard errors) [17]. In order to reduce respondent fatigue, we chose an efficient design that consisted of 20 choices. This was felt to be too many for respondents, therefore two blocks of choice scenarios were generated, each made up of 10 scenarios, and each respondent was randomly assigned to one block of choice scenarios.

An example pairwise choice set is shown in Fig. 1. Practitioners were asked to choose their preferred scenario from each pairwise choice set. The full online questionnaire incorporated questions on (1) background data (including age, sex, profession, and experience with using PAD detection devices); (2) introductory text explaining the DCE task, and an example DCE question (this was an independent question that was not generated from the experimental design; the same question was shown to all survey respondents) [see the electronic supplementary material, which illustrates the DCE section of the online survey]; and (3) the main DCE survey—the 10 DCE choice questions presented to respondents, following the DCE section of the survey, a ranking exercise where respondents were asked to rank the six attributes making up the choice sets; and (4) additional questions used to explain how individuals had answered the DCE questions. In the final section, additional comments on the diagnosis of PAD were included. A final question on whether any key attributes were missing from the DCE was also included.

Fig. 1
figure 1

Screenshot of an example discrete-choice experiment scenario. PAD peripheral arterial disease

2.3 Step 3: Data Collection

2.3.1 Recruitment of Participants and Consent

The DCE was converted into an online survey by a market research company (Research Now), who conducted the DCE to standards described in the Market Research Society’s Code of Conduct [18]. An incentive in the form of a prize draw (an iPad) was offered for full completion of the survey. All participants were provided with written information on the study before taking part, and indicated their consent before data collection took place.

Clinicians involved in the care pathway in the diagnosis of PAD were recruited via a number of avenues in order to achieve the required sample required. The National Health Service (NHS) North of England Commissioning Support Unit, and North East and North Cumbria Clinical Research Network, helped with the recruitment of clinicians in the following ways:

  • the survey link was sent to GP practices for accessing GPs, practice nurses and other non-medically qualified clinicians’

  • district and community nurses who are employed by acute trusts could be contacted using existing links to cascade the survey widely.

In addition to the above, clinical conferences, such as the ‘Issues and Answers in Cardiovascular Disease’ primary care conference that took place in Nottingham, UK, 4–5 November 2016, were targeted for recruitment. This conference was specifically chosen because the conference delegates comprised GPs and nurses working in the primary care setting. The link for the survey was made available on the conference website on 3rd November 2016 and was live for a full week following the conference so practitioners could go back and take the survey at their convenience.

2.3.2 Sample Size

Optimal sample size requirements for DCEs depend on knowledge of the true choice probabilities, which are not known prior to undertaking the research [19]. For this reason, DCE sample size estimates are generally based on previous research, rules-of-thumb and budget constraints. Previous DCE studies have shown that robust choice models can be estimated in samples as low as 50 respondents [19, 20]. Given the number of attributes included in the DCE, it was estimated that a minimum sample size of 100 (i.e. 50 per block) would provide sufficient statistical power.

2.3.3 Think-Aloud Pre-testing and Data Collection

The DCE was pre-tested and piloted prior to the main survey data collection. A convenience sample of four clinicians, comprising one GP and three nurses who were independent of the research team, reviewed the survey. A think-aloud approach was used with clinicians to identify any necessary adaptation of the pairwise scenarios and overall survey. An iterative process was taken to incorporate feedback from each think-aloud participant.

The first participant (nurse) suggested making some changes to the introduction section (e.g. making it shorter). This participant confirmed that they understood each of the attributes and associated levels and did not suggest any further changes. The second participant (GP) suggesting making the cost for the introductory test DCE question the same in the two alternatives to encourage participants to fully take in the other attributes in the first instance. They also suggested that the original maximum value be reduced (originally set at £6000 with a view to getting feedback from practitioners). The third participant (nurse) suggested that the maximum value for the cost of the device should be £2500 and that more cost levels should be added at the lower cost end of the scale. The fourth participant (nurse) gave positive feedback regarding the final levels of the cost attribute, but did not suggest making any changes. The feedback was positive that the DCE task was well understood.

2.4 Analysis of the Discrete-Choice Experiment (DCE) Data

A conditional logit regression model (i.e. multinomial model) was used to investigate the main effect parameters, applied on the full respondent sample. The utility function (\(\mu\)) modelled is based on Eq. (1). The model was implemented in STATA software (version 14.0; StataCorp LLP, College Station, TX, USA) [21]. The model used assumed that all attributes have an independent influence on practitioners’ preference(s). The functional form incorporated dummy attribute-level coefficients (as per Table 1) so that:Footnote 1

$$\mu_{qj } = \propto + \lambda^{\prime}{\mathbf{X}}_{qj} + \varepsilon_{qj} ,$$
(1)

where \(\mu\) = the indirect utility function of individual \(q\) for alternative j, \(\alpha\) = the alternative-specific constant term (for choosing scenario A in the current context), \(\lambda^{\prime}{\mathbf{X}}_{qj}\) = the vector of preferences for the attributes and associated levels included in the DCE survey, for each of the choice tasks included in the DCE survey (t = 1,…, T), and \(\varepsilon_{qj}\)  = the random element that is added to reflect the unobservable factors affecting the estimation of the indirect utility function.

A conditional logit model was used to establish whether the six attributes presented in the choice scenarios were statistically significant predictors of participants’ preferences. The model was run on the full study sample and also on the GP-only subsample in order to investigate potential differences in preferences between GPs and non-GPs. Dummy coding was used for all attributes, with the exception of the cost of device, which was assumed to be a continuous variable. Marginal rates of substitution (MRS) [13] were used to estimate trade-offs between attributes and levels. The MRS allows us to look at trade-offs responders would be willing to make between each attribute when compared against a common numeric value scale. In the current study, this would be in terms of the cost of device attribute as the denominator to compare all other attributes against. Attribute levels with negative preferences indicate that respondents would prefer not to move from the reference level, while attribute levels with positive preferences indicate that respondents would prefer to move to that level from the reference level.

3 Results

Between July and December 2016, 140 individuals consented to participate in the study, of whom 116 (83%) completed all 10 of the DCE scenarios; the data from these individuals were used in the DCE analysis. Of the 24 individuals who initially consented to take part in the study, two partially completed the DCE (one individual dropped out after completing the example DCE question and the first DCE scenario, while the other dropped out after completing the third DCE scenario) and the remaining 22 individuals dropped out of the survey after partially completing the descriptive characteristics questions (e.g. sex, clinical discipline). Participant characteristics are described in Table 2. The majority of the sample comprised doctors (n = 95), followed by nurses (n = 17). One Health Care Assistant completed the survey and three individuals who did not specify their clinical discipline also completed the survey.

Table 2 Characteristics of the DCE sample

Table 3 shows the results of the conditional logit model for the full study sample, as well as for the GP-only subsample. Here, positive coefficients represent a positive preference (utility) associated with a particular level of an attribute compared with the reference level, whereas negative coefficients represent a negative preference (disutility). Statistically significant differences are marked. Scenario A (the hypothetical scenario presented on the left-hand side of the screen) was the most frequently chosen option, making up 72% of responses.

Table 3 Results of the conditional logit regression model

For the full participant sample, clinicians had overall strong (statistically significant) preferences for five of the six attributes considered in the DCE. Estimates for the display attribute indicate a preference for the ‘two traffic lights’ results display (as opposed to the ‘three traffic light’ result display, which was used for the reference category). There is a slight overlap between the confidence intervals for the scale and two traffic light display level options for the display attribute, indicating the possibility that the latter might be preferred to the former. There was a preference for manual input of the test results into patient records. Practitioners had a strong preference for training to be delivered via a paper manual and training video or interactive online training. In terms of power supply (for charging the device), there was a strong preference for the use of disposable batteries with an indicator light to show the low battery life of the device. Practitioners did not show any preference between the two levels for portability of the device. For the GP-only subsample, the main difference, compared with the full study sample, was that none of the levels for the device display attribute was statistically significant. In addition to this, only the disposable battery level for the power supply attribute was statistically significant. The magnitudes and signs of the other statistically significant coefficients where similar to those of the full study sample.

The attribute for the cost of the device was treated as a continuous variable. We usually observe negative utility associated with higher costs (generally individuals would typically prefer to pay a lower amount for a given good or service), and therefore expect the coefficient to be negative; however, this was not observed in this context. The coefficient on the cost attribute was positive and statistically significant, suggesting that there is a willingness to pay more for the new device.

MRS were calculated for the statistically significant levels for each of the attributes, to obtain willingness-to-pay estimates. Willingness-to-pay estimates were calculated, however given that the cost of device attribute was problematic to estimate, MRS estimates would have also limited reliability. Nevertheless, from the ranking of the attributes (Table 4), conducted after completing the DCE section of the survey, it can be seen that the cost attribute was deemed an important consideration for a new medical device. For the majority of the study sample (63%), the cost of the device was ranked either first or second out of the six attributes presented. The display and portability of the device were also in the top-ranking attributes, by respondents (ranked number 1 by at least 41% of respondents). The training and power supply attributes were not ranked high in terms of the six characteristics presented to practitioners.

Table 4 Ranking of DCE attributes

4 Discussion

This study shows that DCEs can be used to elicit the preferences of clinical practitioners in informing the design of a new medical device. This is the first published DCE we are aware of that has investigated the preferences of primary care practitioners in the UK for a new medical device. It is also the first DCE we are aware of to explore preferences for a novel PAD detection device for primary care. Practitioners generally had strong preferences for five of the six attributes that were used to describe the characteristics of the PAD detection device.

4.1 Interpretation of Findings

Practitioners particularly favoured manual, as opposed to automated, integration of test result data onto patient records. The feedback obtained from the pre-pilot think-aloud interviews supported this outcome. Feedback from one of the pre-pilot respondents was that automated integration could cause compatibility issues with the record-keeping system in the future. The compatibility issue flagged the requirement for an ongoing maintenance and service contract to be available to keep the integration software up-to-date with NHS systems (and other software updates). This issue was thought to be a barrier to using a new device where data integration was automated, and, while flagged as a particular problem, it also highlights a wider issue about compatibility with systems between localities, e.g. between countries. It is unclear whether the ability to integrate with standard information systems would be a barrier or facilitator in different countries; however, it is clear that if integration was required, the solution would need to be tailored to each system.

A significantly higher value was placed on a device using disposable batteries as the power supply compared with all other alternatives. Again, feedback from the pre-pilot think-aloud interviews supported this outcome. In two of the think-aloud interviews, participants mentioned that devices such as this that may not be purchased for individual staff, but for group use, will not likely be kept charged. The issue that was brought to light in the think-aloud conversations was that when a device like this is targeted to be used in a home setting, there would be issues of access to a mains electricity socket in patients’ homes. Therefore, when a plug socket is available, it may hinder interaction with the patient and ease of movement around the patient, especially when the patient is in a lying position. The findings need to be considered in line with the context of the situation; for example, environmental concerns about disposable items such as batteries (and the cost of their disposal) may reduce the uptake of a device if this was the sole option.

In terms of the cost of the device, contrary to what would have been expected, a significantly less-favourable attitude towards a higher priced device was not observed. This may reflect the lack of suitable devices currently available and participants’ willingness to pay more to have a more suitable device available. Within the pre-pilot think-aloud interviews, respondents were very conscious of the cost of the device and that a realistic price, in comparison with similar devices, should be set. When the price levels for the device were set within the DCE, we deliberately set prices in small increments and took into account the maximum price range participants suggested in the think-aloud interviews. This was done in order to ensure the cost attribute was not the sole deciding factor that was considered when practitioners were considering the two hypothetical scenarios. The coefficient for the cost attribute was positive and significant. There are three possible explanations for this finding: (1) practitioners mainly considered the levels of the remaining five attributes, and, in the majority of the choice sets, the alternative that was chosen was coincidentally the more expensive option; (2) the sample size was too small in order to be able to distinguish between the different levels of the cost attribute; or (3) unlike the vast majority of DCE studies in health care settings to date that have been conducted in patient samples, the cost is not borne by the DCE participant directly, hence they may be relatively insensitive to impacts upon health system budgets as opposed to direct personal budgets. The conditional logit model did not converge when the cost attribute was estimated as a categorical variable as opposed to a continuous variable, potentially supporting the latter explanation.

4.2 Limitations of the Study

This study had some limitations. First, a main-effects-only design was used. This type of design assumes that all attributes are independent of each other, and were valued as such by study participants (i.e. all interactions between attributes were zero). Given the context of the current study, this may be reasonable since the six different medical device design characteristics can be treated as independent. In addition to this, we did not include an opt-out option in the choice tasks, therefore participants were asked to make a forced choice between the two scenarios presented, and did not have the ability to choose neither option, which might have indicated their preference for the device that they currently use. A future study can look into estimation of uptake rates for MPPG-based technology when compared with commonly used existing devices for ABPI measurements. Related to this, future studies can also assess the issue heterogeneity using more complex model specifications such as the mixed logit model. Second, this study used an online survey whereby the practitioners’ answers were collected electronically. Although easier to administer the survey using an electronic format, an interviewer-based administration may have been more reliable in terms of ensuing participants truly understood the task being asked of them. Nevertheless, only three (3%) participants found the DCE questions difficult to answer. Added to this, the sample mainly consisted of GPs; however, it is not yet known whether nurses would have had different preferences to GPs. It is worth noting that in practice currently, the ABPI test is usually administered by a nurse rather than a GP. Finally, the results of this study may not be generalisable to the secondary care setting as the device is currently being developed specifically for users in the primary care setting. Additionally, the majority of respondents came from the North of England and it would be difficult to evaluate the representativeness of the findings to the wider UK context.

When developing the DCE, choices had to be made about how many attributes and levels could be included. In particular the device display attribute was the only one that included a pictorial set of levels. This might have resulted in study participants treating this attribute differently to the others. Some potentially relevant attributes, such as the type of measurement probe, were not considered because they were not identified as sufficiently important in preparatory work.

4.3 Lessons for Future Studies Using DCE to Inform the Design of New Medical Devices/Conclusion

This study has shown that a DCE can usefully be used to help understand the key attributes to users of a medical device. This in turn can help identify design characteristics of the device that would maximise the benefit, and hence uptake, to users. Typically, attributes and levels included in the DCE can be set to describe the current state (in this context, the currently used device in general practices, which is the ABPI). Levels within device characteristics/attributes can then be altered in order to explore relative preferences for changes to the current device, to the new device, to allow predictions of uptake of the new device [7]. In the current study, device characteristics and their associated levels described the configuration of the proposed new MPPG-based device. The characteristics of the device are very different to ABPI in a number of ways. For example, there is usually no display available with the manual ABPI method (except in the case of auto ABPI devices), therefore this attribute is not a characteristic that is transferable to MPPG technology. Although we cannot estimate uptake rates for MPPG-based technology when compared with commonly used existing devices for ABPI measurements, we do know that display, as well as cost, were the top-ranking characteristics for a new PAD detection device, by GPs. Given that ABPI does not provide a results display, this might have a strong influence on the uptake of MPPG technology for PAD detection.