FormalPara Key Points for Decision Makers

Use of expert judgement is widespread in NICE guidance making, but there is no standard approach across programmes and little use of published tools or protocols for expert elicitation.

Current practice could be improved by using a tool or protocol to facilitate more formal methods of expert elicitation. The tool or protocol would need to meet a number of requirements in order to be adopted by NICE, including fitting into the time and resource constraints of the guidance-making process.

None of the tools and protocols identified in this study were entirely appropriate for use by NICE. However, information on the available tools and protocols could be of use to international HTA agencies and other groups interested in conducting expert elicitation to inform healthcare decision making.

1 Background

Expert judgement has many uses in the context of health technology assessment (HTA) and economic evaluation, and can be broadly categorised into expert opinion and expert elicitation (Box 1) [1]. Expert opinion involves qualitative expression of an individual’s judgement (e.g. asking an expert to define the natural history of a disease). Expert elicitation involves collecting quantitative information from experts (e.g. using elicitation to obtain estimates of model inputs, and often their distributions, as well as associated uncertainty). Although generally acknowledged to be a poor quality of evidence [2], experts are routinely relied upon to fill gaps in higher quality evidence during development of evidence-based recommendations.

Box 1
figure 1

Definitions of expert elicitation and expert opinion used in this study [1]

There is no universal agreement on the most appropriate technique for expert elicitation, although a number of protocols and methodologies are available [3,4,5,6]. Several digital tools and software to support expert elicitation exist [7], some of which are described in further detail in this paper.

1.1 Use of Expert Judgement in HTA and Guideline Production

The National Institute for Health and Care Excellence (NICE) uses information from both professional and lay experts in the production of guidance [8]. Professional experts are individuals with knowledge of the experience and views of practitioners, government and policy, or research and practice. Lay experts are those with experience of using health or care services, carers, advocates, or members or officers of a voluntary or community organisation potentially affected by NICE guidance. This article focuses predominantly on professional experts, referred to simply as ‘experts’ in some instances. Points referring to lay experts are clearly indicated.

There are several programmes within NICE which produce guidance on a variety of technologies, interventions, and services [9]. Within each programme, one or more committees are responsible for considering the available evidence and making recommendations for guidance.

1.2 Objectives

The aims of this study were:

  • To review current use of experts across NICE guidance-making programmes and to identify how the use of expert judgement could be improved during HTA and guidance development;

  • To assess tools and protocols that could assist with the elicitation of information from experts for use by NICE and in HTA worldwide. We considered a ‘tool’ to be an interactive digital resource used to assist a user with expert elicitation, and a ‘protocol’ as a guide that provided methodology on how to design, conduct or best report expert elicitation.

2 Methods

2.1 Review of Current Use of Expert Judgement in NICE Guidance

We obtained methods and process manuals (March 2017) for each NICE guidance-making programme from the NICE website and extracted information on the use of professional and lay experts and/or expert judgement.

In March and April 2017, we conducted interviews with NICE staff members, NICE committee chairs, and individuals with experience of undertaking contracted work for NICE programmes (specifically in those programmes where external groups undertake expert elicitation). We used semi-structured interview guides (“Appendix 1”). Questions were adapted as necessary to reflect the differing roles of interviewees representing the various programmes, lay experts and those working outside of NICE. The interviews aimed to capture how NICE uses experts during guidance making, including numbers of experts consulted, their identification, the process of gathering expert judgement, use of elicited information by NICE programmes, and the opportunities/challenges associated with current processes. We conducted interviews with representatives from the following guidance-making programmes:

  • NICE guidelines.

  • Diagnostics guidance.

  • Highly specialised technologies guidance.

  • Interventional procedures guidance.

  • Medical technologies guidance.

  • Technology appraisals guidance.

We also interviewed a representative from the Public Involvement Programme in order to understand the use of lay experts across all guidance-making programmes.

Interviewees were informed that their full responses were confidential, with themes and examples being reported rather than full interview transcripts. This allowed for more open discussion during interviews.

We identified key issues emerging from interviews and the review of manuals to determine a set of requirements for a tool or protocol to assist with expert elicitation at NICE. Although we did not ask interviewees directly for a list of requirements, their responses fed into the list presented in this paper and, thus, we judge this list to be representative of the requirements of NICE programmes using expert elicitation. Furthermore, four authors of this paper have experience conducting informal expert elicitation for the development of NICE guidance and so could provide insight into the issues emerging from interviews and the relevance of the identified requirements.

2.2 Review of Tools and Protocols for Expert Elicitation

We conducted a targeted literature search to identify tools and protocols designed to support expert elicitation. We included resources meeting the following criteria:

  • Tool or software designed and used for eliciting data from experts published or updated since 2005;

  • Protocol (i.e. a guide) describing the elicitation of data from experts published since 2005.

Only tools and protocols published or updated since 2005 were included, as those developed before the publication of a seminal text in the field [4], were judged less relevant. We did not consider tools and protocols designed for related purposes, such as decision analysis. Further, we excluded case studies reporting expert elicitation in a single or small number of instances (as opposed to guides reporting how expert elicitation should be undertaken).

First, we undertook an Ovid MEDLINE search in March 2017 to find systematic reviews of methods for eliciting probability distributions from experts in the context of HTA and economic evaluation (search strategy provided in “Appendix 2”). We screened records to identify eligible systematic reviews and assessed studies included in reviews for relevance. Second, we conducted targeted searching of Google scholar for published work and Google for grey literature and online tools. We consulted websites describing tools for expert elicitation and identified ongoing work by contacting UK-based researchers known to have published in the area and checking funding body webpages. Grey literature searches were not limited to tools used in the context of HTA and economic evaluation.

A single reviewer (MJ) conducted study selection and a second reviewer (JC) confirmed the relevance of included results. We extracted relevant information on tools and protocols and, where required, contacted tool developers to request additional information. We assessed the suitability of tools and protocols first against the criteria emerging from interviews and second through consultations with NICE staff or stakeholders involved in expert elicitation (for example, a model developer from NICE guidelines who is required to elicit model input parameters from expert committee members).

Our review focused on tools and protocols designed specifically for expert elicitation. However, we also assessed a generic online survey tool suitable for the collection of expert opinion following discussion with NICE staff, who judged that a tool for obtaining qualitative information from experts could be helpful.

3 Results

3.1 Review of Current Use of Expert Judgement in NICE Guidance

From March to June 2017, we conducted 12 semi-structured interviews with NICE staff. Further interviewees included two NICE committee chairs, two health economists contracted to work on NICE guidelines, and a professor who led an independent external group contracted to critically appraise evidence for guidance.

Table 1 summarises the methods we used to review the use of expert judgement and the principal ways in which experts are involved in guidance making in each of the NICE programmes (i.e. if they are full committee members, external advisers or both).

Table 1 Summary of use of experts by NICE guidance-making programmes

Sections 3.1.1 and 3.1.2 describe the ways in which expert opinion and expert elicitation are used in guidance making. It is worth noting that although we present these activities separately they are not typically discrete processes. For example, in the questionnaires used by the interventional procedures programme there are questions around both expert elicitation and expert opinion.

3.1.1 Use of Expert Opinion

Qualitative expert opinion plays an important role in developing NICE guidance and is collected at various stages across all NICE programmes. Box 2 summarises the use of expert opinion, based on both the review of manuals and the interviews conducted.

Box 2
figure 2

Use of expert opinion at NICE

Generally, expert opinion is sought in an unstructured way, during committee meetings (in person), by telephone or email. This process is flexible, and experts are asked anything that committee members and chairs deem appropriate. Many of the programmes also use standardised questionnaires to gather expert opinion, particularly at the scoping stage. In some cases, an interactive approach may be used. For example, staff working on NICE guidelines have facilitated workshops with professional experts for ‘conceptual mapping’ of clinical pathways in complex disease areas.

The use of opinion from lay experts is more limited. Their input generally relates to their own areas of expertise, notably patient experience, quality of life, and any undocumented side effects or benefits of the technology/service under evaluation.

3.1.2 Use of Expert Elicitation

All NICE programmes use quantitative information elicited from experts in some circumstances. However, this is less common than the use of expert opinion and is generally restricted to cases where there are significant gaps in the evidence base. Where expert elicitation is used, information is usually elicited from professional experts. The input of lay experts is sought where there are questions related to their experience of a technology, service or disease area that the committee must take into account to make an informed recommendation (although this typically involves opinion gathering rather than elicitation of quantitative values).

Four of the programmes producing guidance (technology appraisal guidance, highly specialised technologies guidance, medical technologies guidance, and diagnostics guidance) contract independent external groups to review evidence. In these programmes, it is more likely that the external group (or company submitting evidence) carries out expert elicitation, rather than staff from the programme itself. Hence, for all four programmes, we sought advice from these external groups. Interviewees from the technology appraisal programme advised that external evidence review groups used professional experts to estimate quantitative information for parameterising economic models. Based on the experience of the interviewee, these inputs would normally be parameters such as resource use data, rather than the efficacy of a treatment. Typically, experts are asked for a central estimate and some measure of uncertainty around this, for example, minimum and maximum plausible values. In a previous appraisal, experts were asked to plot data on a histogram in order to elicit the distribution of the input value. Information from experts was aggregated by linear pooling and a mean value, distribution and associated uncertainty were calculated [10].

Within the NICE guidelines programme, interviewees described a variety of uses and techniques for expert elicitation. Elicitation ranged from structured approaches with a large group of experts (for example, the full committee) to less formal exercises with smaller numbers of experts. Experts were typically asked for means and ranges of quantitative estimates and means across experts were calculated. One health economist with experience of working on NICE clinical guidelines described using a modified Delphi technique via an online survey tool to elicit information from professional experts. This involved multiple rounds of elicitation, with responses shared after each round, and experts allowed to refine their answers. Other structured approaches to elicitation included using online questionnaires with a subgroup of topic-specific committee members in order to obtain initial estimates of model inputs, which were then presented to the full committee.

Expert elicitation may also be conducted via questionnaires in other programmes. An interviewee from the interventional procedures programme reported asking experts for quantitative information in the questionnaires they complete prior to committee meetings. Some of these questions may include pre-defined ranges for quantitative estimates. For example: “Please estimate the proportion of doctors in your specialty who are doing this procedure (choose one): (a) More than 50% of specialists engaged in this area of work; (b) 10–50% of specialists engaged in this area of work; (c) Fewer than 10% of specialists engaged in this area of work; (d) Cannot give an estimate”. The interviewee reported that a single round of expert elicitation was conducted, and individual answers were not shared across experts. In general, it appeared to be rare for multiple rounds of elicitation to be considered necessary for the production of guidance.

Interviewees indicated that in many cases elicitation is informal and may be conducted over the telephone or by email with individual experts, or with groups or individuals during committee meetings. For example, during a meeting, the committee may ask experts about the mean duration of a procedure where this has not been reported in the evidence base. In these instances, it is unlikely that the uncertainty around experts’ estimates would be quantified as it may be judged unnecessary or impractical to do so.

3.1.3 Potential for Improvements in the Use of Expert Opinion and Expert Elicitation

The most frequently reported challenge for guidance programmes was the limited time available for obtaining and using expert judgement (i.e. expert opinion or expert elicitation). Recruiting and retaining experts can be challenging and time-consuming and interviewees suggested that the use of online surveys could help to improve engagement, particularly in the case of lay experts who may be recruited through their clinicians and asked to return a paper questionnaire. In other cases, use of survey software for elicitation or gathering opinions may save time because experts would not need to attend face-to-face meetings. Use of software that automatically collated experts’ responses would reduce administration time. However, it was noted that NICE staff and committee members often prefer to speak with experts in person in order to obtain the most appropriate information possible for the guidance (via expert opinion).

Interviewees with experience of conducting expert elicitation thought that this process could be improved if a protocol was available that provided advice on the best approach to elicitation and how to avoid bias. A tool or protocol would avoid the need to begin afresh when designing each elicitation. If a tool or protocol led to time savings, expert elicitation could be used more frequently and robustly across all programmes. Another suggestion was to provide training to experts prior to elicitation, which would allow them to provide information better tailored to the evidence gaps. However, this might have recruitment and resource implications.

A tool or protocol to support expert elicitation would be of particular benefit for internal and external modellers working on NICE guidelines, for external academic groups contracted by the diagnostics, medical technologies, technology appraisals, and highly specialised technologies programmes, and for companies submitting to NICE.

3.2 NICE Requirements from a Tool or Protocol for Expert Elicitation

Our findings suggest that any single existing protocol or tool is unlikely to meet all of the needs of the different NICE programmes. However, resources that do not fulfil all requirements of all programmes could still be of use in guidance making.

A tool to support the use of expert elicitation in NICE guidance making would ideally be flexible, affordable, user-friendly, and be used within current timeframes for guidance making, as well as meeting a number of technical requirements. Specific requirements are described in more detail in Table 2, alongside an analysis of the extent to which the two most suitable tools met these criteria.

Table 2 NICE requirements from a tool and the extent to which two tools met these criteria

The majority of requirements that emerged from our review of manuals and interview data referred to tools for expert elicitation (or possibly a tool-protocol combination, i.e. a tool where a protocol was available to use alongside it). However, we also identified some key features of expert elicitation for which an ideal protocol would provide information on best practice: selection of experts (including numbers of experts to include); avoidance of bias (including during recruitment); weighting of expert judgements according to the quality of responses; reporting of expert elicitation exercises; and use of elicitation outputs in decision making.

3.3 Review of Protocols and Online Tools for Expert Elicitation

Targeted, pragmatic searches identified 1017 records, of which 16 tools [3, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] and four protocols [1, 26,27,28] for expert elicitation were deemed suitable for inclusion.

Further details of the tools identified and an analysis of their potential for use by NICE are presented in “Appendix 3”. Although no tool met all of the requirements identified in our review, five were considered further. In the case of these tools, it was judged either that they were likely to be of use for expert elicitation by NICE, or that insufficient information was available in the public domain (i.e. in published studies or on the tools’ websites) to be able to exclude them. The tools’ developers were contacted to obtain further information and the suitability of the tools was analysed in light of this information.

Two tools, Excalibur [11] and PEGS (prior elicitation of graphical software) [24], were subsequently excluded because they required elicitation to be conducted during face-to-face interviews with experts and there were concerns around how user-friendly the software was, particularly because associated training was not available for either. The remaining three tools were judged to be potentially more useful to NICE. These were MATCH [17], SHELF [3] (these two tools use much of the same functionality, as described in Sect. 3.3.1), and ExpertLens [21].

3.3.1 MATCH and SHELF

MATCH [17] is an online interface for the SHELF package, which is considered to represent good practice in expert elicitation [29]. The SHELF package includes a series of documents that provide extensive information on designing and conducting expert elicitation, which could be considered a protocol [3]. A potential advantage of using MATCH is that it provides a user-friendly online platform and, unlike SHELF, does not require use of R statistical software. For these reasons, we assessed MATCH in favour of SHELF for use by the NICE guidance-making programmes. All of the information provided with SHELF is applicable to MATCH and is freely available online.

MATCH fulfils some of NICE’s requirements from a tool. However, it does not allow for pooling of estimates and must be used with one expert at a time, or by obtaining consensus opinions from multiple experts, face-to-face. In addition, facilitators would require training in the use of the SHELF methodology to use MATCH. Training is available but could be costly should a large number of staff require training.

3.3.2 ExpertLens

ExpertLens [21] is an online tool for eliciting information from a large number of experts. It uses a modified Delphi technique, and elicitation occurs online in a series of rounds. The tool fulfils some of NICE’s requirements. However, it does not capture the uncertainty associated with elicited values and encourages experts to agree estimates by consensus. Therefore, the tool would not be appropriate as it currently stands. At present, the tool is best suited to the collection of expert opinion. However, with the addition of a bolt-on module to enable elicitation of input values and associated uncertainty, this tool could meet the technical requirements for use in NICE guidance making.

Table 2 summarises the extent to which MATCH and ExpertLens met NICE’s requirements (considering that MATCH employs SHELF functionality but with a more user-friendly interface, and so the latter was not considered further).

3.3.3 Protocols for Expert Elicitation

Of the four included protocols, none met all the requirements of the NICE programmes. However, as the requirements that emerged from our review referred largely to characteristics of a tool or a combined tool and protocol (see Sect. 3.2), it was anticipated that this would be the case. Nevertheless, each protocol presented some information that could be of use in expert elicitation to inform guidance making. We extracted this information and present this below.

With regard to the selection of experts, Durbach et al. [26] state that the number of experts will vary depending on factors including time and resource constraints. Hoffmann et al. [28] recommend use of snowball sampling and sample size calculations to select and determine numbers of experts; however, this may not be feasible within current NICE timeframes. These protocols support that efforts should be made to prevent bias or overconfidence in experts’ judgements and experts should be asked to capture uncertainty associated with their judgements by providing highest and lowest plausible values in addition to a central estimate. The US Environmental Protection Agency [27] suggest various methods to reduce bias, including providing background briefing information, training experts, piloting, and documenting all stages of elicitation. In addition, they describe various methods for weighting and combining expert responses. Durbach et al. [26] recommend using linear pooling to combine expert judgements and assigning weightings to each expert, although no information on how weightings should be calculated is given. Iglesias et al. [1] present detailed reporting guidelines for expert elicitation exercises and recommend that full results should be presented and interpreted, ensuring that uncertainty is described, in order to inform decision making.

3.3.4 Ongoing Research

A project funded by the Medical Research Council is currently being conducted at the University of York and aims to develop a reference protocol for elicitation of quantitative values from experts [30]. The protocol will be designed for use in healthcare decision making and it is anticipated that it will be suitable for use by model developers involved in the production of NICE guidance. It is likely to be available in early 2019. The project does not aim to develop a tool for use alongside the protocol, although it is possible that this will occur as follow-up work.

3.4 Online Survey Tools for Collating Expert Opinion

A standard tool for gathering expert opinion might also be of use to the NICE guidance-making programmes. At present, many of the programmes ask experts to provide opinions using paper questionnaires and synthesising this information can be time-consuming. Therefore, we assessed a generic online survey tool (Snap Surveys) for suitability and consulted staff members across guidance-making programmes as to whether use of the tool would be beneficial.

Overall, responses suggest that an online survey tool could represent an efficient method for obtaining information from experts. In some programmes, use of the tool could allow for a greater number of experts to be engaged at lower cost, or for experts to be consulted at more time points during the guidance-making process. A representative from the interventional procedures programme advised that they are already piloting use of electronic surveys to gather information from experts. However, online surveys could not completely replace expert opinions provided at committee meetings because it is not known in advance what questions will be asked.

An online survey tool could also have the potential to be used for expert elicitation if clear guidelines were given around structuring questions, recruitment, informing experts of the purpose of the exercise, analysis of responses and presentation of results. Alternatively, if a bolt-on module was developed for ExpertLens as described above, this tool could also be used for both collecting expert opinion and expert elicitation.

4 Discussion

All NICE guidance-making programmes use expert judgement to inform decision making and, especially in cases where there is a paucity of evidence, this judgement can be highly influential in the development of guidance. Experts’ roles vary depending on the NICE programme of interest, the aims of that programme, and the way in which expert judgement is incorporated. The role of professional experts is two-fold. First, they provide expert opinion, including background information and sense checking clinical and economic evidence. Second, quantitative information is elicited to fill gaps in the evidence base. Roles for lay experts are more limited and do not involve elicitation of quantitative information.

Currently, there is little use of published tools or protocols for expert elicitation across NICE, with no standard approach specified in the methods and process manuals. Further, existing timelines in certain programmes are often a constraint to conducting formal elicitation. Interviewees judged that current practice could be improved by using more formal methods for expert elicitation guided by a tool or protocol. This would improve the reliability of elicited data, the sample sizes used in elicitations, and the information generated about uncertainty. However, time and resource constraints mean that a tool or protocol for use by NICE would need to adopt a streamlined process of expert elicitation. In a paper published since our review, the authors compared remote elicitation with face-to-face elicitation and reported that issues with recruitment, expert attrition and the time needed to obtain data remained prevalent when remote elicitation was used [31]. A tool or protocol may also require adaptation for the use of lay experts, although a survey tool for completing questionnaires online and synthesising answers could significantly cut-down administration time and may improve engagement.

None of the tools and protocols for expert elicitation identified in the pragmatic review were entirely appropriate for use in NICE guidance making. Two tools for expert elicitation, ExpertLens and MATCH, do not currently meet NICE’s requirements but could potentially be adapted for use in guidance making through work with the tools’ developers. In particular, development of a module for ExpertLens could allow it to be adapted for expert elicitation of input values for decision models. Ongoing research that aims to develop a reference protocol for expert elicitation in healthcare decision making could also be of use. However, because the protocol is still in development it was not possible to assess it against NICE’s requirements.

Online survey tools for gathering expert opinion could be utilised by NICE to reach a wider pool of both professional and lay experts in a cost-effective manner. However, use of surveys could not entirely replace conversations with experts. Rather, a survey tool may be useful for collecting and analysing information typically obtained through questionnaires, or for gaining additional information.

4.1 Limitations

Due to the pragmatic nature of this review (including primarily searching for systematic reviews only), we did not identify every available tool or protocol to support expert elicitation. Further, by focusing primarily on tools for expert elicitation in the context of HTA, we may have missed relevant knowledge and resources from other sectors. The small sample size of individuals from each of the NICE guidance-making programmes and external evidence review groups is a further limitation to this research, especially considering the apparent diversity in the use of expert judgement across programmes. The processes and opinions described were typically based upon the views of one NICE staff member and either a committee chair or an individual from an external organisation involved in guidance development.

4.2 Conclusions

This review of current NICE processes for using expert judgement and suitability of tools, software and protocols to assist with this process identifies valuable information relating to potential areas for improvement and further research at NICE. We outline the specific requirements of NICE from a tool or protocol for expert elicitation and explore how existing tools/protocols could be adapted to meet these requirements. These findings could inform international HTA agencies and other groups interested in conducting expert elicitation to inform healthcare decision making.