FormalPara Key Points for Decision Makers

• The National Institute for Health and Clinical Excellence (NICE) has issued recommendations for the use of technologies within the context of evidence development since it was first established

• The guidance referring to evidence development usually takes the form of recommending a technology is used ‘only in’ the context of research, but recommendations of evidence collection alongside approval have also been issued

• The incremental cost-effectiveness ratios (ICERs) of these technologies were usually higher than the standard threshold range and there was usually uncertainty around the magnitude of clinical effect

• The use of ‘only in research’ and ‘approval with research’ recommendations has decreased over recent years and is rare for technologies appraised through the single technology appraisal process

• A transparent and systematic framework for the use of recommendations including evidence development would be beneficial

1 Introduction

There has been growing interest in the inclusion of formal requirements for the collection of further evidence within reimbursement decisions as part of the technology approval processes undertaken by healthcare agencies [17]. A recent review found that five countries have implemented ‘coverage with evidence development’ schemes: Australia, France, Sweden, the UK and the USA [2]. In the UK, most national decisions about which health technologies should be used routinely in the NHS are made by the National Institute for Health and Clinical Excellence (NICE). In addition to recommendations of whether a technology should be approved for routine use or not, it has also been established that NICE has the option of recommending the use of a technology in the context of evidence development, including the collection of data within registries, prospective cohort studies and pragmatic randomized trials [8].

A range of terminology is used within agencies and the literature to refer to these types of recommendations that link reimbursement decisions with requests for further evidence development. In addition, the recommendations can be implemented in different ways. Two distinct types of recommendations directly incorporating evidence development can be termed as recommendations of use ‘only in research’ (OIR) or ‘approval with research’ (AWR). The distinction between these two forms of guidance is principally the extent of coverage that each confers: whether all patients taking the technology must participate in the research programme or if non-participants can also routinely access the technology providing the research gets conducted. There is, however, a lack of consensus on the circumstances under which such schemes should be recommended. NICE provides its Appraisal Committees with general guidance on the health technology assessment (HTA) methodologies and social value judgements it considers to be most appropriate for the formulation of NICE guidance [911]. These documents include guidance on the assessment of effectiveness, cost effectiveness and other considerations. With regard to research recommendations, NICE states that its:

advisory bodies may sometimes recommend that an intervention is used only within a research programme. They should consider whether the intervention is reasonably likely to benefit patients and the public, how easily the research can be set up or whether it is already planned or in progress, how likely the research is to provide further evidence, and whether the research is good value for money. [10]

The documents do not distinguish explicitly between alternative types of research recommendations, and do not describe any formal mechanisms for linking the decisions to funding for the research.

Other recent developments include linking reimbursement with evidence of outcomes generated after approval. For example, in 2009 NICE established a formal process for the consideration of ‘patient access schemes’ (PASs), which are aimed at enabling patients to gain access to high-cost drugs by improving their cost effectiveness [12]. Whilst this formal process is new, ‘access’ or ‘risk-sharing’ schemes have previously been adopted by the Department of Health in the UK, for example the risk-sharing scheme for interferon beta [13]. Importantly, these new PASs do not necessarily require the collection of additional evidence; rather they could include a simple price discount or other cost reduction. It is the requirement for evidence collection that characterizes the difference between OIR/AWR recommendations and the broader range of conditional reimbursement recommendations.

There are pressures on reimbursement and HTA agencies to make rapid and clear decisions about approval and reimbursement when a technology is first launched within the respective healthcare system. In response to such pressures in the UK, NICE responded by establishing a faster process for appraisal, the single technology appraisal (STA) process, to issue guidance closer to the time of marketing authorization. Recommendations including requirements for evidence development may be particularly valuable for technologies such as those appraised earlier in the product history, as the evidence base will be least mature and there may be substantial uncertainty in cost effectiveness. However, it is unclear what impact the introduction of the STA process at NICE has had on the use of OIR/AWR recommendations.

Previous research has identified that NICE in the UK uses some forms of recommendations with evidence development [1416]. NICE has itself considered some potential issues that could be taken into account when formulating these recommendations through its ‘Citizens Council’ [17]. However, the extent to which the stated criteria employed by NICE have been considered in the formulation of guidance has not been previously examined and no clear guidance has been issued on when NICE advisory bodies should consider recommending research rather than standard ‘accept’ or ‘reject’ decisions. The primary aim of this review was to identify where OIR/AWR recommendations were made or considered in the development of NICE guidance. Secondary aims were to identify the considerations that led to the recommendations for further research, to identify any common characteristics in appraisals including OIR/AWR recommendations and to assess the implementation of the OIR/AWR recommendations based on reviews of published guidance. This review forms part of a larger piece of research to establish an improved framework for formulating approval and research recommendations under uncertainty at NICE [18, 19].

2 Methods

A systematic review of NICE technology appraisal (TA) documents was conducted. The aim of the systematic review was to identify those pieces of guidance where OIR or AWR recommendations were proposed.

2.1 Inclusion and Exclusion Criteria

All NICE TA guidance up to January 2010 was considered for inclusion in the review. This included all draft and final guidance documents. The document containing the Committee’s intended final recommendations is the ‘Final Appraisal Determination’ or FAD. These are made publicly available and can be appealed by specific stakeholders before becoming final guidance to the NHS. In 2002, the NICE process was amended to also publish draft guidance documents in the form of ‘Appraisal Consultation Documents’ or ACDs for public consultation. Where changes in guidance are made following consultation or appeal, there may be multiple ACDs or FADs related to a single appraisal; all versions of the ACDs and FADs were reviewed. The document published as final formal guidance to the NHS is referred to here as the final guidance document.

NICE guidance documents are published in a standardized format with the guidance to the NHS presented in section 1. The rest of the guidance document provides an overview of the evidence, an explanation of how the evidence was interpreted by the Committee, and additional information to assist the implementation of the guidance. Each guidance document usually includes a section detailing key evidence gaps or suggestions for further related research. The guidance is not conditional upon the fulfilment of these recommendations and they do not form part of the mandatory guidance to the NHS, and are therefore not defined as OIR/AWR recommendations for this review. For inclusion, the guidance documents (draft or final) had to refer to requested, ongoing or planned research in the ‘Guidance’ section (section 1) of the documents. The research recommendations could be framed either as OIR or AWR based on the following definitions:

  • OIR: a recommendation stating that the technology should not be used routinely and advocating that further research should be conducted in the Guidance section.

  • AWR: a recommendation stating that the technology should be used routinely and advocating that further research should be conducted in the Guidance section.

Only documents that have been made publicly available were included; specifically ACDs for TAs 1–43, except 32, were not made publicly available. Documents that have been publicly released but later removed from the NICE website were included in the review (for example, guidance that has been replaced by a subsequent review), and have been obtained directly from NICE where appropriate. Draft recommendations that request further clarification or analysis from the sponsor of the technology (sometimes referred to as ‘minded no’ recommendations in the STA process) are excluded as they require the reanalysis of existing data rather than additional data collection. Guidance including PASs could be categorized as OIR/AWR providing that the guidance was conditional on the access scheme, and that the scheme contained a requirement for further research or the collection of further data. The documents containing OIR/AWR recommendations were cross-checked with a review of OIR recommendations compiled by NICE to check for potential omissions [16].

2.2 Data Extraction and Analysis

Data from each document that included OIR and/or AWR recommendations were extracted using a template developed specifically for the project. Where recommendations changed between draft and final guidance, explanations for the change were reviewed and assessed. Extracted data included: background details on the appraisal and the technology under consideration, and estimates of cost effectiveness. Also included in the data extraction template was a categorization of potential issues that could have led to the issue of OIR/AWR recommendations. Thematic content analysis of the ‘Committee considerations’ section of the guidance documents was conducted to extract the considerations leading to the recommendations. An initial categorization of themes was developed following a review of the literature conducted in parallel as part of a wider research project. This was amended following a review of a sample of guidance documents. The guidance documents were reviewed to identify the stated reasons for the OIR/AWR recommendation and coded according to one or more of the pre-determined categories. The focus was on extracting the stated rationale for the guidance rather than inferring what the Committee’s considerations could or should have been. The specific items of data extracted are reported in Table 1. Data were extracted by one reviewer (JY) and a sample was cross-checked by another reviewer (LL). The data were analysed to identify common characteristics of appraisals that included OIR and/or AWR recommendations, and to assess if there were differences according to the type of recommendation (OIR or AWR).

Table 1 Data extracted

2.3 Examination of Updated Appraisals

Each piece of NICE guidance is considered for review after a specified time from publication, usually after 3 years. Those OIR/AWR recommendations that have been reviewed by a later appraisal were identified. In order to determine if the OIR/AWR recommendation had been implemented, appraisals reviewing previous guidance were examined for new evidence submitted since the original appraisal specifically relating to the OIR/AWR recommendation. Details of changes to the recommendations in the guidance were also recorded.

3 Results

3.1 Characteristics of Appraisals with OIR and AWR Recommendations

Of the 184 appraisals conducted up to January 2010, 40 included OIR/AWR recommendations in the draft and/or final guidance. A list of all appraisals including OIR and AWR recommendations is provided in the Supplementary Table (Online Resource). Most guidance with OIR was not specific about the research to be conducted and often referred to the technology being ‘not recommended except within clinical trials’ or being ‘recommended only within clinical trials’. For example, guidance on mycophenolate mofetil (MMF) for use in paediatric renal transplantation stated: “The use of MMF in corticosteroid reduction or withdrawal strategies for child and adolescent renal transplant recipients is recommended only within the context of randomised clinical trials” (TA99 [20]). There were some exceptions to this. For example, the OIR guidance on spinal cord stimulation provided more information on the type of information the research should provide: “Spinal cord stimulation is not recommended as a treatment option for adults with chronic pain of ischaemic origin except in the context of research as part of a clinical trial. Such research should be designed to generate robust evidence about the benefits of spinal cord stimulation (including pain relief, functional outcomes and quality of life) compared with standard care” (TA159 [21]).

AWR guidance was routinely more specific about the type of evidence collected. A detailed AWR recommendation was made for the use of inhaled insulin in a subgroup of people: “Data on the use of inhaled insulin according to this guidance should be collected as part of a coordinated prospective observational study. The data collected should include individual patient outcomes, adverse events and measurements of lung function” (TA113 [22]). Similarly guidance on etanercept and infliximab for the treatment of arthritis was prescriptive regarding the data to be collected: “All clinicians prescribing etanercept or infliximab should (with the patient’s consent) register the patient with the Biologics Registry established by the BSR [British Society for Rheumatology] and forward information on dosage, outcome and toxicity on a 6-monthly basis” (TA36 [23]).

Multiple ACDs were issued for some appraisals and the 31 ACDs containing OIR/AWR recommendations relate to 25 appraisals. Of the 31 ACDs, 26 (84 %) included OIR recommendations and five (16 %) AWR recommendations. All of the 29 FADs included in the review relate to a unique appraisal and were all published as final guidance for the appraised technology: 25 (86 %) were OIR recommendations and four (14 %) were AWR recommendations. OIR recommendations were much more common than AWR recommendations. Changes to the inclusion of OIR/AWR recommendations between draft and final guidance were more common than suggested by the summary numbers; only 14 appraisals included OIR/AWR recommendations in both draft and final guidance (ACDs were unavailable for a further 12 appraisals). (See the Supplementary Table [Online Resource] for further details.) In most cases, where an OIR/AWR recommendation was removed after consultation, the final guidance usually recommended the technology for all or a specific subgroup of patients.

A single piece of NICE guidance can include several recommendations related to multiple technologies, multiple indications or different settings for the use of the technology. Over half of the OIR/AWR recommendations specified the need for further research in particular subgroups of patients (52 % of OIR/AWR recommendations in final guidance documents). In approximately a quarter of cases, the OIR/AWR recommendations targeted a subset of the technologies included in the appraisal.

Overall, 16 % of all appraisals included an OIR/AWR recommendation in the final guidance. Table 2 shows a recent decline in the frequency of guidance including OIR/AWR recommendations. There were no apparent differences in the decline between OIR and AWR recommendations. No final guidance included OIR/AWR recommendations in 2007, which is the year following the introduction of the STA process. Differences in the frequency of OIR/AWR recommendations were observed between the two NICE appraisal processes. Of appraisals issued through the multiple technology appraisal (MTA) process, OIR or AWR recommendations were included in draft guidance of 23 appraisals and final guidance of 28 appraisals. These 28 TAs account for 19 % of all final guidance issued within the MTA process. In the STA process, only two ACDs and one piece of final guidance contained OIR/AWR recommendations. This accounts for just 2 % of all final guidance issued through the STA process up to the time the review was conducted.

Table 2 The number of OIR/AWR recommendations by year of publication

The data were examined for differences in the use of OIR/AWR recommendations according to general disease area and the type of technology under appraisal. In absolute terms, OIR/AWR recommendations were more common for cancer treatments, accounting for over a third of all the OIR/AWR recommendations in the final guidance, followed by musculoskeletal conditions (n = 7), which accounted for almost a quarter of cases identified. However, NICE has appraised a large number of treatments for cancer: 28 % of all published appraisals over the review period. Only 7 % of all NICE TA guidance has related to musculoskeletal conditions and so it appears that a disproportionate amount of these have included OIR/AWR recommendations compared with appraisals for other conditions.

Just over half of the appraisals with OIR/AWR recommendations related to drugs (n = 16; 55 %). However, taking into account the total number of TAs published relating to drugs, the use of OIR/AWR recommendations appears to be on average less common for these appraisals: 11 % of all drug appraisals within the period contained OIR/AWR recommendations, compared with 47 % of all guidance on therapeutic or surgical procedures and 27 % of all guidance on devices.

3.2 Cost Effectiveness of Technologies with OIR/AWR Recommendations

NICE requires all appraisals to include an assessment of cost effectiveness usually framed as an incremental cost per QALY. All appraisals that included an OIR/AWR recommendation considered the cost effectiveness of the technologies. Most of the guidance documents reported several different estimates of incremental cost effectiveness based on analyses submitted by different stakeholders, relating to different uses of the technology or based on different sets of assumptions or evidence. However, a formal assessment of cost effectiveness was not always conducted or reported in the ACD or FAD for the use of the technology specified in the OIR/AWR recommendation. Table 3 shows the base-case incremental cost-effectiveness ratios (ICERs) for the overall population and for the specific OIR/AWR indication where this differs. Ideally, the ICER considered most plausible by the Appraisal Committee after reviewing the evidence would be taken to reflect the NICE view of cost effectiveness. As this was not always reported, the base-case estimate from the independent Assessment Group is also reported in the table.

Table 3 ICERs of technologies with OIR/AWR recommendations (in FADs only) [n (%)]

Most of the reported ICERs were higher than the £20,000–30,000 threshold range employed by NICE. Only guidance phrased as AWR reported ICERs within the £20,000–30,000 threshold range for the AWR indication as preferred by the Committee. The two ICERs reported as above this range for AWR guidance and considered plausible by the Committee related to two technologies reviewed within one appraisal and were only marginally above £30,000. In some cases, ICERs were reported but were based on analyses that did not use the QALY as the outcome measure. For example, TA5 on the use of liquid-based cytology reported ICERs of £1,100 and £2,500 per life-year gained depending upon the length of the screening interval [24]. Where ICERs were not directly reported, there was often an indication of whether the technology was considered to be cost effective. For example, TA8 on hearing aid technology stated that: “Whilst it is impossible, on the basis of the present evidence, to estimate meaningful cost-utility ratios … additional spending on this service, if appropriately targeted, has the potential to be highly cost effective.” [25]

3.3 Considerations Leading to OIR/AWR Guidance

The frequency of technologies considered cost effective by the Appraisal Committee when used in the context of the OIR/AWR recommendation is presented in Table 4. In most cases (79 % of FADs with OIR/AWR recommendations), the technology was not cost effective and an OIR recommendation was issued. There were only a couple of cases where OIR recommendations were made for technologies considered likely to be cost effective based on the accepted analyses. These appraisals (TA5 [24] and TA51 [26]) requested that pilot implementation programmes be undertaken prior to the large-scale and routine introduction of the technologies into the NHS. In the small number of cases where AWR recommendations were issued, the technologies were usually considered to be cost effective. In the single exception, the ICER for the technology was higher than, but close to the upper bound of, the usual threshold range and reported to be in the range of £27,000–35,000 per additional QALY gained (TA36 [23]).

Table 4 Type of recommendation and conclusion regarding cost effectiveness (in FADs only) [n]

Table 5 shows the stated rationale for issuing the OIR/AWR recommendations. Of the pieces of final guidance that did not explain the rationale for the OIR/AWR recommendation, three were issued prior to a section on the Committee’s considerations being routinely included in the documents (TA5, TA6, TA17). The OIR in the other appraisal related to three specific subgroups of patients: two were not referred to in the Committee’s considerations at all and it was stated that there was “no clinical or modelling evidence, or expert opinion” to support the use of the technology in the third subgroup (TA75 [27]).

Table 5 Types of reasons for including research recommendations within the guidance (n)

A need for further evidence on the relative effectiveness of the intervention in the overall population or the OIR/AWR subgroup was the most commonly cited reason for issuing the OIR/AWR recommendation (Table 5). Several reasons were cited in support of most recommendations and the need for further evidence on relative effectiveness, either in the overall population or the OIR/AWR subgroup, was cited in 19 (66 %) FADs identified in the review, and most of these included OIR recommendations. Of those citing a lack of sufficient evidence on relative effectiveness, only one included an AWR recommendation (TA113 [22]). This appraisal noted a gap in the evidence on clinical effectiveness for the highly selective subgroup of patients targeted in the recommendations and that the cost-effectiveness estimates were sensitive to these estimates. It recommended that the data would be most appropriately collected through a registry.

There was a greater range of considerations cited for draft AWR recommendations than for the final guidance. All final AWR guidance documents referred to the need for long-term data, and most also referred to a need for additional data on adverse effects. These two considerations were also referred to in a small number of OIR final guidance documents. A need for longer-term data was also frequently cited as a consideration leading to the OIR/AWR guidance. Uncertainty in the cost-effectiveness estimates was a common consideration; however, in all cases this was coupled with a need for further clinical evidence. No guidance (draft or final) cited concern about investment and reversal costs as a rationale for OIR/AWR recommendations. However, TA51 on computerized cognitive behavioural therapy (CCBT) did suggest concerns regarding the levels of training required for the implementation of a recommendation to routinely introduce CCBT into the NHS: “Further information is required about the extent of training needed and circumstances under which different staff could provide support for users of CCBT” (TA51 [26]). Concern about the budget impact of introducing the technology or the potential impact on ongoing research did not lead to the OIR/AWR recommendation in any of the appraisals.

NICE routinely considers the list price of technologies (e.g. as reported in the British National Formulary for drugs) and possible changes in price over time are not usually taken into account. However, a system for considering reductions in the costs of treatment through PASs has been established. None of the OIR/AWR guidance identified within the review included a PAS; however, one appraisal that included an OIR recommendation at an earlier stage of development later included a PAS (TA129). This appraisal included an OIR recommendation in the draft guidance and stated that the technology was not recommended except for use in well-designed clinical studies and that the Committee was not persuaded of its cost effectiveness; however, this was subsequently amended to a ‘reject’ recommendation after concerns were raised about whether the research would be conducted. The final guidance approved the technology following the offer of a PAS, which would reduce the cost of providing treatment. The PAS was designed to offer a rebate to the NHS when patients’ disease responds less than partially to treatment; however, there was no formal requirement for data analysis and reporting beyond the level of rebate and it is therefore not categorized as an OIR recommendation here. In another appraisal, the OIR recommendation was revised to an approval after the Committee revised their estimates of cost effectiveness based on discounted prices of the technology along with further information on quality-of-life improvements (TA166).

Considerations around ethical implications and whether uncertainties in the evidence base would resolve over time were not explicitly stated as reasons for issuing OIR or AWR recommendations. In addition, the relative costs and benefits of conducting research were not reported as considerations of the Committee when formulating its research recommendations.

Most of the appraisals that required relative effectiveness data recommended experimental research designs for evidence collection (Table 6). Two appraisals that cited a need for further evidence on relative effectiveness in the final guidance recommended observational studies due to anticipated difficulties in conducting randomized controlled trials (RCTs) in the specific OIR patient population (TA37) or indication (TA167). There were changes in the recommended type of research between draft and final guidance, which were mainly due to changes in the target OIR/AWR population (e.g. TA68) or the recommendation of a broader type of research (e.g. TA89).

Table 6 The type of research recommended [n (%)]

3.4 Review of Updated Appraisals

Among the OIR/AWR recommendations in the final guidance, ten were updated following a review, including two that were incorporated into clinical guidelines (CGs). Table 7 provides details of the appraisals, whether additional evidence was provided and the change to the OIR/AWR recommendation (new evidence for other recommendations included within the guidance is not noted in the table).

Table 7 New evidence on the OIR/AWR recommendation provided at a review of the guidance

In the majority of reviewed appraisals, new evidence informing the OIR/AWR recommendation was available for the review. In three cases, no new evidence was provided that was specific to the OIR/AWR indication. For the review of TA6, no new RCT data were available for the OIR recommendation, which was made more restrictive in the review guidance. New evidence on clinical effectiveness was not available for the review of TA33, and although new data on adverse effects were provided, they were considered inadequate and no change was made to the OIR recommendation. The OIR recommendation was removed from the review of TA37 despite a lack of new evidence presented. In this case, there had been a reduction in demand for the drug in this setting (it had since become licensed and NICE approved for treatment of an earlier stage of disease) and there were concerns about the feasibility of future data collection.

4 Discussion

This review has found that NICE issued OIR/AWR recommendations in 16 % of its published TA guidance. These recommendations have most frequently taken the form of OIR; however, a handful of recommendations were phrased as AWR. Proportionately more OIR/AWR recommendations were issued for appraisals of procedures and devices than of pharmaceuticals. The most common reason cited for OIR/AWR recommendations was uncertainty regarding relative effectiveness, necessitating the need for further evidence. Potential investment and reversal costs have not explicitly led to OIR/AWR recommendations.

Some authors have suggested that, along with other criteria, OIR should be used only when the expected net benefit from the technology is likely to be positive [28]. This review has highlighted that the majority of OIR recommendations issued by NICE were for technologies considered unlikely to be cost effective based on the evidence available at the time of the appraisal. Arguably, for an OIR recommendation to be of value to decision makers, it should have the potential to reverse the decision rejecting the claim for reimbursement. None of the guidance identified included an explicit consideration of the likelihood of the technology being cost effective based on the further evidence within the rationale for the OIR recommendation. The review did, however, identify occasional use of OIR recommendations for technologies where the best available evidence suggests that they may be cost effective: early appraisals of liquid-based cytology for cervical cancer screening and CCBT. In both of these cases, the implementation of routine use of the technologies in the NHS could have required substantial infrastructure or training requirements and possibly significant irreversible costs. In both cases, the initial guidance recommended research in the form of ‘pilot implementation projects’. In many cases, the ICER considered most plausible by the Appraisal Committee was not stated; this was more common for OIR than for AWR recommendations. This could be as a result of the need for reassurance that the technologies were cost effective prior to issuing a recommendation that would lead to routine provision of the technology (albeit with research required).

NICE has recently issued a categorization of all of its TA guidance [29]. There are some differences between the NICE categorization and the results of this review owing to differences in the definitions employed. The most notable differences relate to the classification of AWR guidance. NICE does not use the terminology of AWR in its classification system. However, this review has identified a small number of appraisals that apparently fall into this category of approving a technology for use, and also recommending research within the guidance to the NHS. In all of these cases, observational studies and/or data collection through disease registers were recommended and the technologies were considered to be cost effective in most cases. The NICE categorization only refers to final guidance. Of the four pieces of final AWR guidance identified in this review, NICE categorized one as recommended, two as ‘optimized’ and one as OIR. The lack of a formal category of AWR recommendations by NICE most likely reflects its remit, which is to make recommendations on the best use of technologies within the NHS rather than to make recommendations on research to research funders. Despite that, there is clearly ambiguity in the terminology used in the guidance and differences in the interpretation of recommendations for research made within the Guidance sections of the documents.

One striking finding from this review is the decline in the use of OIR/AWR recommendations over the past 5 years. The decline in the use of OIR/AWR recommendations coincides with the introduction of the STA process in 2006 [30, 31]. Only one appraisal conducted through this process—which is now the most commonly used process for new technologies—included an OIR recommendation in the final guidance and no STA appraisals included AWR recommendations. At first glance, this may appear counter-intuitive. Technologies appraised through this process are usually new and therefore have a more limited evidence base than technologies appraised through the MTA process. However, it could also be that the STA process has started to shift the burden of the proof of effectiveness and cost effectiveness onto the manufacturers and sponsors of technologies. Recommendations to the NHS regarding the research of these technologies may then be seen as less relevant. The infrequency of OIR/AWR recommendations within STA guidance could also reflect tighter time and resource constraints in the production of STA guidance.

The rarity of OIR/AWR recommendations in the STA process does not fully account for the reduction in the use of OIR/AWR recommendations over time and there has also been a decline in their use within the MTA process. This decline could also be linked to an increased opportunity to negotiate on the costs of technologies through PASs; however, there was no evidence of this based on the review and there are usually no specific research requirements within PASs considered by NICE. There have been several other changes to the NICE process within the period covered by this review including the introduction and update of two important documents underpinning the NICE appraisals: NICE’s Guide to Methods of Technology Appraisal in 2004 [32] and 2008 [9] and NICE’s Social Value Judgements in 2005 [33] and 2008 [10]. It is difficult to assess the direct impact of these documents as a delayed impact following their introduction is likely. There has, however, been no explicit policy change recorded in these documents or elsewhere to explain the decline in OIR/AWR recommendations. It is possible that with increasing experience NICE has found OIR/AWR recommendations to be less useful or the lack of a formal link to funders and funding for these recommendations has created difficulty in their implementation. It may also have been that with experience NICE has found itself to have insufficient time resources and expertise to adequately develop and prioritize research recommendations within its guidance.

A recent paper reporting a consensus statement developed by a group of academics and some policy makers from Australia, Canada, the UK and the USA on the use of coverage with evidence development recommended that any such guidance should clearly specify the objective of the recommendation and that this should inform the design of the evidence development, which should also be clearly specified [34]. It also recommended that the design of the evidence development should clearly reflect the healthcare system and its objectives, and that the governance for the research should be independent of vested interests [34]. Whilst these aims are to be applauded, they do not appear to have been widely implemented as yet and no direct formal policy changes in the HTA or reimbursement agencies have been reported (although it is noted that a review of NICE methods in the UK is ongoing). The review presented here has found that the use of OIR/AWR recommendations by NICE meets some of these objectives but still has a way to go. The broad type of research required was usually specified as was the rationale that led to the decision. However, detailed research design requirements and recommendations as to research governance were rarely specified. This likely reflects the lack of a formal process for developing and funding NICE OIR/AWR recommendations.

There have been a few recent developments at NICE, which have increased the potential for further research alongside approval, although their impact remains to be proven. One opportunity has arisen through the supplementary guidance to NICE Committees about technologies used to treat patients at the end of life [35]. This guidance describes criteria for when the Committees should consider departing from the usual criteria for cost effectiveness. In addition, the guidance states that when recommending a treatment under the end-of-life criteria:

[NICE] will normally recommend to the Department of Health that it should give consideration to a data collection exercise for treatment recommended for use on the basis of the criteria set out in section 2. The purpose of this will be to assess the extent to which the anticipated survival gains are evident when the treatments involved are used in routine practice. The outcome of this exercise will be evaluated when the guidance for that treatment is reviewed. [35]

However, in practice the uptake of this recommendation appears to be limited, and in an early review of the policy, it was noted that implementation of such schemes had proven problematic and was likely to be particularly difficult for non-cancer treatments [36]. The second opportunity has arisen through the introduction of PASs in 2009. However, whilst these allow for additional data collection, it is not a formal requirement for all schemes, and most PASs agreed since the process was formally established have been based on reducing the cost of treatment, for example through simple price discounts or rebates for some specified cycles of therapy [12], and such schemes are frequently designed around only reducing the costs of treatment. Finally, NICE has recently issued a guide to assist the production of all of its research recommendations, including those not forming part of the mandatory guidance [11]. It states that relationships with key funders of research in the UK are now integral within the NICE processes and that NICE is proactively exploring other further relationships. If this succeeds, it could potentially enhance the use and implementation of OIR/AWR guidance in the future, but it is currently too early to judge.

A limitation of this analysis has been the reliance on the documented considerations of the Appraisal Committee in formulating its recommendations and whether these, in some cases fairly brief, summaries fully reflect all the considerations that led to the recommendations including the recommendations for OIR/AWR. In addition, identification of the ICERs considered most plausible by the Committee was not always possible; the ICERs for the specific OIR/AWR indications were frequently unclear or unavailable from the documentation. However, the clarity of reporting of the ICERs accepted by the Committee appears to have improved over recent years. Finally, focusing on the reviews as an indication of the success of the OIR/AWR recommendations could bias towards a positive finding as a lack of new evidence could have led to the postponement of planned reviews. However, information from the NICE website suggests that research to potentially inform a review is being conducted in most of the appraisals including OIR/AWR recommendations [29]. Sixteen OIR/AWR appraisals have been considered for review: six have been postponed pending the reporting of ongoing research and a further six are ongoing or scheduled. Only two reviews have been cancelled due to the lack of new evidence and a further two cancelled due to the technology becoming obsolete. Although this review has focused on the TA programme within NICE, the types of considerations in its other guidance programmes are likely to be similar. Further research could examine this empirically within NICE or between other reimbursement agencies.

5 Conclusions

This review has revealed that NICE has used OIR/AWR recommendations since its inception, but these recommendations appear to be on the decline. As a proportion of guidance issued, the use of OIR/AWR recommendations has been more common for appraisals of procedures and devices than for pharmaceuticals. The most commonly cited reason for issuing OIR/AWR guidance has been a need for further evidence on the relative clinical effectiveness of the technology, either for the licensed population or for a subgroup. Consideration of cost effectiveness is routine within the appraisal process, and OIR guidance has mostly been issued for technologies that have not been considered to be cost effective based on the evidence available at the time of the appraisal. The potential impact of a routine approval recommendation on ongoing research or incurring irrecoverable costs does not appear to have explicitly led to any OIR/AWR recommendations. This review has highlighted the characteristics of technologies that are more likely to receive OIR or AWR recommendations, particularly with regard to the cost effectiveness; uncertainty in relative effectiveness; and uncertainty about long-term effects or adverse events. The development of a formal policy on the types of considerations that lead to OIR/AWR recommendations at NICE could improve the transparency and predictability of decision making.