1 Introduction

Ten years have passed since the publication of the EQ-5D-5L descriptive system [1] and 5 years since a value set for England was first published [2]. These years have been squandered by a protracted and misguided debate in the UK, culminating in the National Institute for Health and Care Excellence (NICE) rejecting the published value set [3]. In this editorial, I provide one outsider's view of the debacle and present some ideas for discussion that might inform future decisions about new methods.

2 A Little History

Publication of the final (peer-reviewed) version of the value set in August 2017 [3] triggered a flurry of editorials and commentaries [4,5,6]. Contributors to the debate varied in their perspectives of whether or not NICE should recommend the use of the 5L value set, but all were broadly supportive of a ‘pause’ [4] and the ‘understandable care’ [5] and ‘caution’ [6] of NICE's approach [7].

A team was appointed to develop a new value set for the UK, and there is, as far as I understand it, a clear roadmap to the adoption of the EQ-5D-5L. In the interim, NICE has recommended methods for mapping from 5L data to the old 3L value set. But sticking plasters can be replaced; NICE recently proposed an alternative mapping function to the earlier recommendation [8, 9].

The new valuation study and the various mapping function recommendations are second-best solutions to a problem of NICE's making. But NICE is not the only culprit here. The events highlight some troubling tendencies in our collective attitude to methodological progress in health technology assessment (HTA). Based on the advice of numerous academics, NICE has adopted a pernicious kind of status quo bias borne of a misunderstanding of what it means to deliver consistent decision making.

3 Conceptual Consistency

NICE publishes methodological guidance for the conduct of technology appraisals [10]; it will soon publish a new manual. Health economists in the UK (and further afield) take this seriously. Guidance facilitates consistency. Without consistency in the approach used by analysts, results would be less comparable and NICE's job of supporting an efficient allocation of resources would become more difficult.

The scope of NICE's decision making should determine the appropriate methods to be employed. It would be an error to conflate the underlying objectives and scope of decision making with the specific methods. The old distinction between normative and positive economics may be helpful here; “the positive deals with evaluating means while the normative deals with evaluating ends” [11]. EQ-5D value sets represent means, not ends.

When NICE reviews its methods guidance, as it has been doing for the past year or so [12], it considers matters less trivial than the latest technical developments in methodology. Certain aspects of methodological distinction may be more appropriately framed as conceptual or normative. Examples relevant to health state valuation include the sources of values (e.g. ‘public’ vs ‘patient’ values [13]) and the objects of value (e.g. whether to consider outcomes ‘beyond health’ [14]).

Let us consider the example of patient values or preferences. NICE recommends the use of public values in the estimation of health state utility values. Other HTA agencies make other recommendations. There are sound reasons to favour either approach (or both, or an alternative). Importantly, these differences do not arise from mere measurement error or bias but have fundamentally different conceptual bases and ethical justifications. A shift from one to the other is rightly contentious and contested in the literature; no approach can ever be proven superior.

HTA agencies may reasonably shift from one approach to another. However, this would render appraisals under each conceptual regime incomparable. For example, if NICE recommended using patient preferences to value health states, judging the value of technologies historically evaluated based on societal values would not be possible. Past decisions could be deemed to be ‘wrong’ based on the current scope of decision making. Thus, there is merit in HTA agencies supporting conceptual consistency through time. There may be a need to make changes in light of new thinking, but conceptual consistency ensures that decisions about care provided by the National Health Service (NHS) can be justified based on current priorities. If the conceptual basis for the HTA process changes, then the provision of technologies evaluated based on retired concepts is (to a greater or lesser extent) undermined.

This has little to do with what I would typically call ‘methodology’.

4 Methodological Fluidity

Quantitative and qualitative methods are subject to constant development. Undoubtedly, new methods require testing; researchers must assess the extent to which methods are valid in identifying the truth that we seek. This applies to methods for health state valuation and HTA more broadly. Recent decades have seen extensive development in methods for discrete choice experiments [15], eliciting values for states ‘worse than dead’ [16], and various other tweaks specific to EQ-5D valuation [17].

It is reasonable to make methodological recommendations based on the latest research and developments. Changes in such recommendations need not be considered in the same way that a conceptual shift, such as a change in the source of valuation, must be considered. Relevant methods, in this case, may relate to preference elicitation, including data collection strategies and methods for modelling data. It is on these grounds that the EQ-5D-5L value set for England faced criticism [18].

The favourable properties of the EQ-5D-5L descriptive system, compared with the 3L, have been demonstrated extensively [19]. This should not come as a surprise; researchers developed the 5L precisely because a long history of methodological work suggests that this would be the case. NICE’s apprehensions lie in the value set. Here too, and for similar reasons, we see substantial advantages of more modern methods for preference elicitation [17, 20, 21].

5 A Different Kind of Decision Problem

When assessing technologies, a necessary step is to identify a reasonable comparator. In most cases, alternatives should be compared with the status quo. NICE has failed to recognise the pertinence of this principle in the adoption of new methods.

The decision to reject the 5L value set seemingly gave no consideration to the alternative, which is to continue using the Measurement and Valuation of Health (MVH) study 3L value set developed in the early nineties [22]. Any criticism of the 5L value set for England is irrelevant unless the same criticism cannot be levelled at the MVH value set. Show me one defence of the 3L value set in comparison with the 5L value set! And yet, here we find ourselves, using the 3L value set.

Normative statements cannot be tested. The validity of methods to derive a value set can be tested. However, they cannot—or at least should not—be tested against some unattainable perfection. In lieu of a methodological ‘gold standard’ (as noted by Werner Brouwer and Denzil Fiebig in their expert reviews of the value set [23, 24]), perfection became the comparator. Show me a quantitative study without error or room for improvement! This was a predictable problem for the ‘quality assurance’ process [25].

The methods that we employ must be evaluated against the next best alternative. Perversely, methodological developments since the 5L value set for England study was conducted have been used against it. On this basis, as methods are continually refined, newly published value sets will always fall victim to future improvements.

The 5L saga is a textbook example of perfect being the enemy of the good. In the context of an HTA system that is heavily driven by cost per quality-adjusted life-year (QALY) analyses, it has likely cost us dearly.

6 Concluding Remarks

The NHS is habitually mocked for using fax machines and Windows XP because better technologies are available. Could this be a distinctly British problem?

NICE will soon publish a new manual for HTAs and, based on the most recent draft, it is likely to maintain its recommendation against using the EQ-5D-5L [8]. The guidance that NICE provides for methods of technology assessment, and the process for revising these recommendations, must recognise that conceptual consistency and methodological fluidity are compatible principles. And researchers should cease resting on their laurels.

NICE’s reference case should be revised regularly, including explicit reconsideration of the recommended measurement instrument to estimate QALYs. Where there have been methodological advances within a conceptually consistent framework, new measures should be recommended. In some cases, it may be appropriate for NICE to reconsider the conceptual basis for outcome measurement, which would require a normative case to be made.

That the EQ-5D-5L represents methodological developments should not act as a barrier to use by NICE. Results derived using the 5L value set will differ from the 3L value set [26], but this is not relevant to the decision problem so long as both seek to measure the same thing and are conceptually consistent. While we await the new UK value set, NICE should recommend the 5L value set for England. Axe the fax! To hell with the 3L!