Special Edition on Utility Measurement, PharmacoEconomics
The authors of this special series of papers about utility estimation have provided a very useful overview of the field. The papers have emerged from a collaboration between academics at Sheffield University and experts from industry (at Takeda Pharmaceuticals). This is clearly a very productive partnership—partly because it combines and balances academic rigour and industry practicalities. This is an informative set of papers that will be useful for anyone working in the fields of pharmacoeconomics and health technology assessment. This collection of papers also adds to the existing literature by providing an update on some important issues. For example, the paper on international guidelines is very useful because these issues can change quite quickly . The current conclusions on the role of different types of measures is also an important update [2, 3].
Reading this overview of the field prompts thoughts about what the field has been able to achieve, what challenges lay ahead of us and where the use of preference-based measures of health (PBMs) have done little to help decision makers. I will go through some thoughts on these issues in turn.
One interesting general conclusion is that the generic PBMs remain in a dominant position as choice of outcome measure . The team conclude that condition-specific PBMs have a role to play where there is good evidence that generic measures are not able to measure changes in health . If there is no evidence for insensitivity in generic measures then the results from the condition-specific measure should be limited to sensitivity analyses (in a decision model). Generic measures, particularly the EQ-5D, have been criticized as insensitive and unable to measure the obvious benefits of different treatments . For example, one recent study claimed that EQ-5D-3L lacks sensitivity for measurement in cardiovascular disease . In recent years, there has been huge interest in the development of condition-specific measures to improve measurement accuracy. However, for many condition-specific measures, there is no evidence for a measurement advantage over tools like EQ-5D [6, 7]. In addition, some condition-specific measures also have limitations regarding their ability to measure complications or adverse events. Condition-specific measures have not offered the advantages that were anticipated. Many researchers are now interested in the potential for bolt-on measures to improve measurement accuracy. Here the generic measure has an additional dimension added to it to reflect a specific issue that is not measured in the original descriptive system (e.g. vision, hearing or itch). The lessons from condition-specific PBMs suggest that there needs to be a clear rationale for the development of any bolt-on, which should be based on measurement insensitivity of the original instrument in a disease area. This rationale needs to be based on evidence. The lessons from the development of condition-specific measures suggest that there may be no a priori reason to believe that a bolt-on instrument will provide greater measurement sensitivity.
While a lot of progress has been made in the development and use of PBMs, we still face important challenges. Much of the academic research in this field has focused on the development and assessment of PBMs rather than the practicalities of including them in studies. These data are primarily used to support the development of decision models. Many models, however, are gross simplifications of clinical reality. This may be perceived as adequate for decisions, but oversimplification and the use of assumptions may bias our interpretation of the benefits of a treatment. It is also worth reflecting both on how unrepresentative clinical trial data can be, and in my view the mistaken belief that data from randomised trials should be at the pinnacle of a methods hierarchy. Study entry criteria, protocol-driven care procedures and placebo effects are probably all sources of considerable bias in the collection of subjective endpoints in clinical trials. Clinical trials are typically international studies that include patients from many countries. Their EQ-5D scores may be weighted by UK preferences for submissions to NICE but to what extent is the data from different countries representative of UK patients? There is very little understanding of how representative clinical trial health-related quality-of-life (HRQOL) data are of routine clinical practice anywhere. We also know very little about what is actually happening during data collection at clinical sites because it is not routinely reported. Do patients complete forms on their own or do research staff read the questions out to them? Do research staff decide not to bother certain patients with questionnaire completion for whatever reason? If patients complete other assessments on the same day does this influence HRQOL scores? These influences are largely ignored, but they are sources of potential error. There is a lot of work to do to understand how we can capture HRQOL data while avoiding these issues. Technological innovations are part of the solution. More data, routinely collected data and data from more meaningful timepoints are also part of the solution.
Lastly, we should reflect on areas where PBMs are failing to help decision makers. Aggregating health or HRQOL data to estimate a QALY gain may be sensible for prevalent chronic disease but seems less suited for assessing rare diseases. Our methods have been somewhat blindsided by the huge increase in orphan drugs. Some of the most challenging decisions that payers face relate to the adoption of orphan drugs. Submissions often describe the benefits of a treatment with poor quality data (small samples, no comparator data); they predict huge health gains (perhaps 20–30 QALYs per patient), are very high cost, potentially present very high value but also represent a huge opportunity cost. Our methods of QALY estimation are often not able to provide decision makers with accurate estimates of the health gain associated with these treatments. Our measures are not valid for use in young children (where such conditions often emerge); the valuation of such states presents philosophical and practical challenges; and measurement methods that are available to us typically require a proxy assessment (with inherent bias). Even in older children and adults, where tools do exist, the precision of utility estimates makes any estimate of QALY gain hugely uncertain where we have very small trials. This is a challenge that the field must address.
Compliance with Ethical Standards
This article is published in a special edition journal supplement wholly funded by Takeda Pharmaceutical International AG, Zurich, Switzerland. The author has not received any payment from Takeda and has no known potential conflicts of interest.
- 1.Rowen D, Azzabi Zouraq I, Chevrou-Severac H, van Hout B. International regulations and recommendations for utility data for health technology assessment. Pharmacoeconomics. doi: 10.1007/s40273-017-0544-y. (Current issue).
- 2.Rowen D, Brazier J, Ara R, Azzabi Zouraq I. The role of condition-specific preference-based measures. Pharmacoeconomics. doi: 10.1007/s40273-017-0546-9. (Current issue).
- 3.Brazier J, Ara R, Rowen D, Chevrou-Severac H. A review of generic preference-based measures for use in cost-effectiveness models. Pharmacoeconomics. doi: 10.1007/s40273-017-0545-x. (Current issue).
- 5.Kularatna S, Byrnes J, Chan YK, Carrington MJ, Stewart S, Scuffham PA. Comparison of contemporaneous responses for EQ-5D-3L and Minnesota Living with Heart Failure; a case for disease specific multiattribute utility instrument in cardiovascular conditions. Int J Cardiol. 2017;15(227):172–6.CrossRefGoogle Scholar