Quality of Life Research

, Volume 18, Issue 4, pp 509–518 | Cite as

The precision of health state valuation by members of the general public using the standard gamble

  • Ken SteinEmail author
  • Matthew Dyer
  • Ruairidh Milne
  • Alison Round
  • Julie Ratcliffe
  • John Brazier



Precision is a recognised requirement of patient-reported outcome measures but no previous studies of the precision of methods for obtaining health state values from the general public, based on specific health state descriptions or vignettes, have been carried out. The methodological requirements of policy makers internationally is driving growth in the use of methods to obtain utilities from the general public to inform cost per quality-adjusted life-year (QALY) analyses of health technologies being considered for adoption by health systems.


The precision of five comparisons of the outcomes of treatments, based on health state descriptions, was assessed against the results of clinical trials which showed a statistically and clinically significant improvement using an internet panel of members of the UK general public. Health states were developed to depict the baseline and post-treatment states from these exemplar clinical trials. Preferences for health states were obtained using bottom-up titrated standard gamble over the internet, and differences between summary health state values corresponding to the treatment and comparator groups within each exemplar study were compared. Results are considered in the context of various estimates for the minimally important difference in utility values.


Participation among members of the internet panel in the five exemplars ranged from 27 to 59. In four of the five exemplars, the utility-based estimates of treatment benefit showed significant differences between groups and were greater than an assumed minimally important difference of 0.1. Mean utility differences between groups were: 0.23 (computerised cognitive behavioural therapy for depression, P < 0.001), 0.11 (hip resurfacing for hip osteoarthritis, P < 0.001), 0.0005 (cognitive behavioural therapy for insomnia, P = 0.98), 0.15 (pulmonary rehabilitation for COPD, P < 0.001) and 0.11 (infliximab for Crohn’s disease, P < 0.001). The confidence intervals around the estimates of utility-based treatment effect in three of the five examples did not exclude the possibility of a difference smaller than a minimally important difference of 0.1. Recent empirical evidence suggests a lower minimally important difference (0.03) may be more appropriate, in which case our results provide further reassurance of preservation of precision in health state description and valuation.


The precision of estimates of treatment effects based on preference data obtained from disease-specific measurements in clinically significant studies of health technologies was acceptable using an internet-based panel of members of the general public and the standard gamble. Definition of the minimally important difference in utility estimates is required to adequately assess precision and should be the subject of further research.


Utility Preferences Internet Public Precision 




NHS R&D Programme; National Institute for Health and Clinical Excellence (NICE); NHS Quality Improvement Scotland (NHSQIS). We are extremely grateful to the following for their help: the members of the internet panel, the patients and clinicians who provided help in the development of health state descriptions, Joanne Perry for her project support, Dan Fall (University of Sheffield) and Stephen Elliott (Llama Digital) for website development.

Competing interests


Authors’ contributions

K.S., R.M., J.B. and A.R. conceived the study and, with J.R., designed the evaluation. M.D. developed some of the health state descriptions and contributed to data collection. All authors contributed to the drafting of this report.


  1. 1.
    Fitzpatrick, R., Davey, C., Buxton, M., & Jones, D. (1998). Evaluating patient based outcome measures for use in clinical trials. Health Technology Assessment, 2(14), i–iv.Google Scholar
  2. 2.
    Kessler, R. C., & Mroczek, D. K. (1995). Measuring the effects of medical interventions. Medical Care, 33, AS109–AS119.PubMedGoogle Scholar
  3. 3.
    Stewart, A. L. (1992). Conceptual and methodologic issues in defining quality of life: State of the art. Progress in Cardiovascular Nursing, 7, 3–11.Google Scholar
  4. 4.
    Testa, M. A., & Simonson, D. C. (1996). Assessment of quality-of-life outcomes. The New England Journal of Medicine, 334, 835–840. doi: 10.1056/NEJM199603283341306.PubMedCrossRefGoogle Scholar
  5. 5.
    Dolan, P. (1997). Modeling valuations for EuroQol health states. Medical Care, 35, 1095–1108. doi: 10.1097/00005650-199711000-00002.PubMedCrossRefGoogle Scholar
  6. 6.
    McCabe, C., Stevens, K., & Brazier, J. (2005). Utility scores for the HUI2: An empirical comparison of alternative mapping functions. Medical Care, 43, 627–635. doi: 10.1097/01.mlr.0000163666.00471.8e.PubMedCrossRefGoogle Scholar
  7. 7.
    Stein, K., Dyer, M., Crabb, T., Milne, R., Round, A., Ratcliffe, J., & Brazier, J. (2006). An internet “value of health” panel: Recruitment, participation and compliance. Health and Quality of Life Outcomes, 4, 90.Google Scholar
  8. 8.
    Lenert, L. A., Cher, D. J., Goldstein, M. K., Bergen, M. R., & Garber, A. (1998). The effect of search procedures on utility elicitations. Medical Decision Making, 18, 76–83. doi: 10.1177/0272989X9801800115.PubMedCrossRefGoogle Scholar
  9. 9.
    Dolan, P., Gudex, C., Kind, P., & Williams, A. (1996). Valuing health states: A comparison of methods. Journal of Health Economics, 15, 209–231. doi: 10.1016/0167-6296(95)00038-0.PubMedCrossRefGoogle Scholar
  10. 10.
    Dolan, P., & Sutton, M. (1997). Mapping visual analogue scale health state valuations onto standard gamble and time trade-off values. Social Science and Medicine, 44, 1519–1530. doi: 10.1016/S0277-9536(96)00271-7.PubMedCrossRefGoogle Scholar
  11. 11.
    Hammerschmidt, T., Zeitler, H.-P., Gulich, M., & Leidl, R. (2004). A comparison of different strategies to collect standard gamble utilities. Medical Decision Making, 24, 493–503. doi: 10.1177/0272989X04269239.PubMedCrossRefGoogle Scholar
  12. 12.
    Brazier, J., & Dolan, P. (2005). Evidence of preference construction in a comparison of variants of the standard gamble method. Health Economics and Decision Science Section Discussion Papers, University of Sheffield.Google Scholar
  13. 13.
    Lenert, L. A., & Sturley, A. E. (2002). Use of the internet to study the utility values of the public. In AMIA annual symposium proceedings (pp. 440–444).Google Scholar
  14. 14.
    Selmi, P. M., Klein, M. H., Greist, J. H., Sorrell, S. P., & Erdman, H. P. (1990). Computer-administered cognitive-behavioral therapy for depression. The American Journal of Psychiatry, 147, 51–56.PubMedGoogle Scholar
  15. 15.
    Beck, A. T., Rial, W. Y., & Rickels, K. (1974). Short form of depression inventory: Cross-validation. Psychological Reports, 34, 1184–1186.PubMedGoogle Scholar
  16. 16.
    McMinn, D., Treacy, R., Lin, K., & Pynsent, P. (1996). Metal on metal surface replacement of the hip. Clinical Orthopaedics and Related Research, 329, 89S–98S. doi: 10.1097/00003086-199608001-00009.CrossRefGoogle Scholar
  17. 17.
    Vale, L., Wyness, L., McCormack, K., McKenzie, L., Brazelli, M., & Stearns, S. (2001). Systematic review of the effectiveness and cost effectiveness of metal on metal hip resurfacing for treatment of hip disease. Health Services Research Unit, University of Aberdeen.Google Scholar
  18. 18.
    Morgan, K., Dixon, S., Mathers, N., Thompson, J., & Tomeny, M. (2004). Psychological treatment for insomnia in the regulation of long-term hypnotic drug use. Health Technology Assessment, 8, 1–94.Google Scholar
  19. 19.
    Buysse, B. J., Reynolds, C. F., Monk, T. H., Berman, S. R., & Kupfer, D. J. (1989). The Pittsburgh Sleep Quality Index for psychiatric practice and research. Psychiatry Research, 28, 193–213. doi: 10.1016/0165-1781(89)90047-4.PubMedCrossRefGoogle Scholar
  20. 20.
    Man, W. D., Polkey, M. I., Donaldson, N., Gray, B. J., & Moxham, J. (2004). Community pulmonary rehabilitation after hospitalisation for acute exacerbations of chronic obstructive pulmonary disease: Randomised controlled study. BMJ (Clinical Research Ed.), 329, 1209. doi: 10.1136/bmj.38258.662720.3A.CrossRefGoogle Scholar
  21. 21.
    Lacasse, Y., Wong, E., & Guyatt, G. (1974). A systematic overview of the measurement properties of the chronic respiratory questionnaire. Canadian Respiratory Journal, 4, 131–139.Google Scholar
  22. 22.
    Guyatt, G., Mitchell, A., Irvine, E. J., Singer, J., Williams, N., Goodacre, R., et al. (1989). A new measure of health status for clinical trials in inflammatory bowel disease. Gastroenterology, 96, 804–810.PubMedGoogle Scholar
  23. 23.
    Hanauer, S. B., Feagan, B. G., Lichtenstein, G. R., Mayer, L. F., Schreiber, S., Colombel, J. F., et al. (2002). Maintenance infliximab for Crohn’s disease: The ACCENT I randomised trial. Lancet, 359, 1541–1549. doi: 10.1016/S0140-6736(02)08512-4.PubMedCrossRefGoogle Scholar
  24. 24.
    Stein, K., & Milne, R. (1998). Health technology assessment. In M. Baker & S. Kirk (Eds.), Research and development in the NHS. Oxford: Radcliffe Medical.Google Scholar
  25. 25.
    Williams, A. (1995). The measurement and valuation of health: A chronicle. University of York, York.Google Scholar
  26. 26.
    Garside, R., Stein, K., Castelnuovo, E., Pitt, M., Aschcroft, D., Dimmock, P., et al. (2005). The effectiveness and cost-effectiveness of pimecrolimus and tacrolimus for atopic eczema: A systematic review and economic evaluation. Health Technology Assessment, 9, 1–264.Google Scholar
  27. 27.
    Juniper, E. F., Guyatt, G., Willan, A., & Griffith, L. (1994). Determining the minimal clinically important change in a disease-specific quality of life questionnaire. Journal of Clinical Epidemiology, 47, 81–87. doi: 10.1016/0895-4356(94)90036-1.PubMedCrossRefGoogle Scholar
  28. 28.
    Barrett, B., Brown, D., Mundt, M., & Brown, R. (2005). Sufficiently important difference: Expanding the framework of clinical significance. Medical Decision Making, 25, 250–261. doi: 10.1177/0272989X05276863.PubMedCrossRefGoogle Scholar
  29. 29.
    Brozek, J., Guyatt, G. H., & Schunemann, H. (2006). How a well-grounded minimal important difference can enhance transparency of labelling claims and improve interpretation of a patient reported outcome measure. Health and Quality of Life Outcomes, 4, 69.Google Scholar
  30. 30.
    Brazier, J., Roberts, J., & Deverill, M. (2002). The estimation of a preference-based measure of health from the SF-36. Journal of Health Economics, 21, 292. doi: 10.1016/S0167-6296(01)00130-8.CrossRefGoogle Scholar
  31. 31.
    Kulkarni, A. V. (2006). Distribution-based and anchor-based approaches provided different interpretability estimates for the Hydrocephalus Outcome Questionnaire. Journal of Clinical Epidemiology, 59, 176–184. doi: 10.1016/j.jclinepi.2005.07.011.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Ken Stein
    • 1
    Email author
  • Matthew Dyer
    • 1
  • Ruairidh Milne
    • 2
  • Alison Round
    • 1
  • Julie Ratcliffe
    • 3
  • John Brazier
    • 3
  1. 1.Peninsula Technology Assessment Group, Peninsula Medical SchoolUniversity of ExeterExeterUK
  2. 2.University of SouthamptonSouthamptonUK
  3. 3.University of SheffieldSheffieldUK

Personalised recommendations