Quality of Life Research

, Volume 2, Issue 6, pp 441–449

Psychometric considerations in evaluating health-related quality of life measures

  • R. D. Hays
  • R. Anderson
  • D. Revicki
Research Papers


How does one determine if a measure of health-related quality of life (HRQL) is adequate for clinical trials? Psychometric methods are frequently used to answer this question. What is psychometrics all about? In this paper we address these questions, discussing common psychometric evaluation procedures applied to HRQL measures. Specifically, we discuss issues regarding the evaluation of reliability and validity (including responsiveness).

Key words

Reliability validity responsiveness 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    World Health Organization. Constitution of the World Health Organization. Geneva, Switzerland: Author (Basic Documents), 1947.Google Scholar
  2. 2.
    Hall JA, Epstein A, McNeil BJ. Multidimensionality of health status in an elderly population: construct validity of a measurement battery. Med Care 1989; 27: S168-S177.Google Scholar
  3. 3.
    Hays RD, Stewart AL. The structure of self-reported health in chronic disease patients. Psych Assessment 1990; 2: 22–30.Google Scholar
  4. 4.
    Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951; 16: 297–334.Google Scholar
  5. 5.
    Feldt LS, Woodruff DJ, Salih FA. Statistical inference for coefficient alpha. Appl Psych Measurement 1987; 11: 93–103.Google Scholar
  6. 6.
    Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures: statistics and strategies for evaluation. Contr Clin Trials 1991; 12: 142S-158S.Google Scholar
  7. 7.
    Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater relability. Psych Bull 1979; 86: 420–428.Google Scholar
  8. 8.
    Bravo G, Potvin L. Estimating the reliability of continuous measures with Cronbach's alpha or the intraclass correlation coefficient: toward the integration of two traditions. J Clin Epid 1991; 44: 381–390.Google Scholar
  9. 9.
    Nunnally J. Psychometric Theory 2nd ed. New York: McGraw-Hill, 1978.Google Scholar
  10. 10.
    Ware JE, Sherbourne CD. The MOS 36-Item Short-form Health Survey (SF-36): I. Conceptual framework and item selection. Med Care 1992; 30: 473–483.Google Scholar
  11. 11.
    Hays RD, Sherbourne CD, Mazel RM. the RAND 36-Item Health Survey 1.0. Health Economics 2: 217–227.Google Scholar
  12. 12.
    Prisant LM, Carr AA, Bottini PB, et al. Repeatability of automated ambulatory blood pressure measurements. Journal of Family Practice 1992; 34: 569–574.Google Scholar
  13. 13.
    Stewart AL, Sherbourne CD, Hays RD, et al. Summary and discussion of MOS measures. In: AL Stewart, JE Ware, eds. Measuring Functioning and Well-being: The Medical Outcomes Study Approach. Durham, NC: Duke University Press, 1992: 345–371.Google Scholar
  14. 14.
    Hays RD, Hayashi T. Beyond internal consistency reliability: rationale and user's guide for Multitrait Scaling Analysis Program on the microcomputer. Behavior Research Methods, Instruments & Computers 1990; 22: 167–175.Google Scholar
  15. 15.
    Hays RD, Wang E. Multitrait scaling program: MULTI. Proceedings of the Seventeenth Annual SAS Users Group International Conference, 1992: 1151–1156.Google Scholar
  16. 16.
    Stewart AL, Hays RD, Ware J. Methods of validating MOS health measures. In: AL Stewart, JE Ware eds. Measuring Functioning and Well-being: The Medical Outcomes Study Approach. Durham, NC: Duke University Press, 1992: 309–324.Google Scholar
  17. 17.
    Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psych Bull 1995; 52: 281–302.Google Scholar
  18. 18.
    Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psych Bull 1959; 56: 81–105.Google Scholar
  19. 19.
    Hayashi T, Hays RD. A microcomputer program for analyzing multitrait-multimethod matrices. Behavior Research Methods, Instruments, and Computers 1987; 19: 345–348.Google Scholar
  20. 20.
    Kenny DA, Kashy DA. Analysis of the multitrait-multimethod matrix by confirmatory factor analysis. Psych Bull 1992; 112: 165–172.Google Scholar
  21. 21.
    Nelson EC, Landgraf JM, Hays RD, et al. The functional status of patients: How can it be measured in physicians' offices? Med Care 1990; 28: 1111–1126.Google Scholar
  22. 22.
    Siu AL, Hays RD, Ouslander JG, et al. Measuring functioning and health in the very old. Journal of Gerontology: Medical Sciences 1993; 48: M10-M14.Google Scholar
  23. 23.
    Hadorn DC, Hays RD. Multitrait-multimethod analysis of health-related quality of life preferences. Med Care 1991; 29: 829–840.Google Scholar
  24. 24.
    Cole DA. Utility of confirmatory factor analysis in test validation research. J Consulting and Clinical Psychology 1987; 55: 584–594.Google Scholar
  25. 25.
    Marsh HW. Confirmatory factor analyses of multitrait-multimethod data: Many problems and a few solutions. Appl Psych Measurement 1989 13: 335–361.Google Scholar
  26. 26.
    Siu AL, Ouslander JG, Osterweil D, et al. Change in self-reported functioning in older persons entering a residential care facility. J Clin Epid 46: 1093–1102.Google Scholar
  27. 27.
    Liang MH, Larson MG, Cullen KE, et al. Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthr Rheum 1985; 28: 545–547.Google Scholar
  28. 28.
    Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: 29–36.Google Scholar
  29. 29.
    Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148: 839–843.Google Scholar
  30. 30.
    Guyatt G, Walter S, Norman G. Measuring change overtime: Assessing the usefulness of evaluative instruments. J Chron Dis 1987; 40: 171–178.Google Scholar
  31. 31.
    Hays R, Hadorn D. Responsiveness to change: an aspect of validity, not a separate dimension. Quality of Life Research 1992; 1: 73–75.Google Scholar
  32. 32.
    Chambers LW, Haight M, Norman G, et al. Sensitivity to change and the effect of mode of administration on health status measurement. Med Care 1987; 25: 470–480.Google Scholar
  33. 33.
    MacKenzie CR, Charlson ME, DiGioia D, et al. Can the Sickness impact Profile measure change? An example of scale assessment. J Chron Dis 1986; 39: 429–438.Google Scholar
  34. 34.
    Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989: 27: S178-S189.Google Scholar
  35. 35.
    Guyatt G, Walter S, Norman G. Measuring change over time: assessing the usefulness of evaluative instruments. J Chron Dis 1987; 40: 171–178.Google Scholar
  36. 36.
    Revicki DA, Turner R, Brown R, et al. Reliability and validity of a health-related quality of life battery for evaluating outpatient antidepressant treatment. Quality of Life Research 1992; 1: 257–266.Google Scholar
  37. 37.
    Hui C, Triandis HC. Measurement in cross-cultural psychology: a review and comparison of strategies. Cross Cultural Psychology 1985; 16: 131–152.Google Scholar
  38. 38.
    Brislin R, Lonner W, Thorndike R. Cross Cultural Research Methods. New York: Wiley and Sons, 1973.Google Scholar
  39. 39.
    Hunt S. Cross-cultural issues in the use of sociomedical indicators. Health Policy 1986; 6: 149–158.Google Scholar
  40. 40.
    Patrick D, Sittampalam Y, Sommerville S, et al. A cross-cultural comparison of health status values AJPH 1985; 75: 1402–1407.Google Scholar
  41. 41.
    Cattell B. Comparing factor traits across ages and cultures. J Gerontol 1989; 24: 348–380.Google Scholar
  42. 42.
    Buss AR, Royce JR. Detecting cross-cultural commonalties and differences: intergroup factor analysis. Psych Bull 1975; 82: 128–136.Google Scholar
  43. 43.
    Dressler W, Viteri F, Chavez A, et al. Comparative research in social, epidemiology: measurement issues. Ethnicity Dis 1991; 1: 379–393.Google Scholar
  44. 44.
    Bucquet D, Condor S, Ritchie K. The French version of the Nottingham Health Profile: a comparison of item weights with those of the source version. Soc Sci Med 1990; 30: 829–835.Google Scholar
  45. 45.
    Mangione CM, Marcantonio ER, Goldman L, et al. Influence of age on measurement of health status in patients undergoing elective surgery. J Am Geriatr Soc 1993; 41: 377–383.Google Scholar
  46. 46.
    Jenkinson C, Coulter A, Wright L. Short Form 36 (SF 36) Health Survey Questionnaire: normative data for adults of working age. Br Med J 1993; 306: 1437–1440.Google Scholar
  47. 47.
    Vickrey BG, Hays RD, Rausch R, et al. Quality of life of epilepsy surgery patients compared with out-patients with hypertension, diabetes, heart disease, and/or depressive symptoms. Submitted for publication.Google Scholar
  48. 48.
    Wells KB, Manning MG, Valdez RB. The effects of insurance generosity on the psychological distress and well-being of a general population: results from a randomized trial of insurance. Santa Monica, CA: RAND, R-3682-NIMH/HCFA.Google Scholar
  49. 49.
    Revicki DA, Allen H, Bungay K, et al. Responsiveness and calibration of the general well-being adjustment scale in patients with hypertension. Submitted for publication.Google Scholar
  50. 50.
    Testa MA, Anderson RB, Nackley JF, et al. Quality of life and antihypertensive therapy in men: a comparison of captopril and enalapril. N Engl J Med 1993; 328: 907–913.Google Scholar
  51. 51.
    Detsky AS, nagle IG. A clinician's guide to cost-effectiveness analysis. Annals of Internal Medicine 1990; 113: 147–154.Google Scholar
  52. 52.
    Froberg DG, Kane RL. Methodology for measuring health-state preferences—III: Population and context effects. J Clin Epid 1989; 42: 585–592.Google Scholar
  53. 53.
    Hornberger JC, Redelmeier DA, Petersen J. Variability among methods to assess patients' well-being and consequent effect on a cost-effectiveness analysis. J Clin Epid 1992; 45: 505–512.Google Scholar
  54. 54.
    Bozzette SA, Hays RD, Berry S, et al. A perceived health index for use in persons with advanced HIV disease: derivation, reliability, and validity. Submitted for publication.Google Scholar
  55. 55.
    Hays RD, Stewart AL, Sherbourne CD, et al. The ‘states versus weights’ dilemma in quality of life measurement. Qual Life Res 1993; 2: 167–168.Google Scholar

Copyright information

© Rapid Communications of Oxford Ltd 1993

Authors and Affiliations

  • R. D. Hays
    • 1
  • R. Anderson
    • 2
  • D. Revicki
    • 3
  1. 1.Social Policy DepartmentRANDSanta MonicaUSA
  2. 2.Bowman Gray School of MedicineWinston-SalemUSA
  3. 3.Battelle Medical Technology Assessment and Policy Research CenterWashington, DCUSA

Personalised recommendations