Skip to main content

Patient-Reported Outcome Measures: Development and Psychometric Evaluation

  • Chapter
  • First Online:
Biopharmaceutical Applied Statistics Symposium

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

Abstract

This chapter has been created to provide an accessible introduction to the development and psychometric evaluation of patient-reported outcome (PRO) measures specifically designed to assess key endpoints in clinical trials, with the ultimate goal of supporting approval and/or labeling claims for pharmaceutical products. While many of our recommendations are broadly applicable to the development of PRO measures for use in clinical trials in any country and in other types of patient-based research (such as observational studies), this chapter will primarily focus on assembling and documenting the types of evidence needed to facilitate reviews of key study endpoints by the United States (US) Food and Drug Administration (FDA) .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andrich, D. (1988). Rasch models for measurement. Beverly Hills: Sage.

    Book  Google Scholar 

  • Baldwin, M., Spong, A., Doward, L., & Gnanasakthy, A. (2011). Patient-reported outcomes, patient-reported information: From randomized controlled trials to the social Web and beyond. Patient, 4, 1–7.

    Article  Google Scholar 

  • Bentler, P. M. (1989). EQS structural equations program manual. Los Angeles: BMDP Statistical Software.

    Google Scholar 

  • Bobo, W. V., Angleró, G. C., Jenkins, G., Hall-Flavin, D. K., Weinshilboum, R., & Biernacka, J. M. (2016). Validation of the 17-item hamilton depression rating scale definition of response for adults with major depressive disorder using equipercentile linking to clinical global impression scale ratings: analysis of pharmacogenomic research network antidepressant medication pharmacogenomic study (PGRN-AMPS) data. Human Psychopharmacology, 31, 185–192.

    Article  Google Scholar 

  • Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Newbury Park: Sage.

    Google Scholar 

  • Cappelleri, J. C., & Bushmakin, A. G. (2018). Advancing interpretation of patient-reported outcomes. In K. Peace, D.-G. Chen, & S. Menon (Eds.), Biopharmaceutical Applied Statistics Symposium, Vol. 2. pp. 69–89

    Google Scholar 

  • Cappelleri, J. C., & Spielberg, S. P. (2015). Advances in clinical outcome assessments. Therapeutic Innovation and Regulatory Science, 49, 780–782.

    Article  Google Scholar 

  • Cappelleri, J. C., Zou, K. H., Bushmakin, A. G., Alvir, J. M. J., Alemayehu, D., & Symonds, T. (2013). Patient-reported outcomes—measurement, implementation, and interpretation. Boca Raton, Florida: Chapman and Hall/CRC Press.

    Google Scholar 

  • Cappelleri, J. C., Lundy, J., & Hays, R. D. (2014). Overview of classical test theory and item response theory for quantitative assessment of items in developing patient-reported outcome measures. Clinical Therapeutics, 36, 648–662.

    Article  Google Scholar 

  • Cella, D., Bullinger, M., Scott, C., Barofsky, I., Clinical Significance Consensus Meeting Group. (2002). Group vs individual approaches to understanding the clinical significance of differences or changes in quality of life. Mayo Clinic Proceedings 77, 384–392.

    Google Scholar 

  • Chen, W. C., McLeod, L. D., Nelson, L. M., Williams, V. S., & Fehnel, S. E. (2014). Quantitative challenges facing patient-centered outcomes research. Expert Review Pharmacoecon Outcomes Research, 14(3), 379–386.

    Article  Google Scholar 

  • Cook, K. F., Victorson, D. E., Cella, D., Schalet, B. D., & Miller, D. (2015). Creating meaningful cut-scores for Neuro-QOL measures of fatigue, physical functioning, and sleep disturbance using standard setting with patients and providers. Quality of Life Research, 24, 575–589.

    Article  Google Scholar 

  • Coon, C. D., & Cappelleri, J. C. (2016). Interpreting change in scores on patient-reported outcome instruments. Therapeutic Innovation and Regulatory Science, 50, 22–29.

    Article  Google Scholar 

  • Coons, S. J., Gwaltney, C. J., Hays, R. D., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO good research practices task force report. Value Health, 12, 419–429.

    Article  Google Scholar 

  • Cronbach, L. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.

    Article  Google Scholar 

  • de Vet, H. C. W., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine: A practical guide. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • DeMuro, C. D., Lewis, S. A., DiBenedetti, D. B., Price, M. A., & Fehnel, S. E. (2012). Successful implementation of cognitive interviews in special populations. Expert Review Pharmacoecon Outcomes Research, 12(2), 181–187.

    Article  Google Scholar 

  • Deyo, R. A., Diehr, P., & Patrick, D. L. (1991). Reproducibility and responsiveness of health status measures: Statistics and strategies for evaluation. Controlled Clinical Trial, 12, 142S–158S.

    Article  Google Scholar 

  • Edelen, M. O., & Reeve, B. B. (2007). Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Quality of Life Research, 16, 5–18.

    Article  Google Scholar 

  • European Medicines Agency (EMA). (2005). Reflection paper on the regulatory guidance for the use of health related quality of life (HRQL) measures in the evaluation of medicinal products. London: European Medicines Agency.

    Google Scholar 

  • Fayers, P. M., & Hays, R. D. (Eds.). (2005). Assessing quality of life in clinical trials: Methods and practice. Oxford: Oxford University Press.

    Google Scholar 

  • Fayers, P. M., & Hays, D. R. (2014). Don’t middle your MIDs: regression to the mean shrinks estimates of minimally important differences. Quality of Life Research, 23, 1–4.

    Article  Google Scholar 

  • Fayers, P. M., & Machin, D. (2016). Quality of life: The assessment, analysis and reporting of patient-reported outcomes (3rd ed.). Chichester: Wiley.

    Google Scholar 

  • Food and Drug Administration (FDA). (2007). Guidance for industry. Developing products for weight management. https://www.fda.gov/downloads/Drugs/Guidances/ucm071612.pdf. Accessed June, 01 2017.

  • Food and Drug Administration (FDA). (2009). Guidance for industry. Patient-reported outcome measures: use in medical product development to support labeling claims. http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf. Accessed January, 3 2017.

  • Food and Drug Administration (FDA). (2013a). Roadmap to patient-focused outcome measurement in clinical trials. http://www.fda.gov/Drugs/DevelopmentApprovalProcess/DrugDevelopmentToolsQualificationProgram/ucm284077.htm. Accessed January 5, 2017.

  • Food and Drug Administration (FDA). (2013b). Center for Drug Evaluation and Research. Drug Development Tool Number: COA DDT 003 Study Endpoints and Labeling Development (SEAL) Review. SEALD Tracking Number: 2013–055. http://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/DrugDevelopmentToolsQualificationProgram/UCM386244.pdf. Accessed January 28, 2017.

  • Food and Drug Administration (FDA). (2016). Clinical outcome assessment compendium. http://www.fda.gov/Drugs/DevelopmentApprovalProcess/DevelopmentResources/ucm459231.htm. Accessed January, 8 2017.

  • Frost, M. H., Reeve, B. B., Liepa, A. M., Stauffer, J. W., Hays, R. D.; Mayo/FDA Patient-Reported Outcomes Consensus Meeting Group. (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value Health 10, S94–S105.

    Google Scholar 

  • Gnanasakthy, A., Mordin, M., Clark, M., et al. (2012). A review of patient-reported outcome labels in the United States: 2006–2010. Value Health, 15(3), 437–442.

    Article  Google Scholar 

  • Gnanasakthy, A., Mordin, M., Evans, E., Doward, L., & DeMuro, C. (2017). A review of patient-reported outcome labeling in the United States (2011–2015). Value Health, 20(3), 420–429. https://doi.org/10.1016/j.jval.2016.10.006.

    Article  Google Scholar 

  • Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale: Lawrence Erlbaum.

    MATH  Google Scholar 

  • Guyatt, G. H., Osoba, D., Wu, A. W., Wyrwich, K. W., Norman, G. R.; Clinical Significance Consensus Meeting Group. (2002). Methods to explain the clinical significance of health status measures. Mayo Clin Proceedings 77, 371–383.

    Google Scholar 

  • Hays, R. D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K. (2005). Evaluating the statistical significance of health-related quality of life change in individual patients. Evaluation and the Health Professions, 28, 160–171.

    Article  Google Scholar 

  • Hays, R. D., Revicki, D. (2005). Reliability and validity (including responsiveness). In P. M. Fayers, R. D. Hays (Eds.) Assessing quality of life in clinical trials: methods and practice. Oxford: Oxford University Press, pp. 25–39.

    Google Scholar 

  • King, M. T. (2011). A point of minimal important difference (MID): a critique of terminology and methods. Expert Review Pharmacoecon Outcomes Research, 11, 171–184.

    Article  Google Scholar 

  • Marquis, P., Chassany, O., & Abetz, L. (2004). A comprehensive strategy for the interpretation of quality-of-life data based on existing methods. Value Health, 7, 93–104.

    Article  Google Scholar 

  • McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46.

    Article  Google Scholar 

  • McLeod, L. D., Cappelleri, J. C., & Hays, R. D. (2016). Best (but oft-forgotten) practices: expressing and interpreting associations and effect sizes in clinical outcome assessments. The American Journal of Clinical Nutrition 103(3), 685–693 (with erratum in The American Journal of Clinical Nutrition 2017;105:241).

    Google Scholar 

  • McLeod, L. D., Coon, C. D., Martin, S. A., Fehnel, S. E., & Hays, R. D. (2011). Interpreting patient-reported outcome results: US FDA guidance and emerging methods. Expert Review Pharmacoecon Outcomes Research, 11, 163–169.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. Educational measurement (3rd ed., pp. 13–103). New York: Macmillan.

    Google Scholar 

  • Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality-of-life: The remarkable universality of half a standard deviation. Medical Care, 4, 582–592.

    Google Scholar 

  • Norquist, J. M., Girman, C., Fehnel, S., DeMuro-Mercon, C., & Santanello, N. (2012). Choice of recall period for patient-reported outcome (PRO) measures: Criteria for consideration. Quality of Life Research, 21(6), 1013–1020.

    Article  Google Scholar 

  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.

    Google Scholar 

  • Odom, D., McLeod, L., Sherif, B., Nelson, L., McSorley, D. (Under review). Longitudinal modeling approaches to assess the association between changes in a patient-reported outcome and a clinical endpoint.

    Google Scholar 

  • Odom, D., McLeod, L., Sherif, B., Nelson, L., McSorley, D. (2017). Longitudinal modeling approaches to assess the association between changes in 2 clinical outcome assessments. Ther Innov Regul Sci. 2017 Sep 26.

    Google Scholar 

  • Patrick, D. L., Burke, L. B., Gwaltney, C. J., et al. (2011a). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 1–eliciting concepts for a new PRO instrument. Value Health, 14, 967–977.

    Article  Google Scholar 

  • Patrick, D. L., Burke, L. B., Gwaltney, C. J., et al. (2011b). Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 2—assessing respondent understanding. Value Health, 14, 978–988.

    Article  Google Scholar 

  • Reeve, B. B., Wyrwich, K. W., Wu, A. W., et al. (2013). ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research. Quality of Life Research, 22, 1889–1905.

    Article  Google Scholar 

  • Revicki, D. A., Erickson, P. A., Sloan, J. A., et al; Mayo/FDA Patient-Reported Outcomes Consensus Meeting Group. (2007). Interpreting and reporting results based on patient-reported outcomes. Value Health 10, S116–24.

    Google Scholar 

  • Revicki, D., Hays, R., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology, 61, 102–109.

    Article  Google Scholar 

  • Rothman, M., Gnanasakthy, A., Wicks, P., & Papadopoulos, E. J. (2015). Can we use social media to support content validity of patient-reported outcome instruments in medical product development? Value Health, 18, 1–4.

    Article  Google Scholar 

  • Schuck, P. (2004). Assessing reproducibility for interval data in health-related quality of life questionnaires: Which coefficient should be used? Quality of Life Research, 13, 571–586.

    Article  Google Scholar 

  • Streiner, D. L., Norman, G. R., & Cairney, J. (2015). Health measurement scales: A practical guide to their development and use (5th ed.). New York: Oxford University Press.

    Book  Google Scholar 

  • Sudman, S., & Bradburn, N. M. (1982). Asking questions: A practical guide to questionnaire design. San Francisco: Jossey-Bass.

    Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.

    Article  Google Scholar 

  • Walton, M. K., Powers, J. H., Hobart, J., et al. (2015). Clinical outcome assessments: Conceptual foundation. Report of the ISPOR clinical outcomes assessment – Emerging good practices for outcomes research task force. Value Health 18, 741–752. https://doi.org/10.1016/j.jval.2015.08.006.

  • Wild, D., Grove, A., Martin, M., et al. (2005). Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: Report of the ISPOR task force for translation and cultural adaptation. Value Health, 8, 94–104.

    Article  Google Scholar 

  • Williams, V., McLeod, L., & Nelson, L. (2015). Advances in the evaluation of longitudinal construct validity of clinical outcome assessments. Therapeutic Innovation and Regulatory Science, 49, 805–812.

    Article  Google Scholar 

  • Willis, G. B. (2005). Cognitive interviewing: a tool for improving questionnaire design. Thousand Oaks: Sage.

    Book  Google Scholar 

  • Willis, G. B. (2015). Analysis of the cognitive interview in questionnaire design. understanding qualitative research. New York: Oxford University Press.

    Google Scholar 

  • Wyrwich, K. W., Norquist, J. M., Lenderking, W. R., Acaster, S.; Industry Advisory Committee of International Society for Quality of Life Research (ISOQOL). (2013). Methods for interpreting change over time in patient-reported outcome measures. Quality of Life Research 22, 475–483.

    Google Scholar 

  • Wyrwich, K. W., Krishnan, S., Poon, J. L., et al. (2015). Interpreting important health-related quality of life change using the Haem-A-QoL. Haemophilia, 21, 578–584. https://doi.org/10.1111/hae.12642.

    Article  Google Scholar 

  • Wyrwich, K. W., Tierney, W. M., Wolinsky, F.D. (1999). Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol, 52:861–873.

    Google Scholar 

Download references

Acknowledgements

We thank Lauren Nelson for helpful comments on earlier versions of this chapter. In addition, we thank Lindsey Norcross and Jason Mathes for their editorial and graphical support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lori D. McLeod .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

McLeod, L.D., Fehnel, S.E., Cappelleri, J.C. (2018). Patient-Reported Outcome Measures: Development and Psychometric Evaluation. In: Peace, K., Chen, DG., Menon, S. (eds) Biopharmaceutical Applied Statistics Symposium . ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-10-7829-3_13

Download citation

Publish with us

Policies and ethics