Skip to main content

Advertisement

Log in

Measuring individual true change with PROMIS using IRT-based plausible values

  • Special Section: Methodologies for Meaningful Change
  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

Aims

A primary advantage of IRT-based patient-reported outcome measures such as PROMIS short forms and computer-adaptive tests is that each estimate of the latent trait comes with a standard error. Such measurement error needs to be acknowledged, in particular when monitoring individual patients over time. In this study, we use plausible values to account for measurement error and analyze the probability of true within-individual change.

Methods

We use a longitudinal, observational study of stable and exacerbated COPD patients (N = 185), providing PROMIS Physical Function and Fatigue T-scores over 3 months. At each measurement, we imputed 1000 plausible values from the scores’ posterior distribution. These were then used to calculate probability of true change using a pre-specified threshold such as minimally important difference supported by the literature, or \(\Delta T-score\) > 0. We demonstrate assessment of change in individuals and in groups, across different measures (Short Forms and CATs), and at various levels of confidence.

Results

Using plausible value imputation and with 95% certainty, 47.5% of participants in the exacerbated group reported less fatigue, compared with 26.5% of participants in the stable group. Comparison of Short Forms and CATs suggests that CATs have better ability to detect change compared to short forms. We also illustrate this method using an individual’s probability of change at different time points.

Conclusion

Plausible values offer a flexible way to include measurement error in analysis of individuals and on sample level. Assessment of probability of true change can complement existing distribution-based approaches and facilitates interpretation of improvement or decline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Data for this study are publicly available in the Harvard HealthMeasures Dataverse under the https://doi.org/10.7910/DVN/UOQNJF.

Code availability

The code to reproduce the analyses is available on the Open Science Framework (https://osf.io/dbgnv/files/osfstorage).

References

  1. Tulsky, D. S., Kisala, P. A., Victorson, D., Carlozzi, N., Bushnik, T., Sherer, M., & Cella, D. (2016). TBI-QOL: Development and calibration of item banks to measure patient reported outcomes following traumatic brain injury. The Journal of Head Trauma Rehabilitation, 31(1), 40–51. https://doi.org/10.1097/HTR.0000000000000131

    Article  PubMed  Google Scholar 

  2. Akshoomoff, N., Beaumont, J. L., Bauer, P. J., Dikmen, S., Gershon, R., Mungas, D., & Heaton, R. K. (2013). NIH toolbox cognitive function battery (CFB): Composite scores of crystallized, fluid, and overall cognition. Monographs of the Society for Research in Child Development, 78(4), 119–132. https://doi.org/10.1111/mono.12038

    Article  PubMed  PubMed Central  Google Scholar 

  3. Beaumont, J. L., Havlik, R., Cook, K. F., Hays, R. D., Wallner-Allen, K., Korper, S. P., & Gershon, R. (2013). Norming plans for the NIH toolbox. Neurology, 80(11 Suppl 3), S87–S92. https://doi.org/10.1212/WNL.0b013e3182872e70

    Article  PubMed  PubMed Central  Google Scholar 

  4. Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., & Hays, R. (2010). Initial adult health item banks and first wave testing of the patient-reported outcomes measurement information system (PROMIS™) Network: 2005–2008. Journal of clinical epidemiology, 63(11), 1179–1194. https://doi.org/10.1016/j.jclinepi.2010.04.011

    Article  PubMed  PubMed Central  Google Scholar 

  5. Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., Reeve, B., & Rose, M. (2007). The patient-reported outcomes measurement information system (PROMIS). Medical care, 45(5 Suppl 1), S3–S11. https://doi.org/10.1097/01.mlr.0000258615.42478.55

    Article  PubMed  PubMed Central  Google Scholar 

  6. LeBlanc, T. W., & Abernethy, A. P. (2017). Patient-reported outcomes in cancer care—hearing the patient voice at greater volume. Nature Reviews Clinical Oncology, 14(12), 763–772. https://doi.org/10.1038/nrclinonc.2017.153

    Article  PubMed  Google Scholar 

  7. Basch, E., Deal, A. M., Dueck, A. C., Scher, H. I., Kris, M. G., Hudis, C., & Schrag, D. (2017). Overall survival results of a trial assessing patient-reported outcomes for symptom monitoring during routine cancer treatment. JAMA, 318(2), 197. https://doi.org/10.1001/jama.2017.7156

    Article  PubMed  PubMed Central  Google Scholar 

  8. Sands, W. A., & Waters, B. K. (1997). Introduction to ASVAB and CAT. In W. A. Sands, B. K. Waters, & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 3–9). American Psychological Association.

    Chapter  Google Scholar 

  9. Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in item response theory scale scores. Educational and Psychological Measurement, 72(2), 264–290. https://doi.org/10.1177/0013164411410056

    Article  PubMed  Google Scholar 

  10. Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41(5), 582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C

    Article  PubMed  Google Scholar 

  11. Revicki, D. A., Cella, D., Hays, R. D., Sloan, J. A., Lenderking, W. R., & Aaronson, N. K. (2006). Responsiveness and minimal important differences for patient reported outcomes. Health and Quality of Life Outcomes, 4(1), 70. https://doi.org/10.1186/1477-7525-4-70

    Article  PubMed  PubMed Central  Google Scholar 

  12. King, M. T. (2011). A point of minimal important difference (MID): A critique of terminology and methods. Expert Review of Pharmacoeconomics & Outcomes Research, 11(2), 171–184. https://doi.org/10.1586/erp.11.9

    Article  Google Scholar 

  13. Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in IRT scale scores. Educational and psychological measurement, 72(2), 264–290.

    Article  PubMed  Google Scholar 

  14. Chalmers, R. P., & Ng, V. (2017). Plausible-value imputation statistics for detecting item misfit. Applied Psychological Measurement, 41(5), 372–387. https://doi.org/10.1177/0146621617692079

    Article  PubMed  PubMed Central  Google Scholar 

  15. Marsman, M., Maris, G., Bechger, T., & Glas, C. (2016). What can we learn from plausible values? Psychometrika, 81(2), 274–289. https://doi.org/10.1007/s11336-016-9497-x

    Article  PubMed  PubMed Central  Google Scholar 

  16. von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI monograph series: Issues and methodologies in large-scale assessments (pp. 9–36). Education Testing Service.

    Google Scholar 

  17. Fischer, H. F., & Rose, M. (2019). Scoring depression on a common metric: A comparison of EAP estimation, plausible value imputation, and full Bayesian IRT modeling. Multivariate Behavioral Research, 54(1), 85–99. https://doi.org/10.1080/00273171.2018.1491381

    Article  PubMed  Google Scholar 

  18. Fischer, F., Gibbons, C., Coste, J., Valderas, J. M., Rose, M., & Leplège, A. (2018). Measurement invariance and general population reference values of the PROMIS Profile 29 in the UK, France, and Germany. Quality of Life Research, 27(4), 999–1014. https://doi.org/10.1007/s11136-018-1785-8

    Article  PubMed  Google Scholar 

  19. Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59(1), 12.

    Article  CAS  PubMed  Google Scholar 

  20. Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach (3rd ed.). Wiley.

    Book  Google Scholar 

  21. Chang, H.-H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT model. Psychometrika, 58(1), 37–52. https://doi.org/10.1007/BF02294469

    Article  Google Scholar 

  22. Brown, A., & Croudace, T. J. (2015). Scoring and estimating score precision using multidimensional IRT models. Handbook of item response theory modeling: Applications to typical performance assessment (pp. 307–333). Routledge.

    Google Scholar 

  23. Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. John Wiley & Sons.

    Google Scholar 

  24. Asparouhov, T., & Muthen, B. (2010). Plausible values for latent variables using Mplus. Mplus.

    Google Scholar 

  25. Yount, S. E., Atwood, C., Donohue, J., Hays, R. D., Irwin, D., Leidy, N. K., & DeWalt, D. A. (2019). Responsiveness of PROMIS® to change in chronic obstructive pulmonary disease. Journal of Patient-Reported Outcomes. https://doi.org/10.1186/s41687-019-0155-9

    Article  PubMed  PubMed Central  Google Scholar 

  26. DeWalt, D. (2016). PROMIS 1 wave 2 chronic obstructive pulmonary disease (COPD). Harvard Dataverse. https://doi.org/10.7910/DVN/UOQNJF

  27. Schalet, B. D., Hays, R. D., Jensen, S. E., Beaumont, J. L., Fries, J. F., & Cella, D. (2016). Validity of PROMIS® physical function measures in diverse clinical samples. Journal of clinical epidemiology, 73, 112–118. https://doi.org/10.1016/j.jclinepi.2015.08.039

    Article  PubMed  PubMed Central  Google Scholar 

  28. Lewko, A., Bidgood, P. L., & Garrod, R. (2009). Evaluation of psychological and physiological predictors of fatigue in patients with COPD. BMC Pulmonary Medicine, 9(1), 47. https://doi.org/10.1186/1471-2466-9-47

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Breslin, E., van der Schans, C., Breukink, S., Meek, P., Mercer, K., Volz, W., & Louie, S. (1998). Perception of fatigue and quality of life in patients with COPD. Chest, 114(4), 958–964. https://doi.org/10.1378/chest.114.4.958

    Article  CAS  PubMed  Google Scholar 

  30. Wang, Q., & Bourbeau, J. (2005). Outcomes and health-related quality of life following hospitalization for an acute exacerbation of COPD. Respirology, 10(3), 334–340. https://doi.org/10.1111/j.1440-1843.2005.00718.x

    Article  PubMed  Google Scholar 

  31. Cote, C. G., Dordelly, L. J., & Celli, B. R. (2007). Impact of COPD exacerbations on patient-centered outcomes. Chest, 131(3), 696–704.

    Article  PubMed  Google Scholar 

  32. Irwin, D. E., Atwood, C. A., Hays, R. D., Spritzer, K., Liu, H., Donohue, J. F., & DeWalt, D. A. (2015). Correlation of PROMIS scales and clinical measures among chronic obstructive pulmonary disease patients with and without exacerbations. Quality of Life Research, 24(4), 999–1009. https://doi.org/10.1007/s11136-014-0818-1

    Article  PubMed  Google Scholar 

  33. Rose, M., Bjorner, J. B., Gandek, B., Bruce, B., Fries, J. F., & Ware, J. E. (2014). The PROMIS physical function item bank was calibrated to a standardized metric and shown to improve measurement efficiency. Journal of Clinical Epidemiology, 67(5), 516–526. https://doi.org/10.1016/j.jclinepi.2013.10.024

    Article  PubMed  PubMed Central  Google Scholar 

  34. Fries, J. F., Krishnan, E., Rose, M., Lingala, B., & Bruce, B. (2011). Improved responsiveness and reduced sample size requirements of PROMIS physical function scales with item response theory. Arthritis Research & Therapy, 13(5), R147. https://doi.org/10.1186/ar3461

    Article  Google Scholar 

  35. Lai, J.-S., Cella, D., Choi, S., Junghaenel, D. U., Christodoulou, C., Gershon, R., & Stone, A. (2011). How item banks and their application can influence measurement practice in rehabilitation medicine: A promis fatigue item bank example. Archives of physical medicine and rehabilitation, 92(10), S20–S27. https://doi.org/10.1016/j.apmr.2010.08.033

    Article  PubMed  PubMed Central  Google Scholar 

  36. Ameringer, S., Elswick, R. K., Menzies, V., Robins, J. L., Starkweather, A., Walter, J., & Jallo, N. (2016). Psychometric evaluation of the patient-reported outcomes measurement information system fatigue-short form across diverse populations. Nursing Research, 65(4), 279–289. https://doi.org/10.1097/NNR.0000000000000162

    Article  PubMed  PubMed Central  Google Scholar 

  37. Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied psychological measurement, 33(6), 419–440. https://doi.org/10.1177/0146621608327801

    Article  PubMed  Google Scholar 

  38. Yost, K., Cella, D., Chawla, A., Holmgren, E., Eton, D., Ayanian, J., & West, D. (2005). Minimally important differences were estimated for the functional assessment of cancer therapy-colorectal (FACT-C) instrument using a combination of distribution- and anchor-based approaches. Journal of Clinical Epidemiology, 58(12), 1241–1251. https://doi.org/10.1016/j.jclinepi.2005.07.008

    Article  CAS  PubMed  Google Scholar 

  39. Cella, D., Hahn, E. A., & Dineen, K. (2002). Meaningful change in cancer-specific quality of life scores: Differences between improvement and worsening. Quality of Life Research, 11(3), 207–221.

    Article  PubMed  Google Scholar 

  40. Beaumont, J. L., Davis, E. S., Fries, J. F., Curtis, J. R., Cella, D., & Yun, H. (2021). Meaningful change thresholds for patient-reported outcomes measurement information system (PROMIS) fatigue and pain interference scores in patients with rheumatoid arthritis. The Journal of Rheumatology. https://doi.org/10.3899/jrheum.200990

    Article  PubMed  Google Scholar 

  41. Wyrwich, K. W. (2004). Minimal important difference thresholds and the standard error of measurement: Is there a connection? Journal of Biopharmaceutical Statistics, 14(1), 97–110. https://doi.org/10.1081/BIP-120028508

    Article  PubMed  Google Scholar 

  42. Hays, R. D., Spritzer, K. L., Fries, J. F., & Krishnan, E. (2015). Responsiveness and minimally important difference for the patient-reported outcomes measurement information system (PROMIS) 20-item physical functioning short form in a prospective observational study of rheumatoid arthritis. Annals of the Rheumatic Diseases, 74(1), 104–107. https://doi.org/10.1136/annrheumdis-2013-204053

    Article  PubMed  Google Scholar 

  43. Bartlett, S. J., Gutierrez, A. K., Andersen, K. M., Bykerk, V. P., Curtis, J. R., Haque, U. J., & Bingham, C. O. (2020). Identifying minimal and meaningful change in PROMIS(®) for rheumatoid arthritis: Use of multiple methods and perspectives. Arthritis Care Res (Hoboken), 74(4), 588–597.

    Article  Google Scholar 

  44. Snapinn, S. M., & Jiang, Q. (2007). Responder analyses and the assessment of a clinically relevant treatment effect. Trials, 8(1), 31. https://doi.org/10.1186/1745-6215-8-31

    Article  PubMed  PubMed Central  Google Scholar 

  45. Uryniak, T., Chan, I. S. F., Fedorov, V. V., Jiang, Q., Oppenheimer, L., Snapinn, S. M., & Zhang, J. (2011). Responder analyses—A PhRMA position paper. Statistics in Biopharmaceutical Research, 3(3), 476–487. https://doi.org/10.1198/sbr.2011.10070

    Article  Google Scholar 

Download references

Funding

The second author is funded partially by the National Science Foundation Grant ECR-1760491.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emily H. Ho.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors and uses publicly available data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 20 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ho, E.H., Verkuilen, J. & Fischer, F. Measuring individual true change with PROMIS using IRT-based plausible values. Qual Life Res 32, 1369–1379 (2023). https://doi.org/10.1007/s11136-022-03264-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-022-03264-2

Keywords

Navigation