Patient-reported outcome (PRO) analyses often involve calculating raw change scores, but limitations of this approach are well documented. Regression estimators can incorporate information about measurement error and potential covariates, potentially improving change estimates. Yet, adoption of these regression-based change estimators is rare in clinical PRO research.
Both simulated and PROMIS® pain interference items were used to calculate change employing three methods: raw change scores and regression estimators proposed by Lord and Novick (LN) and Cronbach and Furby (CF). In the simulated data, estimators’ ability to recover true change was compared. Standard errors of measurement (SEM) and estimation (SEE) with associated 95% confidence limits were also used to identify criteria for significant improvement. These methods were then applied to real-world data from the PROMIS® study.
In the simulation, both regression estimators reduced variability compared to raw change scores by almost half. Compared to CF, the LN regression better recovered true simulated differences. Analysis of the PROMIS® data showed similar themes, and change score distributions from the regression estimators showed less dispersion. Using distribution-based approaches to calculate thresholds for significant within-patient change, smaller changes could be detected using both regression estimators.
These results suggest that calculating change using regression estimates may result in more increased measurement sensitivity. Using these scores in lieu of raw differences can help better identify individuals who experience real underlying change in PROs in the course of a trial, and enhance the established methods for identifying thresholds for meaningful within-patient change in PROs.
This is a preview of subscription content,to check access.
Access this article
Similar content being viewed by others
The PROMIS® 1 Wave 2 Pain Depression dataset can be requested here: https://doi.org/10.7910/DVN/ZDIITC.
Computerized adaptive testing
Cronbach & Furby (complete estimator)
Classical test theory
Graded response model
Lord & Novick
Multivariate normal distribution
Patient-reported outcome measurement information system
Standard error of measurement
Standard error of prediction
U.S. Food and Drug Administration (2019) Patient-focused Drug Development Guidance Public Workshop - Discussion document: Incorporating clinical outcome assessments into endpoints for regulatory decision-making. Retrieved, from https://www.fda.gov/media/132505/download
Coon, C. D., & Cook, K. F. (2018). Moving from significance to real-world meaning: Methods for interpreting change in clinical outcome assessment scores. Quality of Life Research, 27, 33–40.
US Food and Drug Administration. (2018). Patient-Focused Drug Development Guidance Public Workshop: Methods to identify what is important to patients select, develop or modify fit-for-purpose clinical outcomes assessments.
Kim-Kang, G., & Weiss, D. J. (2008). Adaptive measurement of individual change. Zeitschrift für Psychologie / Journal of Psychology, 216, 49–58.
Lord, F. M. (1958). Further problems in the measurement of growth. Educational and Psychological Measurement, 18, 437–451.
Lord, F. M. (1956). The measurement of growth. ETS Res Bull Ser, 1956, i–22.
McNemar, Q. (1958). On growth measurement. Educational and Psychological Measurement, 18, 47–55.
Cronbach, L. J., & Furby, L. (1970). How we should measure change–or should we? Psychological Bulletin, 74, 68–80.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley Pub Co, Reading.
Cascio, W. F., & Kurtines, W. M. (1977). A practical method for identifying significant change scores. Educational and Psychological Measurement, 37, 889–895. https://doi.org/10.1177/001316447703700411
Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., Amtmann, D., Bode, R., Buysse, D., Choi, S., Cook, K., Devellis, R., Dewalt, D., Fries, J. F., Gershon, R., Hahn, E. A., Lai, J. S., Pilkonis, P., Revicki, D., … Hays, R. (2010). The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63, 1179–1194. https://doi.org/10.1016/j.jclinepi.2010.04.011
Segawa, E., Schalet, B., & Cella, D. (2020). A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administrated using PROMIS profile. Quality of Life Research, 29, 213–221.
Dagmar Amtmann, 2016, "PROMIS 1 Wave 2 Pain", https://doi.org/10.7910/DVN/ESOAH5, Harvard Dataverse, V1, UNF:6:TYzYcoNorGguhqSjkVdL2Q== [fileUNF]
Amtmann, D., Cook, K. F., Jensen, M. P., Chen, W.-H., Choi, S., Revicki, D., Cella, D., Rothrock, N., Keefe, F., Callahan, L., & Lai, J.-S. (2010). Development of a PROMIS item bank to measure pain interference. Pain, 150, 173–182.
Samejima, F. (1994). Estimation of reliability coefficients using the test information function and its modifications. Applied Psychological Measurement, 18, 229–244.
R Core Team. (2020). A Language and Environment for Statistical Computing. R Found. Stat. Comput. Retrieved, from https://www.R--project.org
Chalmers, R. P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48, 1–29. https://doi.org/10.18637/jss.v048.i06
der Elst, W., Molenberghs, G., Hilgers, RD., Verbeke, G., Heussen, N. (2019). CorrMixed: Estimate Correlations Between Repeatedly Measured Endpoints (Eg, Reliability) Based on Linear Mixed-Effects Models. R package version 1.0
Revelle, W. (2021). psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, Retrieved, from https://CRAN.R-project.org/package=psychVersion=2.1.9
Cohen, J. (1988). Statistical power analysis for the behavioral science (2nd ed.). Taylor & Francis Group.
The current project did not have explicit extramural funding sources. All authors are employees of their respective institutions.
Conflict of interest
The authors have no competing interests to declare.
Ethical approval and consent to participate
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Andrae, D.A., Foster, B. & Peipert, J.D. Comparison of raw and regression approaches to capturing change on patient-reported outcome measures. Qual Life Res 32, 1381–1390 (2023). https://doi.org/10.1007/s11136-022-03196-x