Biases in the Retrospective Calculation of Reliability and Responsiveness from Longitudinal Studies

  • Geoff Norman
  • Paul Stratford
  • Glenn Regehr


We critically examine the common practice in quality of life assessment of identifying improved, unchanged and worsened subsamples retrospectively using some form of global rating, and then calculating test-retest reliability coefficients from the unchanged subsample and responsiveness coefficients from the changed subsample. We use data derived from Monte Carlo simulations to examine the relation between measures of reliability and responsiveness derived from retrospective studies using an unchanged subsample and coefficients based on prospective studies with known treatment effect sizes. We also use actual data from longitudinal studies to examine the fit between simulated and published data. Our results show that calculation of reliability from an unchanged subsample leads to an inflation of the computed coefficient from a typical range of 0.6–0.8 up to 0.85–0.95. We similarly demonstrate that responsiveness coefficients based on the changed subsamples overestimate the responsiveness of the instrument, so that even in situations where there is no overall change, the methods lead to an acceptably large responsiveness coefficient. Based on these results, we conclude that retrospective methods of calculating reliability and responsiveness coefficients based on unchanged samples lead to upwardly biased estimates, and should be discontinued.


Change Score Reliability Coefficient Minimally Important Difference Clinical Epidemiology Responsiveness Coefficient 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wright J.G. and Feinstein, A.R. (1992). A comparative contrast of clinimetric and psychometric methods for constructing indices and rating scales. Journal of Clinical Epidemiology 45, 1201–1218.PubMedCrossRefGoogle Scholar
  2. 2.
    Kirshner, B. and Guyatt, G.H. (1985). A methodological framework for assessing health status indices. Journal of Chronic Diseases 38, 27–36.PubMedCrossRefGoogle Scholar
  3. 3.
    Liang, M.H., Cullen, K.E. and Schwartz, J.A. (1985). Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis Rheumatism 28, 542–547.PubMedCrossRefGoogle Scholar
  4. 4.
    Kazis, L.E. and Anderson, J.J. (1989). Effect size for interpreting changes in health status. Medical Care 27, S178–S189.CrossRefGoogle Scholar
  5. 5.
    Norman, G.R. (1989). Issues in the use of changes scores in randomized trials. Journal of Clinical Epidemiology 42, 1097–1105.PubMedCrossRefGoogle Scholar
  6. 6.
    Jaeschke, R., Singer, J. and Guyatt, G.H. (1989). Measurement of health status: ascertaining the minimum clinically important difference. Controlled Clinical Trials 10, 407–415.PubMedCrossRefGoogle Scholar
  7. 7.
    Guyatt, G.H., Deyo, R.A., Charlson, M., Levine, M.N. and Mitchell, A. (1989). Responsiveness and validity in health status measurement: a clarification. Journal of Clinical Epidemiology 42, 403–408.PubMedCrossRefGoogle Scholar
  8. 8.
    Deyo, R.A. and Centor, R.M. (1986). Assessing responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. Journal of Chronic Diseases 39, 897–906.PubMedCrossRefGoogle Scholar
  9. 9.
    Juniper, E.F., Guyatt, G.H., Feeny, D.H., Ferrie, P.J., Griffith, L.E. and Townsend, M. (1996). Measuring quality of life in children with asthma. Quality of Life Research 5, 35–46.PubMedCrossRefGoogle Scholar
  10. 10.
    Abello-Banfi, M., Cardiel, M.H., Ruiz-Mercado, R. and Alarcon-Segov, D. (1994). Quality of life in rheumatoid arthritis: validation of a Spanish version of Arthritis Impact Measurement Scales. Journal of Rheumatology 21, 1250–55.PubMedGoogle Scholar
  11. 11.
    Ruta, D.A., Garratt, A.M., Leng, M., Russell, I.T. and MacDonald, L.M. (1994). A new approach to the measurement of quality of life: the patient generated index. Medical Care 32, 1109–1126.PubMedCrossRefGoogle Scholar
  12. 12.
    Ruta, D.A., Garratt, A.M., Wardlaw, D. and Russell, I.T. (1994). Developing a valid and reliable measure of health outcomes for patients with low back pain. Spine 19; 1887–1896.PubMedCrossRefGoogle Scholar
  13. 13.
    Rowe, B.H. and Oxman, A.D. (1993). Performance of an asthma quality of life questionnaire in an outpatient setting. American Review of Respiratory Diseases 148, 675–682.CrossRefGoogle Scholar
  14. 14.
    Norman, G.R., Regehr, G. and Stratford, P.W. (1997). Bias in the retrospective calculation of responsiveness to change: the lesson of Cronbach. Journal of Clinical Epidemiology 50, 869–879.PubMedCrossRefGoogle Scholar
  15. 15.
    Ross, M. (1989). Relation of implicit theories to the construction of personal histories. Psychological Review 96, 341–347.CrossRefGoogle Scholar
  16. 16.
    Juniper, E.F., Guyatt, G.H., Feeny, D.H., Ferrie, P.J., Griffith, L.E. and Townsend, M. (1996). Measuring quality of life in children with asthma. Quality of Life Research 5, 35–46.PubMedCrossRefGoogle Scholar
  17. 17.
    Guyatt, G.H., Walter, S.D. and Norman, G.R. (1987). Measuring change over time: assessing their usefulness of evaluative instruments. Journal of Chronic Diseases 40, 171–178.PubMedCrossRefGoogle Scholar
  18. 18.
    Francis, D.J., Fletcher, J.M., Stuebing, K.K., Davidson, K.C. and Thompson, N.M. (1991). Analysis of change: modelling individual growth. Journal of Consultant in Clinical Psychology 59, 27–37.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2002

Authors and Affiliations

  • Geoff Norman
    • 1
  • Paul Stratford
    • 1
  • Glenn Regehr
    • 1
  1. 1.McMaster University and University of TorontoCanada

Personalised recommendations