Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 1, 307–310.
Cronbach, L. (1947). Test “reliability”: Its meaning and determination. Psychometrika, 12, 1–16.
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282.
Hays, R. D., Anderson, R., & Revicki, D. (1993). Psychometric considerations in evaluating health-related quality of life measures. Quality of Life Research, 2, 441–449.
Luiz, R. R., Costa, A. J., Kale, P. L., & Werneck, G. L. (2003). Assessment of agreement of a quantitative variable: a new graphical approach. Journal of Clinical Epidemiology, 56, 963–967.
McArdle, J. J., & Woodcock, R. W. (1997). Expanding test-retest designs to include developmental time-lag components. Psychological Methods, 2, 403–435.
McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46.
Monti, K. (1995). Folded empirical distribution function curves-Mountain plots. The American Statistician, 49, 342–345.
Morrison, D. (1981). A stochastic model for test-retest correlations. Psychometrika, 46, 143–151.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420–428.
Streiner, D. L., & Norman, G. R. (1989). Health measurement scales: A practical guide to their development and use. Oxford: Oxford University Press.
Terwee, C. B., Bot, S. D., De Boer, M. R., van der Windt, D. A., Knol, D. L., Dekker, J., et al. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology, 60, 34–42.
Tisak, J., & Tisak, M. S. (1996). Longitudinal models of reliability and validity: A latent curve approach. Applied Psychological Measurement, 20, 275–288.