Abstract
Reliability of change scores from a pretest-posttest design is important to establish the usefulness of change scores in drawing inferences about pretest-posttest differences. Besides the traditional sum score-based classical test theory approach, an item level classical test theory approach has been proposed to assess change score reliability. This approach was demonstrated to be superior to the traditional sum score-based approach. However, both the item level and the sum score-based approaches are biased in the case of multidimensionality and correlated errors. Therefore, in this chapter two factor analysis approaches to the item level classical test theory approach are presented. These approaches treat the item level data explicitly as ordinal and allow various psychometric aspects of the data to be investigated including multidimensionality, carry-over effects, and response shifts. As a result, using the factor analysis approaches, it can be assessed whether the results from the classical test theory approaches can be trusted. The classical test theory approaches and factor analysis approaches are studied in a simulation and applied to a real dataset pertaining to life satisfaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the traditional classical test theory approach, positive residual correlations will increase σ pre, post which will bias reliability downwards. In the item level classical test theory approach, positive residual correlations will decrease \( {\sigma}_{E_{D_j}}^2\left(\mathrm{as}\ {{\sigma_E}_D}_j^2={\sigma}_{E_j^{\left(\mathrm{pre}\right)}}^2+{\sigma}_{E_j^{\left(\mathrm{post}\right)}}^2-2{\sigma}_{E_j^{\left(\mathrm{pre}\right)},{E}_j^{\left(\mathrm{post}\right)}}\right) \) which will bias reliability upwards. This is also what was found in the simulations by Gu et al. The bias was however much larger for the traditional approach.
References
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88(3), 588–606. https://doi.org/10.1037/0033-2909.88.3.588
Bollen, K. A. (1989). Structural equations with latent variables. Wiley.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Sage. https://doi.org/10.1177/0049124192021002005
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555
Cronbach, L. J., & Furby, L. (1970). How we should measure “change”: Or should we? Psychological Bulletin, 74(1), 68–80. https://doi.org/10.1037/h0029382
Diener, E. D., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The satisfaction with life scale. Journal of Personality Assessment, 49(1), 71–75. https://doi.org/10.1207/s15327752jpa4901_13
Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5 and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 47(2), 309–326. https://doi.org/10.1111/j.2044-8317.1994.tb01039.x
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
Fokkema, M., Smits, N., Kelderman, H., & Cuijpers, P. (2013). Response shifts in mental health interventions: An illustration of longitudinal measurement invariance. Psychological Assessment, 25(2), 520–531. https://doi.org/10.1037/a0031669
Gu, Z., Emons, W. H., & Sijtsma, K. (2018). Review of issues about classical change scores: A multilevel modeling perspective on some enduring beliefs. Psychometrika, 83(3), 674–695. https://doi.org/10.1007/s11336-018-9611-3
Gu, Z., Emons, W. H., & Sijtsma, K. (2021). Estimating difference-score reliability in Pretest–Posttest settings. Journal of Educational and Behavioral Statistics, 46(5), 592–610. https://doi.org/10.3102/1076998620986948
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255–282. https://doi.org/10.1007/BF02288892
Howard, G. S., & Dailey, P. R. (1979). Response-shift bias: A source of contamination of self-report measures. Journal of Applied Psychology, 64(2), 144–150. https://doi.org/10.1037/0021-9010.64.2.144
Hunt, T. (2013). Lambda4: Collection of internal consistency reliability coefficients (Version 3.0) [Computer software]. CRAN. https://CRAN.R-project.org/package=Lambda4
Linn, R. L., & Slinde, J. A. (1977). The determination of the significance of change between pre- and posttesting periods. Review of Educational Research, 47(1), 121–150. https://doi.org/10.3102/00346543047001121
Lord, F. M. (1963). Elementary models for measuring change. In C. W. Harris (Ed.), Problems in measuring change (pp. 21–38). The University of Wisconsin Press.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Adison-Wesley.
Mackinnon, S. P., Ray, C. M., Firth, S. M., & O’Connor, R. M. (2019). Perfectionism, negative motives for drinking, and alcohol-related problems: A 21-day diary study. Journal of Research in Personality, 78, 177–188. https://doi.org/10.1016/j.jrp.2018.12.003
Mackinnon, S. P., Ray, C. M., Firth, S. M., & O’Connor, R. M. (2021). Data from “Perfectionism, negative motives for drinking, and alcohol-related problems: A 21-day diary study”. Journal of Open Psychology Data, 9(1), 1. https://doi.org/10.5334/jopd.44
McConnel, K., Strand, I. E., & Valdes, S. (1998). Testing temporal reliability and carry-over effect: The role of correlated responses in test-retest reliability studies. Environmental and Resource Economics, 12(3), 357–374. https://doi.org/10.1023/A:1008264922331
McDonald, R. P. (1978). Generalizability in factorable domains: “Domain Validity and Generalizability”. Educational and Psychological Measurement, 38(1), 75–79. https://doi.org/10.1177/001316447803800111
McDonald, R. P. (1999). Test theory: A unified treatment. Erlbaum.
Mellenbergh, G. J. (1996). Measurement precision in test score and item response models. Psychological Methods, 1(3), 293. https://doi.org/10.1037/1082-989X.1.3.293
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825
Oort, F. (2005). Using structural equation modeling to detect response shifts and true change. Quality of Life Research, 14(3), 587–598. https://doi.org/10.1007/s11136-004-0830-y
Oort, F. J., Visser, M. R., & Sprangers, M. A. (2009). Formal definitions of measurement bias and explanation bias clarify measurement and conceptual perspectives on response shift. Journal of Clinical Epidemiology, 62(11), 1126–1137. https://doi.org/10.1016/j.jclinepi.2009.03.013
Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17(3), 354–373. https://doi.org/10.1037/a0029315
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Monograph No. 17). Psychometric Society. https://doi.org/10.1007/BF03372160
Schermelleh-Engel, K., Moosbrugger, H., & Müller, H. (2003). Evaluating the fit of structural equation models: Tests of significance and descriptive goodness-of-fit measures. Methods of Psychological Research Online, 8(2), 23–74. http://www.mpr-online.de
Sijtsma, K., & Pfadt, J. M. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika, 86(4), 843–860. https://doi.org/10.1007/s11336-021-09789-8
Sijtsma, K., & Van der Ark, L. A. (2021). Measurement models for psychological attributes. Chapman and Hall/CRC. https://doi.org/10.1201/9780429112447
Sprangers, M. A., & Schwartz, C. E. (1999). Integrating response shift into health-related quality of life research: A theoretical model. Social Science & Medicine, 48(11), 1507–1515. https://doi.org/10.1016/s0277-9536(99)00045-3
Takane, Y., & De Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393–408. https://doi.org/10.1007/BF02294363
Williams, B. J., & Kaufmann, L. M. (2012). Reliability of the go/no go association task. Journal of Experimental Social Psychology, 48(4), 879–891. https://doi.org/10.1016/j.jesp.2012.03.001
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Molenaar, D. (2023). A Factor Analysis Approach to Item Level Change Score Reliability. In: van der Ark, L.A., Emons, W.H.M., Meijer, R.R. (eds) Essays on Contemporary Psychometrics. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-031-10370-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-10370-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10369-8
Online ISBN: 978-3-031-10370-4
eBook Packages: EducationEducation (R0)