A Factor Analysis Approach to Item Level Change Score Reliability

Molenaar, Dylan

doi:10.1007/978-3-031-10370-4_7

Dylan Molenaar¹²

Part of the book series: Methodology of Educational Measurement and Assessment ((MEMA))

438 Accesses

Abstract

Reliability of change scores from a pretest-posttest design is important to establish the usefulness of change scores in drawing inferences about pretest-posttest differences. Besides the traditional sum score-based classical test theory approach, an item level classical test theory approach has been proposed to assess change score reliability. This approach was demonstrated to be superior to the traditional sum score-based approach. However, both the item level and the sum score-based approaches are biased in the case of multidimensionality and correlated errors. Therefore, in this chapter two factor analysis approaches to the item level classical test theory approach are presented. These approaches treat the item level data explicitly as ordinal and allow various psychometric aspects of the data to be investigated including multidimensionality, carry-over effects, and response shifts. As a result, using the factor analysis approaches, it can be assessed whether the results from the classical test theory approaches can be trusted. The classical test theory approaches and factor analysis approaches are studied in a simulation and applied to a real dataset pertaining to life satisfaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the traditional classical test theory approach, positive residual correlations will increase σ _{pre, post} which will bias reliability downwards. In the item level classical test theory approach, positive residual correlations will decrease \( {\sigma}_{E_{D_j}}^2\left(\mathrm{as}\ {{\sigma_E}_D}_j^2={\sigma}_{E_j^{\left(\mathrm{pre}\right)}}^2+{\sigma}_{E_j^{\left(\mathrm{post}\right)}}^2-2{\sigma}_{E_j^{\left(\mathrm{pre}\right)},{E}_j^{\left(\mathrm{post}\right)}}\right) \) which will bias reliability upwards. This is also what was found in the simulations by Gu et al. The bias was however much larger for the traditional approach.

References

Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88(3), 588–606. https://doi.org/10.1037/0033-2909.88.3.588
Article Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables. Wiley.
Book Google Scholar
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Sage. https://doi.org/10.1177/0049124192021002005
Chapter Google Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555
Article Google Scholar
Cronbach, L. J., & Furby, L. (1970). How we should measure “change”: Or should we? Psychological Bulletin, 74(1), 68–80. https://doi.org/10.1037/h0029382
Article Google Scholar
Diener, E. D., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The satisfaction with life scale. Journal of Personality Assessment, 49(1), 71–75. https://doi.org/10.1207/s15327752jpa4901_13
Article Google Scholar
Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5 and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 47(2), 309–326. https://doi.org/10.1111/j.2044-8317.1994.tb01039.x
Article Google Scholar
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
Article Google Scholar
Fokkema, M., Smits, N., Kelderman, H., & Cuijpers, P. (2013). Response shifts in mental health interventions: An illustration of longitudinal measurement invariance. Psychological Assessment, 25(2), 520–531. https://doi.org/10.1037/a0031669
Article Google Scholar
Gu, Z., Emons, W. H., & Sijtsma, K. (2018). Review of issues about classical change scores: A multilevel modeling perspective on some enduring beliefs. Psychometrika, 83(3), 674–695. https://doi.org/10.1007/s11336-018-9611-3
Article Google Scholar
Gu, Z., Emons, W. H., & Sijtsma, K. (2021). Estimating difference-score reliability in Pretest–Posttest settings. Journal of Educational and Behavioral Statistics, 46(5), 592–610. https://doi.org/10.3102/1076998620986948
Article Google Scholar
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255–282. https://doi.org/10.1007/BF02288892
Article Google Scholar
Howard, G. S., & Dailey, P. R. (1979). Response-shift bias: A source of contamination of self-report measures. Journal of Applied Psychology, 64(2), 144–150. https://doi.org/10.1037/0021-9010.64.2.144
Article Google Scholar
Hunt, T. (2013). Lambda4: Collection of internal consistency reliability coefficients (Version 3.0) [Computer software]. CRAN. https://CRAN.R-project.org/package=Lambda4
Linn, R. L., & Slinde, J. A. (1977). The determination of the significance of change between pre- and posttesting periods. Review of Educational Research, 47(1), 121–150. https://doi.org/10.3102/00346543047001121
Article Google Scholar
Lord, F. M. (1963). Elementary models for measuring change. In C. W. Harris (Ed.), Problems in measuring change (pp. 21–38). The University of Wisconsin Press.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Adison-Wesley.
Google Scholar
Mackinnon, S. P., Ray, C. M., Firth, S. M., & O’Connor, R. M. (2019). Perfectionism, negative motives for drinking, and alcohol-related problems: A 21-day diary study. Journal of Research in Personality, 78, 177–188. https://doi.org/10.1016/j.jrp.2018.12.003
Article Google Scholar
Mackinnon, S. P., Ray, C. M., Firth, S. M., & O’Connor, R. M. (2021). Data from “Perfectionism, negative motives for drinking, and alcohol-related problems: A 21-day diary study”. Journal of Open Psychology Data, 9(1), 1. https://doi.org/10.5334/jopd.44
Article Google Scholar
McConnel, K., Strand, I. E., & Valdes, S. (1998). Testing temporal reliability and carry-over effect: The role of correlated responses in test-retest reliability studies. Environmental and Resource Economics, 12(3), 357–374. https://doi.org/10.1023/A:1008264922331
Article Google Scholar
McDonald, R. P. (1978). Generalizability in factorable domains: “Domain Validity and Generalizability”. Educational and Psychological Measurement, 38(1), 75–79. https://doi.org/10.1177/001316447803800111
Article Google Scholar
McDonald, R. P. (1999). Test theory: A unified treatment. Erlbaum.
Google Scholar
Mellenbergh, G. J. (1996). Measurement precision in test score and item response models. Psychological Methods, 1(3), 293. https://doi.org/10.1037/1082-989X.1.3.293
Article Google Scholar
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825
Article Google Scholar
Oort, F. (2005). Using structural equation modeling to detect response shifts and true change. Quality of Life Research, 14(3), 587–598. https://doi.org/10.1007/s11136-004-0830-y
Article Google Scholar
Oort, F. J., Visser, M. R., & Sprangers, M. A. (2009). Formal definitions of measurement bias and explanation bias clarify measurement and conceptual perspectives on response shift. Journal of Clinical Epidemiology, 62(11), 1126–1137. https://doi.org/10.1016/j.jclinepi.2009.03.013
Article Google Scholar
Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17(3), 354–373. https://doi.org/10.1037/a0029315
Article Google Scholar
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
Article Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Monograph No. 17). Psychometric Society. https://doi.org/10.1007/BF03372160
Book Google Scholar
Schermelleh-Engel, K., Moosbrugger, H., & Müller, H. (2003). Evaluating the fit of structural equation models: Tests of significance and descriptive goodness-of-fit measures. Methods of Psychological Research Online, 8(2), 23–74. http://www.mpr-online.de
Google Scholar
Sijtsma, K., & Pfadt, J. M. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika, 86(4), 843–860. https://doi.org/10.1007/s11336-021-09789-8
Article Google Scholar
Sijtsma, K., & Van der Ark, L. A. (2021). Measurement models for psychological attributes. Chapman and Hall/CRC. https://doi.org/10.1201/9780429112447
Book Google Scholar
Sprangers, M. A., & Schwartz, C. E. (1999). Integrating response shift into health-related quality of life research: A theoretical model. Social Science & Medicine, 48(11), 1507–1515. https://doi.org/10.1016/s0277-9536(99)00045-3
Article Google Scholar
Takane, Y., & De Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393–408. https://doi.org/10.1007/BF02294363
Article Google Scholar
Williams, B. J., & Kaufmann, L. M. (2012). Reliability of the go/no go association task. Journal of Experimental Social Psychology, 48(4), 879–891. https://doi.org/10.1016/j.jesp.2012.03.001
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
Dylan Molenaar

Authors

Dylan Molenaar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dylan Molenaar .

Editor information

Editors and Affiliations

Research Institute of Child Development and Education, University of Amsterdam, Amsterdam, The Netherlands
L. Andries van der Ark
Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands
Wilco H. M. Emons
The expertise group Psychometrics and Statistics, University of Groningen, Groningen, The Netherlands
Rob R. Meijer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Molenaar, D. (2023). A Factor Analysis Approach to Item Level Change Score Reliability. In: van der Ark, L.A., Emons, W.H.M., Meijer, R.R. (eds) Essays on Contemporary Psychometrics. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-031-10370-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-10370-4_7
Published: 16 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10369-8
Online ISBN: 978-3-031-10370-4
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics