Skip to main content
Log in

Tests of Measurement Invariance Without Subgroups: A Generalization of Classical Methods

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

The issue of measurement invariance commonly arises in factor-analytic contexts, with methods for assessment including likelihood ratio tests, Lagrange multiplier tests, and Wald tests. These tests all require advance definition of the number of groups, group membership, and offending model parameters. In this paper, we study tests of measurement invariance based on stochastic processes of casewise derivatives of the likelihood function. These tests can be viewed as generalizations of the Lagrange multiplier test, and they are especially useful for: (i) identifying subgroups of individuals that violate measurement invariance along a continuous auxiliary variable without prespecified thresholds, and (ii) identifying specific parameters impacted by measurement invariance violations. The tests are presented and illustrated in detail, including an application to a study of stereotype threat and simulations examining the tests’ abilities in controlled conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.

Similar content being viewed by others

References

  • Andrews, D.W.K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61, 821–856.

    Article  Google Scholar 

  • Bauer, D.J., & Curran, P.J. (2004). The integration of continuous and discrete latent variable models: potential problems and promising opportunities. Psychological Methods, 9, 3–29.

    Article  PubMed  Google Scholar 

  • Bauer, D.J., & Hussong, A.M. (2009). Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. Psychological Methods, 14, 101–125.

    Article  PubMed  Google Scholar 

  • Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., et al. (2011). OpenMx: an open source extended structural equation modeling framework. Psychometrika, 76(2), 306–317.

    Article  PubMed  Google Scholar 

  • Bollen, K.A. (1989). Structural equations with latent variables. New York: Wiley.

    Google Scholar 

  • Borsboom, D. (2006). When does measurement invariance matter? Medical Care, 44(11), S176–S181.

    Article  PubMed  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees. Belmont: Wadsworth.

    Google Scholar 

  • Brown, R.L., Durbin, J., & Evans, J.M. (1975). Techniques for testing the constancy of regression relationships over time. Journal of the Royal Statistical Society. Series B, 37, 149–163.

    Google Scholar 

  • Dolan, C.V., & van der Maas, H.L.J. (1998). Fitting multivariate normal finite mixtures subject to structural equation modeling. Psychometrika, 63, 227–253.

    Article  Google Scholar 

  • Ferguson, T.S. (1996). A course in large sample theory. London: Chapman & Hall.

    Google Scholar 

  • Ferrer, E., Balluerka, N., & Widaman, K.F. (2008). Factorial invariance and the specification of second-order latent growth models. Methodology, 4, 22–36.

    PubMed  Google Scholar 

  • Hansen, B.E. (1992). Testing for parameter instability in linear models. Journal of Policy Modeling, 14, 517–533.

    Article  Google Scholar 

  • Hansen, B.E. (1997). Approximate asymptotic p values for structural-change tests. Journal of Business & Economic Statistics, 15, 60–67.

    Google Scholar 

  • Hjort, N.L., & Koning, A. (2002). Tests for constancy of model parameters over time. Nonparametric Statistics, 14, 113–132.

    Article  Google Scholar 

  • Horn, J.L., & McArdle, J.J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18, 117–144.

    Article  PubMed  Google Scholar 

  • Jöreskog, K.G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426.

    Article  Google Scholar 

  • Lubke, G.H., & Muthén, B. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10, 21–39.

    Article  PubMed  Google Scholar 

  • MacCallum, R.C., Zhang, S., Preacher, K.J., & Rucker, D.D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19–40.

    Article  PubMed  Google Scholar 

  • McArdle, J.J. (2009). Latent variable modeling of differences and changes with longitudinal data. Annual Review of Psychology, 60, 577–605.

    Article  PubMed  Google Scholar 

  • McDonald, R.P. (1999). Test theory: a unified treatment. Mahwah: Erlbaum.

    Google Scholar 

  • Mellenbergh, G.J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.

    Article  Google Scholar 

  • Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525–543.

    Article  Google Scholar 

  • Merkle, E.C., & Shaffer, V.A. (2011). Binary recursive partitioning methods with application to psychology. British Journal of Mathematical & Statistical Psychology, 64(1), 161–181.

    Article  Google Scholar 

  • Millsap, R.E. (2005). Four unresolved problems in studies of factorial invariance. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary psychometrics (pp. 153–171). Mahwah: Erlbaum.

    Google Scholar 

  • Millsap, R.E. (2011). Statistical approaches to measurement invariance. New York: Routledge.

    Google Scholar 

  • Molenaar, D., Dolan, C.V., Wicherts, J.M., & van der Mass, H.L.J. (2010). Modeling differentiation of cognitive abilities within the higher-order factor model using moderated factor analysis. Intelligence, 38, 611–624.

    Article  Google Scholar 

  • Neale, M.C., Aggen, S.H., Maes, H.H., Kubarych, T.S., & Schmitt, J.E. (2006). Methodological issues in the assessment of substance use phenotypes. Addictive Behaviors, 31, 1010–1034.

    Article  PubMed  Google Scholar 

  • Nyblom, J. (1989). Testing for the constancy of parameters over time. Journal of the American Statistical Association, 84, 223–230.

    Article  Google Scholar 

  • Ploberger, W., & Krämer, W. (1992). The CUSUM test with OLS residuals. Econometrica, 60(2), 271–285.

    Article  Google Scholar 

  • Purcell, S. (2002). Variance components models for gene-environment interaction in twin analysis. Twin Research, 5, 554–571.

    PubMed  Google Scholar 

  • R Development Core Team (2012). R: a language and environment for statistical computing [Computer software manual]. URL http://www.R-project.org/. Vienna, Austria (ISBN 3-900051-07-0).

  • Rosseel, Y. (2012). lavaan: an R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. URL:http://www.jstatsoft.org/v48/i02/.

    Google Scholar 

  • Sánchez, G. (2009). PATHMOX approach: segmentation trees in partial least squares path modeling. Unpublished doctoral dissertation. Universitat Politécnica de Catalunya.

  • Satorra, A. (1989). Alternative test criteria in covariance structure analysis: a unified approach. Psychometrika, 54, 131–151.

    Article  Google Scholar 

  • Shorack, G.R., & Wellner, J.A. (1986). Empirical processes with applications to statistics. New York: Wiley.

    Google Scholar 

  • Stark, S., Chernyshenko, O.S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91, 1292–1306.

    Article  PubMed  Google Scholar 

  • Strobl, C., Kopf, J., & Zeileis, A. (2010). A new method for detecting differential item functioning in the Rasch model (Technical Report No. 92). Department of Statistics, Ludwig-Maximilians-Universität München. URL http://epub.ub.uni-muenchen.de/11915/.

  • Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14, 323–348.

    Article  PubMed  Google Scholar 

  • Wicherts, J.M., Dolan, C.V., & Hessen, D.J. (2005). Stereotype threat and group differences in test performance: a question of measurement invariance. Journal of Personality and Social Psychology, 89(5), 696–716.

    Article  PubMed  Google Scholar 

  • Wothke, W. (2000). Longitudinal and multi-group modeling with missing data. In T.D. Little, K.U. Schnabel, & J. Baumert (Eds.), Modeling longitudinal and multilevel data: practical issues, applied approaches, and specific examples. Mahwah: Erlbaum.

    Google Scholar 

  • Zeileis, A. (2005). A unified approach to structural change tests based on ML scores, F statistics, and OLS residuals. Econometric Reviews, 24(4), 445–466.

    Article  Google Scholar 

  • Zeileis, A. (2006). Implementing a class of structural change tests: an econometric computing approach. Computational Statistics & Data Analysis, 50(11), 2987–3008.

    Article  Google Scholar 

  • Zeileis, A., & Hornik, K. (2007). Generalized M-fluctuation tests for parameter instability. Statistica Neerlandica, 61, 488–508.

    Article  Google Scholar 

  • Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17, 492–514.

    Article  Google Scholar 

  • Zeileis, A., Leisch, F., Hornik, K., & Kleiber, C. (2002). strucchange: an R package for testing for structural change in linear regression models. Journal of Statistical Software, 7(2), 1–38. URL http://www.jstatsoft.org/v07/i02/.

    Google Scholar 

  • Zeileis, A., Shah, A., & Patnaik, I. (2010). Testing, monitoring, and dating structural changes in exchange rate regimes. Computational Statistics & Data Analysis, 54, 1696–1706.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Science Foundation grant SES-1061334. The authors thank Jelte Wicherts, who generously shared data for the stereotype threat application, Yves Rosseel, who provided feedback and code for performing the tests with the lavaan package, Kris Preacher, who provided helpful comments on the manuscript, and the participants of the Psychoco 2012 workshop on psychometric computing for helpful discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edgar C. Merkle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Merkle, E.C., Zeileis, A. Tests of Measurement Invariance Without Subgroups: A Generalization of Classical Methods. Psychometrika 78, 59–82 (2013). https://doi.org/10.1007/s11336-012-9302-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-012-9302-4

Key words

Navigation