Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model

von Davier, Matthias; Xu, Xueli; Carstensen, Claus H.

doi:10.1007/s11336-011-9202-z

Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model

Published: 02 February 2011

Volume 76, pages 318–336, (2011)
Cite this article

Psychometrika Aims and scope Submit manuscript

Matthias von Davier¹,
Xueli Xu¹ &
Claus H. Carstensen²

1129 Accesses
51 Citations
3 Altmetric
Explore all metrics

Abstract

The aim of the research presented here is the use of extensions of longitudinal item response theory (IRT) models in the analysis and comparison of group-specific growth in large-scale assessments of educational outcomes.

A general discrete latent variable model was used to specify and compare two types of multidimensional item-response-theory (MIRT) models for longitudinal data: (a) a model that handles repeated measurements as multiple, correlated variables over time and (b) a model that assumes one common variable over time and additional variables that quantify the change. Using extensions of these MIRT models, we approach the issue of modeling and comparing group-specific growth in observed and unobserved subpopulations. The analyses presented in this paper aim at answering the question whether academic growth is homogeneous across types of schools defined by academic demands and curricular differences. In order to facilitate answering this research question, (a) a model with a single two-dimensional ability distribution was compared to (b) a model assuming multiple populations with potentially different two-dimensional ability distributions based on type of school and to (c) a model that assumes that the observations are sampled from a discrete mixture of (unobserved) populations, allowing for differences across schools with respect to mixing proportions. For this purpose, we specified a hierarchical-mixture distribution variant of the two MIRT models. The latter model, (c), is a growth-mixture MIRT model that allows for variation of the mixing proportions across clusters in a hierarchically organized sample. We applied the proposed models to the PISA-I-Plus data for assessing learning and change across multiple subpopulations. The results of this study support the hypothesis of differential growth.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling competence development in the presence of selection bias

Article Open access 15 February 2018

Bayesian longitudinal item response modeling with restricted covariance pattern structures

Article 25 October 2014

Application of Multilevel Models to International Large-Scale Student Assessment Data

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Article Google Scholar
Andersen, E.B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.
Article Google Scholar
Andrade, D.F., & Tavares, H.R. (2005). Item response theory for longitudinal data: population parameter estimation. Journal of Multivariate Analysis, 95, 1–22.
Article Google Scholar
Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden, & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer.
Google Scholar
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.
Article Google Scholar
Draney, K., & Wilson, M. (2007). Application of the Saltus model to stage-like data: some applications and current developments. In M. von Davier, & C.H. Carstensen (Eds.), Multivariate and mixture distribution rasch models (pp. 119–130). New York: Springer.
Chapter Google Scholar
Embretson, S.E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.
Article Google Scholar
Embretson, S.E. (1997). Structured ability models in tests designed from cognitive theory. In M. Wilson, G. Engelhard, Jr., & K. Draney (Eds.), Objective measurement: theory into practice (Vol. 4, pp. 223–236). Greenwich: Ablex.
Google Scholar
Fischer, G.H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359–374.
Article Google Scholar
Fischer, G.H. (1976). Some probabilistic models for measuring change. In D.N.M. de Gruijter, & L.J.T. van der Kamp (Eds.), Advances in psychological and educational measurement (pp. 97–110). New York: Wiley.
Google Scholar
Fischer, G.H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487.
Article Google Scholar
Fischer, G.H. (2001). Gain scores revisited under an IRT perspective. In A. Boomsma, M.A.J. Van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 43–68). New York: Springer.
Google Scholar
Gilula, Z., & Haberman, S.J. (2001). Analysis of categorical response profiles by informative summaries. Sociological Methodology, 31, 193–211.
Article Google Scholar
Glück, J., & Spiel, C. (1997). Item response models for repeated measures designs: application and limitations of four different approaches. Methods of Psychological Research Online, 2(1), 1–18. Retrieved March 12, 2009, from http://www.dgps.de/fachgruppen/methoden/mpr-online/issue2/art6/article.html.
Google Scholar
Hsieh, C., Xu, X., & von Davier, M. (2009). Variance estimation for NAEP data using a resampling-based approach: an application of cognitive diagnostic models. In M. von Davier, & D. Hastedt (Eds.), IERI monograph series: Vol. 2. Issues and methodologies in large scale assessments (pp. 161–174). Hamburg/Princeton: IEA-ETS Research Institute.
Google Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Google Scholar
Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Article Google Scholar
McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Book Google Scholar
Meiser, T., Hein-Eggers, M., Rompe, P., & Rudinger, G. (1995). Analyzing homogeneity and heterogeneity of change using Rasch and latent class models: a comparative and integrative approach. Applied Psychological Measurement, 19(4), 377–391.
Article Google Scholar
Mislevy, R.J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177–196.
Article Google Scholar
Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–177.
Article Google Scholar
Organisation for Economic Co-operation and Development (2003). The PISA 2003 assessment framework: mathematics, reading, science and problem solving knowledge and skills. Paris: Author.
Organisation for Economic Co-operation and Development (2004). Learning for tomorrow’s world: first results from PISA 2003. Paris: Author.
Prenzel, M., Carstensen, C.H., Schöps, K., & Maurischat, C. (2006). Die Anlage des Längsschnitts bei PISA 2003 [The design of the longitudinal PISA assessment]. In M. Prenzel, J. Baumert, W. Blum, R. Lehmann, D. Leutner, M. Neubrand et al. (Eds.), PISA 2003: Untersuchungen zur Kompetenzentwicklung im Verlauf eines Schuljahres [Studies on the development of competencies over the course of a school year] (pp. 29–63). Münster: Waxmann.
Google Scholar
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press.
Google Scholar
Rijmen, F. (2009). Efficient full information maximum likelihood estimation for multidimensional IRT models (ETS Research Report No. RR-09-03). Princeton, NJ: ETS.
Rijmen, F., de Boeck, P., & Maas, H. (2005). An IRT model with a parameter-driven process for change. Psychometrika, 70, 651–669.
Article Google Scholar
Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282.
Article Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Article Google Scholar
Stanovich, K.E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21(4), 360–407.
Article Google Scholar
Vermunt, J.K. (2003). Multilevel latent class models. Sociological Methodology, 33, 213–239.
Article Google Scholar
von Davier, M. (2005). A general diagnostic model applied to language testing data (ETS Research Report No. RR-05-16). Princeton, NJ: ETS.
von Davier, M. (2007a). Mixture general diagnostic models (ETS Research Report No. RR-07-32). ETS: Princeton, NJ.
von Davier, M. (2007b). Hierarchical mixtures of diagnostic models (ETS Research Report No. RR-07-19). ETS: Princeton, NJ.
von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287–307.
Article Google Scholar
von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52(1), 8–28. Retrieved from http://www.psychologie-aktuell.com/fileadmin/download/ptam/1-2010/02_vonDavier.pdf.
Google Scholar
von Davier, M., & Rost, J. (1995). Polytomous mixed Rasch models. In G.H. Fischer, & I.W. Molenaar (Eds.), Rasch models: foundations, recent developments, and applications (pp. 371–379). New York: Springer.
Google Scholar
von Davier, M., & Rost, J. (2006). Mixture distribution item response models. In C.R. Rao, & S. Sinharay (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 643–661). Amsterdam: Elsevier.
Google Scholar
von Davier, M., & Sinharay, S. (2007). An importance sampling EM algorithm for latent regression models. Journal of Educational and Behavioral Statistics, 32(3), 233–251.
Article Google Scholar
von Davier, M., & Sinharay, S. (2010). Stochastic approximation for latent regression item response models. Journal of Educational and Behavioral Statistics, 35(2), 174–193.
Article Google Scholar
von Davier, M., & von Davier, A. (2007). A unified approach to IRT scale linkage and scale transformations. Methodology, 3(3), 115–124.
Google Scholar
von Davier, M., & Yamamoto, K. (2004). A class of models for cognitive diagnosis. Paper presented at the 4th Spearman invitational conference, October, Philadelphia, PA.
von Davier, M., & Yamamoto, K. (2007). Mixture distribution Rasch models and hybrid Rasch models. In M. von Davier, & C.H. Carstensen (Eds.), Multivariate and mixture distribution rasch models (pp. 99–115). New York: Springer.
Chapter Google Scholar
Walberg, H.J., & Tsai, S.-L. (1983). Matthew effects in education. American Educational Research Journal, 20, 359–373.
Google Scholar
Wilson, M. (1989). Saltus: a psychometric model for discontinuity in cognitive development. Psychological Bulletin, 105, 276–289.
Article Google Scholar
Wilson, M., & Draney, K. (1997). Partial credit in a developmental context: the case for adopting a mixture model approach. In M. Wilson, G. Engelhard, Jr., & K. Draney (Eds.), Objective measurement: theory into practice (Vol. 4, pp. 333–350). Greenwich: Ablex.
Google Scholar
Xu, X., & von Davier, M. (2006). Cognitive diagnosis for NAEP proficiency data (ETS Research Report No. RR-06-08). Princeton, NJ: ETS.
Xu, X., & von Davier, M. (2008). Comparing multiple-group multinomial loglinear models for multidimensional skill distributions in the general diagnostic model (ETS Research Report No. RR-08-35). Princeton, NJ: ETS.

Download references

Author information

Authors and Affiliations

ETS, Princeton, NJ, USA
Matthias von Davier & Xueli Xu
Bamberg University, Bamberg, Germany
Claus H. Carstensen

Authors

Matthias von Davier
View author publications
You can also search for this author in PubMed Google Scholar
Xueli Xu
View author publications
You can also search for this author in PubMed Google Scholar
Claus H. Carstensen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias von Davier.

Additional information

Any opinions expressed in this paper are those of the author(s) and not necessarily of Educational Testing Service.

Rights and permissions

Reprints and permissions

About this article

Cite this article

von Davier, M., Xu, X. & Carstensen, C.H. Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model. Psychometrika 76, 318–336 (2011). https://doi.org/10.1007/s11336-011-9202-z

Download citation

Received: 30 March 2009
Revised: 20 August 2010
Published: 02 February 2011
Issue Date: April 2011
DOI: https://doi.org/10.1007/s11336-011-9202-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model

Abstract

Access this article

Similar content being viewed by others

Modeling competence development in the presence of selection bias

Bayesian longitudinal item response modeling with restricted covariance pattern structures

Application of Multilevel Models to International Large-Scale Student Assessment Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model

Abstract

Access this article

Similar content being viewed by others

Modeling competence development in the presence of selection bias

Bayesian longitudinal item response modeling with restricted covariance pattern structures

Application of Multilevel Models to International Large-Scale Student Assessment Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation