Behavior Research Methods

, Volume 46, Issue 3, pp 823–840 | Cite as

A generalized longitudinal mixture IRT model for measuring differential growth in learning environments

  • Damazo T. Kadengye
  • Eva Ceulemans
  • Wim Van den Noortgate


This article describes a generalized longitudinal mixture item response theory (IRT) model that allows for detecting latent group differences in item response data obtained from electronic learning (e-learning) environments or other learning environments that result in large numbers of items. The described model can be viewed as a combination of a longitudinal Rasch model, a mixture Rasch model, and a random-item IRT model, and it includes some features of the explanatory IRT modeling framework. The model assumes the possible presence of latent classes in item response patterns, due to initial person-level differences before learning takes place, to latent class-specific learning trajectories, or to a combination of both. Moreover, it allows for differential item functioning over the classes. A Bayesian model estimation procedure is described, and the results of a simulation study are presented that indicate that the parameters are recovered well, particularly for conditions with large item sample sizes. The model is also illustrated with an empirical sample data set from a Web-based e-learning environment.


Item response theory E-learning Modeling of growth Mixture models 


Author Note

Kind acknowledgments to Han L. J. van der Maas, University of Amsterdam, and to for providing the data set from the Maths Garden learning environment. We also thank two anonymous reviewers and Editor in Chief Gregory Francis for their insightful comments and suggestions.


  1. Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16. doi: 10.1007/BF02294143 CrossRefGoogle Scholar
  2. Andrade, D. F., & Tavares, H. R. (2005). Item response theory for longitudinal data: Population parameter estimation. Journal of Multivariate Analysis, 95, 1–22. doi: 10.1016/j.jmva.2004.07.005 CrossRefGoogle Scholar
  3. Bilir, M. K. (2009, July 29). Mixture item response theory-Mimic Model: Simultaneous estimation of differential item functioning for manifest groups and latent classes. Florida State University. Retrieved from
  4. Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2001). A mixture item response model for multiple-choice data. Journal of Educational and Behavioral Statistics, 26, 381–409. doi: 10.3102/10769986026004381 CrossRefGoogle Scholar
  5. Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008). Spacing effects in learning. A temporal ridgeline of optimal retention. Psychological Science, 19, 1095–1102. doi: 10.1111/j.1467-9280.2008.02209.x PubMedCrossRefGoogle Scholar
  6. Cho, S.-J., Athay, M., & Preacher, K. J. (2013a). Measuring change for a multidimensional test using a generalized explanatory longitudinal item response model. British Journal of Mathematical and Statistical Psychology, 66, 353–381. doi: 10.1111/j.2044-8317.2012.02058.x PubMedCrossRefGoogle Scholar
  7. Cho, S.-J., Bottge, B., Cohen, A. S., & Kim, S.-H. (2011). Detecting cognitive change in the math skills of low-achieving adolescents. Journal of Special Education, 45, 67–76.CrossRefGoogle Scholar
  8. Cho, S.-J., & Cohen, A. S. (2010). A multilevel mixture IRT model with an application to DIF. Journal of Educational and Behavioral Statistics, 35, 336–370. doi: 10.3102/1076998609353111 CrossRefGoogle Scholar
  9. Cho, S.-J., Cohen, A. S., & Bottge, B. (2013b). Detecting intervention effects using a multilevel latent transition analysis with a mixture IRT model. Psychometrika, 78, 576–600.CrossRefGoogle Scholar
  10. Cho, S.-J., Cohen, A. S., & Kim, S.-H. (2006, June). An investigation of priors on the probabilities of mixtures in the mixture Rasch model. Paper presented at the Annual Meeting of the Psychometric Society, Montreal, Canada.Google Scholar
  11. Cho, S.-J., Cohen, A. S., Kim, S.-H., & Bottge, B. (2010). Latent transition analysis with a mixture IRT measurement model. Applied Psychological Measurement, 34, 583–604.CrossRefGoogle Scholar
  12. Cohen, A. S., & Bolt, D. M. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42, 133–148. doi: 10.1111/j.1745-3984.2005.00007 CrossRefGoogle Scholar
  13. Dai, Y., & Mislevy, R. (2009). A mixture Rasch model with a covariate. A simulation study via Bayesian Markoc Chian Monte Carlo estimation. University of Maryland. Retrieved from
  14. De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559. doi: 10.1007/s11336-008-9092-x CrossRefGoogle Scholar
  15. De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York, NY: Springer.CrossRefGoogle Scholar
  16. DeMars, C. E., & Lau, A. (2011). Differential item functioning detection with latent classes. How accurately can we detect who is responding differentially? Educational and Psychological Measurement, 71, 597–616. doi: 10.1177/0013164411404221 CrossRefGoogle Scholar
  17. Desmet, P., Paulussen, H., & Wylin, B. (2006). FRANEL: A public online language learning environment, based on broadcast material. In E. Pearson & P. Bohman (Eds.), Proceedings of the World Conference on Educational Multimedia, Hypermedia and Telecommunications (pp. 2307–2308). Chesapeake VA: AACE. Retrieved from Google Scholar
  18. Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515. doi: 10.1007/BF02294487 CrossRefGoogle Scholar
  19. Finch, W. H., & Pierson, E. E. (2011). A mixture IRT analysis of risky youth behavior. Frontiers in Quantitative Psychology and Measurement, 2, 1–10. doi: 10.3389/fpsyg.2011.00098 Google Scholar
  20. Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika, 54, 599–624. doi: 10.1007/BF02296399 CrossRefGoogle Scholar
  21. Fischer, G. H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487. doi: 10.1007/BF02294324 CrossRefGoogle Scholar
  22. Frederickx, S., Tuerlinckx, F., De Boeck, P., & Magis, D. (2010). RIM: A Random Item Mixture Model to detect differential item functioning. Journal of Educational Measurement, 47, 432–457. doi: 10.1111/j.1745-3984.2010.00122.x CrossRefGoogle Scholar
  23. Kamata, A. (2001). Item analysis by the hierarchical generalized linear model. Journal of Educational Measurement, 38, 79–93. doi: 10.1111/j.1745-3984.2001.tb01117.x CrossRefGoogle Scholar
  24. Klinkenberg, S., Straatemeier, M., & Van der Maas, H. L. J. (2011). Computer adaptive practice of maths ability using a new item response model for on the fly ability and difficulty estimation. Computers & Education, 57, 1813–1824. doi: 10.1016/j.compedu.2011.02.003 CrossRefGoogle Scholar
  25. Li, F., Cohen, A. S., Kim, S.-H., & Cho, S.-J. (2009). Model selection methods for mixture dichotomous IRT models. Applied Psychological Measurement, 33, 353–373.CrossRefGoogle Scholar
  26. Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000). WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325–337. doi: 10.1023/A:1008929526011 CrossRefGoogle Scholar
  27. Miceli, R., Settanni, M., & Vidotto, G. (2008). Measuring change in training programs: An empirical illustration. Psychology Science Quarterly, 50, 433–447.Google Scholar
  28. Mislevy, R., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195–215. doi: 10.1007/BF02295283 CrossRefGoogle Scholar
  29. Muthén, B., & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational and Behavioral Statistics, 10, 133–142. doi: 10.3102/10769986010002133 Google Scholar
  30. Paek, I., Baek, S.-G., & Wilson, M. (2012). An IRT modeling of change over time for repeated measures item response data using a random weights linear logistic test model approach. Asia Pacific Education Review, 13, 487–494. doi: 10.1007/s12564-012-9210-4 CrossRefGoogle Scholar
  31. R Development Core Team. (2012). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from Google Scholar
  32. Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004a). Generalized multilevel structural equation modelling. Psychometrika, 69, 167–190.CrossRefGoogle Scholar
  33. Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004b). GLLAMM manual (U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160). Retrieved from
  34. Roberts, J. S., & Ma, Q. (2006). IRT models for the assessment of change across repeated measurements. In R. W. Lissitz (Ed.), Longitudinal and value added modeling of student performance (pp. 100–127). Maple Grove, MN: JAM Press.Google Scholar
  35. Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282. doi: 10.1177/014662169001400305 CrossRefGoogle Scholar
  36. Safer, N., & Fleischman, S. (2005). How student progress monitoring improves instruction. Educational Leadership, 62, 81–83.Google Scholar
  37. Samuelsen, K. M. (2008). Examining differential item functioning from a latent mixture perspective. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 177–197). Charlotte, NC: Information Age.Google Scholar
  38. Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society, Series B, 62, 795–809. doi: 10.1111/1467-9868.00265 CrossRefGoogle Scholar
  39. Van Den Noortgate, W., & De Boeck, P. (2005). Assessing and explaining differential item functioning using logistic mixed models. Journal of Educational and Behavioral Statistics, 30, 443–464. doi: 10.3102/10769986030004443 CrossRefGoogle Scholar
  40. Van Den Noortgate, W., De Boeck, P., & Meulders, M. (2003). Cross-classification multilevel logistic models in psychometrics. Journal of Educational and Behavioral Statistics, 28, 369–386. doi: 10.3102/10769986028004369 CrossRefGoogle Scholar
  41. Vlach, H. A., & Sandhofer, C. M. (2012). Distributing learning over time: The spacing effect in children’s acquisition and generalization of science concepts. Child Development, 83, 1137–1144.PubMedCentralPubMedCrossRefGoogle Scholar
  42. von Davier, M., Xu, X., & Carstensen, C. H. (2009). Using the general diagnostic model to measure learning and change in a longitudinal large-scale assessment (Technical Report No. ETS RR-09-28). Educational Testing Services. Retrieved from
  43. von Davier, M., & Yamamoto, K. (2007). Mixture-distribution and HYBRID Rasch models. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (Statistics for Social and Behavioral Sciences, pp. 99–115). New York, NY: Springer.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2013

Authors and Affiliations

  • Damazo T. Kadengye
    • 1
    • 2
    • 3
  • Eva Ceulemans
    • 2
  • Wim Van den Noortgate
    • 1
    • 2
  1. 1.Faculty of Psychology and Educational Sciences and ITEC–iMindsUniversity of Leuven–KulakKortrijkBelgium
  2. 2.Centre for Methodology of Educational ResearchUniversity of LeuvenLeuvenBelgium
  3. 3.Faculty of Psychology and Educational SciencesKU Leuven–KulakKortrijkBelgium

Personalised recommendations