, Volume 83, Issue 2, pp 279–297 | Cite as

Response Mixture Modeling: Accounting for Heterogeneity in Item Characteristics across Response Times

  • Dylan MolenaarEmail author
  • Paul de Boeck


In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject’s response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.


item response theory response time modeling mixture modeling 

Supplementary material (6 kb)
Supplementary material 1 (zip 6 KB)


  1. Bacci, S., Pandolfi, S., & Pennoni, F. (2014). A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Advances in Data Analysis and Classification, 8(2), 125–145.CrossRefGoogle Scholar
  2. Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: An unified approach (Vol. 904). Hoboken: Wiley.CrossRefGoogle Scholar
  3. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In E. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (chap) (pp. 17–20). Reading, MA: Addison Wesley.Google Scholar
  4. Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.CrossRefGoogle Scholar
  5. Bolsinova, M., & Maris, G. (2016). A test for conditional independence between response time and accuracy. British Journal of Mathematical and Statistical Psychology, 69(1), 62–79.CrossRefGoogle Scholar
  6. Bolsinova, M., & Tijmstra, J. (2016). Posterior predictive checks for conditional independence between response time and accuracy. Journal of Educational and Behavioral Statistics, 41(2), 123–145.CrossRefGoogle Scholar
  7. Bolsinova, M., de Boeck, P., & Tijmstra, J. (in press). Modelling conditional dependence between response time and accuracy. Psychometrika.Google Scholar
  8. Bolsinova, M., Tijmstra, J., Molenaar, D., & De Boeck, P. (2017). Conditional dependence between response time and accuracy: An overview of its possible sources and directions for distinguishing between them. Frontiers in Psychology, 8, 202.CrossRefPubMedPubMedCentralGoogle Scholar
  9. Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices Test. Psychological Review, 97, 404–431.CrossRefPubMedGoogle Scholar
  10. Celeux, G., Forbes, F., Robert, C. P., & Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1(4), 651–673.CrossRefGoogle Scholar
  11. Cohen, J. (1983). The cost of dichotomization. Applied Psychological Measurement, 7, 249–253.CrossRefGoogle Scholar
  12. De Boeck, P., Chen, H., & Davison, M. (2017). Spontaneous and imposed speed of cognitive test responses. British Journal of Mathematical and Statistical Psychology, 70(2), 225–237.CrossRefPubMedGoogle Scholar
  13. De Boeck, P., & Partchev, I. (2012). IRTrees: Tree-based item response models of the GLMM family. Journal of Statistical Software, 48(1), 1–28.Google Scholar
  14. DiTrapani, J., Jeon, M., De Boeck, P., & Partchev, I. (2016). Attempting to differentiate fast and slow intelligence: Using generalized item response trees to examine the role of speed on intelligence tests. Intelligence, 56, 82–92.CrossRefGoogle Scholar
  15. Dolan, C. V., Colom, R., Abad, F. J., Wicherts, J. M., Hessen, D. J., & van de Sluis, S. (2006). Multi- group covariance and mean structure modeling of the relationship between the WAIS-III common factors and sex and educational attainment in Spain. Intelligence, 34, 193–210.CrossRefGoogle Scholar
  16. Dolan, C. V., & van der Maas, H. L. (1998). Fitting multivariage normal finite mixtures subject to structural equation modeling. Psychometrika, 63(3), 227–253.CrossRefGoogle Scholar
  17. Ericsson, K. A., & Staszewski, J. J. (1989). Skilled memory and expertise: Mechanisms of exceptional performance. In D. Klahr & K. Kotovsky (Eds.), Complex information processing: The impact of Herbert A. Simon. NY: Hillsdale: Erlbaum.Google Scholar
  18. Ferrando, P. J., & Lorenzo-Seva, U. (2007a). An item response theory model for incorporating response time data in binary personality items. Applied Psychological Measurement, 31, 525–543.CrossRefGoogle Scholar
  19. Ferrando, P. J., & Lorenzo-Seva, U. (2007b). A measurement model for Likert responses that incorporates response time. Multivariate Behavioral Research, 42, 675–706.CrossRefGoogle Scholar
  20. Fox, J. P., Klein Entink, R., & Linden, W. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20(7), 1–14.CrossRefGoogle Scholar
  21. Grabner, R. H., Ansari, D., Koschutnig, K., Reishofer, G., Ebner, F., & Neuper, C. (2009). To retrieve or to calculate? Left angular gyrus mediates the retrieval of arithmetic facts during problem solving. Neuropsychologia, 47(2), 604–608.CrossRefPubMedGoogle Scholar
  22. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–511.CrossRefGoogle Scholar
  23. Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (1996). Introducing markov chain monte carlo. In: Markov chain Monte Carlo in practice (pp. 1–19). US: Springer.Google Scholar
  24. Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights From a computer-based large-scale assessment. Journal of Educational Psychology, 106, 608–626.CrossRefGoogle Scholar
  25. Hall, D. B. (2000). Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics, 56(4), 1030–1039.CrossRefPubMedGoogle Scholar
  26. Jansen, B. R., & Van der Maas, H. L. (2001). Evidence for the phase transition from Rule I to Rule II on the balance scale task. Developmental Review, 21(4), 450–494.CrossRefGoogle Scholar
  27. Jeon, M., & De Boeck, P. (2016). A generalized item response tree model for psychological assessments. Behavior Research Methods, 48(3), 1070–1085.CrossRefPubMedGoogle Scholar
  28. Larson, G. E., & Alderton, D. L. (1990). Reaction time variability and intelligence: A “worst performance” analysis of individual differences. Intelligence, 14(3), 309–325.CrossRefGoogle Scholar
  29. Lubke, G. H., & Muthén, B. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10(1), 21.CrossRefPubMedGoogle Scholar
  30. Luce, R. D. (1986). Response times. New York: Oxford University Press.Google Scholar
  31. MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19–40.CrossRefPubMedGoogle Scholar
  32. McLeod, L., Lewis, C., & Thissen, D. (2003). A Bayesian method for the detection of item preknowledge in computerized adaptive testing. Applied Psychological Measurement, 27(2), 121–137.CrossRefGoogle Scholar
  33. Mellenbergh, G. J. (1994). Generalized linear item response theory. Psychological Bulletin, 115, 300–300.CrossRefGoogle Scholar
  34. Min, Y., & Agresti, A. (2005). Random effect models for repeated measures of zero-inflated count data. Statistical Modelling, 5(1), 1–19.CrossRefGoogle Scholar
  35. Molenaar, D., Oberski, D., Vermunt, J., & De Boeck, P. (2016). Hidden Markov item response theory models for responses and response times. Multivariate Behavioral Research, 51(5), 606–626.CrossRefPubMedGoogle Scholar
  36. Molenaar, D., Tuerlinckx, F., & van der Maas, H. L. J. (2015). A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times. Multivariate Behavioral Research, 50(1), 56–74.CrossRefPubMedGoogle Scholar
  37. Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40(1), 23–32.CrossRefGoogle Scholar
  38. Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing (DSC 2003) (pp. 20–22).Google Scholar
  39. Rabbitt, P. (1979). How old and young subjects monitor and control responses for accuracy and speed. British Journal of Psychology, 70, 305–311.CrossRefGoogle Scholar
  40. Ranger, J., & Ortner, T. M. (2011). Assessing personality traits through response latencies using item response theory. Educational and Psychological Measurement, 71(2), 389–406.CrossRefGoogle Scholar
  41. Rummel, R. J. (1970). Applied factor analysis. Evanston: Northwestern University Press.Google Scholar
  42. Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213–232.CrossRefGoogle Scholar
  43. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. Psychological Review, 84, 127.CrossRefGoogle Scholar
  44. Spiegelhalter, D. J., Thomas, A., Best, N. G., & Gilks, W. R. (1995). BUGS: Bayesian inference using Gibbs sampling, Version 0.50. Cambridge: MRC Biostatistics Unit.Google Scholar
  45. Spiegelhalter, D. J., Best, N. G.-, Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society Series B, 64, 583–640.CrossRefGoogle Scholar
  46. Stan Development Team. (2015). Stan modeling language users guide and reference manual. Version 2(9)Google Scholar
  47. Siegler, R. S. (1976). Three aspects of cognitive development. Cognitive Psychology, 8, 481–520.CrossRefGoogle Scholar
  48. Thissen, D. (1983). Timed testing: An approach using item response testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 179–203). New York: Academic Press.Google Scholar
  49. Thomas, A., Hara, B. O., Ligges, U., & Sturtz, S. (2006). Making BUGS open. R News, 6, 12–17.Google Scholar
  50. Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985). Statistical analysis of finite mixture distributions. Chicester: Wiley.Google Scholar
  51. Tuerlinckx, F., & De Boeck, P. (2005). Two interpretations of the discrimination parameter. Psychometrika, 70(4), 629–650.CrossRefGoogle Scholar
  52. Van Harreveld, F., Wagenmakers, E. J., & Van Der Maas, H. L. (2007). The effects of time pressure on chess skill: An investigation into fast and slow processes underlying expert performance. Psychological Research, 71, 591–597.CrossRefPubMedGoogle Scholar
  53. Van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308.CrossRefGoogle Scholar
  54. Van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46(3), 247–272.CrossRefGoogle Scholar
  55. Van der Linden, W. J., Breithaupt, K., Chuah, S. C., & Zhang, Y. (2007). Detecting differential speededness in multistage testing. Journal of Educational Measurement, 44(2), 117–130.CrossRefGoogle Scholar
  56. Van der Linden, W. J., & Glas, C. A. (2010). Statistical tests of conditional independence between responses and/or response times on test items. Psychometrika, 75, 120–139.CrossRefGoogle Scholar
  57. Van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384.CrossRefGoogle Scholar
  58. Van der Linden, W. J., Klein Entink, R. H., & Fox, J. P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34(5), 327–347.CrossRefGoogle Scholar
  59. van der Linden, W. J., & van Krimpen-Stoop, E. M. (2003). Using response times to detect aberrant responses in computerized adaptive testing. Psychometrika, 68(2), 251–265.CrossRefGoogle Scholar
  60. van der Maas, H. L., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological review, 118(2), 339–356.CrossRefPubMedGoogle Scholar
  61. Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68, 456–477.CrossRefPubMedGoogle Scholar
  62. Wang, C., Xu, G., & Shang, Z. (2016). A two-stage approach to differentiating normal and aberrant behavior in computer based testing. Psychometrika.
  63. Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable Information criterion in singular learning theory. Journal of Machine Learning Research, 11, 3571–3594.Google Scholar
  64. Yung, Y. F. (1997). Finite mixtures in confirmatory factor-analysis models. Psychometrika, 62(3), 297–330.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2018

Authors and Affiliations

  1. 1.Psychological Methods, Department of PsychologyUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.Ohio State UniversityColumbusUSA

Personalised recommendations