, Volume 82, Issue 4, pp 1126–1148 | Cite as

Modelling Conditional Dependence Between Response Time and Accuracy

  • Maria BolsinovaEmail author
  • Paul de Boeck
  • Jesper Tijmstra


The assumption of conditional independence between response time and accuracy given speed and ability is commonly made in response time modelling. However, this assumption might be violated in some cases, meaning that the relationship between the response time and the response accuracy of the same item cannot be fully explained by the correlation between the overall speed and ability. We propose to explicitly model the residual dependence between time and accuracy by incorporating the effects of the residual response time on the intercept and the slope parameter of the IRT model for response accuracy. We present an empirical example of a violation of conditional independence from a low-stakes educational test and show that our new model reveals interesting phenomena about the dependence of the item properties on whether the response is relatively fast or slow. For more difficult items responding slowly is associated with a higher probability of a correct response, whereas for the easier items responding slower is associated with a lower probability of a correct response. Moreover, for many of the items slower responses were less informative for the ability because their discrimination parameters decrease with residual response time.


response times hierarchical model conditional independence item response theory 

Supplementary material

11336_2016_9537_MOESM1_ESM.txt (50 kb)
Supplementary material 1 (txt 50 KB)


  1. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading: Addison-Wesley.Google Scholar
  2. Bloxom, B. (1985). Considerations in psychometric modeling of response time. Psychometrika, 50(4), 383–397.CrossRefGoogle Scholar
  3. Bolsinova, M., & Maris, G. (2016). A test for conditional independence between response time and accuracy. British Journal of Mathematical and Statistical Psychology, 69, 62–79.CrossRefGoogle Scholar
  4. Bolsinova, M., & Tijmstra, J. (2016). Posterior predictive checks for conditional independence between response time and accuracy. Journal of Educational and Behavioural Statistics, 41, 123–145.CrossRefGoogle Scholar
  5. Brooks, S., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434–455.Google Scholar
  6. Casella, G., & George, E. (1992). Explaining the Gibbs sampler. The American Statistician, 43(3), 167–174.Google Scholar
  7. Coyle, T. (2003). A review of the worst performance rule: Evidence, theory, and alternative hypotheses. Intelligence, 31(6), 567–587.CrossRefGoogle Scholar
  8. Gelman, A., Meng, X.-L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733–807.Google Scholar
  9. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457–472.CrossRefGoogle Scholar
  10. Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.CrossRefPubMedGoogle Scholar
  11. Goldhammer, F., & Klein Entink, R. (2011). Speed of reasoning and its relation to reasoning ability. Intelligence, 39, 108–119.CrossRefGoogle Scholar
  12. Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608–626.CrossRefGoogle Scholar
  13. Goldhammer, F., Naumann, J., & Greiff, S. (2015). More is not always better: The relation between item response and item response time in Raven’s matrices.Journal of. Intelligence, 3(1), 21–40.CrossRefGoogle Scholar
  14. Hoff, P. D. (2009). A first course in Bayesian statistical methods. New York: Springer.CrossRefGoogle Scholar
  15. Huang, A., & Wand, M. (2013). Simple marginally noninformative prior distributions for covariance matrices. Bayesian Analysis, 8(2), 439–452.CrossRefGoogle Scholar
  16. Klein Entink, R., Kuhn, J., Hornke, L., & Fox, J. P. (2009). Evaluating cognitive theory: A joint modeling approach using responses and response times. Psychological methods, 14(1), 54–75.CrossRefPubMedGoogle Scholar
  17. Loeys, T., Rossel, Y., & Baten, K. (2011). A joint modelling approach for reaction time and accuracy in psycholinguistic experiments. Psychometrika, 76(3), 487–503.CrossRefGoogle Scholar
  18. Luce, R. D. (1986). Response times: Their role in inferring elementary mental organization. New York: Oxford University Press.Google Scholar
  19. Marsman, M., Maris, G., Bechger, T., & Glas, C. A. (2014). Composition algorithms for conditional distributions. Manuscript submitted for publication.Google Scholar
  20. Meng, X. L. (1994). Posterior predictive p-values. The Annals of Statistics, 22(3), 1142–1160.CrossRefGoogle Scholar
  21. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., & Teller, E. (1953). Equations of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092.CrossRefGoogle Scholar
  22. Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40, 23–32.CrossRefGoogle Scholar
  23. Petscher, Y., Mitchell, A., & Foorman, B. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and writing, 28(1), 31–56.CrossRefPubMedGoogle Scholar
  24. R Development Core Team. (2006). R: A language and environment for statistical computing. Vienna: Austria R Foundation for Statistical Computing.Google Scholar
  25. Ranger, J., & Ortner, T. (2012). The case of dependency of responses and response times: A modeling approach based on standard latent trait models. Psychological Test and Assessment Modeling, 54(2), 128–148.Google Scholar
  26. Roskam, E. E. (1987). Toward a psychometric theory of intelligence. In E. E. Roskam & R. Suck (Eds.), Progress in mathematical psychology (pp. 151–171). Amsterdam: North-Holland.Google Scholar
  27. Scherer, R., Greiff, S., & Hautamäki, J. (2015). Exploring the relation between time on task and ability in complex problem solving. Intelligence, 48, 37–50.CrossRefGoogle Scholar
  28. Sinharay, S., Johnson, M., & Stern, H. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.CrossRefGoogle Scholar
  29. Spiegelhalter, D. J., Best, N. G., Carlin, B., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583–639.CrossRefGoogle Scholar
  30. van Breukelen, G. J. P. (2005). Psychometric modeling of response speed and accuracy with mixed and conditional regression. Psychometrika, 70(2), 359–376.CrossRefGoogle Scholar
  31. van Breukelen, G., & Roskam, E. (1991). A Rasch model for the speed-accuracy trade of in time-limited tests. In J. Doignon & R. J. Falmagne (Eds.), Mathematical psychology: Current developments (pp. 251–271). New York: Springer.CrossRefGoogle Scholar
  32. van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204.CrossRefGoogle Scholar
  33. van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308. doi: 10.1007/s11336-006-1478-z.CrossRefGoogle Scholar
  34. van der Linden, W. J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33(1), 5–20.CrossRefGoogle Scholar
  35. van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46(3), 247–272.CrossRefGoogle Scholar
  36. van der Linden, W. J., & Glas, C. A. W. (2010). Statistical tests of conditional independence between responses and/or response times on test items. Psychometrika, 75, 120–139.CrossRefGoogle Scholar
  37. van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384.CrossRefGoogle Scholar
  38. Verhelst, N. D., Verstralen, H. H. F. M., & Jansen, M. G. (1997). A logistic model for time-limit tests. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 169–185). New York: Springer.CrossRefGoogle Scholar
  39. Wang, T. (2006). A model for the joint distribution of item response and response time using one-parameter Weibull distribution (CASMA Research Report 20). Iowa City: IA Center for Advanced Studies in Measurement and Assessment.Google Scholar
  40. Wang, T., & Hanson, B. A. (2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29, 323–339.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2016

Authors and Affiliations

  • Maria Bolsinova
    • 1
    • 2
    • 3
    Email author
  • Paul de Boeck
    • 4
    • 5
  • Jesper Tijmstra
    • 6
  1. 1.Utrecht UniversityUtrechtThe Netherlands
  2. 2.CITO, Dutch National Institute for Educational MeasurementArnhemThe Netherlands
  3. 3.Department of PsychologyUniversity of AmsterdamAmsterdamThe Netherlands
  4. 4.Ohio State UniversityColumbusUSA
  5. 5.KU LeuvenLeuvenBelgium
  6. 6.Tilburg UniversityTilburgThe Netherlands

Personalised recommendations