, 72:287 | Cite as

A Hierarchical Framework for Modeling Speed and Accuracy on Test Items

  • Wim J. van der LindenEmail author


Current modeling of response times on test items has been strongly influenced by the paradigm of experimental reaction-time research in psychology. For instance, some of the models have a parameter structure that was chosen to represent a speed-accuracy tradeoff, while others equate speed directly with response time. Also, several response-time models seem to be unclear as to the level of parametrization they represent. A hierarchical framework for modeling speed and accuracy on test items is presented as an alternative to these models. The framework allows a “plug-and-play approach” with alternative choices of models for the response and response-time distributions as well as the distributions of their parameters. Bayesian treatment of the framework with Markov chain Monte Carlo (MCMC) computation facilitates the approach. Use of the framework is illustrated for the choice of a normal-ogive response model, a lognormal model for the response times, and multivariate normal models for their parameters with Gibbs sampling from the joint posterior distribution.

Key words

hierarchical modeling item-response theory Gibbs sampler Markov chain Monte Carlo estimation speed-accuracy tradeoff response times 


  1. Albert, J.H. (1992). Bayesian estimation of normal-ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics, 17, 261–269.CrossRefGoogle Scholar
  2. Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some fit analysis of multidimensional IRT models. Psychometrika, 66, 541–562.CrossRefGoogle Scholar
  3. Carlin, B.P., & Louis, T.A. (2000). Bayes and Empirical Bayes Methods for Data Analysis. Boca Raton, FL: Chapman & Hall.Google Scholar
  4. Douglas, J., Kosorok, M., & Chewning, B. (1999). A latent variable model for multivariate psychometric response times. Psychometrika, 64, 69–82.CrossRefGoogle Scholar
  5. Dubey, S.D. (1969). A new derivation of the logistic distribution. Naval Research Logistics Quarterly, 16, 37–40.Google Scholar
  6. Fox, J.P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271–288.CrossRefGoogle Scholar
  7. Glas, C.A.W., & van der Linden, W.J. (2006). Modeling variability in item parameters in item response models. Psychometrika. Submitted.Google Scholar
  8. Jansen, M.G.H. (1986). A Bayesian version of Rasch’s multiplicative Poisson model for the number of errors on achievement tests. Journal of Educational Statistics, 11, 51–65.CrossRefGoogle Scholar
  9. Jansen, M.G.H. (1997a). Rasch model for speed tests and some extensions with applications to incomplete designs. Journal of Educational and Behavioral Statistics, 22, 125–140.Google Scholar
  10. Jansen, M.G.H. (1997b). Rasch’s model for reading speed with manifest exploratory variables. Psychometrika, 62, 393–409.CrossRefGoogle Scholar
  11. Jansen, M.G.H., & Duijn, M.A.J. (1992). Extensions of Rasch’s multiplicative Poisson model. Psychometrika, 57, 405–414.CrossRefGoogle Scholar
  12. Johnson, V.E., & Albert, J.H. (1999). Ordinal Data Modeling. New York: Springer-Verlag.Google Scholar
  13. Luce, R.D. (1986). Response times: Their Roles in Inferring Elementary Mental Organization. Oxford, UK: Oxford University Press.Google Scholar
  14. Maris, E. (1993). Additive and multiplicative models for gamma distributed variables, and their application as psychometric models for response times. Psychometrika, 58, 445–469.CrossRefGoogle Scholar
  15. Oosterloo, S.J. (1975). Modellen voor Reaktie-tijden [Models for Reaction Times]. Unpublished master’s thesis, Faculty of Psychology, University of Groningen, The Netherlands.Google Scholar
  16. Patz, R.J., & Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Mulitple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 34, 342–366.Google Scholar
  17. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: The University of Chicago Press. (Original published in 1960).Google Scholar
  18. Roskam, E.E. (1987). Toward a psychometric theory of intelligence. In E.E. Roskam & R. Suck (Eds.), Progress in Mathematical Psychology (pp. 151–171). Amsterdam: North-Holland.Google Scholar
  19. Roskam, E.E. (1997). Models for speed and time-limit tests. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of Modern Item Response Theory (pp. 187–208). New York: Springer.Google Scholar
  20. Rouder, J.N., Sun, D., Speckman, P.L., Lu, J., & Zhou, D. (2003). A hierarchical Bayesian statistical framework for response time distributions. Psychometrika, 68, 589–606.CrossRefGoogle Scholar
  21. Scheiblechner, H. (1979). Specific objective stochastic latency mechanisms. Journal of Mathematical Psychology, 19, 18–38.CrossRefGoogle Scholar
  22. Scheiblechner, H. (1985). Psychometric models for speed-test construction: The linear exponential model. In S.E. Embretson (Ed.), Test design: Developments in psychology and education (pp. 219–244). New York: Academic Press.Google Scholar
  23. Schnipke, D.L., & Scrams, D.J. (1997). Modeling response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213–232.CrossRefGoogle Scholar
  24. Schnipke, D.L., & Scrams, D.J. (1999). Representing response time information in item banks (LSAC Computerized Testing Report No. 97-09). Newtown, PA: Law School Admission Council.Google Scholar
  25. Schnipke, D.L., & Scrams, D.J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C.N. Mills, M. Potenza, J.J. Fremer & W. Ward (Eds.), Computer-Based Testing: Building the Foundation for Future Assessments (pp. 237–266). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  26. Swanson, D.B., Featherman, C.M., Case, S.M., Luecht, R.M., & Nungester, R. (1999, March). Relationship of response latency to test design, examinee proficiency and item difficulty in computer-based test administration. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.Google Scholar
  27. Swanson, D.B., Case, S.E., Ripkey, D.R., Clauser, B.E., & Holtman, M.C. (2001). Relationships among item characteristics, examinee characteristics, and response times on USMLE, Step 1. Academic Medicine, 76, 114–116.CrossRefGoogle Scholar
  28. Tatsuoka, K.K., & Tatsuoka, M.M. (1980). A model for incorporating response-time data in scoring achievement tests. In D.J. Weiss (Ed.), Proceedings of the 1979 Computerized Adaptive Testing Conference (pp. 236–256). Minneapolis, MN: University of Minnesota, Department of Psychology, Psychometric Methods Program.Google Scholar
  29. Thissen, D. (1983). Timed testing: An approach using item response theory. In D.J. Weiss (Ed.), New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. New York: Academic Press.Google Scholar
  30. Townsend, J.T., & Ashby, F.G. (1983). Stochastic Modeling of Elementary Psychological Processes. Cambridge, UK: Cambridge University Press.Google Scholar
  31. van Breukelen, G.J.P. (2005). Psychometric modeling of response speed and accuracy with mixed and conditional regression. Psychometrika, 70, 359–376.CrossRefGoogle Scholar
  32. van der Linden, W.J. (2005). Linear Models for Optimal Test Design. New York: Springer-Verlag.Google Scholar
  33. van der Linden, W.J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204.CrossRefGoogle Scholar
  34. van der Linden, W.J. (2007a). Conceptual Issues in Response-Time Modeling. Submitted.Google Scholar
  35. van der Linden, W.J. (2007b). Using response times for item selection in adaptive tests. Journal of Educational and Behavioral Statistics, 32.Google Scholar
  36. van der Linden, W.J., & Guo, F. (2006). Two Bayesian Procedures for Identifying Aberrant Response-Time Patterns in Adaptive Testing. Manuscript submitted for publication.Google Scholar
  37. van der Linden, W.J., & Hambleton, R.K. (1997). Handbook of Modern Item Response Theory. New York: Springer-Verlag.Google Scholar
  38. van der Linden, W.J., Breithaupt, K., Chuah, S.C., & Zhang, Y. (2007). Detecting differential speededness in multistage testing. Journal of Educational Measurement, 44, in press.Google Scholar
  39. van der Linden, W.J., Klein Entink, R.H., & Fox, J.-P. (2006). IRT Parameter Estimation with Response Times as Collateral Information. Manuscript submitted for publication.Google Scholar
  40. van der Linden, W.J., Scrams, D.J., & Schnipke, D.L. (1999). Using response-time constraints to control for speededness in computerized adaptive testing. Applied Psychological Measurement, 23, 195–210.CrossRefGoogle Scholar
  41. Verhelst, N.D., Verstralen, H.H.F.M., & Jansen, M.G. (1997). A logistic model for time-limit tests. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of Modern Item Response Theory (pp. 169–185). New York: Springer-Verlag.Google Scholar

Copyright information

© The Psychometric Society 2007

Authors and Affiliations

  1. 1.Department of Research Methodology, Measurement, and Data AnalysisUniversity of TwenteAE EnschedeThe Netherlands

Personalised recommendations