A Generalized Speed–Accuracy Response Model for Dichotomous Items

van Rijn, Peter W.; Ali, Usama S.

doi:10.1007/s11336-017-9590-9

A Generalized Speed–Accuracy Response Model for Dichotomous Items

Published: 21 November 2017

Volume 83, pages 109–131, (2018)
Cite this article

Psychometrika Aims and scope Submit manuscript

904 Accesses
20 Citations
Explore all metrics

Abstract

We propose a generalization of the speed–accuracy response model (SARM) introduced by Maris and van der Maas (Psychometrika 77:615–633, 2012). In these models, the scores that result from a scoring rule that incorporates both the speed and accuracy of item responses are modeled. Our generalization is similar to that of the one-parameter logistic (or Rasch) model to the two-parameter logistic (or Birnbaum) model in item response theory. An expectation–maximization (EM) algorithm for estimating model parameters and standard errors was developed. Furthermore, methods to assess model fit are provided in the form of generalized residuals for item score functions and saddlepoint approximations to the density of the sum score. The presented methods were evaluated in a small simulation study, the results of which indicated good parameter recovery and reasonable type I error rates for the residuals. Finally, the methods were applied to two real data sets. It was found that the two-parameter SARM showed improved fit compared to the one-parameter SARM in both data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

Small is beautiful: In defense of the small-N design

Article Open access 19 March 2018

Notes

The executable and software manual are freely available for noncommercial use at sarm@ets.org.
This data set was suggested by an anonymous reviewer.
See http://hvandermaas.socsci.uva.nl/Homepage_Han_van_der_Maas/Chess_Psychology.html.

References

Andersen, E. B. (1973). Conditional inference and multiple choice questionnaires. British Journal of Mathematical and Statistical Psychology, 26, 31–44. https://doi.org/10.1111/j.2044-8317.1973.tb00504.x.
Article Google Scholar
Biehler, M., Holling, H., & Doebler, P. (2015). Saddlepoint approximations of the distribution of the person parameter in the two parameter logistic model. Psychometrika, 80, 665–688. https://doi.org/10.1007/s11336-014-9405-1.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores (pp. 397–472). Reading, MA: Addison-Wesley.
Google Scholar
Butler, R. W. (2007). Saddlepoint approximations with applications. Cambridge: Cambridge University Press.
Book Google Scholar
De Boeck, P., Chen, H., & Davison, M. (2017). Spontaneous and imposed speed of cognitive test responses. British Journal of Mathematical and Statistical Psychology, 70, 225–237. https://doi.org/10.1111/bmsp.12094.
Article PubMed Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39, 1–38.
Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modelling based on generalized linear models (2nd edn). Berlin: Springer. https://doi.org/10.1007/978-1-4757-3454-6.
Goldhammer, F. (2015). Measuring, ability, speed, or both? Challenges, psychometric solutions, and what can be gained from experimental control. Measurement, 13, 133–164. https://doi.org/10.1080/15366367.2015.1100020.
PubMed PubMed Central Google Scholar
Haberman, S. J. (2006). Joint and conditional estimation for implicit models for tests with polytomous item scores (ETS Research Report RR-06-03). Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/j.2333-8504.2006.tb02009.x.
Haberman, S. J. (2013). A general program for item-response analysis that employs the stabilized Newton–Raphson algorithm (ETS research report RR-13-32). Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/j.2333-8504.2013.tb02339.x.
Haberman, S. J. (2016). Exponential family distributions relevant to IRT. In W. J. van der Linden (Ed.), Handbook of item response theory, volume two: Statistical tools (pp. 47–70). Boca Raton, FL: CRC Press.
Haberman, S. J., & Sinharay, S. (2013). Generalized residuals for general models for contingency tables with application to item response theory. Journal of the American Statistical Association, 108, 1435–1444. https://doi.org/10.1080/01621459.2013.835660.
Article Google Scholar
Haberman, S. J., Sinharay, S., & Chon, K. H. (2013). Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions. Psychometrika, 78, 417–440. https://doi.org/10.1007/s11336-012-9305-1.
Article PubMed Google Scholar
Hooker, G., Finkelman, M., & Schwartzman, A. (2009). Paradoxical results in multidimensional item response theory. Psychometrika, 74, 419–442. https://doi.org/10.1007/S11336-009-9111-6.
Article Google Scholar
Kim, S. (2012). A note on the reliability coefficients for item response model-based ability estimates. Psychometrika, 77, 153–162. https://doi.org/10.1007/s11336-011-9238-0.
Article Google Scholar
Kim, S. (2013). Generalization of the Lord–Wingersky algorithm to computing the distributions of summed test scores based on real-number item scores. Journal of Educational Measurement, 50, 381–389.
Article Google Scholar
Lee, Y. H., & Chen, H. (2011). A review of recent response-time analyses in educational testing. Psychological Test and Assessment Modeling, 3, 359–379.
Google Scholar
Lord, F. M. (1975). Formula scoring and number right scoring. Journal of Educational Measurement, 12, 7–11. https://doi.org/10.1111/j.1745-3984.1975.tb01003.x.
Article Google Scholar
Lord, F. M., & Wingersky, M. S. (1984). Comparison of “IRT” true-score and equipercentile observed-score equatings. Applied Psychological Measurement, 8, 453–461.
Article Google Scholar
Louis, T. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 44, 226–233. https://doi.org/10.2307/2345828.
Google Scholar
Luce, R. D. (1986). Response times. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195070019.001.0001.
Maris, G., & van der Maas, H. L. J. (2012). Speed-accuracy response models: Scoring rules based on response time and accuracy. Psychometrika, 77, 615–633. https://doi.org/10.1007/s11336-012-9288-y.
Article Google Scholar
Marsman, M. (2014). Plausible values in statistical inference. Doctoral dissertation, University of Twente, Enschede.
Meng, X. L., & Rubin, D. (1991). Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm. Journal of the American Statistical Association, 86, 899–909.
Article Google Scholar
Naylor, J. C., & Smith, A. F. M. (1982). Applications of a method for efficient computation of posterior distributions. Applied Statistics, 31, 214–225. https://doi.org/10.2307/2347995.
Article Google Scholar
Ranger, J., & Kuhn, J. T. (2012). A flexible latent trait model for response times in tests. Psychometrika, 77, 31–47. https://doi.org/10.1007/s11336-011-9231-7.
Article Google Scholar
Ranger, J., Kuhn, J. T., & Gaviria, J. L. (2015). A race model for responses and response times in tests. Psychometrika, 80, 791–810. https://doi.org/10.1007/s11336-014-9427-8.
Article PubMed Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedagogike Institut.
Google Scholar
Roskam, E. E. (1997). Models for speed and time-limit tests. In R. K. Hambleton & W. J. van der Linden (Eds.), Handbook of modern item response theory (pp. 187–208). New York: Springer.
Chapter Google Scholar
Rouder, J. N., Sun, D., Speckman, P. L., Lu, J., & Zhou, D. (2003). A hierarchical Bayesian statistical framework for response time distributions. Psychometrika, 68, 589–606.
Article Google Scholar
Spearman, C. (1927). The abilities of men. London: MacMillan.
Google Scholar
Thurstone, L. L. (1919). A scoring method for mental tests. Psychological Bulletin, 16, 235–240.
Article Google Scholar
Thurstone, L. L. (1937). Ability, motivation, and speed. Psychometrika, 2, 249–254.
Article Google Scholar
Tuerlinckx, F., & de Boeck, P. (2005). Two interpretations of the discrimination parameter. Psychometrika, 70, 629–650. https://doi.org/10.1007/s11336-000-0810-3.
Article Google Scholar
Tuerlinckx, F., Molenaar, D., & van der Maas, H. L. J. (2016). Diffusion-based item response modeling. In W. J. van der Linden (Ed.), Handbook of item response theory (Vol. 1, pp. 283–302). Boca Raton, FL: Chapman & Hall/CRC Press.
Google Scholar
van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308. https://doi.org/10.1007/s11336-006-1478-z.
Article Google Scholar
van der Linden, W. J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33, 5–20. https://doi.org/10.3102/1076998607302626.
Article Google Scholar
van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46, 247–272.
Article Google Scholar
van der Maas, H., & Wagenmakers, E. J. (2005). A psychometric analysis of chess expertise. American Journal of Psychology, 118, 29–60.
PubMed Google Scholar
van Rijn, P. W., & Ali, U. S. (2017). A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing. British Journal of Mathematical and Statistical Psychology, 70, 317–345. https://doi.org/10.1111/bmsp.12101.
Article PubMed Google Scholar
van Rijn, P. W., & Ali, U. S. (2018, in press). SARM: A computer program for estimating speed-accuracy response models (ETS Research Report). Princeton, NJ: Educational Testing Service.
van Rijn, P. W., & Rijmen, F. (2015). On the explaining-away phenomenon in multivariate latent variable models. British Journal of Mathematical and Statistical Psychology, 68, 1–22. https://doi.org/10.1111/bmsp.12046.
Article PubMed Google Scholar
Yuan, K. H., Cheng, Y., & Patton, J. (2014). Information matrices and standard errors for MLEs of item parameters in IRT. Psychometrika, 79, 232–254. https://doi.org/10.1007/S11336-013-9334-4.
Article PubMed Google Scholar

Download references

Acknowledgements

Funding was provided by Educational Testing Service. The authors would like to thank Rebecca Zwick, Yi-Hsuan Lee, Fred Robin, and three anonymous reviewers for their comments on earlier drafts of the paper.

Author information

Authors and Affiliations

ETS Global, Amsterdam, The Netherlands
Peter W. van Rijn
Educational Testing Service, Princeton, NJ, USA
Usama S. Ali
South Valley University, Qena, Egypt
Usama S. Ali

Authors

Peter W. van Rijn
View author publications
You can also search for this author in PubMed Google Scholar
Usama S. Ali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter W. van Rijn.

Rights and permissions

Reprints and permissions

About this article

Cite this article

van Rijn, P.W., Ali, U.S. A Generalized Speed–Accuracy Response Model for Dichotomous Items. Psychometrika 83, 109–131 (2018). https://doi.org/10.1007/s11336-017-9590-9

Download citation

Received: 15 October 2015
Published: 21 November 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s11336-017-9590-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Generalized Speed–Accuracy Response Model for Dichotomous Items

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Small is beautiful: In defense of the small-N design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Generalized Speed–Accuracy Response Model for Dichotomous Items

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Small is beautiful: In defense of the small-N design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation