A Hierarchical Model for Accuracy and Choice on Standardized Tests
This paper assesses the psychometric value of allowing test-takers choice in standardized testing. New theoretical results examine the conditions where allowing choice improves score precision. A hierarchical framework is presented for jointly modeling the accuracy of cognitive responses and item choices. The statistical methodology is disseminated in the ‘cIRT’ R package. An ‘answer two, choose one’ (A2C1) test administration design is introduced to avoid challenges associated with nonignorable missing data. Experimental results suggest that the A2C1 design and payout structure encouraged subjects to choose items consistent with their cognitive trait levels. Substantively, the experimental data suggest that item choices yielded comparable information and discrimination ability as cognitive items. Given there are no clear guidelines for writing more or less discriminating items, one practical implication is that choice can serve as a mechanism to improve score precision.
Keywordshigh-stakes testing item response theory Thurstonian models Bayesian statistics choice
This research was possible with a grant from the Illinois Campus Research Board. The authors acknowledge undergraduate research assistants Yusheng Feng, Simon Gaberov, Kulsumjeham Siddiqui, and Darren Ward for assistance with data collection.
- Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434–455.Google Scholar
- Carmona, R. (2009). Indifference pricing: Theory and applications. Princeton, NJ: Princeton University Press.Google Scholar
- Culpepper, S.A. (2015). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika.Google Scholar
- Guay, R. (1976). Purdue spatial visualization test. West Layfette, IN: Purdue University.Google Scholar
- Maeda, Y., Yoon, S. Y., Kim-Kang, G., & Imbrie, P. (2013). Psychometric properties of the revised PSVT: R for measuring first year engineering students’ spatial ability. International Journal of Engineering Education, 29(3), 763–776.Google Scholar
- Wang, X.B. (1992). Achieving equity in self-selected subsets of test items (Unpublished doctoral dissertation). University of Hawaii.Google Scholar
- Yoon, S.Y. (2011). Psychometric properties of the Revised Purdue Spatial Visualization tests: Visualization of rotations (the revised PSVT-R) (Unpublished doctoral dissertation). Purdue University.Google Scholar