Abstract
Three plausible assumptions of conditional independence in a hierarchical model for responses and response times on test items are identified. For each of the assumptions, a Lagrange multiplier test of the null hypothesis of conditional independence against a parametric alternative is derived. The tests have closed-form statistics that are easy to calculate from the standard estimates of the person parameters in the model. In addition, simple closed-form estimators of the parameters under the alternatives of conditional dependence are presented, which can be used to explore model modification. The tests were applied to a data set from a large-scale computerized exam and showed excellent power to detect even minor violations of conditional independence.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aithchison, J., & Silvey, D.C. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics, 29, 813–828.
Bergstrom, B., Gershon, R., & Lunz, M.E. (1994). Computer-adaptive testing: exploring examinee response time using hierarchical linear modeling. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord & M.R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading: Addison-Wesley.
Chen, W.-H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289.
Fox, J.-P., Klein Entink, R.H., & van der Linden, W.J. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20(7), 1–14.
Glas, C.A.W. (1999). Modification indices for the 2PL and the nominal response model. Psychometrika, 64, 273–294.
Glas, C.A.W., & Dagohoy, A.V.T. (2007). Person fit tests for IRT models for polytomous items with estimated person and item parameters. Psychometrika, 72, 159–180.
Glas, C.A.W., & Suárez Falcón, J.C. (2003). A comparison of item-fit statistics for the three-parameter logistic model. Applied Psychological Measurement, 27, 87–106.
Glas, C.A.W., & van der Linden, W.J.. (2005). Likelihood-based estimation methods for models for concurrent continuous and discrete responses (LSAC Report). Enschede, The Netherlands: University of Twente, Department of Research Methodology, Measurement, and Data Analysis.
Hornke, L.F. (2000). Item response times in computerized adaptive testing. Psicológica, 21, 175–189.
Hornke, L.F. (2005). Response time in computer-aided testing: a “Verbal Memory” test for routes and maps. Psychological Science, 2, 280–293.
Klein Entink, R.H., Fox, J.-P., & van der Linden, W.J. (2009). A multivariate multilevel approach to simultaneous modeling of accuracy and speed on test items. Psychometrika, 74, 21–48.
Lehmann, E.L. (1999). Elements of large-sample theory. New York: Springer.
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Erlbaum.
Luce, R.D. (1986). Response times: their roles in inferring elementary mental organization. Oxford: Oxford University Press.
Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50–64.
Rao, C.R. (1948). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50–57.
Schnipke, D.L., & Scrams, D.J. (1997). Representing response time information in item banks (LSAC Computerized Testing Report No. 97-09). Newtown, PA: Law School Admission Council.
Silvey, S.D. (1975). Statistical inference. London: Chapman & Hall.
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.
Swanson, D.B., Featherman, C.M., Case, S.M., Luecht, R.M., & Nungester, R. (1999). Relationship of response latency to test design, examinee proficiency and item difficulty in computer-based test administration. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
Swanson, D.B., Case, S.E., Ripkey, D.R., Clauser, B.E., & Holtman, M.C. (2001). Relationships among item characteristics, examinee characteristics, and response times on USMLE Step 1. Academic Medicine, 76, 114–116.
Thissen, D. (1983). Timed testing: an approach using item response theory. In D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.
van der Linden, W.J. (2005). Linear models for optimal test design. New York: Springer.
van der Linden, W.J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204.
van der Linden, W.J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308.
van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 32, 5–20.
van der Linden, W.J. (2009a). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46. In press.
van der Linden, W.J. (2009b). Predictive control of speededness in adaptive testing. Applied Psychological Measurement, 33, 25–41.
van der Linden, W.J. (2009c). A bivariate lognormal response-time model for the detection of collusion between test takers. Journal of Educational and Behavioral Statistics, 34. In press.
van der Linden, W.J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384.
van der Linden, W.J., Breithaupt, K., Chuah, D., & Zhang, O. (2007). Detecting differential speededness in multistage testing. Journal of Educational Measurement, 44, 117–130.
van der Linden, W.J., Klein Entink, R.H., & Fox, J.-P. (2008). IRT parameter estimation with response times as collateral information. Manuscript submitted for publication.
Yen, W.M. (1984). Effects of local independence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125–145.
Author information
Authors and Affiliations
Corresponding author
Additional information
This study received funding from the Law School Admissions Council (LSAC). The opinions and conclusions contained in this paper are those of the author and do not necessarily reflect the policy and position of LSAC.
Wim J. van der Linden is now at CTB/McGraw-Hill.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
van der Linden, W.J., Glas, C.A.W. Statistical Tests of Conditional Independence Between Responses and/or Response Times on Test Items. Psychometrika 75, 120–139 (2010). https://doi.org/10.1007/s11336-009-9129-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-009-9129-9