Abstract
As a convenient data source from computerized tests, response time could also be very informative evidence for the validity of test scores, offering an opportunity for insights into parts of the test that test takers linger over and other parts of the test where they glide through the material. Given these hopes for, and expectations of, response time data, we should critically evaluate how such data are studied and more importantly, to what extent this type of data lives up to its promise. We begin this chapter by defining response time and briefly discussing its use as validity evidence. We then describe the typical uses of response time data and review the different approaches to studying response time data in various research. The chapter closes with an evaluation of response time data as validity evidence and suggests the conditions that would facilitate better use of response data for validation purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education [AERA, APA, & NCME]. (2014). Standards for educational and psychological testing (5th ed.). Washington, DC: American Educational Research Association.
Bergstrom, B., Gershon, R., & Lunz, M. E. (1994, April). Computerized adaptive testing exploring test taker response time using hierarchical linear modeling. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Chan, S.-C., Lu, T.-S., & Tsai, R.-C. (2014). Incorporating RT to analyze test data with mixture structural equation modeling. Psychological Testing, 61, 463–488.
Dennis, I., & Evans, J. S. B. T. (1996). The speed-error trade-off problem in psychometric testing. British Journal of Psychology, 87, 105–129. doi:10.1111/j.2044-8295.1996.tb02579.x.
Dodonova, Y. A., & Dodonova, Y. S. (2013). Faster on easy items, more accurate on difficult ones: Cognitive ability and performance on a task of varying difficulty. Intelligence, 41, 1–10. doi:10.1016/j.intell.2012.10.003.
Fan, Z., Wang, C., Chang, H.-H., & Douglas, J. (2012). Utilizing response time distributions for item selection in CAT. Journal of Educational and Behavioral Statistics, 37, 655–670. doi:10.3102/1076998611422912.
Ferrando, P. J., & Lorenzo-Seva, U. (2007). An item response theory model for incorporating response time data in binary personality items. Applied Psychological Measurement, 31, 525–543. doi:10.1177/0146621606295197.
Fox, J.-P., Entink, R. K., & van der Linden, W. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20, 1–14. doi:10.18637/jss.v020.i07.
Gierl, M. J., & Leighton, J. P. (2007). Defining cognitive diagnostic assessment in education. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 3–18). Cambridge, UK: Cambridge University Press.
Gorin, J. S. (2006). Test design with cognition in mind. Educational Measurement: Issues and Practice, 25, 21–35. doi:10.1111/j.1745-3992.2006.00076.x.
Gulliksen, H. (1950). Theory of mental tests. New York, NY: Wiley.
Gvozdenko, E., & Chambers, D. (2007). Beyond test accuracy: Benefits of measuring response time in computerised testing. Australasian Journal of Educational Technology, 23, 542–558. doi:10.14742/ajet.v23i4.1251.
Halkitis, P. N., & Jones, J. P. (1996, April). Estimating testing time: The effects of item characteristics on response latency. Paper presented at the annual meeting of the American Educational Research Association, New York.
Hess, B. J., Johnston, M. M., & Lipner, R. S. (2013). The impact of item format and test taker characteristics on response times. International Journal of Testing, 13, 295–313. doi:10.1080/15305058.2012.760098.
Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer-based testing. Educational Measurement: Issues and Practice, 20, 16–25. doi:10.1111/j.1745-3992.2001.tb00066.x.
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for Fusion Model application to LanguEdge assessment. Language Testing, 26, 31–73. doi:10.1177/0265532208097336.
Klein Entink, R. H., Kuhn, J.-T., Hornke, L. F., & Fox, J.-P. (2009). Evaluating cognitive theory: A joint modeling approach using responses and response times. Psychological Methods, 14, 54–75. doi:10.1037/a0014877.
Kong, X. J., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educational and Psychological Measurement, 67, 606–619. doi:10.1177/0013164406294779.
Kahraman, N., Cuddy, M. M., & Clauser, B. E. (2013). Modeling pacing behavior and test speededness using latent growth curve models. Applied Psychological Measurement, 37, 343–360. doi:10.1177/0146621613477236.
Lasry, N., Watkins, J., Mazur, E., & Ibrahim, A. (2013). Response times to conceptual questions. American Journal of Physics, 81, 703. doi:10.1119/1.4812583.
Lee, Y.-H., & Chen, H. (2011). A review of recent response-time analyses in educational testing. Psychological Test and Assessment Modeling, 53, 359–379.
Lee, Y.-H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16, 240–267. doi:10.1080/15305058.2015.1085385.
Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes. Written Communication, 30, 358–392. doi:10.1177/0741088313491692.
Lu, Y., & Sireci, S. G. (2007). Validity issues in test speededness. Educational Measurement: Issues and Practice, 26, 29–37. doi:10.1111/j.1745-3992.2007.00106.x.
Lyons-Thomas, J., Liu, Y., & Zumbo, B. D. (2014). Validation practices in the social, behavioral, and health sciences: A synthesis of syntheses. In B. D. Zumbo & E. K. H. Chan (Eds.), Validity and validation in social, behavioral, and health sciences (pp. 313–319). New York, NY: Springer.
Marianti, S., Fox, J.-P., Avetisyan, M., Veldkamp, B. P., & Tijmstra, J. (2014). Testing for aberrant behavior in response time modeling. Journal of Educational and Behavioral Statistics, 39, 426–451. doi:10.3102/1076998614559412.
Meng, X.-B., Tao, J., & Chang, H.-H. (2015). A conditional joint modeling approach for locally dependent item responses and response times. Journal of Educational Measurement, 52, 1–27. doi:10.1111/jedm.12060.
Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Washington, DC: American Council on Education.
Meyer, J. P. (2010). A mixture Rasch model with item response time components. Applied Psychological Measurement, 34, 521–538. doi:10.1177/0146621609355451.
Mislevy, R. J. (1989). Foundations of a new test theory. ETS Research Report Series, 1982(2), i-32.
Molenaar, D. (2015). The value of response times in item response modeling. Measurement: Interdisciplinary Research and Perspectives, 13, 177–181. doi:10.1080/15366367.2015.1105073.
Molenaar, D., Tuerlinckx, F., & van der Maas, H. L. J. (2015). A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times. Multivariate Behavioral Research, 50, 56–74. doi:10.1080/00273171.2014.962684.
Parshall, C. G., Mittelholtz, D. J., & Miller, T. R. (1994, April). Response latency: An investigation into determinants of item-level timing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans.
Qian, H., Staniewska, D., Reckase, M., & Woo, A. (2016). Using response time to detect item preknowledge in computer-based licensure examinations. Educational Measurement: Issues and Practice, 35, 38–47. doi:10.1111/emip.12102.
Ranger, J., & Kuhn, J.-T. (2013). Analyzing response times in tests with rank correlation approaches. Journal of Educational and Behavioral Statistics, 38, 61–80. doi:10.3102/1076998611431086.
Ranger, J., & Kuhn, J.-T. (2014). Testing fit of latent trait models for responses and response times in tests. Psychological Test and Assessment Modeling, 56, 382–404.
Ranger, J., & Kuhn, J.-T. (2015). Modeling information accumulation in psychological tests using item response times. Journal of Educational and Behavioral Statistics, 40, 274–306. doi:10.3102/1076998615583903.
Ranger, J., & Kuhn, J.-T. (2016). A mixture proportional hazards model with random effects for response times in tests. Educational and Psychological Measurement, 76, 562–586. doi:10.1177/0013164415598347.
Ranger, J., Kuhn, J.-T., & Gaviria, J.-L. (2015). A race model for responses and response times in tests. Psychometrika, 80, 791–810. doi:10.1007/s11336-014-9427-8.
Ranger, J., & Ortner, T. M. (2012). The case of dependency of responses and response times: A modeling approach based on standard latent trait models. Psychological Test and Assessment Modeling, 54, 128–148.
Ratcliff, R. (2014). Measuring psychometric functions with the diffusion model. Journal of Experimental Psychology. Human Perception and Performance, 40, 870–888. doi:10.1037/a0034954.
Ratcliff, R., van Zandt, T., & McKoon, G. (1999). Connectionist and diffusion models of reaction time. Psychological Review, 106, 261–300.
Scherer, R., Greiff, S., & Hautamäki, J. (2015). Exploring the relation between time on task and ability in complex problem solving. Intelligence, 48, 37–50. doi:10.1016/j.intell.2014.10.003.
Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213–232. doi:10.1111/j.1745-3984.1997.tb00516.x.
Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C. N. Mills, M. Potenza, J. J. Fremer, & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 237–266). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc..
Siem, F. M. (1996). The use of response latencies to enhance self-report personality measures. Military Psychology, 8, 15–27. doi:10.1207/s15327876mp0801_2.
Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32, 463–483. doi:10.1177/0265532214562099.
Thissen, D. (1983). Timed testing: An approach using item response theory. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 179–203). New York, NY: Academic Press.
Thomas, M. H. (2006). Modeling differential pacing trajectories in high stakes computer adaptive testing using hierarchical linear modeling and structural equation modeling. Unpublished doctoral dissertation. The University of North Carolina at Greensboro, Greensboro, NC.
van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204. doi:10.3102/10769986031002181.
van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308. doi:10.1007/s11336-006-1478-z.
van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46, 247–272. doi:10.1111/j.1745-3984.2009.00080.x.
van der Linden, W. J. (2011). Setting time limits on tests. Applied Psychological Measurement, 35, 183–199. doi:10.1177/0146621610391648.
van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384. doi:10.1007/s11336-007-9046-8.
van der Linden, W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York, NY: Springer-Verlag.
van der Linden, W. J., Klein Entink, R. H., & Fox, J.-P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34, 327–347. doi:10.1177/0146621609349800.
van der Linden, W. J., Scrams, D. J., & Schnipke, D. L. (1999). Using response-time constraints to control for differential speededness in computerized adaptive testing. Applied Psychological Measurement, 23, 195–210. doi:10.1177/01466219922031329.
van der Maas, H. L. J., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118, 339–356. doi:10.1037/a0022749.
Voss, A., Nagler, M., & Lerche, V. (2013). Diffusion models in experimental psychology: A practical introduction. Experimental Psychology, 60, 385–402. doi:10.1027/1618-3169/a000218.
Wang, C., Chang, H.-H., & Douglas, J. A. (2013). The linear transformation model with frailties for the analysis of item response times. The British Journal of Mathematical and Statistical Psychology, 66, 144–168. doi:10.1111/j.2044-8317.2012.02045.x.
Wang, T. (2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29, 323–339. doi:10.1177/0146621605275984.
Wickelgren, W. A. (1977). Speed-accuracy tradeoff and information processing dynamics. Acta Psychologica, 41, 67–85. doi:10.1016/0001-6918(77)90012-9.
Wise, S. L. (2014). The utility of adaptive testing in addressing the problem of unmotivated test takers. Journal of Computerized Adaptive Testing, 2, 1–17. doi:10.7333/jcat.v2i0.30.
Wise, S. L., & DeMars, C. E. (2006). An application of item response time: The effort-moderated IRT model. Journal of Educational Measurement, 43, 19–38. doi:10.1111/j.1745-3984.2006.00002.x.
Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of test taker motivation in computer-based tests. Applied Measurement in Education, 18, 163–183. doi:10.1207/s15324818ame1802_2.
Yang, X. (2007). Methods of identifying individual guessers from item response data. Educational and Psychological Measurement, 67, 745–764. doi:10.1177/0013164406296978.
Zenisky, A. L., & Baldwin, P. (2006). Using item response time data in test development and validation: Research with beginning computer users, Center for educational assessment report No. 593. Amherst, MA: University of Massachusetts, School of Education.
Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, Psychometrics (Vol. 26, pp. 45–79). The NetherlandsAmsterdam: Elsevier Science B.V..
Zumbo, B. D., & Chan, E. K. H. (Eds.). (2014). Validity and validation in social, behavioral, and health sciences. New York, NY: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Li, Z., Banerjee, J., Zumbo, B.D. (2017). Response Time Data as Validity Evidence: Has It Lived Up To Its Promise and, If Not, What Would It Take to Do So. In: Zumbo, B., Hubley, A. (eds) Understanding and Investigating Response Processes in Validation Research. Social Indicators Research Series, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-319-56129-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-56129-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56128-8
Online ISBN: 978-3-319-56129-5
eBook Packages: Social SciencesSocial Sciences (R0)