Response Time Data as Validity Evidence: Has It Lived Up To Its Promise and, If Not, What Would It Take to Do So

Li, Zhi; Banerjee, Jayanti; Zumbo, Bruno D.

doi:10.1007/978-3-319-56129-5_9

Zhi Li⁴,
Jayanti Banerjee⁴ &
Bruno D. Zumbo⁵

Part of the book series: Social Indicators Research Series ((SINS,volume 69))

1382 Accesses
10 Citations
6 Altmetric

Abstract

As a convenient data source from computerized tests, response time could also be very informative evidence for the validity of test scores, offering an opportunity for insights into parts of the test that test takers linger over and other parts of the test where they glide through the material. Given these hopes for, and expectations of, response time data, we should critically evaluate how such data are studied and more importantly, to what extent this type of data lives up to its promise. We begin this chapter by defining response time and briefly discussing its use as validity evidence. We then describe the typical uses of response time data and review the different approaches to studying response time data in various research. The chapter closes with an evaluation of response time data as validity evidence and suggests the conditions that would facilitate better use of response data for validation purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education [AERA, APA, & NCME]. (2014). Standards for educational and psychological testing (5th ed.). Washington, DC: American Educational Research Association.
Google Scholar
Bergstrom, B., Gershon, R., & Lunz, M. E. (1994, April). Computerized adaptive testing exploring test taker response time using hierarchical linear modeling. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Google Scholar
Chan, S.-C., Lu, T.-S., & Tsai, R.-C. (2014). Incorporating RT to analyze test data with mixture structural equation modeling. Psychological Testing, 61, 463–488.
Google Scholar
Dennis, I., & Evans, J. S. B. T. (1996). The speed-error trade-off problem in psychometric testing. British Journal of Psychology, 87, 105–129. doi:10.1111/j.2044-8295.1996.tb02579.x.
Article Google Scholar
Dodonova, Y. A., & Dodonova, Y. S. (2013). Faster on easy items, more accurate on difficult ones: Cognitive ability and performance on a task of varying difficulty. Intelligence, 41, 1–10. doi:10.1016/j.intell.2012.10.003.
Article Google Scholar
Fan, Z., Wang, C., Chang, H.-H., & Douglas, J. (2012). Utilizing response time distributions for item selection in CAT. Journal of Educational and Behavioral Statistics, 37, 655–670. doi:10.3102/1076998611422912.
Article Google Scholar
Ferrando, P. J., & Lorenzo-Seva, U. (2007). An item response theory model for incorporating response time data in binary personality items. Applied Psychological Measurement, 31, 525–543. doi:10.1177/0146621606295197.
Article Google Scholar
Fox, J.-P., Entink, R. K., & van der Linden, W. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20, 1–14. doi:10.18637/jss.v020.i07.
Google Scholar
Gierl, M. J., & Leighton, J. P. (2007). Defining cognitive diagnostic assessment in education. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 3–18). Cambridge, UK: Cambridge University Press.
Google Scholar
Gorin, J. S. (2006). Test design with cognition in mind. Educational Measurement: Issues and Practice, 25, 21–35. doi:10.1111/j.1745-3992.2006.00076.x.
Article Google Scholar
Gulliksen, H. (1950). Theory of mental tests. New York, NY: Wiley.
Book Google Scholar
Gvozdenko, E., & Chambers, D. (2007). Beyond test accuracy: Benefits of measuring response time in computerised testing. Australasian Journal of Educational Technology, 23, 542–558. doi:10.14742/ajet.v23i4.1251.
Article Google Scholar
Halkitis, P. N., & Jones, J. P. (1996, April). Estimating testing time: The effects of item characteristics on response latency. Paper presented at the annual meeting of the American Educational Research Association, New York.
Google Scholar
Hess, B. J., Johnston, M. M., & Lipner, R. S. (2013). The impact of item format and test taker characteristics on response times. International Journal of Testing, 13, 295–313. doi:10.1080/15305058.2012.760098.
Article Google Scholar
Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer-based testing. Educational Measurement: Issues and Practice, 20, 16–25. doi:10.1111/j.1745-3992.2001.tb00066.x.
Article Google Scholar
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for Fusion Model application to LanguEdge assessment. Language Testing, 26, 31–73. doi:10.1177/0265532208097336.
Article Google Scholar
Klein Entink, R. H., Kuhn, J.-T., Hornke, L. F., & Fox, J.-P. (2009). Evaluating cognitive theory: A joint modeling approach using responses and response times. Psychological Methods, 14, 54–75. doi:10.1037/a0014877.
Article Google Scholar
Kong, X. J., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educational and Psychological Measurement, 67, 606–619. doi:10.1177/0013164406294779.
Article Google Scholar
Kahraman, N., Cuddy, M. M., & Clauser, B. E. (2013). Modeling pacing behavior and test speededness using latent growth curve models. Applied Psychological Measurement, 37, 343–360. doi:10.1177/0146621613477236.
Article Google Scholar
Lasry, N., Watkins, J., Mazur, E., & Ibrahim, A. (2013). Response times to conceptual questions. American Journal of Physics, 81, 703. doi:10.1119/1.4812583.
Article Google Scholar
Lee, Y.-H., & Chen, H. (2011). A review of recent response-time analyses in educational testing. Psychological Test and Assessment Modeling, 53, 359–379.
Google Scholar
Lee, Y.-H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16, 240–267. doi:10.1080/15305058.2015.1085385.
Article Google Scholar
Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes. Written Communication, 30, 358–392. doi:10.1177/0741088313491692.
Article Google Scholar
Lu, Y., & Sireci, S. G. (2007). Validity issues in test speededness. Educational Measurement: Issues and Practice, 26, 29–37. doi:10.1111/j.1745-3992.2007.00106.x.
Article Google Scholar
Lyons-Thomas, J., Liu, Y., & Zumbo, B. D. (2014). Validation practices in the social, behavioral, and health sciences: A synthesis of syntheses. In B. D. Zumbo & E. K. H. Chan (Eds.), Validity and validation in social, behavioral, and health sciences (pp. 313–319). New York, NY: Springer.
Google Scholar
Marianti, S., Fox, J.-P., Avetisyan, M., Veldkamp, B. P., & Tijmstra, J. (2014). Testing for aberrant behavior in response time modeling. Journal of Educational and Behavioral Statistics, 39, 426–451. doi:10.3102/1076998614559412.
Article Google Scholar
Meng, X.-B., Tao, J., & Chang, H.-H. (2015). A conditional joint modeling approach for locally dependent item responses and response times. Journal of Educational Measurement, 52, 1–27. doi:10.1111/jedm.12060.
Article Google Scholar
Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Washington, DC: American Council on Education.
Google Scholar
Meyer, J. P. (2010). A mixture Rasch model with item response time components. Applied Psychological Measurement, 34, 521–538. doi:10.1177/0146621609355451.
Article Google Scholar
Mislevy, R. J. (1989). Foundations of a new test theory. ETS Research Report Series, 1982(2), i-32.
Google Scholar
Molenaar, D. (2015). The value of response times in item response modeling. Measurement: Interdisciplinary Research and Perspectives, 13, 177–181. doi:10.1080/15366367.2015.1105073.
Google Scholar
Molenaar, D., Tuerlinckx, F., & van der Maas, H. L. J. (2015). A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times. Multivariate Behavioral Research, 50, 56–74. doi:10.1080/00273171.2014.962684.
Article Google Scholar
Parshall, C. G., Mittelholtz, D. J., & Miller, T. R. (1994, April). Response latency: An investigation into determinants of item-level timing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans.
Google Scholar
Qian, H., Staniewska, D., Reckase, M., & Woo, A. (2016). Using response time to detect item preknowledge in computer-based licensure examinations. Educational Measurement: Issues and Practice, 35, 38–47. doi:10.1111/emip.12102.
Article Google Scholar
Ranger, J., & Kuhn, J.-T. (2013). Analyzing response times in tests with rank correlation approaches. Journal of Educational and Behavioral Statistics, 38, 61–80. doi:10.3102/1076998611431086.
Article Google Scholar
Ranger, J., & Kuhn, J.-T. (2014). Testing fit of latent trait models for responses and response times in tests. Psychological Test and Assessment Modeling, 56, 382–404.
Google Scholar
Ranger, J., & Kuhn, J.-T. (2015). Modeling information accumulation in psychological tests using item response times. Journal of Educational and Behavioral Statistics, 40, 274–306. doi:10.3102/1076998615583903.
Article Google Scholar
Ranger, J., & Kuhn, J.-T. (2016). A mixture proportional hazards model with random effects for response times in tests. Educational and Psychological Measurement, 76, 562–586. doi:10.1177/0013164415598347.
Article Google Scholar
Ranger, J., Kuhn, J.-T., & Gaviria, J.-L. (2015). A race model for responses and response times in tests. Psychometrika, 80, 791–810. doi:10.1007/s11336-014-9427-8.
Article Google Scholar
Ranger, J., & Ortner, T. M. (2012). The case of dependency of responses and response times: A modeling approach based on standard latent trait models. Psychological Test and Assessment Modeling, 54, 128–148.
Google Scholar
Ratcliff, R. (2014). Measuring psychometric functions with the diffusion model. Journal of Experimental Psychology. Human Perception and Performance, 40, 870–888. doi:10.1037/a0034954.
Article Google Scholar
Ratcliff, R., van Zandt, T., & McKoon, G. (1999). Connectionist and diffusion models of reaction time. Psychological Review, 106, 261–300.
Article Google Scholar
Scherer, R., Greiff, S., & Hautamäki, J. (2015). Exploring the relation between time on task and ability in complex problem solving. Intelligence, 48, 37–50. doi:10.1016/j.intell.2014.10.003.
Article Google Scholar
Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34, 213–232. doi:10.1111/j.1745-3984.1997.tb00516.x.
Article Google Scholar
Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C. N. Mills, M. Potenza, J. J. Fremer, & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 237–266). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc..
Google Scholar
Siem, F. M. (1996). The use of response latencies to enhance self-report personality measures. Military Psychology, 8, 15–27. doi:10.1207/s15327876mp0801_2.
Article Google Scholar
Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32, 463–483. doi:10.1177/0265532214562099.
Article Google Scholar
Thissen, D. (1983). Timed testing: An approach using item response theory. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 179–203). New York, NY: Academic Press.
Google Scholar
Thomas, M. H. (2006). Modeling differential pacing trajectories in high stakes computer adaptive testing using hierarchical linear modeling and structural equation modeling. Unpublished doctoral dissertation. The University of North Carolina at Greensboro, Greensboro, NC.
Google Scholar
van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204. doi:10.3102/10769986031002181.
Article Google Scholar
van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308. doi:10.1007/s11336-006-1478-z.
Article Google Scholar
van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46, 247–272. doi:10.1111/j.1745-3984.2009.00080.x.
Article Google Scholar
van der Linden, W. J. (2011). Setting time limits on tests. Applied Psychological Measurement, 35, 183–199. doi:10.1177/0146621610391648.
Article Google Scholar
van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384. doi:10.1007/s11336-007-9046-8.
Article Google Scholar
van der Linden, W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York, NY: Springer-Verlag.
Google Scholar
van der Linden, W. J., Klein Entink, R. H., & Fox, J.-P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34, 327–347. doi:10.1177/0146621609349800.
Article Google Scholar
van der Linden, W. J., Scrams, D. J., & Schnipke, D. L. (1999). Using response-time constraints to control for differential speededness in computerized adaptive testing. Applied Psychological Measurement, 23, 195–210. doi:10.1177/01466219922031329.
Article Google Scholar
van der Maas, H. L. J., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118, 339–356. doi:10.1037/a0022749.
Article Google Scholar
Voss, A., Nagler, M., & Lerche, V. (2013). Diffusion models in experimental psychology: A practical introduction. Experimental Psychology, 60, 385–402. doi:10.1027/1618-3169/a000218.
Article Google Scholar
Wang, C., Chang, H.-H., & Douglas, J. A. (2013). The linear transformation model with frailties for the analysis of item response times. The British Journal of Mathematical and Statistical Psychology, 66, 144–168. doi:10.1111/j.2044-8317.2012.02045.x.
Article Google Scholar
Wang, T. (2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29, 323–339. doi:10.1177/0146621605275984.
Article Google Scholar
Wickelgren, W. A. (1977). Speed-accuracy tradeoff and information processing dynamics. Acta Psychologica, 41, 67–85. doi:10.1016/0001-6918(77)90012-9.
Article Google Scholar
Wise, S. L. (2014). The utility of adaptive testing in addressing the problem of unmotivated test takers. Journal of Computerized Adaptive Testing, 2, 1–17. doi:10.7333/jcat.v2i0.30.
Google Scholar
Wise, S. L., & DeMars, C. E. (2006). An application of item response time: The effort-moderated IRT model. Journal of Educational Measurement, 43, 19–38. doi:10.1111/j.1745-3984.2006.00002.x.
Article Google Scholar
Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of test taker motivation in computer-based tests. Applied Measurement in Education, 18, 163–183. doi:10.1207/s15324818ame1802_2.
Article Google Scholar
Yang, X. (2007). Methods of identifying individual guessers from item response data. Educational and Psychological Measurement, 67, 745–764. doi:10.1177/0013164406296978.
Article Google Scholar
Zenisky, A. L., & Baldwin, P. (2006). Using item response time data in test development and validation: Research with beginning computer users, Center for educational assessment report No. 593. Amherst, MA: University of Massachusetts, School of Education.
Google Scholar
Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, Psychometrics (Vol. 26, pp. 45–79). The NetherlandsAmsterdam: Elsevier Science B.V..
Google Scholar
Zumbo, B. D., & Chan, E. K. H. (Eds.). (2014). Validity and validation in social, behavioral, and health sciences. New York, NY: Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

Paragon Testing Enterprises, Inc., 110-2925 Virtual Way, Vancouver, BC, V5M 4X5, Canada
Zhi Li & Jayanti Banerjee
Measurement, Evaluation, and Research Methodology (MERM) Program, Department of Educational and Counselling Psychology, and Special Education (ECPS), The University of British Columbia, 2125 Main Mall, Vancouver, BC, V6T 1Z4, Canada
Bruno D. Zumbo

Authors

Zhi Li
View author publications
You can also search for this author in PubMed Google Scholar
Jayanti Banerjee
View author publications
You can also search for this author in PubMed Google Scholar
Bruno D. Zumbo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi Li .

Editor information

Editors and Affiliations

Measurement, Evaluation, and Research Methodology (MERM) Program, Department of Educational and Counselling Psychology, and Special Education (ECPS), The University of British Columbia, Vancouver, British Columbia, Canada
Bruno D. Zumbo
Measurement, Evaluation, and Research Methodology (MERM) Program, Department of Educational and Counselling Psychology, and Special Education (ECPS), The University of British Columbia, Vancouver, British Columbia, Canada
Anita M. Hubley

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, Z., Banerjee, J., Zumbo, B.D. (2017). Response Time Data as Validity Evidence: Has It Lived Up To Its Promise and, If Not, What Would It Take to Do So. In: Zumbo, B., Hubley, A. (eds) Understanding and Investigating Response Processes in Validation Research. Social Indicators Research Series, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-319-56129-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-56129-5_9
Published: 25 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56128-8
Online ISBN: 978-3-319-56129-5
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics