Response time-based treatment of omitted responses in computer-based testing

Frey, Andreas; Spoden, Christian; Goldhammer, Frank; Wenzel, S. Franziska C.

doi:10.1007/s41237-018-0073-9

Response time-based treatment of omitted responses in computer-based testing

Original Paper
Published: 20 November 2018

Volume 45, pages 505–526, (2018)
Cite this article

Behaviormetrika Aims and scope Submit manuscript

399 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

A new response time-based method for coding omitted item responses in computer-based testing is introduced and illustrated with empirical data. The new method is derived from the theory of missing data problems of Rubin and colleagues and embedded in an item response theory framework. Its basic idea is using item response times to statistically test for each individual item whether omitted responses are missing completely at random (MCAR) or missing due to a lack of ability and, thus, not at random (MNAR) with fixed type-1 and type-2 error levels. If the MCAR hypothesis is maintained, omitted responses are coded as not administered (NA), and as incorrect (0) otherwise. The empirical illustration draws from the responses given by N = 766 students to 70 items of a computer-based ICT skills test. The new method is compared with the two common deterministic methods of scoring omitted responses as 0 or as NA. In result, response time thresholds from 18 to 58 s were identified. With 61%, more omitted responses were recoded into 0 than into NA (39%). The differences in difficulty were larger when the new method was compared to deterministically scoring omitted responses as NA compared to scoring omitted responses as 0. The variances and reliabilities obtained under the three methods showed small differences. The paper concludes with a discussion of the practical relevance of the observed effect sizes, and with recommendations for the practical use of the new method as a method to be applied in the early stage of data processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Omitted and Not-Reached Items in IRT Models

Article 15 November 2016

Understanding examinees’ item responses through cognitive modeling of response accuracy and response times

Article Open access 21 March 2023

Modeling Differences Between Response Times of Correct and Incorrect Responses

Article 28 August 2019

Notes

At this point, we discuss missing responses at a general level. A distinction between the types “missing by design”, “not reached”, and “omitted” is introduced later in the text in the section “Types of Missing Item Responses”.

References

Arbuckle JL (1996) Full information estimation in the presence of incomplete data. In: Marcoulides GA, Schumacker RE (eds) Advanced structural equation modeling. Lawrence Erlbaum Publishers, Mahwah, pp 243–277
Google Scholar
Champely, S. (2017). pwr: Basic functions for power analysis [software]. R package version 1.2-1
Chen HY, Little RJA (1999) A test of missing completely at random for generalised estimating equations with missing data. Biometrika 86:1–13
Article MathSciNet Google Scholar
Core Team R (2017) R: a language and environment for statistical computing [software]. R Foundation for Statistical Computing, Vienna
Google Scholar
De Ayala RJ (2009) The theory and practice of item response theory. The Guildford Press, New York
Google Scholar
De Ayala RJ, Plake BS, Impara JC (2001) The impact of omitted responses on the accuracy of ability estimation in item response theory. J Educ Meas 38:213–234
Article Google Scholar
Diggle PJ (1989) Testing for random dropouts in repeated measurement data. Biometrics 45:1255–1258
Article Google Scholar
Enders CK (2010) Applied missing data analysis. Guilford Press, New York
Google Scholar
Enders CK, Bandalos DL (2001) The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Struct Equ Model 8:430–457
Article MathSciNet Google Scholar
Engelhardt L, Naumann J, Goldhammer F, Frey A, Wenzel SFC, Hartig K, Horz H (2018) Convergent evidence for validity of a performance-based ICT skills test. European Journal of Psychological Assessment
Finch H (2008) Estimation of item response theory parameters in the presence of missing data. J Educ Meas 45:225–245
Article Google Scholar
Frey A, Hartig J, Rupp A (2009) Booklet designs in large-scale assessments of student achievement: theory and practice. Educ Meas: Issues Pract 28:39–53
Article Google Scholar
Glas CAW, Pimentel J (2008) Modeling nonignorable missing data in speeded tests. Educ Psychol Measur 68:907–922
Article MathSciNet Google Scholar
Goldhammer F, Kroehne U (2014) Controlling individuals’ time spent on task in speeded performance measures: experimental time limits, posterior time limits, and response time modeling. Appl Psychol Meas 38:255–267
Article Google Scholar
Goldhammer F, Martens T, Lüdtke O (2017) Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics. Large-scale Assess Educ 5(1):18
Article Google Scholar
Hedges LV (1981) Distribution theory for Glass’s estimator of effect size and related estimators. J Educ Stat 6:107–128
Article Google Scholar
Holman R, Glas CAW (2005) Modelling non-ignorable missing data mechanisms with item response theory models. Br J Math Stat Psychol 58:1–17
Article MathSciNet Google Scholar
International ICT Literacy Panel. (2002). Digital transformation: a framework for ICT literacy. Princeton, NJ. http://www.ets.org/research/policy_research_reports/publications/report/2002/cjik
Kim KH, Bentler PM (2002) Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika 67:609–624
Article MathSciNet Google Scholar
Köhler C, Pohl S, Carstensen CH (2015) Taking the missing propensity into account when estimating competence scores: evaluation of item response theory models for nonignorable omissions. Educ Psychol Measur 75:850–874
Article Google Scholar
Kong X, Wise J, Bhola SL (2007) Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educ Psychol Meas 67:606–619
Article MathSciNet Google Scholar
Lee Y-H, Chen H (2011) A review of recent response-time analyses in educational testing. Psychol Test Assess Model 53:359–379
Google Scholar
Little RJA (1988) A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc 83:1198–1202
Article MathSciNet Google Scholar
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken
Book Google Scholar
Lord FM (1974) Estimation of latent ability and item parameters when there are omitted responses. Psychometrika 39(2):247–264
Article Google Scholar
Ludlow LH, O’Leary M (1999) Scoring omitted and not-reached items: practical data analysis implications. Educ Psychol Measur 59:615–630
Article Google Scholar
Mislevy RJ, Wu PK (1996) Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing (RR-96-30-ONR). Educ Testing Serv, Princeton
Google Scholar
Neyman J, Pearson ES (1933) On the testing of statistical hypotheses in relation to probability a priori. Proc Camb Philos Soc 29:492–510
Article Google Scholar
O’Muircheartaigh C, Moustaki I (1999) Symmetric pattern models: a latent variable approach to item non-response in attitude scales. J R Stat Soc, Ser A 162:177–194
Article Google Scholar
OECD (2016) Technical report of the survey of adult skills (PIAAC), 2nd edn. Author, Paris
Google Scholar
Robitzsch A (2016) Zu nichtignorierbaren Konsequenzen des (partiellen) ignorierens fehlender item responses im large-scale assessment [on non-negligible consequences of (partially) ignoring missing item responses in large- scale assessments]. In: Suchań B, Wallner-Paschon C, Schreiner C (eds) PIRLS & TIMSS 2011—die Kompetenzen in Lesen, Mathematik und Naturwissenschaft am Ende der Volksschule: Österreichischer Expertenbericht. Leykam, Graz, pp 55–64
Google Scholar
Robitzsch A, Kiefer T, Wu M (2018) TAM: test analysis modules [software]. R package version 2.9-35
Rose N, von Davier M, Xu X (2010) Modeling non-ignorable missing data with IRT (ETS Research Report No. Educational Testing Service, Princeton, pp 10–11
Google Scholar
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Article MathSciNet Google Scholar
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Book Google Scholar
Rutkowski L, Gonzales E, von Davier M, Zhou Y (2014) Assessment design for international large-scale assessments. In: Rutkowski L, von Davier M, Rutkowski D (eds) Handbook of international large-scale assessment: background, technical issues, and methods of data analysis. CRC Press, Boca Raton, pp 75–95
Google Scholar
Satterthwaite FE (1946) An approximate distribution of estimates of variance components. Biometr Bull 2:110–114. https://doi.org/10.2307/3002019
Article Google Scholar
van der Linden WJ (2009) Conceptual issues in response-time modeling. J Educ Meas 46:247–272
Article Google Scholar
Warm TA (1989) Weighted likelihood estimation of ability in item response theory. Psychometrika 54:427–450
Article MathSciNet Google Scholar
Weeks JP, von Davier M, Yamamoto K (2016) Using response time data to inform the coding of omitted responses. Psychol Test Assess Model 58:671–701
Google Scholar
Welch BL (1947) The generalization of “Student’s” problem when several different population variances are involved. Biometrika 34:28–35. https://doi.org/10.1093/biomet/34.1-2.28
Article MathSciNet MATH Google Scholar
Wenzel SFC, Engelhardt L, Hartig K, Kuchta K, Frey A, Goldhammer F, Naumann J, Horz H (2016) Computergestützte, adaptive und verhaltensnahe Erfassung Informations- und Kommunikationstechnologie-bezogener Fertigkeiten (ICT-Skills) [Computerized adaptive and behaviorally oriented measurement of Information and communication technology-related skills (ICT-skills)]. In Bundesministerium für Bildung und Forschung (BMBF) Referat Bildungsforschung (Hrsg.), Forschungsvorhaben in Ankopplung an Large-Scale Assessments (pp 161–180). Niestetal: Silber Druck
Wilkinson L, Task Force on Statistical Inference, American Psychological Association, Science Directorate (1999) Statistical methods in psychology journals: guidelines and explanations. Am Psychol 54:594–604. https://doi.org/10.1037/0003-066X.54.8.594
Article Google Scholar
Wise SL, Kingsbury GG (2016) Modeling student test-taking motivation in the context of an adaptive achievement test. J Educ Meas 53:86–105
Article Google Scholar
Wise SL, Ma L (2012) Setting response time thresholds for a CAT item pool: the normative threshold method. Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, Canada

Download references

Author information

Andreas Frey
Present address: Goethe University, Frankfurt, Germany
Christian Spoden
Present address: German Institute for Adult Education, Leibniz Centre for Lifelong Learning, Bonn, Germany

Authors and Affiliations

Friedrich Schiller University Jena, Am Planetarium 4, 07743, Jena, Germany
Andreas Frey & Christian Spoden
Centre for Educational Measurement (CEMO) at the University of Oslo, Oslo, Norway
Andreas Frey
DIPF | Leibniz Institute for Research and Information in Education, Frankfurt, Germany
Frank Goldhammer
Centre for International Student Assessment (ZIB), Frankfurt, Germany
Frank Goldhammer
Goethe University, Frankfurt, Germany
S. Franziska C. Wenzel

Authors

Andreas Frey
View author publications
You can also search for this author in PubMed Google Scholar
Christian Spoden
View author publications
You can also search for this author in PubMed Google Scholar
Frank Goldhammer
View author publications
You can also search for this author in PubMed Google Scholar
S. Franziska C. Wenzel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Frey.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

About this article

Cite this article

Frey, A., Spoden, C., Goldhammer, F. et al. Response time-based treatment of omitted responses in computer-based testing. Behaviormetrika 45, 505–526 (2018). https://doi.org/10.1007/s41237-018-0073-9

Download citation

Received: 06 March 2018
Accepted: 08 November 2018
Published: 20 November 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s41237-018-0073-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Response time-based treatment of omitted responses in computer-based testing

Abstract

Access this article

Similar content being viewed by others

Modeling Omitted and Not-Reached Items in IRT Models

Understanding examinees’ item responses through cognitive modeling of response accuracy and response times

Modeling Differences Between Response Times of Correct and Incorrect Responses

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

About this article

Cite this article

Keywords

Navigation

Response time-based treatment of omitted responses in computer-based testing

Abstract

Access this article

Similar content being viewed by others

Modeling Omitted and Not-Reached Items in IRT Models

Understanding examinees’ item responses through cognitive modeling of response accuracy and response times

Modeling Differences Between Response Times of Correct and Incorrect Responses

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

About this article

Cite this article

Share this article

Keywords

Search

Navigation