Skip to main content
Log in

Response time-based treatment of omitted responses in computer-based testing

  • Original Paper
  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

A new response time-based method for coding omitted item responses in computer-based testing is introduced and illustrated with empirical data. The new method is derived from the theory of missing data problems of Rubin and colleagues and embedded in an item response theory framework. Its basic idea is using item response times to statistically test for each individual item whether omitted responses are missing completely at random (MCAR) or missing due to a lack of ability and, thus, not at random (MNAR) with fixed type-1 and type-2 error levels. If the MCAR hypothesis is maintained, omitted responses are coded as not administered (NA), and as incorrect (0) otherwise. The empirical illustration draws from the responses given by N = 766 students to 70 items of a computer-based ICT skills test. The new method is compared with the two common deterministic methods of scoring omitted responses as 0 or as NA. In result, response time thresholds from 18 to 58 s were identified. With 61%, more omitted responses were recoded into 0 than into NA (39%). The differences in difficulty were larger when the new method was compared to deterministically scoring omitted responses as NA compared to scoring omitted responses as 0. The variances and reliabilities obtained under the three methods showed small differences. The paper concludes with a discussion of the practical relevance of the observed effect sizes, and with recommendations for the practical use of the new method as a method to be applied in the early stage of data processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. At this point, we discuss missing responses at a general level. A distinction between the types “missing by design”, “not reached”, and “omitted” is introduced later in the text in the section “Types of Missing Item Responses”.

References

  • Arbuckle JL (1996) Full information estimation in the presence of incomplete data. In: Marcoulides GA, Schumacker RE (eds) Advanced structural equation modeling. Lawrence Erlbaum Publishers, Mahwah, pp 243–277

    Google Scholar 

  • Champely, S. (2017). pwr: Basic functions for power analysis [software]. R package version 1.2-1

  • Chen HY, Little RJA (1999) A test of missing completely at random for generalised estimating equations with missing data. Biometrika 86:1–13

    Article  MathSciNet  Google Scholar 

  • Core Team R (2017) R: a language and environment for statistical computing [software]. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • De Ayala RJ (2009) The theory and practice of item response theory. The Guildford Press, New York

    Google Scholar 

  • De Ayala RJ, Plake BS, Impara JC (2001) The impact of omitted responses on the accuracy of ability estimation in item response theory. J Educ Meas 38:213–234

    Article  Google Scholar 

  • Diggle PJ (1989) Testing for random dropouts in repeated measurement data. Biometrics 45:1255–1258

    Article  Google Scholar 

  • Enders CK (2010) Applied missing data analysis. Guilford Press, New York

    Google Scholar 

  • Enders CK, Bandalos DL (2001) The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Struct Equ Model 8:430–457

    Article  MathSciNet  Google Scholar 

  • Engelhardt L, Naumann J, Goldhammer F, Frey A, Wenzel SFC, Hartig K, Horz H (2018) Convergent evidence for validity of a performance-based ICT skills test. European Journal of Psychological Assessment

  • Finch H (2008) Estimation of item response theory parameters in the presence of missing data. J Educ Meas 45:225–245

    Article  Google Scholar 

  • Frey A, Hartig J, Rupp A (2009) Booklet designs in large-scale assessments of student achievement: theory and practice. Educ Meas: Issues Pract 28:39–53

    Article  Google Scholar 

  • Glas CAW, Pimentel J (2008) Modeling nonignorable missing data in speeded tests. Educ Psychol Measur 68:907–922

    Article  MathSciNet  Google Scholar 

  • Goldhammer F, Kroehne U (2014) Controlling individuals’ time spent on task in speeded performance measures: experimental time limits, posterior time limits, and response time modeling. Appl Psychol Meas 38:255–267

    Article  Google Scholar 

  • Goldhammer F, Martens T, Lüdtke O (2017) Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics. Large-scale Assess Educ 5(1):18

    Article  Google Scholar 

  • Hedges LV (1981) Distribution theory for Glass’s estimator of effect size and related estimators. J Educ Stat 6:107–128

    Article  Google Scholar 

  • Holman R, Glas CAW (2005) Modelling non-ignorable missing data mechanisms with item response theory models. Br J Math Stat Psychol 58:1–17

    Article  MathSciNet  Google Scholar 

  • International ICT Literacy Panel. (2002). Digital transformation: a framework for ICT literacy. Princeton, NJ. http://www.ets.org/research/policy_research_reports/publications/report/2002/cjik

  • Kim KH, Bentler PM (2002) Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika 67:609–624

    Article  MathSciNet  Google Scholar 

  • Köhler C, Pohl S, Carstensen CH (2015) Taking the missing propensity into account when estimating competence scores: evaluation of item response theory models for nonignorable omissions. Educ Psychol Measur 75:850–874

    Article  Google Scholar 

  • Kong X, Wise J, Bhola SL (2007) Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educ Psychol Meas 67:606–619

    Article  MathSciNet  Google Scholar 

  • Lee Y-H, Chen H (2011) A review of recent response-time analyses in educational testing. Psychol Test Assess Model 53:359–379

    Google Scholar 

  • Little RJA (1988) A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc 83:1198–1202

    Article  MathSciNet  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken

    Book  Google Scholar 

  • Lord FM (1974) Estimation of latent ability and item parameters when there are omitted responses. Psychometrika 39(2):247–264

    Article  Google Scholar 

  • Ludlow LH, O’Leary M (1999) Scoring omitted and not-reached items: practical data analysis implications. Educ Psychol Measur 59:615–630

    Article  Google Scholar 

  • Mislevy RJ, Wu PK (1996) Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing (RR-96-30-ONR). Educ Testing Serv, Princeton

    Google Scholar 

  • Neyman J, Pearson ES (1933) On the testing of statistical hypotheses in relation to probability a priori. Proc Camb Philos Soc 29:492–510

    Article  Google Scholar 

  • O’Muircheartaigh C, Moustaki I (1999) Symmetric pattern models: a latent variable approach to item non-response in attitude scales. J R Stat Soc, Ser A 162:177–194

    Article  Google Scholar 

  • OECD (2016) Technical report of the survey of adult skills (PIAAC), 2nd edn. Author, Paris

    Google Scholar 

  • Robitzsch A (2016) Zu nichtignorierbaren Konsequenzen des (partiellen) ignorierens fehlender item responses im large-scale assessment [on non-negligible consequences of (partially) ignoring missing item responses in large- scale assessments]. In: Suchań B, Wallner-Paschon C, Schreiner C (eds) PIRLS & TIMSS 2011—die Kompetenzen in Lesen, Mathematik und Naturwissenschaft am Ende der Volksschule: Österreichischer Expertenbericht. Leykam, Graz, pp 55–64

    Google Scholar 

  • Robitzsch A, Kiefer T, Wu M (2018) TAM: test analysis modules [software]. R package version 2.9-35

  • Rose N, von Davier M, Xu X (2010) Modeling non-ignorable missing data with IRT (ETS Research Report No. Educational Testing Service, Princeton, pp 10–11

    Google Scholar 

  • Rubin DB (1976) Inference and missing data. Biometrika 63:581–592

    Article  MathSciNet  Google Scholar 

  • Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York

    Book  Google Scholar 

  • Rutkowski L, Gonzales E, von Davier M, Zhou Y (2014) Assessment design for international large-scale assessments. In: Rutkowski L, von Davier M, Rutkowski D (eds) Handbook of international large-scale assessment: background, technical issues, and methods of data analysis. CRC Press, Boca Raton, pp 75–95

    Google Scholar 

  • Satterthwaite FE (1946) An approximate distribution of estimates of variance components. Biometr Bull 2:110–114. https://doi.org/10.2307/3002019

    Article  Google Scholar 

  • van der Linden WJ (2009) Conceptual issues in response-time modeling. J Educ Meas 46:247–272

    Article  Google Scholar 

  • Warm TA (1989) Weighted likelihood estimation of ability in item response theory. Psychometrika 54:427–450

    Article  MathSciNet  Google Scholar 

  • Weeks JP, von Davier M, Yamamoto K (2016) Using response time data to inform the coding of omitted responses. Psychol Test Assess Model 58:671–701

    Google Scholar 

  • Welch BL (1947) The generalization of “Student’s” problem when several different population variances are involved. Biometrika 34:28–35. https://doi.org/10.1093/biomet/34.1-2.28

    Article  MathSciNet  MATH  Google Scholar 

  • Wenzel SFC, Engelhardt L, Hartig K, Kuchta K, Frey A, Goldhammer F, Naumann J, Horz H (2016) Computergestützte, adaptive und verhaltensnahe Erfassung Informations- und Kommunikationstechnologie-bezogener Fertigkeiten (ICT-Skills) [Computerized adaptive and behaviorally oriented measurement of Information and communication technology-related skills (ICT-skills)]. In Bundesministerium für Bildung und Forschung (BMBF) Referat Bildungsforschung (Hrsg.), Forschungsvorhaben in Ankopplung an Large-Scale Assessments (pp 161–180). Niestetal: Silber Druck

  • Wilkinson L, Task Force on Statistical Inference, American Psychological Association, Science Directorate (1999) Statistical methods in psychology journals: guidelines and explanations. Am Psychol 54:594–604. https://doi.org/10.1037/0003-066X.54.8.594

    Article  Google Scholar 

  • Wise SL, Kingsbury GG (2016) Modeling student test-taking motivation in the context of an adaptive achievement test. J Educ Meas 53:86–105

    Article  Google Scholar 

  • Wise SL, Ma L (2012) Setting response time thresholds for a CAT item pool: the normative threshold method. Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, Canada

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Frey.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Frey, A., Spoden, C., Goldhammer, F. et al. Response time-based treatment of omitted responses in computer-based testing. Behaviormetrika 45, 505–526 (2018). https://doi.org/10.1007/s41237-018-0073-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41237-018-0073-9

Keywords

Navigation