Abstract
This experimental study was designed to quantify, by means of the Rasch model (RM), the effects of three instruction/ scoring conditions on student measures and on the reliability of an achievement multiple-choice test in a field context. Examinees performed the test in one of three conditions which differed only in the instructions provided. Predictions regarding performance indicators were fulfilled, and the expected differences in reliability favoring omission-inducing instructions did appear. This difference in reliability was found for both Rasch and raw data and thus it can be concluded that the fact that results from previous studies failed to corroborate this prediction must have been due to the lack of important consequences of test scores for the students. The RM has served to neatly quantify the differences between instructions promoting guessing and instructions promoting omission under uncertainty, showing that the recommendation to omit is not only educationally but also psychometrically sound.
Article PDF
Similar content being viewed by others
References
Adams, R. J., &Khoo, S. T. (1996).Quest-2.1. The interactive test analysis system [Computer software]. Camberwell: ACER.
Andersen, E. B. (1970). Asymptotic properties of conditional maximum likelihood estimators.Journal of the Royal Statistical Society,32, 283–301.
Andersen, E. B. (1973). A goodness of fit test for the Rasch model.Psychometrika,38, 123–140.
Budescu, D., &Bar-Hillel, M. (1993.) To guess or not to guess: A decision theoretic view of formula scoring.Journal of Educational Measurement,30, 277–291.
Cizek, G. J., &O’Day, D. M. (1994). Further investigation of nonfunctioning options in multiple-choice test items.Educational & Psychological Measurement,54, 861–872.
Delgado, A. R., &Prieto, G. (1998). Further evidence favoring threeoption items in multiple-choice tests.European Journal of Psychological Assessment,14, 197–201.
Embretson, S. E. (2006). The continued search for nonarbitrary metrics in psychology.American Psychologist,61,50–555.
Fisher R. A. (1936). The use of multiple measurements in taxonomic problems.Annals of Eugenics,7, 179–188.
Haladyna, T. M., &Downing, S. M. (1993). How many options is enough for a multiple-choice test item?Educational & Psychological Measurement,53, 999–1010.
Linacre, J. M. (2002). Number of person or item strata.Rasch Measurement Transactions,16, 888.
Long, J. D., Feng, D., &Cliff, N. (2003). Ordinal analysis of behavioral data. In J. A. Schinka & W. F. Velicer (Vol. Eds.),Handbook of psychology: Vol. 2. Research methods in psychology (pp. 635–661). Hoboken, NJ: Wiley.
Luce, R. D., &Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement.Journal of Mathematical Psychology,1, 1–27.
Prieto, G., &Delgado, A. R. (1999). The effect of instructions on multiple-choice test scores.European Journal of Psychological Assessment,15, 143–150.
Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Rasch, G. (1968).A mathematical theory of objectivity and its consequences for model construction. Report from the European Meeting on Statistics, Econometrics and Management Sciences, Amsterdam.
Rogers, W. T., &Harley, D. (1999). An empirical comparison of three and four-choice items and tests: Susceptibility to test wiseness and internal consistency reliability.Educational & Psychological Measurement,59, 234–247.
Thissen, D., &Wainer, H. (2001).Test scoring. Mahwah, NJ: Erlbaum.
Thorndike, R. L. (Ed.) (1971).Educational measurement. Washington, DC: American Council on Education.
Traub, R. E., &Hambleton, R. K. (1972). The effect of scoring instructions and degree of speedness on the validity and reliability of multiple-choice tests.Educational & Psychological Measurement,32, 737–758.
Traub, R. E., Hambleton, R. K., &Singh, B. (1969). Effects of promised reward and threatened penalty on performance of a multiplechoice vocabulary test.Educational & Psychological Measurement,29, 847–861.
Tversky, A., &Kahneman, D. (1971). Belief in the law of small numbers.Psychological Bulletin,76, 105–110.
van der Linden, W. J., &Hambleton, R. K. (1997). Item response theory: Brief history, common models, and extensions. In W. J. van der Linden & R. K. Hambleton (Eds.),Handbook of modern item response theory (pp. 1–28). New York: Springer.
Verguts, T., de Boeck, P., &Ruts, W. (1998). Analyzing experimental data using the Rasch model.Behavior Research Methods, Instruments, & Computers,30, 501–505.
Wright, B. D., &Stone, M. H. (1979).Best test design. Chicago: Mesa Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Delgado, A.R. Using the Rasch model to quantify the causal effect of test instructions. Behavior Research Methods 39, 570–573 (2007). https://doi.org/10.3758/BF03193027
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03193027