, Volume 80, Issue 4, pp 1105–1122 | Cite as

Analyzing Test-Taking Behavior: Decision Theory Meets Psychometric Theory

  • David V. BudescuEmail author
  • Yuanchao Bo


We investigate the implications of penalizing incorrect answers to multiple-choice tests, from the perspective of both test-takers and test-makers. To do so, we use a model that combines a well-known item response theory model with prospect theory (Kahneman and Tversky, Prospect theory: An analysis of decision under risk, Econometrica 47:263–91, 1979). Our results reveal that when test-takers are fully informed of the scoring rule, the use of any penalty has detrimental effects for both test-takers (they are always penalized in excess, particularly those who are risk averse and loss averse) and test-makers (the bias of the estimated scores, as well as the variance and skewness of their distribution, increase as a function of the severity of the penalty).


multiple-choice tests guessing formula scoring partial information decision theory loss aversion mis-calibration of probabilities 



We wish to thank Drs. Jason Dana, Tzur Karelitz, Charles Lewis, Yigal Attali, and three anonymous reviewers for their thoughtful comments on earlier versions of the paper. This work was supported, in part, by the Anastasi Fellowship at Fordham University.


  1. Bar-Hillel, M., Budescu, D. V., & Attali, Y. (2005). Scoring and keying multiple choice tests: A case study in irrationality. Mind and Society, 4, 2–12.CrossRefGoogle Scholar
  2. Bechger, T. M., Maris, G., & Verstralen, H. H. F. M. (2005). The Nedelsky model for multiple-choice items. In A. van der Ark, M. Croon, & K. Sijtsma (Eds.), Chapter 10 in New developments in categorical data analysis for the social and behavioral sciences. New York: Lawrence Erlbaum.Google Scholar
  3. Bereby-Meyer, Y., Meyer, J., & Budescu, D. V. (2003). Decision making under internal uncertainty: The case of multiple-choice tests with different scoring rules. Acta Psychologica, 112, 207–220.CrossRefPubMedGoogle Scholar
  4. Bereby-Meyer, Y., Meyer, J., & Flascher, O. M. (2002). Prospect theory analysis of guessing in multiple choice tests. Journal of Behavioral Decision Making, 15, 313–327.CrossRefGoogle Scholar
  5. Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21, 65–88.CrossRefGoogle Scholar
  6. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord, & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  7. Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.Google Scholar
  8. Booij, A. S., Van Praag, B. M. S., & Van de Kuilen, G. (2010). A parametric analysis of prospect theory’s functionals for the general population. Theory and Decision, 68, 115–148.CrossRefGoogle Scholar
  9. Budescu, D. V., & Bar-Hillel, M. (1993). To guess or not to guess: A decision theoretic view of formula scoring. Journal of Educational Measurement, 30, 227–291.CrossRefGoogle Scholar
  10. De Finetti, B. (1965). Methods for discriminating levels of partial knowledge concerning a test item. British Journal of Mathematical and Statistical Psychology, 18, 87–123.CrossRefGoogle Scholar
  11. Diamond, J., & Evans, W. (1973). The correction for guessing. Journal of Educational Research, 43, 181–191.Google Scholar
  12. Delgado, A. R. (2007). Using the Rasch model to quantify the causal effect of test instructions. Behavioral Research Methods, 39, 570–573.CrossRefGoogle Scholar
  13. Embretson, S. E., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum Publishers.Google Scholar
  14. Espinosa, M. P., & Gardeazabal, J. (2010). Optimal correction for guessing in multiple-choice tests. Journal of Mathematical Psychology, 54, 415–425.CrossRefGoogle Scholar
  15. Espinosa, M. P., & Gardeazabal, J. (2013). Do students behave rationally in multiple choice tests? Evidence form a field experiment. Journal of Economics and Management, 9, 107–135.Google Scholar
  16. Frary, R. B. (1988). Formula scoring of multiple choice tests (Correction for guessing). Educational Measurement: Issues and Practice, 7, 33–38.CrossRefGoogle Scholar
  17. Holzinger, K. J. (1924). On scoring multiple response tests. Journal of Educational Psychology, 15, 445–447.CrossRefGoogle Scholar
  18. Johnson, T. R., Budescu, D. V., & Wallsten, T. S. (2001). Averaging probability judgments: Monte Carlo analyses of asymptotic diagnostic values. Journal of Behavioral Decision Making, 14, 123–140.CrossRefGoogle Scholar
  19. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291.CrossRefGoogle Scholar
  20. Kahneman, D., & Tversky, A. (2000). Choices, values and frames. New York: Cambridge University Press.Google Scholar
  21. Karelitz, T. M., & Budescu, D. V. (2013). The effect of the raters’ marginal distributions on their matched agreement: A rescaling framework for interpreting Kappa. Multivariate Behavioral Research, 48(6), 923–952.CrossRefGoogle Scholar
  22. Karmarkar, U. S. (1978). Subjectively weighted utility: A descriptive extension of the expected utility model. Organizational Behavior and Human Performance, 21, 61–72.CrossRefGoogle Scholar
  23. Kruskal, W.H. (1958). Ordinal measures of association. Journal of the American Statistical Association, 53, 814–861.Google Scholar
  24. Lichtenstein, S., Fischhoff, B., & Phillips, L. (1982). Calibration and probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 306–334). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  25. Lord, F. M. (1975). Formula scoring and number-right scoring. Journal of Educational Measurement, 12, 7–12.CrossRefGoogle Scholar
  26. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Welsley.Google Scholar
  27. Markowitz, H. M. (1959). Portfolio selection: Efficient diversification of investments. New York: Wiley.Google Scholar
  28. Merkle, E. C., Smithson, M., & Verkuilen, J. (2011). Hierarchical models of simple mechanisms underlying confidence in decision making. Journal of Mathematical Psychology, 55, 57–67.CrossRefGoogle Scholar
  29. Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3–19.CrossRefGoogle Scholar
  30. Samejima, F. (1970). A new family of models for the multiple-choice item (Office of Naval Research Rep. 79–4, N400014–77-C-0360). Knoxville: University of Tennessee, Department of Psychology.Google Scholar
  31. San Martín, E., del Pino, G., & De Boeck, P. (2006). IRT models for ability-based guessing. Applied Psychological Measurement, 30, 183–203.CrossRefGoogle Scholar
  32. Stott, H. P. (2006). Cumulative prospect theory’s functional menagerie. Journal of Risk and Uncertainty, 32, 101–130.CrossRefGoogle Scholar
  33. Thissen, D., & Steinberg, L. (1984). A response model for multiple choice items. Psychometrika, 49, 501–519.CrossRefGoogle Scholar
  34. Thurstone, L. L. (1919). A method for scoring tests. Psychological Bulletin, 16, 235–240.CrossRefGoogle Scholar
  35. Traub, R. E., Hambleton, R. K., & Singh, B. (1969). Effects of promised reward and threatened penalty on performance of a multiple choice vocabulary test. Educational and Psychological Measurement, 29, 847–861.CrossRefGoogle Scholar
  36. Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323.CrossRefGoogle Scholar
  37. Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent model. The Quarterly Journal of Economics, 106, 1039–1061.CrossRefGoogle Scholar
  38. Wallsten, T. S., & Budescu, D. V. (1983). Encoding subjective probabilities: A psychological and psychometric review. Management Science, 29, 151–173.CrossRefGoogle Scholar
  39. Williams, C. A. (1966). Attitudes towards speculative risk as an indicator of attitudes towards pure risk. Journal of Risk and Insurance, 33, 577–587.CrossRefGoogle Scholar
  40. Wright, G., & Ayton, P. (1994). Subjective probability. Chichester: Wiley.Google Scholar

Copyright information

© The Psychometric Society 2014

Authors and Affiliations

  1. 1.Depertament of PsychologyFordham UniversityBronxUSA

Personalised recommendations