Simulation Study of Scoring Methods for Various Multiple-Multiple-Choice Items

  • Sayaka AraiEmail author
  • Hisao Miyano
Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 265)


Multiple-choice (MC) format is the most widely used format in objective testing. The “select all the choices that are true” items, also called multiple-multiple-choice (MMC) items, is a variation of the MC format, which gives no instructions about how many correct choices may be selected. Although many studies have been developed and various scoring methods for MMC items have been compared, the results have often been inconsistent. Arai and Miyano (Bull Data Anal Japan Classif Soc 6:101–112, 2017) proposed new scoring methods and compared their scoring features by conducting numerical simulations of a few MMC item patterns. In this study, we conducted numerical simulations of all other plausible MMC item patterns to examine the relationships between examinees’ abilities (true scores) and scores given by scoring methods. We illustrated the effects of the total number of choices and correct choices for each scoring.


Multiple-multiple-choice items Scoring method 


  1. Albanese, M. A., & Sabers, D. L. (1988). Multiple true-false items: A study of interitem correlations, scoring alternatives, and reliability estimation. Journal of Educational Measurement, 25(2), 111–123.CrossRefGoogle Scholar
  2. Arai, S., & Miyano, H. (2017). Scoring method for “Select All the Choices That Are True” items. Bulletin of Data Analysis of Japanese Classification Society, 6(1), 101–112(in Japanese).Google Scholar
  3. Cronbach, L. J. (1941). An experimental comparison of the multiple true-false and multiple multiple-choice tests. Journal of Educational Psychology, 32(7), 533.CrossRefGoogle Scholar
  4. Domnich, A., Panatto, D., Arata, L., Bevilacqua, I., Apprato, L., Gasparini, R., et al. (2015). Impact of different scoring algorithms applied to multiple-mark survey items on outcome assessment: An in-field study on health-related knowledge. Journal of Preventive Medicine and Hygiene, 56(4), E162.Google Scholar
  5. Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
  6. Mayekawa, S. (2018, November 15). Lazy R packages. Retrieved from
  7. Ripkey, D. R., Case, S. M., & Swanson, D. B. (1996). A “new” item format for assessing aspects of clinical competence. Academic Medicine, 71(10), S34–6.Google Scholar
  8. Sokal, R. R., & Sneath, P. H. A.(1963). Principles of numerical taxonomy. W. H. Freeman and Company.Google Scholar
  9. Tsai, F. J., & Suen, H. K. (1993). A brief report on a comparison of six scoring methods for multiple true-false items. Educational and Psychological Measurement, 53(2), 399–404.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.National Center for University Entrance ExaminationsTokyoJapan

Personalised recommendations